Ethiopian Airlines Flight 302.
Lion Air Flight 610.
Two Boeing 737 Max 8 airplanes.
346 lives lost.
As human factors experts in flight safety, we’ve been pouring over the seemingly endless amount of daily news stories concerning these tragedies. And what we’ve discovered is this: If there was ever a case study to show exactly how Human-Centered Design could have saved lives, this is it
Let’s start at the beginning – with the Boeing 737. Did you know that the Boeing 737 is the highest-selling commercial jetliner of all time? It’s been in service since 1968. A study done in 2006 showed that there were 1,250 Boeing 737s airborne around the world at any given time. Every 5 seconds, two were either landing or taking off. Over the years, Boeing has maintained one of the best flight safety records in aviation.
As you can expect any company to do, Boeing made design changes to their popular 737 over the last several decades. They’ve lengthened it, made multiple changes to its engines and wings, and have upgraded cockpits and interiors. In most recent years, their goal has been to make the 737 more efficient. You can’t blame them for wanting to make these design updates. As times change and technology advances, redesigns are expected – no matter what the product is.
Marketing versus flight safety
We now live in a fast-paced world that’s dominated by the lust for technology and the power of marketing to make us think we need it. The rush to be the latest and greatest often leads companies to make poor decisions when it comes to design and how to introduce technology to users. Often, technology is updated and integrated for technology’s sake or for the purpose of maintaining market dominance. Too often, companies lose sight of the fact that the purpose of technology is to solve peoples’ problems. This is a core principal of Human-Centered Design, and exactly where Boeing may have skipped a step.
It’s been reported that marketing was the catalyst for pushing Boeing to release its most recent technology, the Maneuvering Characteristics Augmentation System (MCAS), on its newly redesigned 737 Max planes. MCAS is an advanced “fly by wire” device, and a technology that Boeing most likely felt would give them the upper hand in their sales war against their biggest rival, Airbus.
You see, “fly by wire” was pioneered by Airbus over twenty years ago. Now it’s become the standard in the airline industry. “Fly by wire” is automation technology that converts the pilot’s movements with flight controls into electrical signals. These are then interpreted by the flight control computers. Essentially, “fly by wire” is a computerized flight-control system that can fly the plane. Boeing’s MCAS takes this system to a whole new level of sophistication. In hindsight, a dangerous level. From a human factors standpoint, its release to the public should have only proceeded after user testing and proper procedure and training protocols were established. At this time, there has been no evidence released that these critical design steps were completed to understand the safety risks associated with the design.
But time was ticking. In 2010, Airbus announced its plans to build a new fuel-efficient and cost-effective plane. Boeing, of course, wanted their own version. To make their latest 737 redesign as efficient and competitive as possible, Boeing set out to reduce fuel consumption by 30%. This is most likely the reason why as of January 31, 2019, Boeing had a whopping 5,011 firm orders from 79 customers for their 737 Max planes. Schedule pressure is another major contributor to errors of omission. Skipping critical steps in order to get to market introduces risk to end users. No company ever wants to be in a position of not being able to fill hot orders. Consider Tesla in most recent years. Tesla was unable to meet demand and production and therefore weren’t able to keep up with orders placed. This created buyer resentment.
But back to Boeing. In order to reduce fuel consumption by 30%, Boeing’s planes needed engines so enormous that they wouldn’t be able to be located under the wings (as in previous models). So, Boeing moved the engines forward. This, of course, dramatically altered the plane’s center of gravity. It wouldn’t be able to fly like this. The fix? Their anti-stall system, MCAS, would be used. Pilots could shut it off if need be, but they would only have a short time to take back control of the plane and avoid a possible crash. However, pilots would now need to know that there was a new operational anomaly, that this was the solution, and that they were responsible for knowing and appropriately completing the proper response.
Designing for flight safety
Now, when you’re truly designing for users and their safety, you must test, test, and test some more to ensure you haven’t introduced something new from a risk perspective. However, the problem begins with the design change motivation. The change on the Max 8 was driven more from a desire to reduce cost versus a problem with operation. This could be part of the reason why Boeing missed the potential for user risk. Now, we are not saying this shouldn’t be done. It absolutely should be done to create business efficiencies. But it needs to be done with proper care and testing to protect end users. Part of our jobs as human factors experts is being able to foresee unforeseen and potentially catastrophic conditions that design can cause. We look for every possibility of failure because we know that even small issues can cause disastrous outcomes – even when the changes seem unrelated to the operators. We don’t know exactly how much user testing was done when Boeing incorporated MCAS, if any. But the testing should have ensured that MCAS didn’t introduce new human risks to the current model benchmarks.
It’s still too soon to know exactly what happened on Ethiopian Airlines Flight 302. However, preliminary data retrieved from the flight data recorder is showing a clear similarity with what happened on Lion Air Flight 610. In that case, Indonesian authorities have already released an investigative report. It points to black box data that shows pilots struggled to maintain control of the plane as MCAS repeatedly pushed its nose down. Each time the plane pointed down, the pilots manually aimed the nose higher. But the plane kept repeating the sequence every five seconds. This happened 26 times before crashing! The focus of the investigation has been whether faulty information from sensors caused the nose to be forced down.
We don’t know what simulator testing Boeing completed with these sensors. But we do know one thing: If you can’t use design to get rid of issues like these, you need to create as many safeguards against failure as possible – including clear communications and warnings. For example, a system could have been put into place to alert the pilot to an anomalous case. The system could have provided the correct action for him or her to take. The twist here, however, is that pilots would need to be aware of this anomaly, understand the information that they’re given and know the correct action that they need to take. Pilots of Max 8 planes have come out in the press in the wake of both crashes, saying that they didn’t know about the MCAS anomalies at all or that they received very little training on the new model of the 737.
Who is responsible for bad design?
In the Lion Air case, the telemetry was fighting with itself. This type of issue has happened before in catastrophic cases like the Deepwater Horizon explosion that killed 11, injured 17 and led to the largest environmental disaster in U.S. history. This disaster was a result of the failure of eight different safety systems that were meant to prevent the disaster from happening in the first place! Deepwater Horizon was an instance where it was unreasonable to expect the operators to know what to do when the system was communicating contradictory information. Although steps were taken in the name of safety, they weren’t necessarily safe for the end user! The same holds true for the Chernobyl disaster, which happened because of a flawed reactor design. The safety culture at the time was non-existent, and human risk was not a critical design parameter.
When we look at situations like the Max 8 and other large-scale design failures like Deepwater Horizon and Chernobyl, we must ask ourselves: Is bad design the responsibility of the user to mitigate? Was it the responsibility of the pilot to prevent the issues introduced by MCAS? The answer: Absolutely not!
It’s the responsibility of the organization and its designers to identify and mitigate all system and human risks introduced through their designs and end products or services. If, as a designer, you’ve exhausted your ability to design an issue out, then you must introduce as many preventative measures and safeguards as possible. However, you first have to take the time to assess and make sure you have identified all possible failure modes – system and human-related. And no, they are not the same. The least desirable option is leaving a risk unchecked and up to a human in a high-stress situation to successfully overcome – praying that their two hours or less of training holds! When an accident occurs, we conveniently point the finger at the human because it’s easy. In reality, the problems run much deeper. It’s usually the management or design team that holds first tier responsibility for the catastrophe. But that raises bigger problems for a company, especially in an event of this caliber on the public stage. It can end a company – perfect past record or not.
It can take approximately 10,000 hours of exposure to or practice with anything to become an expert or to develop an innate response pattern to an operation. This may be debated in some circumstances, but for the sake of this blog, we’re using it to make a point. If an event occurs that produces a set of circumstances – such as the nose dives with the Max 8’s – that does not match an operator’s experience or training, you cannot expect them to react with the correct response after minimal or no exposure to a successful recovery. It’s simply not intuitive regarding the actions needed to be taken to recover during an incident like that. In addition, innate behavior typically takes over in high-stress situations. In the case of Lion Air Flight 610, the pilot’s innate reaction was modeled off their experience with the other Boeing models. Unfortunately, the event required essentially opposite response patterns.
Training is always necessary – even for small design changes.
Three people with knowledge of the cockpit voice recorder contents on Lion Air Flight 610 have told the press that in the final moments before the crash, the pilots scrambled through a handbook to understand why the plane was diving. This gives us some indication that they had knowledge of the MCAS anomaly or how to stop the plane from diving.
Airlines dictate the amount of training required to transition from flying one model of the 737 to the next. A Southwest Airlines captain recently told the New York Times that pilots had to watch a video before flying the Max 8 to “familiarize themselves with slight differences in the systems and the engines”. Many reports claim that this video was only 2 hours long and shown on an iPad.
According to a New York Times report, Boeing had not been able to prepare a training simulator for pilots in time for the launch of the new plane. Simulation cockpits teach pilots how to use any new functions of the instruments and software. This is all done in a safe atmosphere on the ground and gives them experience with solving the problem successfully – experience that is necessary to transfer to real time operation in order to have an expectation of the operator to respond successfully.
“They were building the airplane and still designing it,” said Southwest Pilots Association training and standards chair Greg Bowen. “The data to build a simulator didn’t become available until about when the plane was ready to fly.”
A group of United Airlines pilots put together their own training manual on the Max 8 just before the airline accepted delivery of the planes in 2017. None of them had ever flown a Max 8 at the time. A 737 captain and union official named James LaRosa led the training group. He told the New York Times that he went to a Boeing training center in Seattle to try out a mock cockpit that didn’t move like typical simulators. During this time, he and his group put together a 13-page manual showing the differences between the Max 8 and other 737s. While the manual showed changes to engines and displays, it didn’t mention the MCAS anomaly at all. Is this type of training really enough? No. But it comes back to understanding the impact of seemingly small changes. It shows how they introduce risk into an equation and why training is not the only necessary risk mitigation tactic.
It has been suggested that Boeing’s marketing department left the MCAS anomaly out of training materials because they wanted to make the introduction to the new plane as easy as possible. Some reports also state that there was a great deal of secrecy around MCAS because of Boeing’s competition with Airbus. If these reports are true, then you could assume that the marketing department was a large influence to user integration – not necessarily the engineers and designers who built it.
When disaster happens to an organization that has a reputation built on consistent and safe operations, the public is left very confused and often unforgiving (as they should be). Unfortunately, organizations are very complex, and even the best ones make mistakes. Over time, there is often a halo of experience effect. Experts sometimes forget about the risks present in everything that they add or take away from a known or well-understood design. The addition of new technology will betray them every time because it requires the human to develop a new model of interaction. And yes, this is true even for small changes. If in doubt, think critically about all of the challenges we have experienced with introducing driverless or driving-assist technologies into a very well-known design – the automobile. The challenge is due to understanding what it now means to be a driver in an AI world. We have yet to move the focus to understanding the incremental design steps we need to take to help bring the human operator along with the technology. Airplanes are no different, and it’s time to slow down the technology in favor of understanding the true human safety impacts.