Monday, April 08, 2019

Boeing Confirms Software At Fault In Ethiopian Crash


Last Thursday, Apr. 4, Ethiopian Transport Minister Dagmawit Moges released a preliminary report into the crash of an Ethiopian Airlines Boeing 737 Max 8 outside Addis Ababa last month, killing all 157 people on board.  Cockpit voice recordings and data from the flight recorder make it very clear that, as Boeing CEO Dennis A. Muilenberg admitted regarding both this crash and that of an Indonesian Lion Air flight last fall, "it's apparent that in both flights the Maneuvering Characteristics Augmentation System, known as MCAS, activated in response to erroneous angle of attack information."  Boeing is currently scrambling to fix both that software problem and another minor one uncovered recently, but as of now, no 737 Max 8s are flying in the U. S. or much of anywhere else.  And the FBI is reportedly investigating how Boeing certified the plane.

When we blogged about the Ethiopian crash three weeks ago, there were significant questions as to whether the MCAS alone was at fault, or whether pilot errors contributed to the crash.  But according to a summary published in the Washington Post, Minister Moges said that the pilots did everything recommended by the manufacturer to disable the MCAS, which was repeatedly attempting to point the plane's nose downward in response to the single faulty angle-of-attack sensor output.  But their efforts proved futile, and the plane eventually keeled over into a 40-degree dive and crashed into the ground at more than 500 mph. 

Our sympathy is with those who lost relatives and loved ones in both crashes.  Similar words were spoken by CEO Muilenberg, on whose head lies the ultimate responsibility for fixing these problems.  In doing so, he and his underlings will be dealing with how to smoothly integrate control of life-critical systems when both humans and what amounts to artificial intelligence are involved.

This is not a new problem, but it has transformed so much over the years that it seems new. 

I once toured a museum near Lowell, Massachusetts which preserved a good number of the original pieces of machinery used in one of the many water-powered textile mills that used to dot the landscape in the early 1800s.  Attached to their main water turbine was a large, complicated system of gears, flywheels, springs, levers, and so on which turned out to be the speed regulator for the mill.  As looms were cut in and out of the belt-and-shaft power distribution system, the load would vary, but it was important to keep the speed of the mill's shafts as constant as possible.  The complicated piece of machinery I saw turned out to be a sophisticated control system that kept the wheels turning at the same rate to within a few percent, despite wide variations in load.

I'm sure that from time to time the thing might malfunction, and in that case a human operator would have to intervene, shutting it down if it started to go too fast, for example, or if continued operation endangered someone caught in a belt, say.  So humans have been learning to get along with autonomous machinery for almost two hundred years.

The difference now is that in transportation systems (autonomous cars, airplanes), timing is critical.  And because cars and planes travel into novel situations, not all of which can be anticipated by software engineers, conditions can arise which make it hard or impossible for the humans who are ultimately responsible for the safety of the craft to respond.  That increasingly seems to be what happened to Ethiopian Air Flight 302, as evidenced by the black-box data clearly showing only one angle-of-attack sensor to be transmitting flawed data. 

Such issues have happened numerous times with the limited number of autonomous cars that have been fielded in recent years.  We know of at least two fatalities associated with them, and there have probably been many more near-misses or non-fatal accidents as well. 

But even a severe car wreck can kill at most a few people.  Commercial airliners are in a differenc category altogether.  They are operated by (mostly) seasoned professionals who should be able to trust that if they follow the procedures recommended by the manufacturer (in this case, Boeing), they will be able to deal with almost any imaginable contingency, even something like a stray plastic bag jamming an angle-of-attack sensor (this is my imagination working, but something had to make it give an erroneous reading).  In the case of the Ethiopian crash, the implied promise was broken.  The pilots did what they were told would disable the MCAS, but it didn't disable, and with disastrous results.

It is unusual for a criminal investigation to be aimed at the civilian U. S. aircraft industry, whose safety record has been achieved under mostly cooperative conditions between the Federal Aviation Administration and the firms who make and fly the planes.  Obviously it is too soon to speculate about what, if anything, will turn up from such an investigation.  In teaching my engineering classes, I sometimes ask if anyone has encountered on-the-job situations whose ethics could be questioned.  And I have heard several stories about how inspection or test records were falsified in order to pass along defective products.  So such things do happen, but one hopes that in a firm with a reputation such as Boeing's, incidents like this are rare. 

The marketplace has ways of punishing firms for bad behavior which are not just, perhaps, but nonetheless effective.  With the growth of Airbus, Boeing knows it has a formidable rival for commercial aircraft, and any company with millions of dollars' worth of capital sitting idly on the ground as the 737 Max 8s wait for properly vetted software upgrades is bound to be having second thoughts about going with Boeing the next time they need some planes.  I would not want to be one of the software engineers or managers dealing with this problem, as the reputation of the company may be hinging on the timeliness and effectiveness of the fixes they will come up with. 

Boeing has been reasonably transparent about this problem so far, and I hope they continue to be up-front and frank with customers, regulators, investigators, and the public about the progress they make toward fixing these software issues.  People have been learning to get along with smart machines for centuries now, and I am confident that engineers can overcome this issue as well.  But it will take a lot of work and continued vigilance to keep something like it from happening in the future.

Sources:  The Washington Post carried the story "Additional software problem detected in Boeing 737 Max flight control system, officials say," on Apr. 4 at https://www.washingtonpost.com/world/africa/ethiopia-says-pilots-performed-boeings-recommendations-to-stop-doomed-aircraft-from-diving-urges-review-of-737-max-flight-control-system/2019/04/04/3a125942-4fec-11e9-bdb7-44f948cc0605_story.html.  I also consulted a  Seattle Times article at https://www.seattletimes.com/business/boeing-aerospace/fbi-joining-criminal-investigation-into-certification-of-boeing-737-max/ and the original report from the Transport Ministry of Ethiopia, which the Washington Post currently has at https://www.washingtonpost.com/context/ethiopia-aircraft-accident-investigation-preliminary-report/?noteId=6375a995-4d9f-4543-bc1e-12666dfe2869&questionId=7ad6fc9d-5427-415d-b719-34ad0b3fecfd&utm_term=.55ff25187605.

No comments:

Post a Comment