Monday, February 24, 2020

Divided Loyalties: The 737 Max Warning Light Glitch


In the sixth chapter of the Gospel of Matthew, Jesus is quoted as saying "No man can serve two masters; for either he will hate the one and love the other, or he will be devoted to one and despise the other."  The context is the impossibility of serving both God and mammon (money), but one does not have to be a Christian to recognize the shrewdness of Jesus' observation that divided loyalties sooner or later lead to trouble. 

A report from Bloomberg News this week makes this saying particularly relevant to the ongoing woes of Boeing Inc., whose 737 MAX airliner is still grounded after two fatal crashes led to investigations revealing serious problems with the plane's software.  Now it appears that a warning light which could have helped mechanics fix the problem that contributed to the crashes wasn't even working, again due to software problems.

As we have mentioned in this blog before, both the Indonesian Lion Air crash in October 2018 and the Ethiopian Airlines March 2019 crash occurred when problems arose with the angle-of-attack sensors.  Specifically, one of them malfunctioned, and as a result, the defective software responded by essentially flying the plane into the ground, despite the pilots' efforts to stay aloft.  The warning light in question would have illuminated if the two angle-of-attack sensor readings disagreed, showing that one of them had a problem.  An alert pilot might have gotten a mechanic to fix the problem, which would have avoided the issue that led to the two fatal crashes.

But due to a separate software glitch, the warning light turned out not to work unless the customer also asked for an optional display showing each angle-of-attack sensor reading independently.  And 80% of 737 MAXes sold did not have that option, and so also had a defective warning light.  It's a little like if you ordered a car and found out that unless you also asked for optional fog lights, your brake lights wouldn't work. 

By itself, the sensor disagreement warning light's malfunction was not a safety violation.  But in a letter written to Congress last July, the U. S. Federal Aviation Administration (FAA) acting head Daniel Elwell said, "A manufacturer cannot alter the airplane’s features after it has been certified."  The FAA is contemplating assessing fines against the company, and such fines can range up to the tens of millions of dollars.

That is a comparative drop in the bucket in relation to the estimated $18 billion that the firm has lost so far in the 737 MAX debacle since that fleet was grounded last year.  But the details of how Boeing discovered the warning-light glitch back in 2017 and decided not to fix it immediately reveal the glaring defects in a practice that the FAA decided to halt last November:  allowing Boeing-paid engineers to act as FAA inspectors for certain aspects of the certification and approval process.

Regardless of the details, the intended relationship between the FAA and private airplane manufacturers such as Boeing is inherently adversarial, to the extent that the point of having a regulatory agency is to ensure that the entity regulated doesn't get away with murder, or its corporate equivalent.  A simple example is the state of food manufacturing and sale in the U. S. prior to the establishment of the U. S. Food and Drug Administration, the history of which can be traced back to 1906.  Before then, it was perfectly legal to sell candy colored with arsenic-containing dyes to children, or fruit with traces of the arsenic-containing insecticide Paris green.  Once laws were passed against such abominations, the laws had to be enforced, which meant that chemists and inspectors paid by the government went out, collected samples, and tested them for harmful ingredients.  If found, the government used the evidence to levy fines and other penalties against the firms, and the U. S. food supply took a notable turn for the better.

But note that the integrity of the inspectorate—those charged with checking the output of the private manufacturers—owed their livelihood not to the manufacturers directly, but to the government.  This is a sound principle to ensure against corruption and divided loyalties, but one that was neglected when Boeing convinced the FAA to allow some of its employees to do inspections that the FAA would normally undertake.

According to the Bloomberg report, one such "inspector"—a Boeing employee authorized by the FAA to decide such matters—chose to let the warning-light glitch go until a future software update rather than issuing an immediate order to repair all the defective planes.  A clearer case of letting the fox watch over the henhouse would be hard to find. 

This lax procedure is probably not unrelated to the fact that Boeing is the only U. S. maker of large commercial aircraft.  Its only serious global competitor is the European combine Airbus.  If there were three or four viable U. S. airline manufacturers, the FAA would be in a stronger position to levy serious and even firm-threatening penalties against Boeing, the reason being that the other hypothetical firms could take up any slack and still allow the U. S. airline manufacturing business to function. 

But both Boeing and the FAA know that is not the case, and that whatever Boeing does, the FAA isn't going to do anything on its own that would threaten the company's existence and put the U. S. out of the international airliner business. 

There are many bad things about monopolies, and one of the worst is that they encourage laziness, both on the part of the monopoly itself and on any agency charged with keeping an eye on it.  In surrendering some of its authority to Boeing employees, the FAA preserved the appearance of vigilance while relinquishing the reality.  When it ended such cozy arrangements last November, it took a step in the right direction of putting a respectable distance between itself and the industry it is charged with regulating.

But cultures and perceptions do not change overnight, and both Boeing and the FAA have a long way to go before they recover some of the public trust that went down in flames in the 737 MAX crashes. 

Sources:  The Bloomberg report on the prospect of FAA fines for the warning-light glitch was carried on the Fortune website on Feb. 21, 2020 at https://fortune.com/2020/02/21/boeing-737-max-warning-light-new-faa-fines/.

Monday, February 17, 2020

Will FIDO Make an End to Passwords?

Anybody who spends much time online these days, which is nearly everybody, wastes a certain amount of time and endures more or less annoyance in entering passwords.  An industry alliance called FIDO (for Fast IDentity Online) promises to make passwords a thing of the past.  But before that happens, there are both technical and social obstacles in the way.

Founded in 2013 by PayPal and other companies wishing to make it easier for people to log in to their sites, FIDO works by collapsing all the different password-validation operations for the sites you use into one device-specific process.  That would be a great improvement over the way things are now, as I will illustrate with a personal example.

Say I want to do the following:  check my bank balance, buy a component from a supplier in a hurry, log in to my university email,  and change a file on my class website. 

Right now I'd have to perform these steps flawlessly: (a) log on to my bank's website and enter two separate passwords which have nothing to do with my other passwords, and therefore are not that easy to remember (b) hunt up the place on my computer where I hide all the dozens of vendor passwords I've accumulated over the years by remembering the name of the file I hid it in, and typing the password into the vendor's website (c)  type in a long sequence of letters, some of which are capitalized, that the university recently made us switch to from an old shorter password, and hope I get it right, which I still do only about 80% of the time; (b) and for the class website, I have to do a two-step verification involving not only the previously mentioned new long password, and also either asking the computer to call my office phone (which is fine if I'm in the office) or letting me enter a six-digit number from a dongle they sold me, which works fine until I accidentally press its button two or three times without using the numbers, which I do from time to time because it's on a keychain in my pocket, and then it loses sync with the computer, in which case I have to phone IT support and spend ten minutes or so waiting for them to hunt up the one guy who is authorized to re-sync dongles, and I read out three numbers in sequence to him, with thirty-second pauses in between.  Then I can go back, log in, and change the file on my class website.

This is not to knock the university's IT people.  They are understandably concerned about security, and within their limited resources they have come up with the best password protection they can figure out.  And admittedly, if I would just break down and buy a smartphone I wouldn't have to fool with the dongle. 

But the dongle is one of the technical hurdles FIDO will have to overcome in its march to eliminate passwords.  As I understand it from the FIDO Alliance website, once FIDO achieves universal buy-in, all password requests would be dealt with the same way.  If you have a smartphone that does fingerprint verification, the same fingerprint will work for every website.  If you do dongle verification, or smart-card verification, or voice-recognition verification, that same method will work for everything.  The method used will depend on the device that the user has access to. 

For old duffers like me who spend at least as much time using a laptop to access the Internet as I do with a phone, this prospect is not so encouraging, because it means to take advantage of FIDO, I'd have to be using the same device all the time.  Or at least it seems to mean that.  But the global trend is toward using mobile phones for just about everything, and newer computers tend to have the hardware and software needed for fingerprint ID or similar biometric methods, so this issue will not be so serious going forward.

The social issue I mentioned is the simple fact that for FIDO to work, the websites all have to be able to take the FIDO "public-key cryptography" stuff that the user's device sets up.  And all the user-device makers have to make FIDO available on their devices.  Fortunately, the upsides to most parties involved way outweigh the downsides, which is why the people in charge of the Android operating system have recently upgraded their buy-in so that it will work with mobile browsers, according to a recent article on the Wired website.  So progress is being made in that area.

For people and organizations unable or unwilling to do FIDO, there will still be the old-fashioned password, which brings back to my mind scenes out of 1930s' movies about Prohibition, where someone desirous of booze would appear before a door with a peephole in it and murmur, "Joe sent me."  Perhaps back then the formality of a password just added to the underworld glamour of obtaining illegal hooch.  But these days, when accessing multiple websites in a day is as routine as walking through multiple doors in a day, passwords have become a digital albatross around our collective necks that we would be more than happy to get rid of.

As is always the case with advances in widely used technology, somebody will figure out a way to hack FIDO.  The obvious weakness to me is the fact that with FIDO, all one's security eggs will be in one basket, so to speak.  Right now, if somebody hacked my bank password, for example, I might wake up broke tomorrow, but at least I could still make a secure purchase from Etsy—if I had any money.  But if FIDO becomes universal and someone manages to hack into your FIDO verification system, they can get into everything your current passwords give you access to, all at once. 

I'm sure the FIDO wizards have thought of this possibility and will try to deal with it somehow.  As long as FIDO will work better than my hardware dongle, I'm all for it, but it looks like it will be a while before it gains the degree of acceptance that would make a real dent in our need for remembering, typing in accurately, and dealing with the downsides of plain old-fashioned passwords. 

Sources:  I referred to a Wired article entitled "Android Is Helping Kill Passwords On a Billion Devices" at https://www.wired.com/story/android-passwordless-login-fido2/, the FIDO Alliance website at https://fidoalliance.org/, and the Wikipedia article "FIDO Alliance."

Monday, February 10, 2020

A Real Live Caucus-Race in Iowa


In Lewis Carroll's Alice in Wonderland, Alice falls into a pool of her own tears (don't ask) and eventually makes her way to shore amid a crowd of animals.  One of them, a bird called a Dodo (which was extinct long before Carroll published his work in 1865) suggested that they all dry out by running a caucus-race.  The Dodo marked out an irregular course, and everyone "began running when they liked, and left off when they liked, so that it was not easy to know when the race was over."  The ensuing chaos was echoed in Democratic politics last week.

Last Monday, Feb. 3, the Iowa Democratic Party inadvertently staged a caucus-race of its own when precinct leaders in the 1,700 or so Democratic caucus sites tried to use a new smartphone app to report their results to headquarters.  To make a long and nail-biting night short, the app failed, and it took another day or longer before the nation learned the results of the very first official 2020 party selection process for the Democratic nominee.

Numerous analysts and commentators have opined on what went wrong and why, and as I'm no software expert, I will take their word that the cause was a combination of factors:  a programming flaw that led to serious inaccuracies in the transmitted data; a failure to test the app thoroughly before deploying it; and a user-hostile interface that required chairpersons to download auxiliary apps and/or change obscure phone settings to get it to work.  In view of the fact that the people using the app were not software experts, but volunteers from many walks of life, it's not surprising in retrospect that something went wrong.

One commentator I read remarked that sometimes the old-fashioned telephone is better than a newfangled app, noting that the Associated Press election-night results that serve as the national gold standard for election news are still reported by human beings over the telephone to a central location, where they are collated and disseminated to news outlets.  When it became clear that the app wasn't working, the Iowa precinct chairs resorted to phoning, but the headquarters phone bank was not prepared for such an onslaught of calls, and most callers just got a busy signal.

At no time were the actual vote tallies at risk of being lost.  The whole problem was one of communication, and not just on the night of the caucuses.  Reports indicate that the app was developed in haste and without adequate testing.  Its deployment was marked by insufficient training of volunteers, many of whom weren't able to get the thing to run on their phones easily even after they followed the instructions.  As a result, the very leading edge of what will be a long and twisting road for the Democratic candidates for nomination from here to the national election night next November was blunted by needless confusion and delay.

Software engineering as a discipline is one of the younger divisions of engineering, and in contrast to the older fields such as civil and mechanical engineering, traditions and standards for it are fairly new and still in a state of flux.  It is still a discipline in which people without college degrees can make viable careers, as many a high-school nerd who fell in love with coding can attest.  And this is not to say that a college degree is a guarantee of professional integrity.  The most such a degree can do is expose the student to a standard set of educational experiences that include a warning about the wider implications of what software engineers do, and how to take responsibility for the potential for failure that any act of programming entails.

In the absence of any detailed information about the firm that developed the failed app, all we can do is look at the results.  Unlike product rollouts, election and caucus dates are firmly entrenched in the calendar.  It looks like whoever had the bright idea to get an app developed for this purpose either didn't have the idea soon enough, or wasn't able to get the process rolling until just a few months before it had to be used. 

One of the reasons I'm in academia rather than industry is the fact that I don't work that well under time pressure.  Other people, including some software developers, thrive on it, and if so that's fine for them.  But there are limits to any speedup of work, and even if the software itself had been flawless, the time remaining between when the developers said it was ready to be deployed and the caucus date itself may not have been long enough to allow for adequate training.  It's difficult to get volunteers together for a purely auxiliary thing like software training.  It begins to sound too much like a job.  So that may be one reason that training was neglected.

And when the full magnitude of the disaster became obvious, there was evidently no Plan B in place.  I don't know how much it would have cost for the headquarters to have hired enough phone-answerers to handle telephoned-in results just in case the app didn't work, but it would have been money well spent.  The rule that ocean-going vessels must have enough lifeboats for everybody wasn't adopted until after the Titanic sank with all the people who couldn't get on the inadequate number of lifeboats.  If your primary system fails, a secondary system that works only halfway isn't much better than no system at all. 

In the event, the absence of results enabled all the candidates to claim victory for a while that night, just as at the end of Alice's Caucus-Race, the Dodo said, "Everybody has won, and all must have prizes."  The losers last Monday, unfortunately, were not only the candidates who came in behind the eventual winner, but everybody who wanted to know the results right away.

Sources:  I referred to a New York Times opinion piece by Charlie Warzel at https://www.nytimes.com/2020/02/04/opinion/iowa-caucus-app.html and articles at https://www.nbcnews.com/tech/security/iowa-caucus-app-was-rushed-flawed-beginning-experts-say-n1131216 and https://gritdaily.com/iowa-caucus-tech-disaster/.  I thank Michael Cook of Mercatornet.com for suggesting this topic and drawing my attention to the Times piece.

Monday, February 03, 2020

What Price Prosperity? The Watson Grinding & Manufacturing Explosion


People living in northwest Houston were awakened early Friday morning, Jan. 24, by a tremendous explosion.  A dramatic doorbell-camera video taken from a nearby neighborhood shows a brilliant flash followed by a rising fireball, and then the camera is knocked off its mountings by the arriving shock wave. 

The explosion demolished much of Watson Grinding and Manufacturing, killed two workers, injured about 20 people, and damaged some 400 homes and other structures nearby, 35 of them seriously. 

Investigators from the U. S. Bureau of Alcohol, Tobacco, and Firearms determined that the cause was likely a spark that set off propylene gas leaking from a 2,000-gallon tank, which was later secured by HazMat crews.  The family of one man killed in the explosion has filed a wrongful-death lawsuit, and Harris County has filed suit on behalf of its residents because it claims the firm failed to exercise due care to protect the public.  

Chemical explosions and fires are nothing new to the residents of Greater Houston and the coastal region near it, which has one of the largest concentrations of refineries and petrochemical plants in the world.  The things you can make by tormenting hydrocarbons and other materials with extreme heat and pressure are valuable to the world's economy, and so a lot of money goes into building plants and equipment that necessarily involve dangerous materials and processes.

At this point it is rather speculative to ask what Watson Grinding and Manufacturing was doing with all that propylene.  The most likely answer is that they were using it as a substitute for the even more hazardous acetylene in welding operations and other applications requiring lots of gas-fueled heat, such as heat treating.  Propylene is a gas at room temperature, but like other light hydrocarbon compounds, it can be compressed into liquid form, and that is probably how it was stored in the tank.

One news report points out that any facility where more than 10,000 pounds of propylene are stored has to conform to certain reporting requirements, which Watson Grinding and Manufacturing apparently failed to do.  By my calculations, 2000 gallons of the stuff weighs just about exactly 10,000 pounds, so they may have squeaked under the wire on this one.  Regardless of such reporting, it's obvious that not only did a serious propylene leak occur, no alarms were set off by gas-leak detectors, which these days are not that expensive.  The two employees killed were in the facility's gym, exercising before a work day they never had.

Did these men, who were probably glad to have reasonably well-paying manufacturing jobs, deserve to die?  I can't imagine anyone saying "Yes" to that.  Yet they knew, or may have known, that the stuff they were working with was dangerous, and now and then things go wrong even in the best-run facilities.  A society can be judged by the way it deals with clear cases of injustice, and many industrial accidents fall into that category.

The people who own and/or operate the facility rarely spend much time on the shop floor, exposed to the hazards that they have created.  The people who are injured are killed are either low-level operatives or sometimes just innocent bystanders who could afford only the kind of housing you find in the vicinity of manufacturing districts.  The days when factory owners would build a mansion next door to the plant are long gone. 

We don't have to imagine an alternative universe where regulations on manufacturing would be so tight that hardly anyone, down to the lowliest worker, would be at risk of injury or death.  We have only to look to places such as Amherst, Massachusetts, where I worked for many years.  The regulatory and civic environment was such that anyone wanting to run a plant offering hazards more serious than a paper cut had to flee to another location, such as Sunderland or Hadley.  If some dictator (I won't say anything about the current candidates for presidential nominations) managed to implement Massachusetts-style federal regulations on the whole country, including Houston, why, Houston would be a much safer place to be.  It would also turn into a ghost town, or large sections of it would, because the thing that has already happened to a great deal of manufacturing activity in the U. S. might well happen to the petrochemical and refining industries and all their ancillary commercial infrastructure as well:  it would move offshore to places less hostile to hazardous commercial activity.

This is not to set up a false two-way choice between prosperity and death for workers on the one hand, and third-world poverty and safety on the other hand.  There is a third way, one that allows for dangerous manufacturing processes, but with due attention paid to the hazards involved. 

To their credit, most of the firms doing hazardous manufacturing manage to keep their workers reasonably safe most of the time.  If factories blew up every day, it wouldn't be news.  Still, every explosion reveals a failure of attention:  attention to what might go wrong and diligence in stopping it from going wrong before it gets too far. 

In the coming months, investigators and lawyers will find out a lot about how Watson Grinding and Manufacturing dealt with these problems.  The most important "machinery" in a plant is invisible:  it's the network of minds and human relations that keep the place going while enforcing a culture of safety, and back up that culture with needed resources—or not, as the case may be.  It's obvious that something went wrong:  a leak that began small got bigger, a leak detector that should have sounded the alarm was malfunctioning or not purchased in the first place, and an unfortunate combination of circumstances culminated in the tragedy caught on the doorbell camera. 

The word "love" is rarely brought up in discussions of engineering ethics, but in the proper context, it provides the foundation for good corporate and industrial behavior.  Here's a question that those in charge at Watson could ask themselves.  Take someone you love—a son, daughter, spouse—and answer this question:  would you want your beloved to work at any job in your plant?  For years?  If not, why not?  The answer to those questions can motivate improvements that could prevent things like the explosion at Watson Grinding and Manufacturing in the future.  But only if they are asked, and answered in the right way, with actions as well as words.

Sources:  An Associated Press report on the explosion was carried by numerous outlets, including the Raleigh (NC) News & Observer at https://www.newsobserver.com/news/business/article239855223.html.  I also referred to reports by KPRC Houston at https://www.click2houston.com/news/local/2020/01/31/harris-county-injured-employee-sue-owner-of-watson-grinding-and-manufacturing-after-explosion/
and the Houston Chronicle at https://www.houstonchronicle.com/news/houston-texas/houston/article/What-we-know-about-company-west-explosion-gessner-15000960.php