Spanair Flight JK 5022
One of the things we who are information security professionals are constantly told is to be sure to anticipate risks with very extreme consequences when we perform risk analyses. Events such as the Katrina hurricane, the destruction of the World Trade Center by terrorists, and the Chernobyl nuclear power plant meltdown accompanied by a gargantuan leakage of radioactivity show that far out of the ordinary events sometimes occur and that we are usually ill-prepared to deal with them.
Several days ago some startling news started circulating on the Internet. On August 20, 2008 Spanair flight JK 5022 crashed immediately after it took off from the Marajas Airport in Madrid, Spain, killing 172 people on board. The plane had some mechanical problems, the most significant of which was having its flaps and slats retracted during takeoff. An earlier takeoff attempt the same day was aborted due to mechanical problems, and mechanical problems in the plane had also been reported a few days earlier. A computing system at the airport ran an application designed to report problems such as the ones JK 5022 experienced. Had it worked properly, the pilot would almost certainly have aborted the takeoff, but it did not. The El Pais, a daily newspaper in Spain, claims that the central computing system was so infected with Trojan horse programs that it did not function properly.
This news item needs to be interpreted with caution, as there is no conclusive evidence that the central computing system in question was infested with malware. An investigation into the incident is underway, and findings are due to be released by the end of this calendar year. If malware infestation turns out to be the cause of this unfortunately incident, however, the incident would be a landmark in information security in that this would be the first time that a security-related cause would be at the root of a massive catastrophe that caused many deaths.
Security letdowns have resulted in deaths before, however. In the 1980s a PC in a medical center that controlled the amount of radiation delivered to cancer patients undergoing radiation treatments malfunctioned because of a virus infection, causing it to deliver far too much radiation. Fortunately, only one fatality resulted from this malfunction, but even one is too much. Additionally, a control system at a textile plant that controlled a carpeting rolling machine malfunctioned, causing two employees to be suffocated because they were caught up into and trapped in a carpet roll. Analysis of the control system later revealed that a malicious insider had tampered with it.
If malware turns out to be the cause of the Spanair crash, the incident will serve as a compelling wake-up call that people will not be able to ignore. When security problems turn into human fatalities, even the most security-hostile managers will be forced to ask questions such as “could something like this happen in our organization if we do not adequately control security risk?”
Incidentally, the “blame game” in this crash has taken an interesting twist. A plane mechanic who checked the plane before its takeoff attempt and a maintenance chief are being investigated. They are likely to face manslaughter charges in the future. But how would individuals who perform the roles they do detect malware on the central computing system in question? Did anyone even think of running anti-malware tools on this system? Was the Spanair information security staff aware of the risk of malware running on this system and did the staff make recommendations for mitigating this risk? We do not presently know the answers to these questions, but I suspect that in time we will. Meanwhile, the Spanair incident shows just how important it is to anticipate events of this magnitude and to ensure that processes that address the associated risks keep risk down to an acceptable level.