DARPA project probes COTS software testing and hardening

March 1, 1998
PITTSBURGH - Mission controllers for the European Space Agency`s initial Ariane-5 flight were confident that the rocket`s June 1996 liftoff from Kourou, French Guiana, would kick off a successful series of launches for their orbital delivery vehicle. However, moments after the launch a software failure in a floating point-to-integer conversion module in one of the Ariane-5`s inertial guidance systems caused the control apparatus to crash and shut down. The rocket`s second inertial guidance syste

By Wilson Dizard III

PITTSBURGH - Mission controllers for the European Space Agency`s initial Ariane-5 flight were confident that the rocket`s June 1996 liftoff from Kourou, French Guiana, would kick off a successful series of launches for their orbital delivery vehicle. However, moments after the launch a software failure in a floating point-to-integer conversion module in one of the Ariane-5`s inertial guidance systems caused the control apparatus to crash and shut down. The rocket`s second inertial guidance system took over, but the same software flaw choked it as well.

Within seconds, the Ariane-5 spun out of its planned flight path and mission controllers destroyed the rocket before it could crash to earth and cause serious damage. The multi-million dollar consequences of this failure brought home once again the vulnerability even of carefully tested mission-critical systems to catastrophic software failures.

Software designers who write mission-critical code have worked for many years to determine what conditions bring about catastrophic system crashes. Two approaches to the problem they have found are "fault injection," in which systems test conflicting inputs, and traditional software testing, which involves overloading a system until it crashes.

Both approaches are featured in the widely-applied public-domain CRASHME software test suite, that has been proven to disable many kinds of operating systems by subjecting them to exceptional inputs and overload conditions.

But the sledgehammer approach of CRASHME and similar tests generate neither detailed measurements, nor software metrics that point to the exact reasons why an operating system malfunctions.

Researchers at Carnegie- Mellon University`s Institute for Complex Engineered Systems in Pittsburgh are working not only to refine automated methods of testing commercial off-the-shelf (COTS) operating systems, but also to build an automated tool that can protect operating systems and other software against crashes.

Philip Koopman, co-principal investigator of the Ballista project, named for an ancient catapult-like weapon, involves bombarding operating systems with conflicting inputs until the systems crash.

"Ballista was first funded a year and a half ago, and will run for three years," Koopman says. Officials of the Defense Advanced Research Project Agency (DARPA) in Arlington, Va., support work by three Carnegie-Mellon professors and three student researchers under a grant from their Embeddable Systems program.

Ballista team members chose to test the Posix operating system interface because systems designers widely use it in space and military systems. Also, because the Posix operating systems are mature commercial products, "nobody could say we were testing student code," Koopman quips.

While the Ballista test suite finds Posix flaws, "those bugs are not necessarily bugs in the strict software engineering sense," Koopman cautions. "But we can say that those systems are not robust. I never heard of a bug-free piece of software, but it`s nice to have software that reacts gracefully [to flaws and bugs]."

One of the deliverables the Ballista team will provide to DARPA is a World Wide Web page to which other software engineers can submit code for testing by the Ballista suite and determine its robustness.

"We are working on a criterion for one aspect of software which is robustness. I`m pretty sure we`ll be able to do that," Koopman says. "What DARPA really wants ... are automated tools to improve COTS software so it won`t crash. Ballista is going to be a step closer to that goal in the area of reliability and fault tolerance."

As the Ballista program progresses, programmers will generate an automated robustness-hardening capability for new COTS programs and for older existing code. Experts will carry out the hardening by first probing a software module for responses to exceptional inputs. Once it identifies the robustness bugs, Ballista automatically creates a software "wrapper" to filter out dangerous inputs.

Program engineers intend the Ballista testing and hardening process to work in cases where machine-parsable software specifications are not available, and where source code for COTS packages is not available. The Ballista method will use an automated approach to save money, which is one of the main motives for military and aerospace project managers choosing COTS software in the first place.

"I see this as a preventative," Koopman says. "Our goal is a fully automatic system. If you have to spend a lot of money on it, it`s not cost-effective. The idea is you take a COTS module and run it through Ballista in order to induce a crash or hang."

The Ballista system induces three types of failures: a catastrophic failure, bringing down the entire tested module and requiring a hardware reboot; a restart failure, which requires the resumption of a part of the program; and a limited single-task failure.

Ballista experts intend the software wrappers that their tool will generate to automatically warn the application manager - which in many cases is a software component itself - of the possibility of a crash.

"The key to our work is a higher degree of repeatability [compared to CRASHME testing]," Koopman says. "People have known how to get operating systems to crash for years. But those tests have had a lot of randomness. With Ballista, we can point to the one input that brought the system down."

The Ballista test results also generate detailed metrics. "We can say we ran so many tests and got so many failures," Koopman says. "We`ve proposed some high level metrics on average failure rates."

Koopman cautions that his research project is still more than a year from completion. However, a prototype version of the Ballista software testing web page is already up and running. Koopman encourages interested military and aerospace software managers to test the capability of the Ballista prototype by using the Web-based method for challenging operating systems, which is available at http://www.cs. cmu.edu/afs/cs/project/edrc-ballista/ www/index.html.

Click here to enlarge image

A French Ariane 5 rocket crashed because of a software glitch. DARPA researchers are trying to determine why software crashes happen and to take measures to prevent them from happening.

Voice your opinion!

To join the conversation, and become an exclusive member of Military Aerospace, create an account today!