Securing your system when the roof is on fire with High Availability Clusters

Oct. 17, 2016

Fault tolerance systems are defined by their ability to continue operating in the event of a component failure. Essentially, fault tolerant systems need to be able to continue processing data no matter the situation (even if the system is on fire). So how do we ensure the data processing continues?

System developers must build duplicate hardware of all critical components of a system and teach the software to re-route the data flow to the alternative hardware once a failure is detected. This comes with several challenges including ensuring the software reacts only when needed and is successful in transferring the software operations to the duplicate hardware.

In a High Performance Embedded Computing (HPEC) cluster, there are compute nodes and the cluster manager, which is also known as the head node. The “head node” is the connection between HPEC cluster and the external network. It controls all other devices and eases the administration of the compute nodes. This node provisioning by the cluster manager simplifies replacing a compute node in the event of a hardware failure. This decreases the risk of any errors and allows for a confident node replacement even when the rest of the system may be failing.

While the head node offers us a secure and reliable solution during a hardware failure, the downside remains that the head node is a single point of failure for the entire system.

What is the solution? A high availability setup derived from the HPC world. Download the white paper HPEC: High Availability by Design to learn more about:

  • High Availability clusters
  • Fault Tolerance Software
  • HPC applications for HPEC
  • Cluster Managers
  • The STONITH process
About the Author

Tammy Carter | Senior Product Manager – OpenHPEC

Tammy Carter is the Senior Product Manager for OpenHPEC products for Curtiss-Wright Defense Solutions, based out of Ashburn Virginia. She has over 20 years of experience in designing, developing and integrating real-time embedded systems in the Defense, Communications and Medical arenas, and a M.S. in Computer Science.

Voice your opinion!

To join the conversation, and become an exclusive member of Military Aerospace, create an account today!