Reconnecting with military program requirements for performance and interoperability

May 1, 2005
Switched fabrics for embedded high-performance computing: past, present and future perspectives

Switched fabrics for embedded high-performance computing: past, present and future perspectives

By Bernard Pelon

Standards and trade organizations such as the VMEbus International Trade Association (VITA) and the PCI Industrial Computer Manufacturers Group (PICMG), have worked for years to propose true ANSI standards for high-speed serial-switched networks, and in some cases are looking for the holy grail of switched fabrics.

This profusion of offerings does not guarantee that U.S. Department of Defense (DOD) requirements will be easy to meet, or that open-systems solutions will prevail.

There is a real risk, in fact, that many switched fabrics with standards-based physical layers will differentiate themselves only by their proprietary software stack and proprietary application programming interfaces (APIs), thus reestablishing the traditional vendor lock that is part of too many of today’s commercial off-the-shelf (COTS) solutions.

Fabrics have a longer life span than other technologies such as processors, which are replaced every 12 months. Systems designers must consider all the implications of this carefully; the stakes are high in risks and hidden costs, and those who fail to learn from history are bound to repeat its mistakes. Accordingly, we will ask what have we learned, where are we today, and where do we want to be?

A brief history

Switched fabrics for multiprocessing requirements emerged more than 10 years ago. Sky Computers, Mercury Computer Systems, and CSP Inc. (CSPI) were the only companies capable of delivering COTS systems with a switched fabric supporting 64 processors or more. Performance was in practice the only criteria, proprietary solutions were acceptable, and no one really considered open platforms and open standards.

Now Mercury Raceway and Sky’s SKYChannel are obsolete technologies replaced by RapidIO and InfiniBand, respectively. On the other hand, Myrinet, which CSPI implemented more than seven years ago, moves forward fueled by Myricom’s dominant position in high-performance (HPC) cluster technology.

Using a RapidIO switched fabric to build an embedded cluster can present a risk because of a lack of mature serial implementation for switches or host interfaces on silicon. The fact that Motorola PPC AltiVec chips, for example, are not yet available with RapidIO support is a concern for anyone with applications built upon the PPC AltiVec architecture.

By comparison embedding Myrinet serial technologies in DOD applications leverages an existing and well-established clustering technology. Myrinet uses 16-port switches and network processors to interface with PCI and PCI-X host processors, and has even demonstrated interoperability with Gigabit Ethernet.

InfiniBand solutions from Mellanox fundamentally have the same characteristics as Myrinet. The one noticeable difference is how well the two technologies adapt to different design approaches. InfiniBand defines the Mellanox offering while Myricom, the supplier of Myrinet, delivers clustering technology that can switch many protocols, including Gigabit Ethernet.

Legacy and future evolution

The move from SKYChannel to InfiniBand is a revolution-a clear change from proprietary to open standards. Is moving from Raceway to RapidIO a real evolution toward open standards? Changing the physical layer to RapidIO while keeping the Raceway proprietary software stack maintains backward compatibility but does not address the issue of interoperability of the system in a heterogeneous environment. Therefore, vendor lock may continue and the only gain is increased performance. We are reminded of the historical “proprietary versus open” debate between the Digital VMS and the UNIX software operating systems. DOD programs require performance but not at the expense of flexibility and interoperability.

The Myrinet transition from parallel to serial is an evolution at the physical layer only without any change to the software stack. The standard APIs that the fabric supports, such as TCP/IP and MPI (message passing), remain unchanged. Because Myrinet is a network it can evolve and encapsulate other protocols freely. A network processor is key to providing this support while offloading the processing node.

Myrinet behaves much like a high-performance Ethernet, which continuously improves performance (10, 100 and now 1000BaseT with Gigabit Ethernet), without any change to the application layer. The Myrinet switch fabric evolves to switch different protocols as they appear. This requires great attention to two key features-the physical layer interface and the software stack.

The VITA-41 Standard (VXS Backplane) represents an evolution in VMEbus technology to accommodate high-speed-interconnect serial fabrics while preserving the VME legacy. It is now a reality with vendors offering VXS backplanes. Serial chipsets for switched fabrics will readily move in, thanks to the VITA approach of impartial support of all serial technologies.

Embedded clusters

Once the designer has a real switched-fabric implementation capable of supporting from four to thousands of nodes, he discovers his need for a clustering technology to make it work. It is much more than moving bytes around. It takes the time and practice that HPC developers have invested to understand clustering technology and performance.

Clusters measure themselves not by their raw components at the switch level, but by using performance metrics and standards benchmarks. The point-to-point performance for a link is of little significance when what really matters is the collective performance, or throughput, of tens to hundreds of processors. More specifically, what really matters with an embedded cluster is the throughput per cubic feet, per watt.

The ability to measure the effective performance of these systems requires software code capable of running on one generation of fabric to the next without any change to the application layer. This means standard APIs and stable software stacks. Research centers and commercial users as well as DOD systems designers cannot afford to redesign their applications for each fabric and that is why the migration to MPI (Message Passing Interface) is pervasive. The success that the U.S. Defense Advanced Research Projects Agency (DARPA) had with the MPI initiative and its initial support of Myricom was decisive for the HPC community. Likewise, the Linux momentum continues to have a significant influence on defining an open, scalable platform.

DOD programs can leverage the HPC cluster experience and deploy embedded Linux clusters. This is good news for fabric vendors like Myricom and Mellanox, with technologies that designers can embed on a VXS platform. Embedded HPC applications will benefit from the emergence of a standard open platform, and we can expect to see a welcome decline in legacy and vendor lock issues.

Market forces and dominant players

Behind the battle of standards for switched fabrics, dominant MPU players are competing for market share by covering a large spectrum of commercial applications. Intel, Motorola, IBM, and a few others are working hard to bridge the gap between on-chip performance and chip-to-chip interconnect, power consumption, performance, and scalability.

Intel 3GIO and InfiniBand efforts finally converged toward PCI-Express and advanced switching with real silicon in view, which created a de-facto standard while carrying forward PCI and PCI-X legacy. PCI-Express (full-blown serial technology is a revolution at the physical layer with guaranteed insertion in most market places due to its compatibility with PCI and PCI-X.

Motorola’s early RapidIO efforts resulted in an initiative that has moved on to a standardization organization-in essence, a reverse process compared to Intel. Perhaps, at a fundamental level, the shift from parallel to serial specifications delayed the silicon implementation. Why work on the parallel silicon if serial is the only way to the future? As of today RapidIO remains a standard looking for real serial silicon.

Finally, IBM is involving its broad spectrum of applications with all these efforts. IBM’s solutions will use everything from HyperTransport to RapidIO, InfiniBand, Myrinet, PCI-X, and PCI-Express as well as its own proprietary bus for the new PPC 970 with AltiVec. IBM shows remarkable flexibility in its offerings based on a broader market view and historical perspective. For example, IBM blade solutions are already shipping with integrated Myrinet technology for HPC Cluster commercial applications.

Designers should not forget “reconfigurable computing with FPGA” and non-von-Neumann compute engines (ASICs, SOC) which we will place in the “direct-mapped hardware” class of processing. Direct-mapped hardware offers much higher giga-ops per watt ratio than the classic MPU, but at the cost of specialization.

Systems of systems

Switched fabrics with high performance are key, but the real direction is toward networked architectures that can interoperate. DARPA’s vision of “systems of systems” in which embedded systems are not designed, deployed, and used in isolation but rather in a cooperative way will influence the switched fabric.

The Embedded HPC Cluster must use open standards because limiting the focus to “in-the-box” will become less attractive.

The GRID Computing effort in the commercial market is certainly of interest in this respect. Myricom understands this evolution, offering a switched fabric that is agnostic in so much as it can switch Myrinet devices just as well as Gigabit devices and both at the same time. Perhaps a natural evolution is toward 10 Gigabit Ethernet. The good news is that there is an existing market for these emerging technologies, which are directly applicable to DOD program requirements for performance and interoperability.

Bernard Pelon is director product research at CSP Inc. in Billerica, Mass.

Voice your opinion!

To join the conversation, and become an exclusive member of Military Aerospace, create an account today!