By Tom Litrenta
The past few years have seen multifunctional radar systems that need to perform sophisticated, computationally intensive algorithms—and to do so within limited space and often with constraints on power and heat dissipation. Even single-channel radar systems require many processors to keep up with today’s demanding real time I/O requirements. Thus, multicore processors have become an extremely attractive alternative.
However, raw computational speed is not only a processing consideration. To meet today’s real-time constraints, radar pulse return data is distributed on a collection of processors. At some point in the processing, an FFT [fast Fourier transform] algorithm must be performed on column data. However, distributed column data is not contiguous in memory—meaning that a column FFT would be extremely computationally inefficient. Virtually all radar systems perform a distributed matrix transpose, or corner turn, to place the column data in a contiguous row. This distributed corner turn requires a considerable amount of I/O between processors and is often the system bottleneck. Even the addition of more multicore processors may not improve the overall system performance. For instance, the performance of a radar system able to support 128 processors has been seen to saturate at 50 processors if the corner turn is not efficient.
Today’s embedded radar systems are equipped with modern I/O interconnections, which improve the performance of a distributed corner turn. These I/O paths typically include serial switched fabrics such as PCI Express and Rapid IO between nodes on a board, and between boards in a system.
Modern radar system processing requirements are constantly in a state of flux. Initial design specifications repeatedly change—so the design prototype needs to be scalable. Also, system performance monitoring is important to meet real-time requirements. Even on final system implementation, additional design features may be added. Thus, for maximum cost effectiveness, it is desirable that the radar system designer has access to a user friendly, scalable design interface that also provides real-time performance monitoring. Ideally, such an interface should be graphical and self-documenting.
It is always desirable to have as much processing power as possible for today’s multifunctional radar applications. Furthermore, conventional air-cooled systems with 6U boards are typically limited to approximately 80 to 100 watts per board for average power consumption.
Previously, this power limitation implied that a 6U board could contain only four 1 GHz processors. However, today’s multicore chips can provide more processing power per board. For instance, the MPC8641D dual CPU from Freescale Semiconductor, consumes a mere 25 watts of power dissipation at 1 GHz—meaning that a 6U board consisting of four MPC8641D dual processors will have typical power consumption within the power requirement for an air-cooled system.
The MPC8641D dual core processor is a PowerPC design with special attention paid to modern I/O requirements. There are two built-in flexible I/O ports. The local interface supports dual 8-lane PCI Express and the fabric interface supports Serial RapidIO. Also present are four Ethernet controllers supporting 10, 100, and 1000 megabits per second transfer rates. Furthermore, the MPC8641D is code compatible with the previous generation MPC7448 single-core processor.
It becomes clear that future radar front-end processing capability will be enhanced by approximately a factor of two per board by comparison with the previous generation.
As noted, a radar corner turn can often be the dominant processing consideration. This distributed corner turn requires considerable I/O among all processors in the system. It is desirable that the corner turn not be the system bottleneck. Thus modern radar systems have improved I/O capabilities between processors at all system levels. In the following graphical display, the paths or channels connecting the numerous paths (channels) connecting the brown and blue circles (tasks) represent the I/O required for a corner turn.
Today, the I/O between processors in a multicore processor provides a truly remarkable dimension: for a MPC8641D, for example, transfer speeds at approximately 1.5 gigabytes per second have been measured.
Previously, PCI-X was used to transfer data between processors on a 6U board, with typical PCI-X I/O speeds at approximately 530 megabytes per second for a simple application. However, even for a more complicated application, such as a corner turn, real-time speeds approaching 300 megabytes per second have been measured.
Today, 8-way PCI Express is used, with interprocessor I/O speeds recorded of 1.8 gigabytes per second for a straightforward application.
Finally, StarFabric was previously used to perform I/O between boards. Typically, StarFabric I/O times were found to be of the order of 330 megabytes per second. Today’s radar systems use either PCI Express or Serial RapidIO. Typically, Serial RapidIO would be chosen for larger systems (>4 boards) to avoid PCI mapping address limitations associated with PCI Express.
Thus, modern radar processing systems have experienced a factor of 5 to 10 improvement for interprocessor I/O speed. This improvement will eliminate the corner turn bottleneck and allow for radar systems with a larger number of processors.
It has been long been understood that software development and maintenance costs often dominate radar project expenses. Thus it is important to have access to software development and performance analysis tools that simplify development and help reduce these costs. An example of such a tool is the AXIS graphical software suite from GE Fanuc Intelligent Platforms that simplifies DSP development effort, and thus reduces software costs. It comprises a high-performance signal and vector-processing library; a suite of integrated graphical tools that provide system visualization with the ability to configure, download, run, debug, and monitor a multiprocessor system from an integrated, intuitive GUI environment; and interprocessor communication software that provides high-throughput, low-latency, reconfigurable interconnects that facilitate data transport between tasks, processors, boards, and systems.
With radar systems, as with many other military applications, it is difficult to determine whether the requirement for greater processing capability has driven the development of more powerful hardware and software—or whether radar systems designers have been quick to perceive the opportunities presented by more capable hardware and easier-to-use software. Whichever is the case, it is certainly true that today’s sophisticated radar processing is reliant on three key elements. The first of these is a new generation of multicore processors that allow significantly more computational power per board, but remain within stringent heat and power constraints. The second is the role of serial switched fabrics in allowing enormous improvements to be made in board-to-board data transfers, such that multiprocessor configurations are not only more powerful, but also more flexible. Third is that, in the face of complexity, growing requirements for scalability, the need to adapt to constantly changing applications and the absolute necessity of fielding new applications as rapidly as possible, a new generation of powerful yet easy to use software tools is becoming available to provide important improvements in developer productivity.
Tom Litrenta is an engineer with GE Fanuc Intelligent Platforms.