FPGA frameworks are necessary to provide true commercial off-the-shelf solutions
By Jeff Milrod
In a perfect world, a commercial off-the-shelf (COTS) signal processing board would have a single “magic” processor with infinite throughput and computational performance, zero latency and power consumption, and ease of programming in high-level languages. Since we are not living in a perfect world, however, and signal processing challenges are becoming more complex and widespread, designers have implemented a wide variety of COTS approaches that leverage existing real-world compute engines-all with varying compromises to the ideal solution.
General-purpose processors (GPPs), on the one hand, have tremendous performance along with well-established development tools and environments that make them relatively easy to program. On the other hand, however, these devices fall short where optimizing throughput, latency, or power are concerned.
Dedicated digital signal processors (DSPs) are superior to GPPs at optimizing for throughput, latency, and power, yet DSPs do not have as much raw computational performance or development tools necessitating multiprocessor solutions that are notoriously hard to develop on.
Recently, FPGAs have been brought to bear on the signal processing world as an adjunct, or even an alternative, to GPPs and DSPs. Clearly, FPGAs have tremendous advantages to standard processors since the throughput, computational performance, and latency can be optimally tuned in hardware, and recent generations, such as the Stratix III from Altera, have greatly improved power consumption. Even the traditional “Achilles heel” of using FPGAs for signal processing-the development environment and associated complexities-has been minimized via tremendous improvements in vendor and third-party tools.
There is one major drawback to the use of FPGAs, however, that the argument above neglects to discuss. While an FPGA can perform impressive processing, it is not a processor, and therefore does not have an internal architecture, instruction set, data paths, or a peripheral set. In fact, an FPGA simply provides the raw materials and components that enable-and require-a user to create everything from scratch. While the open-ended potential is understandably attractive to the engineer trying to find creative solutions to difficult signal-processing problems, this “dirty little secret” of FPGAs often unwittingly undermines the whole purpose of COTS.
The advantages of COTS are well understood. Still, the basic concepts of reducing development costs, risks, and time-to-deployment, are severely compromised if a user must create his own DMA/memory interface or host/control interface for the FPGA in addition to the specific algorithmic work for the application at hand. Yes, the PCB design is done, implemented, and proven, but now the FPGA pin-out is locked down. The designer is given a generic set of cores that may or may not do what is needed, and may or may not work on that board. It is fair to say that the low-level FPGA development effort (not counting specific algorithmic work) is as big or bigger than the schematic and printed circuit board development.
Therefore, if FPGAs are included as part of the signal-processing chain, the concept of COTS signal processing must be redefined. It must be more than just chips on a board, and must include:
- FPGA board “framework” that provides fully validated board-level interfaces for I/O, communications, and memory;
- internal dataflow interconnect fabric that enables the framework modules to be easily connected; and
- control fabric that enables them to be easily coordinated and controlled.
When properly implemented, an FPGA framework creates a stable, high-performance signal processing platform that, as COTS intended, frees the user to focus on application development rather than reinventing board-level infrastructure.
Of course, the details of any specific FPGA framework will be driven by the board architecture and design. Consider a basic concept: the COTS framework modules provide the board-level interfacing and control. This kind of COTS solution includes memory interfaces with DMA engines, inter-processor communications, standard I/O interfaces, and host/control interfacing. The user only needs to develop the modules that add unique value for a given application, including processing and/or non-standard I/O.
A well-designed FPGA framework for signal processing will also separate the data-flow fabric from the command-and-control fabrics. Since most signal processing applications require hard real-time determinism, mixing data and control can be disastrous.
Implementation of these fabrics can be done using either proprietary or standard fabrics. Proprietary fabrics can be convenient for the COTS vendor; designers can optimize for their specific architecture and approach, and leverage existing modules. However, it can be difficult and cumbersome for the users to develop IP modules for these proprietary fabrics; even when they are well designed and documented, they will always have a non-trivial learning curve and single point of support.
Alternatively, standard data and control fabrics are emerging that are supported by much broader communities. Current examples include the open-source OCP, and Altera’s new Avalon which supports memory-mapped fabric for command and control, and a streaming fabric for data flow. Avalon also has the unique advantage of being integrated into Altera’s Quartus II tools, further reducing the users development time, and expanding the available IP modules.
Similarly, the data-flow fabric can be implemented using buses and multiplexers to create a traditional in-out sequential data flow, or a central switch, or switches, can help create a more networked style of data flow. The networked approach helps incoming data flow to and from any or all other modules. A networked fabric enables users to add on processing or I/O modules easily without the need to figure out and lock down fixed data flows at the architectural level. Using the switch, data flow paths can be defined on the fly, so that any given processing module can be routed as a pre-processor, post-processor, or co-processor-in parallel, or sequentially pipelined.
Delivery of a COTS FPGA framework generally consists of a development kit of modules and simulation environment, along with preconfigured framework implementation examples. The user first can design, compile, and simulate using the framework, and then develop and “bolt-on” application specific IP modules. If desired, the designer can even edit, modify, or recreate the existing framework modules-but that effort is only required for special circumstances, much like modified COTS hardware.
Several COTS vendors are now moving to this FPGA framework concept, yet terminologies have not converged at this early stage, and can widely vary from vendor to vendor. The comprehensiveness and sophistication also varies widely and is evolving. One example of an existing COTS solution is BittWare’s ATLANTiS FPGA Framework. This framework was initially released in 2005, and supports several boards on different formats. The current incarnation primarily targets signal processing boards based on a hybrid architecture combining Analog Device’s TigerSHARC DSPs and Altera’s Stratix II GX FPGAs.
Signal processing is hard. The ultimate goal of COTS is to reduce the barrier to deployed solutions-not just to provide raw material that can be used to build solutions. As FPGAs become more and more attractive as signal processing engines, COTS vendors must add more and more value to the FPGA with frameworks that provide stable, synthesizable board-level interfaces and on-chip communication structures that users can easily leverage. The COTS market must become more knowledgeable and sophisticated with respect to this reality and redefine the concept and expectations of COTS signal processing.
Jeff Milrod is president and chief executive officer of BittWare Inc. in Concord, N.H.