Using High-Level Language to Implement Floating-Point Calculations on FPGAs

High-level languages reduce the complexity of hardware design.

The scientific community is interested in using field-programmable gate arrays (FPGAs) for scientific computations because they can be targeted for specific applications and achieve greater throughput at a lower power cost. However, these gains can usually only be achieved by a user with expert knowledge of hardware design. Therefore, despite improvements in FPGA technology that have allowed their use to become attractive for a wider range of applications, inexperience with hardware design remains a barrier for many.

Data Flow between the Mitrion-C and host programs. Each of the Quad-Data Rate (QDR) memories directly available to the Virtex-II Pro contains 4 MB of space for input/output, for a total of 16 MB of input and output.
High-level languages use a variety of approaches to reduce the complexity of hardware design. In this project, Mitrion-C was used because it was readily available at the Naval Research Laboratory, and because it is a commercial product with fast and effective support services. Mitrion-C makes hardware design more accessible in two ways. First, algorithms are described in the Mitrion-C programming language, which uses “C-like” syntax and structures such as functions and loops. Second, the Mitrion Integrated Development Environment (IDE) packages together a user interface, compiler, and simulator.

In hardware design using a traditional hardware description language (HDL) such as Very High Speed Integrated Circuit HDL (VHDL), both simulation and synthesis are time-consuming and synthesis can often fail, requiring modification of the code. The Mitrion IDE simulates and generates VHDL in one step and also estimates whether a design will fit, based on the target hardware’s limitations. Therefore, as long as there are no syntax errors in the Mitrion code, the VHDL synthesis will most likely be successful, with the exception of cases where resource consumption exceeds the resources of the FPGA by a very small margin. One downside of using a high-level language is that the hardware designer loses a level of control. Although Mitrion-C offers explicit options for pipelining, how it achieves its optimizations is opaque to the user.

The simulation of the interaction of a ray of light with an optical element — assuming that the element is a conic surface — requires several calculations. This project looked at two in particular: the intersection point of a ray with an element, and the vector normal to the element’s surface at the point of intersection.

Mitrion-C version 1.4 was used to implement the two calculations. Each of the Quad-Data Rate (QDR) memories directly available to the Virtex-II Pro contains 4 MB of space for input/output, for a total of 16 MB of input and output. Since many scientific applications require more than 16 MB of input and output, a host program is needed to marshall data between the FPGA’s memory and host memory present on the same compute node.

The host program was written using the American National Standards Institute’s standard for C (ANSI-C), and run on one of the Advanced Micro Devices (AMD) Opteron 275 processors on the same compute node as the FPGA. The Cray XD1 supercomputer used in this project uses an interconnect system that allows data transfer between the FPGA and host RAM at a rate of 3.2 GB/s. Mitrion-C uses the full bandwidth provided by Cray.

In the host program, each of the FPGA’s QDR memories is treated as an array. The host program loads values into the arrays, sends the FPGA a start signal using a function provided by Mitrionics, and reads the results after it receives a done signal back from the FPGA.

The Mitrion-C program was split into three functions that: 1) read the inputs from QDR memory, 2) performed floating- point calculations, and 3) wrote the results to a different QDR memory. Data was stored in a list data structure and the program was run in a foreach loop. This combination explicitly instructs the Mitrion compiler to automatically pipeline the design.

As a benchmark, the performance of the Mitrion-C implementations of the ray-intersection calculation and normalvector calculation to ANSI-C programs was compared. Each of the 4 MB memories available to the Virtex-II Pro has a bitwidth of 64 bits. Although all four of the FPGA’s memories were used for input, two of the memories had to be used for output as well. Mitrion-C provides memory synchronization commands that enable bidirectional use of the FPGA’s memories with no effect on throughput.

As mentioned before, the maximum bandwidth of the interconnect, between the FPGA’s QDR memories and the host memories, is 3.2 GB/s. This means that each of the four QDR memories makes up 800 MB/s of that total. Since each FPGA memory can read or write 64 bits (8 bytes) every clock cycle, the 100-MHz clock used by Mitrion makes use of the maximum 800 MB/s bandwidth of the memories.

Measurements confirmed that a throughput very near the limit of the memories — 799.04MB/s in the case of the normal-vector calculation — could be maintained over a large sample of data. Mitrion-C is a straightforward way to achieve the maximum throughput allowed by the memory bandwidth, given that the intended design fits on the target FPGA.

This work was done by Kevin K. Liu, Charles B. Cameron, and Antal A. Sarkady of the US Naval Academy. NRL-0057



This Brief includes a Technical Support Package (TSP).
Document cover
Using High-Level Language to Implement Floating-Point Calculations on FPGAs

(reference NRL-0057) is currently available for download from the TSP library.

Don't have an account?



Magazine cover
Defense Tech Briefs Magazine

This article first appeared in the April, 2012 issue of Defense Tech Briefs Magazine (Vol. 6 No. 2).

Read more articles from this issue here.

Read more articles from the archives here.


Overview

The document discusses the implementation of floating-point arithmetic on a Cray XD1 supercomputer using the high-level programming language Mitrion-C. It highlights the growing interest in Field-Programmable Gate Arrays (FPGAs) within the high-performance computing (HPC) community due to their potential for lower power consumption and higher throughput compared to traditional processors.

The authors emphasize the challenges associated with creating hardware designs for FPGAs, particularly as technology advances and complexity increases. Mitrion-C is presented as a solution that simplifies the hardware design process. It features a "C-like" syntax and structures, making it more accessible for developers familiar with traditional programming languages. The Mitrion Integrated Development Environment (IDE) further enhances usability by combining a user interface, compiler, and simulator, allowing for efficient simulation and synthesis of designs.

The document outlines the performance metrics achieved using Mitrion-C, including resource consumption, throughput, and power consumption. The authors found that Mitrion-C could achieve maximum theoretical throughput allowed by memory bandwidth in cases where designs fit within the FPGA's resource limits. However, they noted that low-level programming might still be necessary to optimize trade-offs between throughput and resource consumption.

The study also discusses specific implementation details, such as the use of QDR memories and the impact of memory bandwidth on throughput. The authors observed that the normal-vector calculation consumed a significant portion of FPGA resources, indicating the need for careful resource management when designing applications. They concluded that while Mitrion-C requires an initial investment of time to learn, it significantly reduces the overall time spent in the hardware design cycle.

In terms of power consumption, the document reports that maintaining FPGAs incurs a constant increase of approximately 30%, but using them for processing adds only about 3% more power compared to traditional sequential processors. This finding underscores the efficiency of FPGAs when utilized effectively.

Overall, the document advocates for the use of Mitrion-C as a valuable tool for harnessing the computational power of FPGAs, particularly for applications that do not exceed the resource limits of the target hardware. The authors recommend Mitrion-C for its ability to streamline the design process while delivering high performance in floating-point operations.