Test and Evaluation Uses of Heterogeneous Computing

General Purpose Graphics Processing Units (GPGPUs) produce power, space, cooling, and maintenance benefits.

T est and evaluation of current systems is a time-consuming process that reflects both the intricacies of the object of the test and the range of equipment, personnel, and environments required. Many argue that this process consumes far too much of the time that it takes to put new systems into the hands of the warfighters, and uses way too many resources without much obvious benefit for those in combat. One solution to ameliorating these costs and delays is the increased use of computer simulations, ranging from agent-based models of battle spaces, to Mechanical Computer-Aided Engineering (MCAE) analyses of hardware, to esoteric simulations using computational fluid dynamics to assess everything from new airframes to dispersion of chemical and biological agents.

One potential approach to reducing costs, time to roll-out, and physical danger, all while improving validity, transparency, and utility, is to adopt the strategy of heterogeneous computing. Heterogeneous computing is the use of a variety of different types of computational units to aid the central processing unit (CPU), such as accelerators like General Purpose Graphics Processing Units (GPGPUs), field programmable gate arrays (FPGAs), and digital signal processors.

The technique presented here is to use GPGPUs to effectively handle computationally intensive activity ‘‘spikes.’’ Three specific aspects of the use of GPGPUs are presented: code drafting and development hurdles and opportunities, codes modified in several areas of computational science, and a wide range of software results in floating point operations per second (FLOPS) per watt parameters in various hardware configurations.

To better analyze potential test and evaluation use, this method was implemented for forces modeling and simulation. Existing DOD simulation codes running on advanced Linux clusters were used. GPGPU experiments were first conducted on a more manageable code set to ease the programming burden and hasten the results. Basic Linear Algebra Subprogram routines were seen as appropriate candidates. An MCAE ‘‘crash code’’ arithmetic kernel was used as vehicle for a basic demonstration problem. This preliminary characterization of GPU acceleration focused on a subset of the large space of numerical algorithms, in this case factoring large sparse symmetric indefinite matrices.

Finding a great deal of interest in GPGPU acceleration, the following work, while necessarily preliminary because of the design dynamics of the devices being offered, may prove useful to those facing power issues today. In any case, these analyses do support the proposition that the use of GPGPUs is probably indicated as a viable method for reducing power consumption per unit of computation (usually quantified here as FLOPS).

One can examine the extra power requirement for a system, first at the maximum power drain specified, then the drain at high computational loads, the drain at idle, and finally the drain with the GPGPU card removed from the node. Three versions of NVIDIA GPUs were tested: the 8800, 9400, and 9800. In each case, the host for the GPGPU was chosen to best complement the GPU itself, so different platforms were used in every instance. This is a necessary result of the choice of the target GPUs and would be more convoluted if they were all tried on one platform with the concomitant compromises.

A Model 22-602 Radio Shack AC ammeter probe was used to test current flow to the entire node. In each case, the amperage was measured, within the accuracy of the meter, of the current to the node under test while exercising the GPU (a) to the maximum extent feasible, (b) at idle while running, (c) at a sleep or hibernate mode, and (d) then finally, with the subject card removed.

Care should be exercised if trying to calculate actual amperages to be experienced in different computational environments and using different analytic tools. The accuracy of the meter used could be reliably certain to return comparative figures, but the absolute numbers might be off by some significant fraction. Test and retest numbers were very stable, giving some assurance that the comparative values were meaningful. These data indicate that the entire node takes on the order of 50 percent more power at full load and that the GPGPU adds on the order of 15 to 20 percent power consumption, even at rest, assuming one GPGPU card per processor.

This work was done by Dan M. Davis, Gene Wagenbreth, and Robert F. Lucas of the Information Sciences Institute at the University of Southern California; and Paul C. Gregory of Lockheed Martin for the Air Force Research Laboratory. AFRL-0199



This Brief includes a Technical Support Package (TSP).
Document cover
Test and Evaluation Uses of Heterogeneous Computing

(reference AFRL-0199) is currently available for download from the TSP library.

Don't have an account?



Magazine cover
Defense Tech Briefs Magazine

This article first appeared in the August, 2011 issue of Defense Tech Briefs Magazine (Vol. 5 No. 4).

Read more articles from this issue here.

Read more articles from the archives here.


Overview

The document titled "The Test and Evaluation Uses of Heterogeneous Computing: GPGPUs and Other Approaches" discusses the integration of General Purpose Graphics Processing Units (GPGPUs) and other heterogeneous computing technologies in the test and evaluation (T&E) community. It emphasizes the growing need for efficient computational power while managing energy consumption, particularly in military and defense applications.

The authors highlight that modern processors consume significant energy primarily due to the demands of supplying instructions and data to functional units. As interconnects become less efficient compared to logic advancements, they account for a substantial portion of energy use—over 70% in some cases. The ELM architecture is introduced as a solution, aiming to execute real-time, computationally intensive tasks while minimizing power usage. This architecture replaces fixed-function hardware with software, allowing for more cost-effective updates and power savings.

The document also addresses the challenges faced in T&E settings, where energy conservation is critical due to varying costs and environmental constraints. It notes that electronics, particularly microcircuitry, are sensitive to heat and energy limitations, necessitating innovative approaches to maintain performance without excessive power consumption.

The authors explore the use of GPGPUs to handle computational spikes effectively, detailing their experiences with code development and modifications in computational science. They report on the potential for significant power reductions—up to two orders of magnitude for individual operations—through the use of tiled architectures and efficient instruction management.

Additionally, the document discusses IBM's Blue Gene initiative, which integrates essential subsystems on a single chip, achieving low power dissipation (around 17 W per node). This design allows for the installation of numerous compute nodes within standard power and cooling limits, enhancing overall performance metrics such as FLOPS per watt.

In conclusion, the authors assert that heterogeneous computing, particularly through GPGPUs, presents numerous advantages for the T&E community. They advocate for further exploration and adoption of these technologies to meet the increasing demands for computational power while addressing the critical issue of energy consumption. The document serves as a call to action for researchers and practitioners to leverage these advancements for improved efficiency in defense systems and simulations.