Quantifying Generative AI in Defense Applications
Defense Advanced Projects Research Agency (DARPA)
Arlington, VA
703-526-6630
www.darpa.mil

Relying on credible, concrete information is essential in high-stakes decision-making. So, how can society be sure generative artificial intelligence (AI) will be safe and effective for such applications?
Over the past century, one of humanity's most significant innovations has been the ability to move people and things quickly over large scales. Everything from bridges to jets and rockets uses mathematical foundations to understand the physical world and reliably build these systems and structures.
Yet, as society catapults into an era of exploring and applying AI to quickly deliver information to people, methods for guaranteeing the capabilities (and limitations) of generative AI systems do not exist. Neither do insights into when and why those capabilities manifest.
The Defense Advanced Projects Research Agency (DARPA) has been a long-term investor in AI research and development. With the influx of large language models, the agency continues to invest in areas that show promise in filling the fundamental gaps between state-of-the-art systems and national security applications, including the Department of Defense’s (DoD) mission-critical needs.
As decision-making becomes faster due to generative AI, the agency seeks to develop mathematical foundations for assessing generative AI and providing guarantees necessary to deploy the technology safely and effectively across the DoD and society.
According to DARPA's Artificial Intelligence Quantified (AIQ) Program Manager, Dr. Patrick Shafto — it all boils down to math.
"AI has achieved near human-level performance in domains including text generation, game playing, and such, which raises the prospect of widespread integration with human partners in the military and society," he explained. "And at the most general level, we're interested in determining how to ensure AI systems will have the properties needed to solve various problems."
AIQ will explore the hypothesis that mathematical foundations, combined with advances in measurement and modeling, will guarantee an AI system's capabilities, when they will or will not manifest, and why.
Today, if you ask a generative AI chatbot a question, there's no guarantee that it will get the answer right.
Furthermore, even slight rewordings of the same question or simply changing the order of the words can result in a completely different answer.
Shafto says that mathematical foundations, combined with advances in measurement and modeling, may unlock the solution to guarantee AI capabilities in a quantified way. And generalization is key, says Shafto.
Current AI evaluation focuses on giving AI systems quizzes, like we would give to a person. However, there is no reason to believe that the answers would be the same even for simple rewordings of the same question, nevermind real-world applications. That is, we want guarantees about generalization, and math is required for that.
Through AIQ, DARPA will work closely with partners at the National Institute of Standards and Technology (NIST) and the DoD to ensure that when AI systems are deployed in high-stakes situations, one can have confidence in predicting their performance.
Visit Here
Top Stories
INSIDERRF & Microwave Electronics
Blue Ghost Arrives in Lunar Orbit, Prepares for Landing
NewsConnectivity
Closing Gap to Leverage Enhanced Computational Power for SDV Advancement
ArticlesEnergy
Hybrid Powertrains in the Product Mix
ProductsElectronics & Computers
INSIDERElectronics & Computers
Researchers Achieve Breakthrough in New Design of Superconducting Quantum...
Technology ReportMaterials
Lighter, Recyclable Body Seal from Cooper Standard Wins SAA Award
Webcasts
Software
Leveraging Simulation for Net Zero Emissions in Conventional and...
Manufacturing & Prototyping
Quickly Prototyping Custom Textures on Automotive Parts
Unmanned Systems
March 2025 Automated and Connected Vehicles Digital Summit
Defense
A Guide to Electric Aircraft Systems Sizing: ePowertrain, TMS,...
Defense
Advancements in Pulsating Heat Pipes: Analysis and Applications...