Enlightened Multiscale Simulation of Biochemical Networks

Easy exchange of model information will support post-genomics research in systems biology.

A continuing research project is dedicated to development of mathematical and software infrastructure in support of post genomics research in systems biology. One near-term objective of the project is to contribute to deeper understanding of the organizational principles of biological networks. A distinguishing theme of this project is a focus on scalable methods of robustness and theoretically sound methods of the use of experimental data to validate (or invalidate) models; this theme stands in contrast to the heretofore prevalent theme of relying purely on simulation.

A central goal of modeling and simulation in systems biology is to connect molecular mechanisms to network functions to questions of biomedical relevance. Unfortunately, many of the most critical questions involve events that are extremely rare at the level of individual cells in an organism, yet are catastrophic to the organism as a whole. Consequently, simulation methods that may be adequate for studying generic or typical behavior are inadequate for use in exploring worst-case scenarios, which are computationally intractable using conventional methods. In an effort to overcome this limitation of conventional methods, the present project is extending best practice software tools and algorithms for robustness analysis that have become standards in engineering to models of biological relevance, which are typically nonlinear, hybrid, uncertain, and stochastic. This effort includes integration of formal inference methods from previously fragmented theories in computer science with those of control and dynamical systems.

The theoretical framework being developed in this project represents an unprecedented opportunity to create a system for analysis and validation (or invalidation) of models and for iterative experimentation on models that may be of a large-scale, stochastic, nonlinear, nonequilibrium, and mixed continuous and discrete nature with multiple time and spatial scales. The remarkable quality of the theoretical framework is that it can be used to prove conjectures regarding such complex, difficult-to-analyze models. Examples of conjectures that can be proven are (1) a given model cannot fit the experimental data, no matter what parameters are used and (2) a given model is robust, no matter how its parameters are varied. Heretofore, there has been no way of proving such conjectures except in cases of much simpler models. The combination of the capability of proving such conjectures and sophisticated robustness analysis methods is what is needed to make it possible to analyze realistic biological models and relate them to experimental data to help answer the question, “What is the next experiment that would best differentiate among the current alternative hypotheses?”

An especially notable recent product of this continuing development effort is the Systems Biology Markup Language (SBML), which is a machine-readable language that provides a format, based on Extensible Markup Language (XML), for representing models in such a way that they can be executed within, and exchanged among, different software systems to communicate and exchange the models. By utilizing SBML to define their input and output formats, different software tools can all operate on an identical representation of a model, removing opportunities for errors in translation and assuring a common starting point for analyses and simulations.

SBML can encode models representing biochemical entities (species) linked by reactions to form networks. An important principle is that a model is decomposed into explicitly labeled constituent elements, the set of which resembles a verbose rendition of chemical-reaction equations. The representation deliberately does not cast the model directly into a set of differential equations or other specific interpretation of the model. This explicit, modeling framework-agnostic decomposition makes it easier for a software tool to interpret the model and translate the SBML form into whatever internal form the tool uses. The formalisms in SBML enable modeling of a wide range of biological phenomena, including (and not limited to) metabolism, cell signaling, and gene regulation. SBML affords significant flexibility and power by making it possible to define arbitrary formulae for rates of change of variables and to express other constraints mathematically.

The software infrastructure of SBML includes libSBML, which is an embedded software library that provides an application programming interface (API) for working with SBML in the C, C++, Java, PERL, MATLAB, Lisp, and Python programming languages. Programmers can embed libSBML in their application programs, saving themselves the work of implementing their own parsing, manipulation, and validation software. libSBML is open-source software written in C and C++ and is highly portable. It is currently supported on the Linux, Solaris, MacOS X, and Microsoft Windows operating systems.

This work was done by John C. Doyle and Michael Hucka of California Institute of Technology for the Air Force Research Laboratory



This Brief includes a Technical Support Package (TSP).
Document cover
Enlightened Multiscale Simulation of Biochemical Networks

(reference AFRL-0058) is currently available for download from the TSP library.

Don't have an account?



Magazine cover
Defense Tech Briefs Magazine

This article first appeared in the February, 2008 issue of Defense Tech Briefs Magazine (Vol. 2 No. 1).

Read more articles from the archives here.


Overview

The document titled "Enlightened Multiscale Simulation of Biochemical Networks" is a final report authored by John C. Doyle and Michael Hucka from the California Institute of Technology, covering research conducted from September 1, 2001, to July 31, 2006. It is approved for public release by the Air Force Research Laboratory and focuses on the development of mathematical and software infrastructure to support post-genomics research in systems biology.

The primary objective of the research is to enhance the understanding of the organizational principles of biological networks through scalable methods that emphasize robustness and model validation, rather than relying solely on simulation. A significant aspect of this work is the Systems Biology Markup Language (SBML), which serves as a machine-readable exchange language for computational models of biochemical networks. The report highlights the development of libSBML, an embedded software library designed to facilitate interaction with SBML across various programming languages, including C, C++, Java, Perl, MATLAB, Lisp, and Python. LibSBML is characterized as free, open-source, and portable across multiple operating systems, including Linux, Windows, MacOS, and Solaris.

The report outlines two main goals achieved during the project: the continued development of libSBML, which includes increased support for SBML features and added functionality, and the promotion of SBML's use and evolution, particularly through direct support for DARPA's Bio-SPICE initiative. Additionally, the document discusses the need for an API that can interpret and verify the semantic correctness of units used in models, ensuring that complex mathematical equations yield proper units after parameter substitution. It also addresses the development of an API for SBML annotations, which allows software tools to insert and manipulate additional information in a structured manner.

Overall, the report emphasizes the importance of robust mathematical frameworks and software tools in advancing systems biology, particularly in the context of biochemical networks. By providing a comprehensive overview of the project's achievements and future directions, the document serves as a valuable resource for researchers and developers in the field, facilitating further exploration and innovation in biochemical simulations and modeling.