Standards covered by or related to COMBINE activities
One of the major goals of COMBINE is to improve the interoperability of existing standards, and to foster or support fledging efforts aimed at filling gaps or new needs. Below are listed some of the major community standard representation formats covered by or related to COMBINE activity.
The following standardization activities are open community efforts. The standards are described in freely available specifications, and associated tools (XML schemas, UML diagrams etc.). They are piloted by democratically elected editorial boards, sometimes assisted by scientific committees. A decent software support exist, including API implementations. The development is supported by central teams and/or funding sources. The different formats try to avoid overlapping but rather strive to interoperate, via interconversion, cross-linking, use of common metadata layers etc.
| BioPAX is a standard language that aims to enable integration, exchange and analysis of biological pathway data. It is expressed in OWL. |
The last specification is BioPAX Level 3.
BioPAX development is coordinated by an elected editorial board and a Scientific Advisory Board.
| The Systems Biology Graphical Notation (SBGN), is a set standard graphical languages to describe visually biological knowledge. It is currently made up of three languages describing Process Descriptions, Entity Relationships and Activity Flows. |
The last specifications are SBGN PD Level 1 Version 1.3, SBGN ER Level 1 Version 2 and SBGN AF Level 1 Version 1.
SBGN development is coordinated by an elected editorial board and a Scientific Committee.
Several data resources and software claim support for SBGN. An API is available to help implementing support: libSBGN
| The Systems Biology Markup Language (SBML) is a computer-readable XML format for representing models of biological processes. SBML is suitable for, but not limited to, models using a process description approach. |
The latest stable specification is Level 3 Version 1 Core.
SBML development is coordinated by an elected editorial board and central developer team.
The Simulation Experiment Description Markup Language (SED-ML) is an XML-based format for encoding simulation experiments. SED-ML allows to define the models to use, the experimental tasks to run and which results to produce.is a computer-readable format for representing the models of biological processes. SED-ML can be used with models encoded in several languages, as far as they are in XML.
The latest stable specification is Level 1 Version 2.
SED-ML development is coordinated by an elected editorial board.
The CellML language is an XML markup language to store and exchange computer-based mathematical models. CellML is being developed by the Auckland Bioengineering Institute at the University of Auckland and affiliated research groups.
The latest stable specification is Version 1.1.
CellML development is coordinated by an elected editorial board.
APIs are available to help implementing support: CellML API in C.
The Synthetic Biology Open Language (SBOL) is a language for the description and the exchange of synthetic biological parts, devices and systems.
SBOL is supported by many software tools. APIs are available to help implementing support.
The NeuroML project focuses on the development of an XML based description language that provides a common data format for defining and exchanging descriptions of neuronal cell and network models.
The latest stable specification of NeuroML is version 2 beta 3.
NeuroML development is coordinated by the NeuroML Editorial Board.
NeuroML is supported by many software tools and databases, see here.
Associated standardization efforts
The standardisation efforts described below are not community-developed representation formats. However, they are tools to add a layer of semantics that facilitate the use, the interoperability or enhance the usefulness of COMBINE representation formats.
A COMBINE archive is a single file bundling the various documents necessary for a modelling and simulation project, and all relevant information. The archive is encoded using the Open Modeling EXchange format (OMEX).
MIRIAM Unique Resource Identifiers allow one to uniquely and unambiguously identify an entity in a stable and perennial manner. MIRIAM Registry is a set of services and resources that provide support for generating, interpreting and resolving MIRIAM URIs. Through the Identifiers.org technology, MIRIAM URIs can be dereferenced in a flexible and robust way.
MIRIAM URIs are used by SBML, SED-ML, CellML and BioPAX controlled annotation schemes.
Systems Biology Ontology
The Systems Biology Ontology (SBO) is a set of controlled, relational vocabularies of terms commonly used in Systems Biology, and in particular in computational modeling.
Each element of an SBML file carries an optional attribute sboTerm which value must be a term from SBO.
Each symbol of SBGN is associated with an SBO term.
Kinetic Simulation Algorithm Ontology
The Kinetic Simulation Algorithm Ontology (KiSAO) describes existing algorithms and their inter-relationships through their characteristics and parameters.
KiSAO is used in SED-ML, which allows simulation software to automatically choose the best algorithm available to perform a simulation and unambiguously refer to it.
BioModels.net qualifiers are standardized relationships (predicates) that specify the relation between an object represented in a description language and the external resource used to annotate it. The relationship is rarely one-to-one, and the information content of an annotation is greatly increased if one knows what it represents, rather than only know it is "related to" the model component.
Related standardization efforts
The following standardization efforts are of interest for COMBINE, either as candidate standards, or similar efforts in different domains.
Computational Neuroscience Ontology
FieldML's (Field Modelling/Markup Language) goal is to be a declarative language for building hierarchical models represented by generalized mathematical fields. Its primary use will be to represent the dynamic geometry and solution fields from computational models of cells, tissues and organs.
GPML (GenMAPP Pathway Markup Language) is an XML-based format to define a pathway consisting of purely graphical elements (such as lines and shapes) or graphical elements with added biological information (such as genes, proteins and datanodes).
The mathematical modelling ontology (MAMO) is an ontology describing and classifying the mathematical models used in the life sciences (for the time being). MAMO provides the types of models, the variables they use, the readout to expect and other relevant features.
The Network Interchange for Neuroscience Modeling Language (NineML) - is a language developed by the International Neuroinformatics Coordinating Facility (INCF) and designed for the description of large networks of spiking neurons.
The Numerical Markup Language (NuML) (pronounce "neumeul" and not "new em el", that sounds like NewML) is a simple XML format to exchange multidimensional arrays of numbers to be used with model and simulation descriptions. NuML was initially developed as part of the Systems Biology Results Markup Language (SBRML).
The Pharmacometrics Markup Language is an exchange format for encoding of models, associated tasks and their annotation as used in pharmacometrics. PharmML is developed by the DDMoRe consortium, an European Innovative Medicines Initiative (IMI) project.
The Proteomics Standards Initiative Molecular Interaction XML Format is a a data exchange format for molecular interactions developed by the the HUPO Proteomics Standards Initiative
The Terminology for the Description of Dynamics is a project to build an ontology for dynamical behaviours, observable dynamical phenomena, and control elements of bio-models and biological systems in Systems Biology and Synthetic Biology.
More information about standards used to share data in life sciences can be found on the website of the BioSharing initiative.