Standards covered by or related to COMBINE activities

One of the major goals of COMBINE is to improve the interoperability of existing standards, and to foster or support fledging efforts aimed at filling gaps or new needs. Below are listed some of the major community standard representation formats covered by or related to COMBINE activity.

COMBINE standards

The following standardization activities are open community efforts. The standards are described in freely available specifications, and associated tools (XML schemas, UML diagrams etc.). They are piloted by democratically elected editorial boards, sometimes assisted by scientific committees. A decent software support exist, including API implementations. The development is supported by central teams and/or funding sources. The different formats try to avoid overlapping but rather strive to interoperate, via interconversion, cross-linking, use of common metadata layers etc.

A comprehensive list of specification documents is also available, following the COMBINE specification infrastructure.

BioPAX
BioPAX is a standard language that aims to enable integration, exchange and analysis of biological pathway data. It is expressed in OWL.

The last specification is BioPAX Level 3.

BioPAX development is coordinated by an elected editorial board and a Scientific Advisory Board.

BioPAX is supported by many pathway database or processing tools. An API is available to help implementing support: Paxtools

More information

SBGN
The Systems Biology Graphical Notation (SBGN), is a set standard graphical languages to describe visually biological knowledge. It is currently made up of three languages describing Process Descriptions, Entity Relationships and Activity Flows.

The last specifications are SBGN PD Level 1 Version 1.3, SBGN ER Level 1 Version 2 and SBGN AF Level 1 Version 1.2.

SBGN development is coordinated by an elected editorial board and a Scientific Committee.

Several data resources and software claim support for SBGN. An API is available to help implementing support: libSBGN

More information

SBML
The Systems Biology Markup Language (SBML) is a computer-readable XML format for representing models of biological processes. SBML is suitable for, but not limited to, models using a process description approach.

The latest stable specification is Level 3 Version 1 Core.

SBML development is coordinated by an elected editorial board and central developer team.

Over 250 software systems known to support SBML can be found in the SBML software guide. APIs are available to help implementing support: libSBML in C++ and JSBML in Java.

More information

SED-ML

The Simulation Experiment Description Markup Language (SED-ML) is an XML-based format for encoding simulation experiments. SED-ML allows to define the models to use, the experimental tasks to run and which results to produce.is a computer-readable format for representing the models of biological processes. SED-ML can be used with models encoded in several languages, as far as they are in XML.

The latest stable specification is Level 1 Version 3.

SED-ML development is coordinated by an elected editorial board.

APIs are available to help implementing support: libSedML in C#, libSEDML in C++ with swig bindings for python, java, perl, R and ruby, and jlibsedml in Java.

More information

CellML

The CellML language is an XML markup language to store and exchange computer-based mathematical models. CellML is being developed by the Auckland Bioengineering Institute at the University of Auckland and affiliated research groups.

The latest stable specification is Version 1.1.

CellML development is coordinated by an elected editorial board.

APIs are available to help implementing support: CellML API in C.

More information

SBOLData

The Synthetic Biology Open Language Data (SBOL Data) is a language for the description and the exchange of synthetic biological parts, devices and systems.

The latest stable specification of SBOL Data is 2.1.0.

SBOL Data is developed by the SBOL Developers Group. The development is coordinated by an editorial board and the SBOL Chair.

SBOL data is supported by many software tools. APIs are available to help implement the support of this data standard.

More information

SBOLVisual

The Synthetic Open Language Visual (SBOL Visual) is an open-source graphical notation that uses schematic “glyphs” to specify genetic parts, devices, modules, and systems.

The latest stable specification of SBOL Visual is 1.0.0.

SBOL is developed by the SBOL Developers Group and SBOL Visual Group. The development is coordinated by an editorial board and the SBOL Chair.

SBOL Visual is supported by many software tools.

More information


NeuroML

The NeuroML project focuses on the development of an XML based description language that provides a common data format for defining and exchanging descriptions of neuronal cell and network models.

The latest stable specification of NeuroML is version 2 beta 3.

NeuroML development is coordinated by the NeuroML Editorial Board.

NeuroML is supported by many software tools and databases, see here.

More information


Associated standardization efforts

The standardisation efforts described below are not community-developed representation formats. However, they are tools to add a layer of semantics that facilitate the use, the interoperability or enhance the usefulness of COMBINE representation formats.

COMBINE Archive COMBINE Archive

A COMBINE archive is a single file bundling the various documents necessary for a modelling and simulation project, and all relevant information. The archive is encoded using the Open Modeling EXchange format (OMEX).

Identifiers.org Identifiers.org URIs

MIRIAM Unique Resource Identifiers allow one to uniquely and unambiguously identify an entity in a stable and perennial manner. MIRIAM Registry is a set of services and resources that provide support for generating, interpreting and resolving MIRIAM URIs. Through the Identifiers.org technology, MIRIAM URIs can be dereferenced in a flexible and robust way.

MIRIAM URIs are used by SBML, SED-ML, CellML and BioPAX controlled annotation schemes.

SBO Systems Biology Ontology

The Systems Biology Ontology (SBO) is a set of controlled, relational vocabularies of terms commonly used in Systems Biology, and in particular in computational modeling.

Each element of an SBML file carries an optional attribute sboTerm which value must be a term from SBO.

Each symbol of SBGN is associated with an SBO term.

Kinetic Simulation Algorithm Ontology

The Kinetic Simulation Algorithm Ontology (KiSAO) describes existing algorithms and their inter-relationships through their characteristics and parameters.

KiSAO is used in SED-ML, which allows simulation software to automatically choose the best algorithm available to perform a simulation and unambiguously refer to it.

BioModels.net qualifiers

BioModels.net qualifiers are standardized relationships (predicates) that specify the relation between an object represented in a description language and the external resource used to annotate it. The relationship is rarely one-to-one, and the information content of an annotation is greatly increased if one knows what it represents, rather than only know it is "related to" the model component.

Related standardization efforts

The following standardization efforts are of interest for COMBINE, either as candidate standards, or similar efforts in different domains.

Computational Neuroscience Ontology

The Computational Neuroscience Ontology (CNO) is a controlled vocabulary composed of classes representing general concepts related to computational neuroscience. More ...

FieldML

FieldML's (Field Modelling/Markup Language) goal is to be a declarative language for building hierarchical models represented by generalized mathematical fields. Its primary use will be to represent the dynamic geometry and solution fields from computational models of cells, tissues and organs.

GPML

GPML (GenMAPP Pathway Markup Language) is an XML-based format to define a pathway consisting of purely graphical elements (such as lines and shapes) or graphical elements with added biological information (such as genes, proteins and datanodes).

MAMO

The mathematical modelling ontology (MAMO) is an ontology describing and classifying the mathematical models used in the life sciences (for the time being). MAMO provides the types of models, the variables they use, the readout to expect and other relevant features.


NineML

The Network Interchange for Neuroscience Modeling Language (NineML) - is a language developed by the International Neuroinformatics Coordinating Facility (INCF) and designed for the description of large networks of spiking neurons.

NuML

The Numerical Markup Language (NuML) (pronounce "neumeul" and not "new em el", that sounds like NewML) is a simple XML format to exchange multidimensional arrays of numbers to be used with model and simulation descriptions. NuML was initially developed as part of the Systems Biology Results Markup Language (SBRML).

PharmML

The Pharmacometrics Markup Language is an exchange format for encoding of models, associated tasks and their annotation as used in pharmacometrics. PharmML is developed by the DDMoRe consortium, an European Innovative Medicines Initiative (IMI) project.

PSI-MI

The Proteomics Standards Initiative Molecular Interaction XML Format is a a data exchange format for molecular interactions developed by the the HUPO Proteomics Standards Initiative

SpineML

The Spiking Neural Mark-up Language (SpineML) is a declarative XML based model description language for large scale neural network models. It is partially based upon on the INCF NineML.

TEDDY

The Terminology for the Description of Dynamics is a project to build an ontology for dynamical behaviours, observable dynamical phenomena, and control elements of bio-models and biological systems in Systems Biology and Synthetic Biology.


BioSharing"

More information about standards used to share data in life sciences can be found on the website of the BioSharing initiative.