2.2. What is ISA-TAB?

ISA-TAB is an abbreviation of “Investigation-Study-Assay TAB delimited format”. The ISAcreator software is used to create archives containing experiment descriptions as well as the raw data of an investigation. An archive typically includes all the work that is part of a publication. The archive contains three tables describing the experimental set-up in a hierarchical fashion. In order to avoid entering duplicate information, different types of data are entered in different parts of an archive. The upper level table in the ISA-TAB archive is known as an Investigation. An Investigation contains one or more Studies. Studies share the use of similar biological materials, e.g. same types of treatments and cells that are investigated for instance using different technologies. A Study contains one or more Assays. Each assay is technology-specific and common features associated with a particular technology (e.g. affymetrix microarrays) are captured in SEURAT-1/ToxBank assay templates. We will be using a SEURAT-1 configured An Assay contains links to one or more data files. The table contains links to these data files and details about the protocols that were used to derive them.

Investigation-Study-Assay tab delimited files store the metadata of the experiment, while data files store the actual readouts. Since the format of the data files is not specified by the ISA-TAB, assay specific formats could be used (e.g. txt, .cel, spread sheets, etc).

ISA-TAB is designed to describe the workflow of processing biological material and data. Citing the section 5.1 of the ISA-TAB specification * : "a protocol takes one or more inputs (biological material or data) and generates one or more outputs (biological material or data). Therefore protocols correspond to edges in the experimental graph, while materials and data correspond to the nodes“. The workflow is represented in the spread sheet (tab delimited text) format, using left-to-right convention, meaning the first node (e.g. a sample) is being processed according to the protocol and the second node is obtained.

Node1 Name | Protocol REF | Node2 Name

"Node Name" could be either Study node or Assay node. Study nodes are denoting the study subjects (biological samples), while Assay Nodes describe the experiment itself. The subjects processing might be more complex than a single step (e.g. preparation of a culture). There are means to assign characteristics, protocol parameters and factors.

The figure following (Chapter 2.3.) outlines a simple experiment workflow, which is composed of two named nodes (source name and sample name) and a protocol study description (growth protocol).

* http://isatab.sourceforge.net/docs/ISA-TAB_release-candidate-1_v1.0_24no.... “5. Design patterns for Study and Assay files. Like MAGE-TAB before it, ISA-TAB is a framework with which to capture and communicate the complexity of the experimental metadata required to interpret an experiment. The fields in the Study and Assay files whose headers containing the string ‘Name’ (e.g. Source Name) and ‘File’ (e.g. Raw Data File) represent nodes in the experimental graph, corresponding to material (e.g., samples, RNA extracts, synthetic material etc.) or data objects. Edges show the relationships between nodes in the experimental graph.”

2.3. Understanding ISA-TAB Investigations