Creating datasets in ISA-Tab

ToxBank Guide
Step-by-step instructions for creating datasets in ISA-Tab (dose-response and ‘omics).

The research leading to these results has received funding from the European Community’s Seventh Framework Programme (FP7/2007-2013) under grant agreement n° [267042]. The research leading to these results has received financing from Cosmetics Europe. This document reflects only the author’s views and that the Community, Cosmetics Europe, and ToxBank are not liable for any use that may be made of the information contained therein.

Grant Agreement HEALTH-F5-2010-267042
Acronym ToxBank
Name ToxBank – Supporting Integrated Data Analysis and Servicing of Alternative Testing Methods in Toxicology
Scientific Coordinator Douglas Connect (DC)
Administrative Coordinator Istituto Di Ricerche Farmacologiche Mario Negri (IRFMN)

This guide has information on the role of ToxBank and the ToxBank Data Warehouse (sections 1-2), Important background information on the SEURAT-1/ToxBank ISAcreator and design principles that help understand the ISA-Tab file format concept as well as some technical information (3), Background on the Toxicity ontologies and keyword hierarchy that is used for annotating the protocols and the data at the ToxBank Data warehouse (4), instruction on getting started with creating datasets (dose-response study) (section 5) and finally a detailed guide on how to create a dose-response gene expression profiling dataset out of publicly available gene expression dataset (section 6). If you are reasonably familiar with ISA-Tab and ISacreator you can skip directly to section 5 or to the section 6.

ISA-Tab is designed to describe all meta-information necessary for reproducibility and downstream analysis (QC, normalisation, association with traits of interest, toxicological predictions). This includes investigation, sample and assay parameters and links to ontology terms.

ISA-Tab is a universal data exchange and annotation format for biology-related studies. It is available at: http://isatab.sourceforge.net/tools.html. This guide is partly based on an EBI/diXa ISA-Tab guide: http://www.dixa-fp7.eu/news/dixa-isatab-tutorial. (Stathis Kanterakis, EBI, 18/01/2013). It uses publicly available data sets from a large Japanese Toxicogenomics Project TG-GATEs: http://toxico.nibio.go.jp/open-tggates/search.html.

A SEURAT-1 customized version of ISAcreator software (able to interface with the ToxBank Data Warehouse) is available from the help page of the ToxBank Data Warehouse.

 

To get started: Download the software from this link (https://services.toxbank.net/toxbank-ui/help). Running the software requires the Java VM.

NOTE: Following this guide you will be including the raw data files of an expression study as well. As you practice loading data into the ToxBank Data Warehouse (TBDW) please remove the .cel files (about 150 MB in compressed format) before uploading the files to save time. Alternatively, these raw data files can be uploaded to the ToxBank FTP site and linked to the ISA-tab fields (contact ToxBank support for more details). Unpublished SEURAT-1 studies should include the raw data files as well. TBDW includes mechanisms that keep any data submitted to it secure and restrict availability only to allowed persons.

Proceed to: