Toxygates: exposing toxicogenomics datasets and linking with InterMine

This is a guest post from our colleague Johan Nyström-Persson, who works with ToxyGates and the NIBIOHN in Japan.

Toxygates (http://toxygates.nibiohn.go.jp) has been developed as a user-friendly toxicogenomics analysis platform at the Mizuguchi Lab, National Institutes of Biomedical Innovation, Health and Nutrition (NIBIOHN) in Osaka since 2012. The first public release was in 2013. At this time, the main focus of Toxygates was exposing the Open TG-GATEs dataset, a large, systematically organised toxicogenomics dataset compiled during more than a decade by the Japanese Toxicogenomics Project (http://toxico.nibiohn.go.jp). This dataset consists of over 24,000 microarray samples. To make use of such a large dataset without time-consuming data manipulation and programming, it is necessary to have a rich user interface and access to many kinds of secondary data.

Toxygates allows anyone with a web browser to explore and analyse this data in context. Various kinds of filtering and statistical testing are available, allowing users to discover and refine gene sets of interest, with respect to particular compounds. For a reasonably sized data selection, hierarchical clustering and heat-maps can be displayed directly in the browser. Through TargetMine (http://targetmine.nibiohn.go.jp) integration (based on the InterMine framework), enrichment of various kinds is possible. Compounds can also be ranked according to how they influence genes of interest.

To support all of these functions, we came up with the concept of a “hybrid” data model which recognises that, while gene expression values by themselves may be viewed as a large matrix with a flat structure, secondary annotations of genes and samples, such as
proteins, pathways, GO terms or pathological findings, have an open-ended structure. Thus, we combine an efficient key-value store (for gene expressions) with RDF and linked data (for gene and sample annotations) to allow for both high performance and a flexible data structure.

Today, the project continues to evolve in new directions as a general transcriptomics data analysis platform. We have integrated Toxygates not only with TargetMine, but also with HumanMine, RatMine and MouseMine. Recently, users can also upload their own transcriptomics data and analyse it in context alongside Open TG-GATEs data. We may
also add more datasets in the future.

P1000874The current project members are Kenji Mizuguchi (project leader) and Chen Yi-An (NIBIOHN), Johan Nyström-Persson and Yuji Kosugi (Level Five), and Yayoi Natsume-Kitatani and Yoshinobu Igarashi (NIBIOHN).

Advertisements