➔ Slides
AuthorsSean Buehler 1, Nic Herndon 1, Emily Grau 1, Ming Chen 2, Abdullah Almsaeed 2, Connor Wytko 3, Brian Soto 3, Sook Jung 3, Shawna Spoor 3, Kuangching Wang 4, Chun-Huai Cheng 3, Nick Watts 4, Lacey Sandserson 5, Jill Wegrzyn 1, Doreen Main 3, Alex Feltus 4, Margaret Staton 6, Stephen Ficklin 3, Nathan Henry 6
1 : University of Connecticut
2 : University of Tennessee Institute of Agriculture
3 : Washington State University
4 : Clemson University
5 : University of Saskatchewan
6 : University of Tennessee
AbstractSpecies or clade specific genomics databases offer curated and specialized data (as well as relevant metadata) to scientists. As the quantity of next generation sequencing sourced data increases, the need to store, transfer, and analyze efficiently becomes a tremendous challenge. The open-source platform, Tripal, connects Drupal (a content management system) and Chado
1,2 (a standardized relational database model for biological data). Today, a coalition of genomics databases implement Tripal for their online data repository needs. Recent development in Tripal is focused on achieving not just storage, but cross-database discovery to allow delivery of data directly to the Galaxy platform3. Through the new Tripal Galaxy module, scientists can select custom datasets from within and across Tripal databases and import those directly to a Galaxy instance from within a Tripal repository. A team of developers representing several plant genomics databases are focused on implementing workflows for differential gene expression, variant detection, and association genetics. These workflows are provided to the public as modules for Tripal databases. Current efforts at TreeGenes4, a repository for forest tree genomics, are focused on implementing association mapping and landscape genomics analysis in Tripal and Galaxy. This will include data integration and analytical capabilities for thousands of individual tree accessions in the form of genotype, phenotype, and environmental data.