The 2017 Galaxy Community Conference (GCC2017) is being held in Montpellier, France, 26-30 June.  GCC2017 will include keynotes and accepted talks, poster sessions, demos, birds-of-a-feather meetups, exhibitors, and plenty of networking opportunities. There will also be three days of pre-conference activities, including hackathons and training. If you work in data-intensive biomedical research, there is no better place than GCC2017 to present your work and to learn from others.
Friday, June 30 • 15:20 - 16:35
P16: Galaxy-P rides the Jetstream: Cloud-based multi-omic informatics

Timothy Griffin 1*, Matthew Chambers 2, James Johnson 1, Thomas McGowan 1, Thomas Doak 3, Jeremy Fischer 3, Praveen Kumar 1, Pratik Jagtap 1

1 : University of Minnesota
2 : Vanderbilt University
3 : Indiana University
* : Corresponding author

The collaborative Galaxy for proteomics project (Galaxy-P) has demonstrated Galaxy's value in integrating genomic and mass spectrometry (MS)-based proteomic software for multi-omic applications such as proteogenomic and metaproteomic analysis. Thus far, Galaxy-P tools and workflows have been most readily available to other users who are operating local Galaxy instances. To increase accessibility, the Galaxy-P team has partnered with the NSF-funded cloud-based cyberinfrastructure Jetstream. In the initial phase of developing the Galaxy-P/Jetstream resources, we have focused on developing an instance with tools and workflows for integrating RNA-seq and MS-based proteomics data for proteogenomics studies. A main workflow generates protein sequence databases from in-silico translation of RNA-seq data, focusing on potentially expressed proteins from RNA variants, such as single nucleotide variants and insertion-deletions. The second workflow is focused on sequence database searching utilizing the customized FASTA database from the first workflow. This workflow also includes a step where putative variant peptide sequences are searched against known proteins via BLAST-P, to confirm their novelty. More recently we have established a second instance for metaproteomics analysis, where proteins expressed by microbial communities are characterized. This instance combines software for generating protein sequence databases from metagenomics data, with tools for sequence database searching and also visualization of taxonomy and molecular functions represented by the identified meta-proteins. For both resources, we are leveraging Jetstream's ability to handle increasing memory and processor load as data-intensive analyses require. We are also using the extensibility of the Galaxy framework to continue to extend the functionality of these cloud-based instances.

Timothy Griffin

Center for Mass Spectrometry and Proteomics, University of Minnesota

Friday June 30, 2017 15:20 - 16:35
Le Corum Le Corum

