➔ Slides
AuthorsDaniel Blankenberg 1
1 :
Penn State / Galaxy TeamAbstractA key feature of the design of the Galaxy platform is the ease with which new tool configurations can be created and shared. Although the community has done a phenomenal job of making thousands of new tools available, it remains an arduous task to transform large tool suites. For example, the metagenomics packages QIIME (doi:10.1038/nmeth.f.303) and mothur (doi:10.1128/AEM.01541-09) contain over a hundred tools each. Efforts to add both packages have required tremendous effort and patience spanning months to years.
A completely manual process of making command-line utilities function as Galaxy tools does not scale. To address this hurdle, we must enable automatic generation of Galaxy Tools. In the case of singular standalone tools, Planemo via the tool_init command can generate a starter-quality tool configuration. However, for tool generation to be successful on a large-scale, software developers must take care to design well thought out command-line interfaces that make use of standard infrastructure components.
We present two examples of programmatic generation of tools. The first, Anvi'o (doi:10.7717/peerj.1319), is an analysis platform consisting of approximately 50 command-line tools plus an interactive visualization tool allowing users to perform metagenomic binning, characterize single-nucleotide variation, study bacterial pangenomes, predict number of bacterial genomes in a metagenomic assembly, or even remove contamination from eukaryotic assembly projects. Galaxy tool configurations, with production quality interfaces, were generated automatically for each of the Anvi'o platform's commands. A second example is a command-line utility that is able to convert any R package into a set of Galaxy tools.