Bioinformatics workflow development environments
From Wikipedia, the free encyclopedia
A bioinformatics workflow development environment is a specialized form of Integrated Development Environment designed specifically to compose and execute a serires of computational or data manipulation steps, or a workflow, in a specific domain of science, bioinformatics.
Reader beware: By its very nature, this page is susceptible to link spam and advertisements or personal endorsements disguised as facts.
There are currently many different workflow systems, both open source and closed source, commercial and academic. Some have been developed more generally as scientific workflow systems for use by scientists from many different disiciplines like astronomy and earthquake prediction. A list of scientific workflow environments in alphabetical order is shown below. See external links below for surveys of the various scientific workflow systems.
Contents |
[edit] Potential important differentiating features
- single process, or multi-threaded, or distributed computing
- Client/server, or standalone
- Database or flat-file resources
- Federated or data-wharehousing
- graphical or scripted environment
- Proprietary libraries, or using existing community supported resources such as BioPerl BioJava EMBOSS, or having their own community supported libraries
- compiled or interpreted workflows
- single user or a shared IDE, with built-in workflow documentation features, concurrent versioning system
- may include webservices (SOAP/XML)
- may include system for managing the generated results
- supported scripting languages, typically: Java, Perl, Python
[edit] Alphabetical List of Examples
[edit] DiscoveryNet
DiscoveryNet is a £2m EPSRC-funded project to an e-Science platform for scientific discovery from the data generated by a wide variety of high throughput devices at Imperial College London
[edit] Geodise
See Grid Enabled Optimisation and Design Search for Engineering (GeoDise) developed at the University of Southampton
[edit] Kepler
The Kepler workflow system enables scientists in a variety of disciplines like biology, ecology and astronomy to compose and execute workflows. Kepler is based on the Ptolemy II system for heterogeneous, concurrent modeling and design. Ptolemy II was developed by the members of the Ptolemy project at University of California Berkeley. Although not originally intended for scientific workflows, it provides a mature platform for building and executing workflows, and supports multiple models of computation.
[edit] Pegasus
Pegasus is a flexible framework that enables the mapping of complex scientific workflows onto the grid developed at the Information Sciences Institute at the University of Southern California
[edit] Pegasys
[http://bioinformatics.ubc.ca/pegasys/ Pegasys is a software for executing and integrating analyses of biological sequences, developed by the University of British Columbia.
[edit] Taverna
The Taverna workbench is an open source worfklow system that enables scientists (typically, though not exclusively, in bioinformatics) to compose and execute scientific worfklows. It has been developed as part of a £5.5m EPSRC project called myGrid based at the University of Manchester
[edit] Triana
The Triana project is an open source problem solving environment developed at Cardiff University that combines an intuitive visual interface with powerful data analysis tools.
[edit] Wildfire
Wildfire is a distributed, Grid-enabled workflow construction and execution environment. It has a graphical user interface for constructing and running workflows. Wildfire borrows user interface features from Jemboss and adds a drag-and-drop interface allowing the user to compose EMBOSS (and other) programs into workflows. For execution, Wildfire uses GEL, the underlying workflow execution engine, which can exploit available parallelism on multiple CPU machines including Beowulf-class clusters and Grids.
[edit] External links
- A survey of Scientific Workflow Development Environments
- Taverna: Lessons in creating a workflow environment for the Life Sciences This paper reviews some of the above workflow systems
- A taxonomy of scientific workflow systems for grid computing from the ACM SIGMOD Record