Approaches to Web Development for Bioinformatics

Previous  Contents  Next
References

Introduction

One of the driving forces behind research in biology and medicine over the past decade or two has been the availability of huge amounts of DNA and protein data.  The Human Genome Project1 was one of the major initiatives.  It led to massive improvements in the efficiency of DNA sequencing, and inspired numerous other genome projects.  However, you do not have to be the US National Institute of Health (NIH) to create a bioinformatics portal.  Smaller organizations may have many reasons for and building a bioinformatics portal to collaborate with others, including sharing data and research results.

Given that you have an idea for analyzing or presenting data in a particular was, a complete bioinformatics web application depends of these basic pieces, which is what this article is all about:

  1. A source of data. This may be your own organization's data or data from a public source.
  2. An application programming language to access and analyze the data. With this you need to know how to access and process the data in the first step.
  3. A web application platform to provide a HTML user interface for your data and analysis results. This enables you to present the data and model from the previous step to your users and allow them to interact with it.
  4. Optionally, a data store, such as a relational database, to store results or user's data. If your users are to save any data for use in a future visit you will need this.
  5. Optionally, you would reuse software tools and libraries developed by others. This can save you time and may enable you to do things that you don't have the expertise to do yourself.

There are a number of bioinformatics databases that are publicly available and, in addition, make the databases and software tools themselves available for building your own web user interfaces.  The databases include:

When developing bioinformatics software, it can be a challenge to be an expert in biology and in software engineering at the same time.  This is made worse by the many different programming platforms and technologies used in bioinformatics.  For this reason, I have included some basic material to get something basic running on each platform to get an appreciation of what the issues will be and then provide pointers to further resources.  My goal is to use this as an aid to get started and consider the choices from so many different software technologies. The following table summarizes the programming languages, web application platforms, and bioinformatics tools and libraries discussed in this article as well as their relative advantages and disadvantages.

Programming Language Web Technology Bioinformatics Tools Advantages Disadvantages
Perl Common Gateway Interface (CGI) BioPerl Large active open source community, cheap in an ISP hosted environment Language not as structured as strongly typed languages
Java Servlet / Java 2 Enterprise Edition BioJava Large active open source community, well developed web programming model Expensive in an ISP hosted environment
C and C++ (CGI) BLAST Some critical programs, such as BLAST, are already written using it, can be integrated with Perl Not suitable for web application development (too complex)
C# ASP.NET   Relatively cheap in an ISP hosted environment, Well developed web programming model Small active open source community, tools are not open source or free
PHP Apache Module and others BioPHP Cheap in an ISP hosted environment Smaller active open source community, language not as suitable to large scale software development
JavaScript (AJAX) All other server side languages   Better user experience Smaller active open source community, adds an additional level of software development complexity

Many of the tools described are open source.  This can present a challenge to some commercial organizations where ownership of intellectual property is very important.  See the article Open Source Software and Documentation for Chemistry and Biomedical Sciences on this web site for more details on open source.

After looking at each we will find that, despite the existence strengths and weaknesses, all the web application platforms discussed have the fundamental pieces needed for bioinformatics applications and, at a fundamental level, have comparatively similar capabilities. Given that, there will be many bioinformatics applications written on different platforms. The next question is how do you integrate them together to provide users with something they can handle? The section Application Integration discusses these questions.

See the page Bioinformatics Resources on this web site for a list of popular and user suggested online resources.


Previous  Contents  Next
References

Contributed Comments and NotesAdd a comment.

There are no user comments.

Google

Please send ideas and opinions by email at alexamies@gmail.com.

© 2006-2007 Alex Amies