One of the driving forces behind research in biology and medicine over the past decade or two has been the availability of huge amounts of DNA and protein data. The Human Genome Project1 was one of the major initiatives. It led to massive improvements in the efficiency of DNA sequencing, and inspired numerous other genome projects. However, you do not have to be the US National Institute of Health (NIH) to create a bioinformatics portal. Smaller organizations may have many reasons for and building a bioinformatics portal to collaborate with others, including sharing data and research results.
Given that you have an idea for analyzing or presenting data in a particular was, a complete bioinformatics web application depends of these basic pieces, which is what this article is all about:
There are a number of bioinformatics databases that are publicly available and, in addition, make the databases and software tools themselves available for building your own web user interfaces. The databases include:
When developing bioinformatics software, it can be a challenge to be an expert in biology and in software engineering at the same time. This is made worse by the many different programming platforms and technologies used in bioinformatics. For this reason, I have included some basic material to get something basic running on each platform to get an appreciation of what the issues will be and then provide pointers to further resources. My goal is to use this as an aid to get started and consider the choices from so many different software technologies. The following table summarizes the programming languages, web application platforms, and bioinformatics tools and libraries discussed in this article as well as their relative advantages and disadvantages.
| Programming Language | Web Technology | Bioinformatics Tools | Advantages | Disadvantages |
|---|---|---|---|---|
| Perl | Common Gateway Interface (CGI) | BioPerl | Large active open source community, cheap in an ISP hosted environment | Language not as structured as strongly typed languages |
| Java | Servlet / Java 2 Enterprise Edition | BioJava | Large active open source community, well developed web programming model | Expensive in an ISP hosted environment |
| C and C++ | (CGI) | BLAST | Some critical programs, such as BLAST, are already written using it, can be integrated with Perl | Not suitable for web application development (too complex) |
| C# | ASP.NET | Relatively cheap in an ISP hosted environment, Well developed web programming model | Small active open source community, tools are not open source or free | |
| PHP | Apache Module and others | BioPHP | Cheap in an ISP hosted environment | Smaller active open source community, language not as suitable to large scale software development |
| JavaScript (AJAX) | All other server side languages | Better user experience | Smaller active open source community, adds an additional level of software development complexity |
Many of the tools described are open source. This can present a challenge to some commercial organizations where ownership of intellectual property is very important. See the article Open Source Software and Documentation for Chemistry and Biomedical Sciences on this web site for more details on open source.
After looking at each we will find that, despite the existence strengths and weaknesses, all the web application platforms discussed have the fundamental pieces needed for bioinformatics applications and, at a fundamental level, have comparatively similar capabilities. Given that, there will be many bioinformatics applications written on different platforms. The next question is how do you integrate them together to provide users with something they can handle? The section Application Integration discusses these questions.
See the page Bioinformatics Resources on this web site for a list of popular and user suggested online resources.
There are no user comments.
Please send ideas and opinions by email at alexamies@gmail.com.