Approaches to Web Development for Bioinformatics

Previous  Contents  Next
References

On this page:

Java

This section discusses the basics of the Java language with a bioinformatics flavor. The section Web Programming with Java discusses developing web user interfaces with Java and the section BioJava discusses the BioJava open source project. Several other sections give Java examples as well, including the sections AJAX and Accessing a Relational Database with Java.

Language Basics

The Sun Microsystems java web site16 is a great place to begin studying Java. You can freely download the latest Java Development Kit (JDK) and work your way through the Java Tutorial.

Here is a Java program that does the same thing as our Hello World Perl program.  The file is in Hello.java.


public class Hello {

public static void main(String[] argv) {
System.out.println("Hello World!");
}
}

Compile it by entering the command.  This will generate a .class file in the same directory.


> javac Hello.java

Run the program by entering the command


> java Hello

The first thing you may notice comparing it to our Perl Hello World program is that it takes three lines to write instead of one, ignoring the comments in the Perl program.  All Java programs must be encapsulated in a class (Hello).  Procedural code is usually encapsulated in a method (main).  There are also concepts of class and method visibility (public).  There is the concept of static versus instance in the main method signature.  The method return type (void) must be specified even if it is not used.  Finally, you have to get an object (System) to get an output stream (out) to print the text with the platform method println().

Although it seems more complex than the Perl example, all these concepts have been developed with code reuse in mind. Let's look at an example that is similar to the regular expression example with Perl that was used to validate that all the symbols in a DNA string were valid nucleotides.  The file is Validate.java.


import java.util.regex.Matcher;
import java.util.regex.Pattern;

/**
 * Class to demonstrate validation of an input string to make
 * sure all symbols are valid nucleotides.

 */
public class Validate {

    /** The entry point for the program
     * @param argv The first and only command line argument is the string to validate
     */
    public static void main(String[] argv) {
        Pattern pattern = Pattern.compile("[^atcg]", Pattern.CASE_INSENSITIVE);
        Matcher matcher = pattern.matcher(argv[0]);
        if (matcher.find()) {
            System.out.println("Invalid symbol " + matcher.group() +
                " found at position " + matcher.start() + ".");
        }
    }
}

Compile the class with the command


>javac Validate.java

Test the program with similar input to the Perl script:


>java Validate atcgNSTOP

The output is similar to Perl script:


Invalid symbol N found at position 4.

There are couple of points are immediately obvious:

The language constructs described above allow programmers to better encapsulate their code within an application programming interface (API).  The exact interfaces that are exposed to users of the API is can be controlled to be the minimum needed.  Users of the API are constrained to use it in a particular way, according the Java types in the interface and the Java platform comes with tools to document the use API's in HTML.  This creates some programming overhead but it has been thought worth it to many large scale software development projects both for the development of products and in-house systems for all kinds of businesses.

Finally, to complete our comparison of Perl with Java, let's look at an example that translates an RNA sequence to an amino acid sequence. The Java source is in Translate.java.

import java.util.HashMap;
import java.util.Map;

/**
 * Class demonstrates translation of an RNA sequence into an amino acid sequence.
 */
public class Translate {

    // RNA String to translate
    private static final String RNA = "auggcacaggcacuguugguacccccaggaccugaaagcuuccgccuuuuuacuaga";

    // Map to store the codons to an amino acid sequence
    private static final Map<String ,String> TRANSLATION = new HashMap<String ,String>();

    // Codons to translate
    private static final String[] CODONS = {
            "uuu", "uuc", "uua", "uug",
            "ucu", "ucc", "uca", "ucg",
            "uau", "uac", "uaa", "uag",
            "ugu", "ugc", "uga", "ugg",
            "cuu", "cuc", "cua", "cug",
            "ccu", "ccc", "cca", "ccg",
            "cau", "cac", "caa", "cag",
            "cgu", "cgc", "cga", "cgg",
            "auu", "auc", "aua", "aug",
            "acu", "acc", "aca", "acg",
            "aau", "aac", "aaa", "aag",
            "agu", "agc", "aga", "agg",
            "guu", "guc", "gua", "gug",
            "gcu", "gcc", "gca", "gcg",
            "gau", "gac", "gaa", "gag",
            "ggu", "ggc", "gga", "ggg"
            };

    // Amino acid in map
    private static final String[] AMINO_ACIDS = {
            "F", "F", "L", "L",
            "S", "S", "S", "S",
            "Y", "Y", "--STOP--", "--STOP--",
            "C", "C", "--STOP--", "W",
            "L", "L", "L", "L",
            "P", "P", "P", "P",
            "H", "H", "Q", "Q",
            "R", "R", "R", "R",
            "I", "I", "I", "M",
            "T", "T", "T", "T",
            "N", "N", "K", "K",
            "S", "S", "R", "R",
            "V", "V", "V", "V",
            "A", "A", "A", "A",
            "D", "D", "E", "E",
            "G", "G", "G", "G"
            };

    // initialize the map
    private static void init() {
        for (int i=0; i<CODONS.length; i++) {
            TRANSLATION.put(CODONS[i], AMINO_ACIDS[i]);
        }
    }

    /** The entry point for the program
    * @param argv No command line arguments are used
    */
    public static void main(String[] argv) {
        init();
        StringBuffer aminoAcidSequence = new StringBuffer();
        int i = 0;
        while (i < RNA.length()) {
            aminoAcidSequence.append(TRANSLATION.get(RNA.substring(i, i+3)));
            i += 3;
        }
        System.out.println("Amino acid string: " + aminoAcidSequence);
    }
}

The program can be compiled and run with the commands


>javac Translate.java
>java Translate
Amino acid string: MAQALLVPPGPESFRLFTR

Although the program is more verbose that the Perl equivalent there are several notable things:

Tools for Java Development

A number of tools for Java development are included with the basic Java Development Kit (JDK).  These include the compiler javac, the documentation tool javadoc, the Java ARchiving (compression) tool jar, and performance instrumentation.  In addition, there are probably more freely available development tools than any other platform, including

There are also a number of commercial tools that can be useful for web development and performance testing.

Java Compared with Other Languages

Java has been the language of choice for beginning computing sciences courses at many universities and for many companies in project and product development.  It is a very well designed language and it is a pleasure (for me, at least) to program in.  However, it has a number of disadvantages

The availability of open source projects in Java is also an important factor.  There are a number of open source bioinformatics projects, such as BioJava, and there are a huge number of other open source Java projects of all kinds.

See the page Java Resources on this web site for a list of popular and user suggested online resources.

Previous  Contents  Next
References

Contributed Comments and NotesAdd a comment.

There are no user comments.

Google

Please send ideas and opinions by email at alexamies@gmail.com.

© 2006-2007 Alex Amies