Approaches to Web Development for Bioinformatics

Previous  Contents  Next
References

Working with Files and Text

The section Working with Data from Public Databases describes some of the common data formats used in bioinformatics. One of these is the FASTA format used to hold sequence data. FASTA files can contain comment lines beginning with the character '>'. This PHP script reads a file and prints out the FASTA style comment.

PHP

<?php
$fp = fopen('HD.txt', 'r') or die("Can't open file.");
while ($line = fgets($fp, 1024)) {
if ($line[0] == '>') {
echo "Sequence description: " . substr($line, 1);
}
}
fclose($fp) or die("Can't close file.");
?>

The fopen function opens the file whose name is the first argument. The second parameter is the mode that the file is opened in. A value of 'r' for this parameter means read only. The functionfopen returns a file handle, which is stored in $fp in this example. The fgets function reads a line of text, which is stored in variable $line here. The parameter to the function is the maximum number of characters to read.

The program then checks whether the line begins with the character '>'. If so then it prints the comment, ommitting the '>' by using the substr function. Finally, the script closes the file with the fclose function. The output to this program for the example FASTA file HD.txt is


The sequence description is gi|90903230|ref|NM_002111.6| Homo sapiens huntingtin (Huntington disease) (HD), mRNA


Previous  Contents  Next
References

Contributed Comments and NotesAdd a comment.

There are no user comments.

Google

Please send ideas and opinions by email at alexamies@gmail.com.

© 2006-2007 Alex Amies