Approaches to Web Development for Bioinformatics

Developing Applications with C#

Previous  Contents  Next 
References

Basic Text Processing

The string class in C# has many useful methods for basic text processing. In addition, C# has the similar capabilities to process regular expressions that Perl, Java, and other languages do. Here is an example that accepts a DNA string from the command line, validates it with a regular expression, and then transcribe to RNA. The code is in file DNAtoRNA.cs.


C#

// Transcribe a DNA string to the equivalent RNA.
// Also converts uppper to lower case.

using System;
using System.Text.RegularExpressions;

class DNAtoRNA
{

// Validates a DNA string
public bool IsValid(string dna)
{
Regex rx = new Regex("[^atcg]", RegexOptions.IgnoreCase);
MatchCollection matches = rx.Matches(dna);
if (matches.Count == 0)
{
return true;
}

// Print out the invalid characters to the console
foreach (Match match in matches)
Console.WriteLine("Character {0} at position {1} is not valid.", match, match.Index);

return false;
}
// Does the transcription
public string Transcribe(string dna) {
return dna.ToLower().Replace( "t", "u" );
}
// Entry point from the console
static public int Main(string[] args)
{
// Validate input
if (args.Length != 1) {
Console.WriteLine("Usage: DNAtoRNA {DNA string}");
Console.WriteLine("Example: DNAtoRNA agctagAGG");
return -1;
}
// Repeat input to console
string dna = args[0];
System.Console.WriteLine("Input DNA: {0}", dna);

DNAtoRNA converter = new DNAtoRNA();

// Validate
if (converter.IsValid(dna))
{
// Compute RNA and output to console
string rna = converter.Transcribe(dna);
Console.WriteLine("Output RNA string: {0}", rna);
}

return 0;
}
}

The program defines a class called DNAtoRNA. The first method IsValid uses the Regex class in namespace System.Text.RegularExpressions to search for any character not matching 'a', 't', 'c', or 'g'. All the characters that are not matched are then printed out. See the MSDN library documentation on more details of Regex and other classes in namespace System.Text.RegularExpressions. The method Transcribe uses basic string methods to lower class the string and Replace all occurences of "t" with "u" Compile the class and run it with an argument like this


>DNAtoRNA.exe agctagAGGWXY
Input DNA: agctagAGGWXY
Character W at position 9 is not valid.
Character X at position 10 is not valid.
Character Y at position 11 is not valid.


Previous  Contents  Next 
References

Contributed Comments and NotesAdd a comment.

There are no user comments.

Google

Please send ideas and opinions by email at alexamies@gmail.com.

© 2006-2007 Alex Amies