In this section:
A web server receives and responds to Hypertext Transfer Protocol (HTTP) requests. Web servers are actually better called HTTP servers. Developing a web application you will be most concerned how your application programming language can interface with the web server. That will be discussed for several programming languages in the following sections. However, there are a number of aspects that are common to many application programming languages and some that are particular to certain web servers. Some of these aspects are file permissions and access, authentication, and encryption. The individual web servers and those common aspects are the subject of this section.
The two most commonly used web servers are the Apache HTTP Server and Microsoft Internet Information Server. There are also a number of others on the market. Java application servers usually include their own web server but also provide plug-ins that support integration with other web servers. It can be a good idea to use a web server other than the Java application server to take advantage of a more production ready server for security purposes. Many development environments, including Eclipse and Microsoft Visual Studio, have built-in web servers making it less necessary for developers to be familiar with production web servers.
Regardless of the web server you choose, two files that you will probably want to have
in the root document directory are robots.txt and favicon.ico.
The file robots.txt tells web crawlers where to look and where not to look
to index the files on your server. A basic robots.txt file is
This tells the web crawler that they should look anywhere except the directory cgi-bin.
The file favicon.ico is the icon that browsers place at the left hand side of
the address bar and on tabs. You can create one with a graphics program, such as GIMP.
Some browser (mostly IE) automatically look for the favicon.ico file in
the same directory as the page is served from. You can also specifically direct browsers
to look for the icon file using the HTML header
The Apache HTTP Server is reportedly the most commonly used web server. It is a favorite of Internet Service Providers (ISP's) and of the open source community. The Apache HTTP Server can be freely downloaded from the Apache site51. There are versions of the server for many different operating systems, including Windows with an easy installation program. My environment at present Apache 2.2.4 with Windows for development and an older version of Apache on Linux in production.
The Apache executable is called httpd in the bin directory under the Apache
installation tree. On Windows you can start it via a Windows server using the
convenient Apache icon in the system tray. The configuration file is httpd.conf
in the conf directory. The first thing that you will likely want to do is to change the
document root, which is the location of the file served up in response to HTTP request.
To do that modify the DocumentRoot property and add or modify a
<Directory> element.
By default the Apache log files access.log and error.log will
be written to the logsdirectory. The first thing that you will probably notice are
404 File Not Found entries for robots.txt and favicon.ico.
Apache modules are a way to add functionality to the basic web server. The Common Gateway Interface (CGI) and PHP web platforms discussed later in this article are Apache modules. Other useful modules are available for authorization, virtual hosts, logging and URL mapping.
The Common Gateway Interface (CGI) is discussed in the section Web User Interfaces of this article. Here I am including some notes on setting it up for Apache. See the Apache Tutorial: Dynamic Content with CGI and Using Apache with Microsoft Windows51.
To set up CGI on Apache you need to load the CGI module with a line like htis in httpd.conf:
It is a good practice to set a script alias (a directive of the mod_alias module, discussed below) to prevent scripts being accessed directly as files. This can be done with a line like this in httpd.conf:
Forward slashes are used on Windows as if it was UNIX. A directory stanza is also needed. This should look something like
On UNIX and Linux the first line of a script usually starts with #! to let the server
know what program to use to execute the script.
On Windows it is convenient to have set the ScriptInterpreterSource to
allow look up of the execution program in the Windows regisry. You can do that with a line
like
The mod_alias and mod_rewrite modules are designed for URL mapping. The mod_alias module is mostly suitable for mapping URL's to files that can exist either on your sever or somewhere else. The mod_alias module can map URLs that match regular expressions to either files or scripts that can generate dynamic content These can be useful in mapping REST patterns to scripts that process requests and dynamically generate data without requiring users to see the unfriendly URL's involved. See the section Application Integration for a discussion of REST.
To enable the rewrite module add or uncomment this line in the httpd.conf file
As an example, to enable the mod_rewrite module to map the URL's of the form
http://host/gene/symbol where symbol is an arbitrary symbol for a gene
to a script that dynamically generates content about the gene you could use this
rewriting rule.
The stanza is only used if the rewrite_module is loaded. It turns the rewrite engine on.
Any URL matching the regular expression is mapped to the script geneinfo.pl in the
directory gene with the symbol of the gene appended to the query string. In
the regular expression
For example, the URL http://localhost/gene/HD is mapped to
http://localhost/gene/geneinfo.php?symbol=HD. The QSA
(query string append) flag is intended for rules that modify the query string.
There are no user comments.
Please send ideas and opinions by email at alexamies@gmail.com.