Chapter 8. Web Services (208)

This topic has a weight of 11 points and contains the following objectives:

Objective 208.1; Basic Apache Configuration (4 points)

Candidates should be able to install and configure a web server. This objective includes monitoring the server's load and performance, restricting client user access, configuring support for scripting languages as modules and setting up client user authentication. Also included is configuring server options to restrict usage of resources. Candidates should be able to configure a web server to use virtual hosts and customize file access.

Objective 208.2; Apache configuration for HTTPS (3 points)

Candidates should be able to configure a web server to provide HTTPS.

Objective 208.3; Implementing Squid as a caching proxy (2 points)

Candidates should be able to install and configure a proxy server, including access policies, authentication and resource usage.

Objective 208.4; Implementing Nginx as a web server and a reverse proxy (2 points)

Candidates should be able to install and configure a reverse proxy server, Nginx. Basic configuration of Nginx as a HTTP server is included.

Basic Apache Configuration (208.1)

Candidates should be able to install and configure a web server. This objective includes monitoring the server's load and performance, restricting client user access, configuring support for scripting languages as modules and setting up client user authentication. Also included is configuring server options to restrict usage of resources. Candidates should be able to configure a web server to use virtual hosts and customise file access.

Key Knowledge Areas

Apache 2.x configuration files, terms and utilities

Apache log files configuration and content

Access restriction methods and files

mod_perl and PHP configuration

Client user authentication files and utilities

Configuration of maximum requests, minimum and maximim servers and clients

Apache 2.x virtual host implementation (with and without dedicated IP addresses)

Using redirect statements in Apache's configuration files to customise file access

Terms and utilities

  • access.log or access_log

  • error.log or error_log

  • .htaccess

  • httpd.conf

  • mod_auth

  • htpasswd

  • AuthUserFile, AuthGroupFile

  • apache2ctl

  • httpd

Resources: LinuxRef06; Coar00; Poet99; Wilson00; Engelschall00; PerlRef01; Krause01; the man pages for the various commands.

Installing the Apache web-server

Building Apache from source was routinely done when Apache emerged. Nowadays it is included in binary format in most distributions. Apache can be configured by setting options in configuration files. Where to find configuration files and how they are organized varies. Red Hat and similar distributions have their configuration files in the /etc/httpd/conf directory. Other locations which are or have been used are /etc/apache/config, /etc/httpd/config and /etc/apache2.

On most distributions the main configuration file is httpd.conf, in some cases it is apache2.conf. Again, depending on your distribution it might be one big file or a small generic one with references to other configuration files. Red Hat distributions use the latter approach. In these generic settings like servername, port(s) and IP addresses to listen on, and the user and group Apache should switch to after startup can be set. There are also various directives to influence the way Apache serves files from its document tree. For example there are Directory directives that control whether it is allowed to execute PHP files located in them. The default configuration file is meant to be self explanatory and contains a lot of valuable information that we shall not repeat here.

An additional method to set options for a subdivision of the document tree is by means of an .htaccess file. For security reasons you will also need to enable the use of .htaccess files in the main configuration file by setting the AllowOverride directive for that Directory context. All options in an .htaccess file influence files in the directory and the ones below it, unless they are overridden by another .htaccess file or directives in the main configuration file.

Modularity

Apache has a modular source code architecture. You can custom build a server with only modules you really want. Many modules are available on the Internet and you could also write your own.

Modules are compiled objects written in C. If you have questions about the development of Apache modules, join the Apache-modules mailing list at http://modules.apache.org. Remember to do your homework first: research past messages and check all the documentation on the Apache site before posting questions.

Special modules exist for the use of interpreted languages like Perl and Tcl. They allow Apache to run interpreted scripts natively without having to reload an interpreter every time a script runs (e.g. mod_perl and mod_tcl). These modules include an API to allow for modules written in an interpreted (scripted) language.

The modular structure of Apache's source code should not be confused with the functionality of run-time loading of Apache modules. Run-time modules are loaded after the core functionality of Apache has started and are a relatively new feature. In older versions, to use the functionality of a module, it needed to be compiled in during the build phase. Current implementations of Apache are capable of run-time module loading. The section on DSO has more details.

Run-time loading of modules (DSO)

Most modern Unix derivatives have a mechanism for the on demand linking and loading of so called Dynamic Shared Objects (DSO). This is a way to load a a special program into the address space of an executable at run-time. This can usually be done in two ways: either automatically by a system program called ld.so when the executable is started, or manually from within the executing program with the system calls dlopen() and dlsym().

In the latter method the DSO's are usually called shared objects or DSO files and can be named with an arbitrary extension. By convention the extension .so is used. These files are usually installed in a program-specific directory. The executable program manually loads the DSO at run-time into its address space via dlopen().

As of version 1.3 Apache has been capable of loading DSOs, both to load core functionality and to extend its functionality at run-time. This was a relatively easy step because Apache source code was designed to be modular from the beginning.

Tip

How to run Apache-SSL as a shareable (DSO) module: first, configure the shared module support in the source tree:

	./configure --enable-shared=apache_ssl
				

then enable the module in your httpd.conf:

	LoadModule apache_ssl_module modules/libssl.so
				

Tip

To see whether your version of Apache supports DSOs, execute the command httpd -l which lists the modules that have been compiled into Apache. If mod_so.c appears in the list of modules then your Apache server can make use of dyamic modules.

APache eXtenSion (APXS) support tool

The APXS is a new support tool from Apache 1.3 and onwards which can be used to build an Apache module as a DSO outside the Apache source-tree. It knows the platform dependent build parameters for making DSO files and provides an easy way to run the build commands with them.

Monitoring Apache load and performance

An Open Source system that can be used to periodically load-test pages of web-servers is Cricket. Cricket can be easily set up to record page-load times, and it has a web-based grapher that will generate charts to display the data in several formats. It is based on RRDtool whose ancestor is MRTG (short for Multi-Router Traffic Grapher). RRDtool (Round Robin Data Tool) is a package that collects data in round robin databases; each data file is fixed in size so that running Cricket does not slowly fill up your disks. The database tables are sized when created and do not grow larger over time. As the data ages, it is averaged.

Enhancing Apache performance

Apache developers focus on correctness and configurability first. But even then, Apache's performance is quite good. It can saturate a 45 Mbps line with ease even when running on low-end servers. So if your website feels sluggish, consider external factors first. Nowadays many smaller sites share oversubscribed network lines, which is quite often the culprit. Slow sites may also be caused by CGI scripts. The constant loading and unloading of interpreters introduces a lot of overhead . You may consider the use of special modules like mod_php or mod_perl instead. Another major cause for bad performance is insufficent disk speed or a slow database subsystem.

Another problem may be the lack of RAM which may result in swapping. A swapping webserver will perform badly, especially if the disk subsystem is not up to par, causing users to hit stop and reload, further increasing the load. You can use the MaxClients setting to limit the amount of children your server may spawn hence reducing memory footprint.

Apache access_log file

The access_log contains a generic overview of page requests for your web-server. The format of the access log is highly configurable. The format is specified using a format string that looks much like a C-style printf format string. A typical configuration for the access log might look like the following:

	LogFormat "%h %l %u %t \"%r\" %>s %b" common
	CustomLog logs/access_log common
			

This defines the nickname common and associates it with a particular log format string. The format as shown is known as the Common Log Format (CLF). It is a standard format produced by many web servers and can be read by most log analysis programs. Log file entries produced in CLF will look similar to this line:

	127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326
			

CLF contains the following fields:

  1. IP address of the client (%h)

  2. RFC 1413 identity determined by identd (%l)

  3. userid of person requesting (%u)

  4. time server finished serving request (%t)

  5. request line of user (%r)

  6. status code servers sent to client (%s)

  7. size of object returned (%b).

Apache error_log file

The server error log, whose name and location is set by the ErrorLog directive, is the most important log file. This is the place where Apache httpd will send diagnostic information and record any errors that it encounters in processing requests. It is the first place to look when a problem occurs with starting the server or with the operation of the server, since it will often contain details of what went wrong and how to fix it.

The error log is usually written to a file (typically error_log on Unix systems and error.log on Windows). On Unix systems it is also possible to have the server send errors to syslog or pipe them to a program.

The format of the error log is relatively free-form and descriptive. But there is certain information that is contained in most error log entries. For example, here is a typical message:

	[Wed Oct 11 14:32:52 2000] [error] [client 127.0.0.1] client denied by server \
		configuration: /export/home/live/ap/htdocs/test
			

The first item in the log entry is the date and time of the message. The second item lists the severity of the error being reported. The LogLevel directive is used to control the types of errors that are sent to the error log by restricting the severity level. The third item gives the IP address of the client that generated the error. Beyond that is the message itself, which in this case indicates that the server has been configured to deny the client access. The server reports the file-system path (as opposed to the web path) of the requested document.

A very wide variety of different messages can appear in the error log. Most look similar to the example above. The error log will also contain debugging output from CGI scripts. Any information written to stderr by a CGI script will be copied directly to the error log.

It is not possible to customize the error log by adding or removing information. However, error log entries dealing with particular requests have corresponding entries in the access log. For example, the above example entry corresponds to an access log entry with status code 403. Since it is possible to customize the access log, you can obtain more information about error conditions using that log file.

During testing, it is often useful to continuously monitor the error log for any problems. On Unix systems, you can accomplish this using:

	tail -f error_log
			

Restricting client user access

Many systems use either DAC or MAC to control access to objects:

Discretionary Access Control (DAC)

A system that employs DAC allows users to set object permissions themselves. They can change these at their discretion.

Mandatory Access Controls (MAC)

A system that employs MAC has all its objects (e.g., files) under strict control of a system administrator. Users are not allowed to set any permissions themselves.

Apache takes a liberal stance and defines discretionary controls to be controls based on usernames and passwords, and mandatory controls to be based on static or quasi-static data like the IP address of the requesting client.

Apache uses modules to authenticate and authorise users. In general, modules will use some form of database to store and retrieve credential data. The mod_auth module for instance uses text files where mod_auth_dbm employs a DBM database.

Below is a list of the security-related modules that are included as part of the standard Apache distribution.

mod_auth

(DAC) This is the basis for most Apache security modules; it uses ordinary text files for the authentication database.

mod_access

(MAC) This is the only module in the standard Apache distribution which applies what Apache defines as mandatory controls. It allows you to list hosts, domains, and/or IP addresses or networks that are permitted or denied access to documents.

mod_auth_anon

(DAC) This module mimics the behaviour of anonymous FTP. Rather than having a database of valid credentials, it recognizes a list of valid usernames (i.e., the way an FTP server recognizes ftp and anonymous) and grants access to any of those with virtually any password. This module is more useful for logging access to resources and keeping robots out than it is for actual access control.

mod_auth_db

(DAC) This module is essentially the same as mod_auth, except that the authentication credentials are stored in a Berkeley DB file format. The directives contain the additional letters DB (e.g., AuthDBUserFile).

mod_auth_dbm

(DAC) Like mod_auth_db, except that credentials are stored in a DBM file.

mod_auth_digest

(DAC) This module implements HTTP Digest Authentication (RFC2617), which provides a more secure alternative to mod_auth_basic. After receiving a request and a user name, the server will challenge the client by sending a nonce. The contents of a nonce can be any (preferably base 64 encoded) string, and the server may use the nonce to prevent replay attacks.

A nonce might, for example, be constructed using an encrypted timestamp within a resolution of a minute, i.e. '201311291619'. The timestamp (and maybe other static data identifying the requested URI) might be encrypted using a private key known only to the server.

Upon receival of the nonce the client calculates a hash (by default a MD5 checksum) of the received nonce, the username, the password, the HTTP method, and the requested URI and sends the result back to the server. The server will gather the same data from session data and password data retrieved from a local digest database. To reconstruct the nonce the server will try twice: the first try will use the current clocktime, the second try (if necessary) will use the current clocktime minus one minute. One of the tries should give the exact same hash the client calculated. If so, access to the page will be granted. This restricts validity of the challenge to one minute and prevents replay attacks.

Please note that the contents of the nonce can be chosen by the server at will. The example provided is one of many possibilities. Like with mod_auth, the credentials are stored in a text file (the digest database). Digest database files are managed with the htdigest tool. Please refer to the module documentation for more details.

Configuring authentication modules

Apache security modules are configured by configuration directives. These are read from either the centralized configuration files (mostly found under or in the /etc/ directory) or from decentralized .htaccess files. The latter are mostly used to restrict access to directories and are placed in the top level directory of the tree they help to protect. For example, authentication modules will read the location of their databases using the AuthUserFile or AuthDBMGroupFile directives.

Centralized configuration.  This is an example of a configuration as it might occur in a centralized configuration file:

	<Directory /home/johnson/public_html>
	<Files foo.bar>
	AuthName "Foo for Thought"
	AuthType Basic
	AuthUserFile /home/johnson/foo.htpasswd
	Require valid-user
	</Files>
	</Directory>
				

The resource being protected is any file named foo.bar in the /home/johnson/public_html directory or any underlying subdirectory. Likewise, the file specifies whom are authorized to access foo.bar: any user that has credentials in the /home/johnson/foo.htpasswd file.

Decentralized configuration.  The alternate approach is to place a .htaccess file in the top level directory of any document tree that needs access protection. Note that you must set the directive AllowOverride in the central configuration to enable this.

The first section of .htaccess determines which authentication type should be used. It can contain the name of the password or group file to be used, e.g.:

	AuthUserFile {path to passwd file}
	AuthGroupFile {path to group file}
	AuthName {title for dialog box}
	AuthType Basic
			

The second section of .htaccess ensures that only user {username} can access (GET) the current directory:

	<Limit GET>
	require user {username} 
	</Limit>
			

The Limit section can contain other directives to restrict access to certain IP addresses or to a group of users.

The following would permit any client on the local network (IP addresses 10.*.*.*) to access the foo.html page and require a username and password for anyone else:

	<Files foo.html>
	Order Deny,Allow
	Deny from All
	Allow from 10.0.0.0/255.0.0.0
	AuthName "Insiders Only"
	AuthType Basic
	AuthUserFile /usr/local/web/apache/.htpasswd-foo
	Require valid-user
	Satisfy Any
	</Files>
			

User files

The mod_auth module uses plain text files that contain lists of valid users. The htpasswd command can be used to create and update these files. The resulting files are plain text files, which can be read by any editor. They contain entries of the form username:password, where the password is encrypted. Additional fields are allowed, but ignored by the software.

htpasswd encrypts passwords using either a version of MD5 modified for Apache or the older crypt() routine. You can mix and match.

	SYNOPSIS
	htpasswd [ -c ] passwdfile username
			

Here are two examples of using htpasswd for creating an Apache password file. The first is for creating a new password file while adding a user, the second is for changing the password for an existing user.

	$ htpasswd -c /home/joe/public/.htpasswd joe
	$ htpasswd /home/joe/public/.htpasswd stephan
			

Note

Using the -c option, the specified password file will be overwritten if it already exists!

Group files

Apache can work with group files. Group files contain group names followed by the names of the people in the group. By authorizing a group, all users in that group have access. Group files are known as .htgroup files and by convention bear that name - though you can use any name you want. Group files can be located anywhere in the directory tree but are normally placed in the toplevel directory of the tree they help to protect. To allow the use of group files you will need to include some directives in the Apache main configuration file. This will normally be inside the proper Directory definition. An example:

Apache main configuration file:

	...
	AuthType Basic
	AuthUserFile /var/www/.htpasswd
	AuthGroupFile /var/www/.htgroup
	Require group Management
	...
		

The associated .htgroup file might have the following syntax:

	Management: bob alice
	Accounting: joe
		

Now the accounts 'bob' and 'alice' would have access to the resource but account 'joe' would not due to the Require group Management statement in the main configuration file because 'joe' is not a member of the required 'Management' group. For this to work the users specified in the .htgroup file must have an entry in the .htpasswd file as well.

Note

A username can be in more than one group entry. This simply means that the user is a member of both groups.

To use a DBM database (as used by mod_auth_db) you may use dbmmanage. For other types of user files/databases, please consult the documentation that comes with the chosen module.

Note

Make sure the various files are readable by the webserver.

Configuring mod_perl

mod_perl is another module for Apache, which loads the Perl interpreter into your Apache webserver, reducing spawning of child processes and hence memory footprint and need for processor power. Another benefit is code-caching: modules and scripts are loaded and compiled only once, and will be served from the cache for the rest of the webserver's life.

Using mod_perl allows inclusion of Perl statements into your webpages, which will be executed dynamically if the page is requested. A very basic page might look like this:

	print "Content-type: text/plain\r\n\r\n";
	print "Hello, you perly thing!\n";
			

mod_perl also allows you to write new modules in Perl. You have full access to the inner workings of the web server and can intervene at any stage of request-processing. This allows for customized processing of (to name just a few of the phases) URI->filename translation, authentication, response generation and logging. There is very little run-time overhead.

The standard Common Gateway Interface (CGI) within Apache can be replaced entirely with Perl code that handles the response generation phase of request processing. mod_perl includes two general purpose modules for this purpose. The first is Apache::Registry, which can transparently run well-written existing perl CGI scripts. If you have badly written scripts, you should rewrite them. If you lack resources, you may choose to use the second module Apache::PerlRun instead because it doesn't use caching and is far more permissive then Apache::Registry.

You can configure your httpd server and handlers in Perl using PerlSetVar, and <Perl> sections. You can also define your own configuration directives, to be read by your own modules.

There are many ways to install mod_perl, e.g. as a DSO, either using APXS or not, from source or from RPM's. Most of the possible scenarios can be found in the Mod_perl Guide PerlRef01.

Building Apache from source code

For building Apache from source code you should have downloaded the Apache source code, the source code for mod_perl and have unpacked these in the same directory [5]. You'll need a recent version of perl installed on your system. To build the module, in most cases, these commands will suffice:

	$ cd ${the-name-of-the-directory-with-the-sources-for-the-module}
	$ perl Makefile.PL APACHE_SRC=../apache_x.x.x/src \
	DO_HTTPD=1 USE_APACI=1 EVERYTHING=1
	$ make && make test && make install
				

After building the module, you should also build the Apache server. This can be done using the following commands:

	$ cd ${the-name-of-the-directory-with-the-sources-for-Apache}
	$ make install
				

All that's left then is to add a few configuration lines to httpd.conf (the Apache configuration file) and start the server. Which lines you should add depends on the specific type of installation, but usually a few LoadModule and AddModule lines suffice.

As an example, these are the lines you would need to add to httpd.conf to use mod_perl as a DSO:

	LoadModule perl_module modules/libperl.so
	AddModule mod_perl.c
	PerlModule Apache::Registry 

	Alias /perl/ /home/httpd/perl/ 
	<Location /perl>
	SetHandler perl-script 
	PerlHandler Apache::Registry 
	Options +ExecCGI
	PerlSendHeader On 
	</Location>
				

The first two lines will add the mod_perl module when Apache starts. During startup, the PerlModule directive ensures that the named Perl module is read in too. This usually is a Perl package file ending in .pm. The Alias keyword reroutes requests for URIs in the form http://www.host.com/perl/file.pl to the directory /home/httpd/perl. Next, we define settings for that location. By setting the SetHandler, all requests for a Perl file in the directory /home/httpd/perl now will be redirected to the perl-script handler, which is part of the Apache::Registry module. The next line simply allows execution of CGI scripts in the specified location instead of displaying this file. Any URI of the form http://www.host.com/perl/file.pl will now be compiled once and cached in memory. The memory image will be refreshed by recompiling the Perl routine whenever its source is updated on disk. Setting PerlSendHeader to on tells the server to send an HTTP headers to the browser on every script invocation but most of the time it's better either to use the $r->send_http_header method using the Apache Perl API or to use the $q->header method from the CGI.pm module.

Configuring mod_php support

PHP is a server-side, cross-platform, HTML embedded scripting language. PHP started as a quick Perl hack written by Rasmus Lerdorf in late 1994. Later he rewrote his code in C and hence the "Personal Home Page/Forms Interpreter" (PHP/FI) was born. Over the next two to three years, it evolved into PHP/FI 2.0. Zeev Suraski and Andi Gutmans wrote a new parser in the summer of 1997, which led to the introduction of PHP 3.0. PHP 3.0 defined the syntax and semantics used in both versions 3 and 4. PHP became the de facto programming language for millions of web developers. Still another version of the (Zend) parser and much better support for object oriented programming led to the introduction of version 5.0 in july 2004. Several subversions followed and also version 6 was started to include native Unicode support. However this version was abandoned. For the year 2015 the start for version 7.0 was planned.

PHP can be called from the CGI interface, but the common approach is to configure PHP in the Apache web server as a (dynamic) DSO module. To do this, you can either use pre-built modules extracted from RPM's or roll your own from the source code[6]. You need to configure the make process first. To tell configure to build the module as a DSO, you need to tell it to use APXS:

	./configure -with-apxs
			

.. or, in case you want to specify the location for the apxs binary:

	./configure -with-apxs={path-to-apxs}/apxs
			

Next, you can compile PHP by running the make command. Once all the source files are successfully compiled, install PHP by using the make install command.

Before Apache can use PHP, it has to know about the PHP module and when to use it. The apxs program took care of telling Apache about the PHP module, so all that is left to do is tell Apache about .php files. File types are controlled in the httpd.conf file, and it usually includes lines about PHP that are commented out. You may want to search for these lines and uncomment them:

	Addtype application/x-httpd-php .php 
			

Then restart Apache by issuing the apachectl restart command. The apachectl command is another way of passing commands to the Apache server instead of using /etc/init.d/httpd. Consult the apachectl(8) manpage for more information.

To test whether it actually works, create the following page:

	<HTML>
	<HEAD><TITLE>PHP Test </TITLE></HEAD>
	<BODY>
	<?php phpinfo( ) ?>
	</BODY>
	</HTML>
			

Save the file as test.php in Apache's htdocs directory and aim your browser at http://localhost/test.php. A page should appear with the PHP logo and additional information about your PHP configuration. Notice that PHP commands are contained by <? and ?> tags.

The httpd binary

The httpd binary is the actual HTTP server component of Apache. During normal operation, it is recommended to use the apachectl or apache2ctl command to control the httpd daemon. On some distributions the httpd binary is named apache2.

Apache used to be a daemon that forked child-processes only when needed. To allow better response times, nowadays Apache can also be run in pre-forked mode. This means that the server will spawn a number of child-processes in advance, ready to serve any communication requests. On most distributions the pre-forked mode is run by default.

Configuring Apache server options

The httpd.conf file contains a number of sections that allow you to configure the behavior of the Apache server. A number of keywords/sections are listed below.

MaxKeepAliveRequests

The maximum number of requests to allow during a persistent connection. Set to 0 to allow an unlimited amount.

StartServers

The number of servers to start initially.

MinSpareServers, MaxSpareServers

Used for server-pool size regulation. Rather than making you guess how many server processes you need, Apache dynamically adapts to the load it sees. That is, it tries to maintain enough server processes to handle the current load, plus a few spare servers to handle transient load spikes (e.g., multiple simultaneous requests from a single browser). It does this by periodically checking how many servers are waiting for a request. If there are fewer than MinSpareServers, it creates a new spare. If there are more than MaxSpareServers, the superfluous spares are killed.

MaxClients

Limit on total number of servers running, i.e., limit on the number of clients that can simultaneously connect. If this limit is ever reached, clients will be locked out, so it should not be set too low. It is intended mainly as a brake to keep a runaway server from taking the system with it as it spirals down.

Note

In most Red Hat derivates the Apache configuration is split into two subdirectories. The main configuration file httpd.conf is located in /etc/httpd/conf. The configuration of Apache modules is located in /etc/httpd/conf.d. Files in that directories with the suffix .conf are added to the Apache configuration during startup of Apache.

Apache Virtual Hosting

Virtual Hosting is a technique that provides the capability to host more than one domain on one physical host. There are two methods to implement virtual hosting:

* Name-based virtual hosting.  With name-based virtual hosting, the server relies on the client (e.g. the browser) to report the hostname as part of the HTTP headers. Using this technique, many different hosts can share the same IP address.

* IP-based virtual hosting.  Using this method, each (web) domain has it's own unique IP address. Since one physical host can have more than one IP address, one host can serve more than one (web) domain. In other words: IP-based virtual hosts use the IP address of the connection to determine the correct virtual host to serve.

Name-based virtual hosting

Name-based virtual hosting is a fairly simple technique. You need to configure your DNS server to map each hostname to the correct IP address first, then configure the Apache HTTP Server to recognize the different hostnames.

Tip

Name-based virtual hosting eases the demand for scarce IP addresses. Therefore you should use name-based virtual hosting unless there is a specific reason to choose IP-based virtual hosting, see IP-based Virtual Hosting.

To use name-based virtual hosting, you must designate the IP address (and possibly port) on the server that will be accepting requests for the hosts. This is configured using the NameVirtualHost directive. Any IP address can be used but normally all IP addresses on the server should be used, so you can use * as the argument to NameVirtualHost. Note that mentioning an IP address in a NameVirtualHost directive does not automatically make the server listen to that IP address: there are two additional directives used to restrict or specify which addresses and ports Apache listens to, i.e. the BindAddress or Listen directive and the VirtualHost directive.

The BindAddress directive could be used in the older Apache versions (up to Apache 1.3). BindAddress is deprecated and is eliminated in Apache 2.0. Equivalent functionality and more control over the IP address and ports Apache listens to, is available using the Listen directive.

  • BindAddress (deprecated) is used to restrict the server to listening to a single address, and can be used to permit multiple Apache servers on the same machine to listen to different IP addresses;

  • Listen can be used to make a single Apache server listen to more than one IP address and/or port.

The <VirtualHost> directive is the next step to create for each different host you would like to serve. The argument to the <VirtualHost> directive should be the same as the argument to the NameVirtualHost directive (i.e., an IP address or * for all addresses). Inside each <VirtualHost> block you will need, at minimum, a ServerName directive to designate which host is served and a DocumentRoot directive to point out where in the filesystem the content for that host can be found.

Suppose that both www.domain.tld and www.otherdomain.tld point to the IP address 111.22.33.44. You then simply add the following to httpd.conf:

	NameVirtualHost 111.22.33.44

	<VirtualHost 111.22.33.44>
	ServerName www.domain.tld
	DocumentRoot /www/domain
	</VirtualHost>

	<VirtualHost 111.22.33.44>
	ServerName www.otherdomain.tld
	DocumentRoot /www/otherdomain
	</VirtualHost>
				

In the simplest case, in the example above, the IP address 111.22.44.33 can be replaced by * to match all IP addresses for your server. Many servers want to be accessible by more than one name. This is possible with the ServerAlias directive placed inside the <VirtualHost> section.

If, for example, you add the following to the first <VirtualHost> block above

	ServerAlias domain.tld *.domain.tld 
				

then requests for all hosts in the domain.tld domain will be served by the www.domain.tld virtual host. The wildcard characters * and ? can be used to match names.

Tip

Of course, you can't just make up names and place them in ServerName or ServerAlias. Your DNS server must be properly configured to map those names to the IP address in the NameVirtualHost directive.

Finally, you can fine-tune the configuration of the virtual hosts by placing other directives inside the <VirtualHost> containers. Most directives can be placed in these containers and will then change the configuration only of the relevant virtual host. Configuration directives set in the main server context (outside any <VirtualHost> container) will be used only if they are not overridden by the virtual host settings.

Now when a request arrives, the server will first check if it is requesting an IP address that matches the NameVirtualHost. If it is, then it will look at each <VirtualHost> section with a matching IP address and try to find one where the ServerName or ServerAlias matches the requested hostname. If it finds one, it then uses the corresponding configuration for that server. If no matching virtual host is found, then the first listed virtual host that matches the IP address will be used.

As a consequence, the first listed virtual host is the default virtual host. The DocumentRoot from the main server will never be used when an IP address matches the NameVirtualHost directive. If you would like to have a special configuration for requests that do not match any particular virtual host, simply put that configuration in a <VirtualHost> container and place it before any other <VirtualHost> container in the configuration file.

IP-based virtual hosting

Despite the advantages of name-based virtual hosting, there are some reasons why you might consider using IP-based virtual hosting instead:

  • Name-based virtual hosting cannot be used with SSL secure servers because of the nature of the SSL protocol;

  • Some ancient clients are not compatible with name-based virtual hosting. For name-based virtual hosting to work, the client must send the HTTP Host header. This is required by HTTP/1.1, and is implemented by all modern HTTP/1.0 browsers as an extension;

  • Some operating systems and network equipment implement bandwidth management techniques that cannot differentiate between hosts unless they are on separate IP addresses.

As the term IP-based indicates, the server must have a different IP address for each IP-based virtual host. This can be achieved by equipping the machine with several physical network connections or by use of virtual interfaces, which are supported by most modern operating systems (see system documentation for details on IP aliasing and the ifconfig command).

There are two ways of configuring Apache to support multiple hosts:

  • By running a separate httpd daemon for each hostname;

  • By running a single daemon which supports all the virtual hosts.

Use multiple daemons when:

  • There are security issues, e.g., if you want to maintain strict separation between the web-pages for separate customers. In this case you would need one daemon per customer, each running with different User, Group, Listen and ServerRoot settings;

  • You can afford the memory and file descriptor requirements of listening to every IP alias on the machine. It's only possible to Listen to the wildcard address, or to specific addresses. So, if you need to listen to a specific address, you will need to listen to all specific addresses.

Use a single daemon when:

  • Sharing of the httpd configuration between virtual hosts is acceptable;

  • The machine serves a large number of requests, and so the performance loss in running separate daemons may be significant.

Setting up multiple daemons

Create a separate httpd installation for each virtual host. For each installation, use the Listen directive in the configuration file to select which IP address (or virtual host) that daemon services:

	Listen 123.45.67.89:80
					

You can use the domain name if you want to, but it is recommended you use the IP address instead.

Setting up a single daemon

For this case, a single httpd will service requests for the main server and all the virtual hosts. The VirtualHost directive in the configuration file is used to set the values of ServerAdmin, ServerName, DocumentRoot, ErrorLog and TransferLog or CustomLog configuration directives to different values for each virtual host.

	<VirtualHost www.smallco.com>
	ServerAdmin webmaster@mail.smallco.com
	DocumentRoot /groups/smallco/www
	ServerName www.smallco.com
	ErrorLog /groups/smallco/logs/error_log
	TransferLog /groups/smallco/logs/access_log
	</VirtualHost>

	<VirtualHost www.baygroup.org>
	ServerAdmin webmaster@mail.baygroup.org
	DocumentRoot /groups/baygroup/www
	ServerName www.baygroup.org
	ErrorLog /groups/baygroup/logs/error_log
	TransferLog /groups/baygroup/logs/access_log
	</VirtualHost>
					

Customizing file access

Redirect allows you to tell clients about documents which used to exist in your server's namespace, but do not anymore. This allows you to tell the clients where to look for the relocated document.

	Redirect {old-URI} {new-URI}
			



[5] The mod_perl module can be obtained at perl.apache.org, the source code for Apache at www.apache.org

[6] The source code for PHP4 can be obtained at www.php.net

Copyright Snow B.V. The Netherlands