Understanding Apache 2.2 Configuration

Understanding Apache 2.2
Configuration
Brad Nicholes
Senior Software Engineer, Novell Inc.
Member, Apache Software Foundation
[email protected] (with revisions)
1







A PAtCHy server: developed by the Apache group formed
2/95 around a number of people who provided patch files
for NCSA httpd 1.3 by Rob McCool.
History-http://www.apache.org/ABOUT_APACHE.html
First official public release (0.6.2) in April 1995
Add adaptive pre-fork child processes (very important!).
Modular structure and API for extensibility (Bob Thau)
Port to multiple platforms. Add documentation.
Apache 1.0 was released on 12/1/95.
Passed NCSA httpd to be #1 server in Internet.
2



Apache is current market share leader in web
servers.
You can download it from www.apache.org
See survey statistics in
http://news.netcraft.com/archives/web_serv
er_survey.html
3

Shipping:
Apache 1.3.37 – Maintenance mode, no new
development
 Apache 2.0.59 – Maintenance mode, no new
development
 Apache 2.2.9 – Current release


Development:

Apache 2.3.x-dev – Unstable, all new
development happens here first
4

Download httpd-2.2.0.tar.bz2 from
http://www.apache.org/dist or closer mirror sites
$tar xjf httpd-2.2.0.tar.bz2
$ ./configure --prefix=PREFIX
$ make
$ make install
$ PREFIX/bin/apachectl start



Here PREFIX is the prefix of the directory containing the
distribution, typically it is /usr/local/apache.
For configuring the apache with specific features, we can
specify the corresponding features as option to the
configure command. You can find the list of features by
“./configure –help”
Here is a command used to compile the httpd with proxy
and cache modules needed.
5

File Locations






Modules - /usr/lib/apache2
Configuration - /etc/apache2
Logs - /var/log/apache2
Cgi-bin - /srv/www/cgi-bin
DocumentRoot - /srv/www/htdocs
Binary - /usr/sbin/httpd2 (symlink to actual binary)
/usr/sbin/httpd2-worker
 /usr/sbin/httpd2-prefork



Other support binaries - /usr/sbin
Startup script – /usr/sbin/rcapache2

Symlink to /etc/init.d/apache2
6


Accommodate a wide variety of operating
environments on different platforms
Responsible for:




Binding to network ports
Accepting requests
Dispatching worker threads to handle requests
Allows customization for particular sites



Scalability in a threaded environment – Worker
MPM
Compatibility with older modules – Prefork MPM
Platform custom – NetWare MPM, WinNT MPM
7

Combines multi-process and multi-threaded
models
Variable number of processes (parents)
 Fixed number of threads





Each child process handles many concurrent
connections
Stability of multiple processes
Performance of multiple threads
Reduces the memory footprint
8

Worker MPM - Multi-Processing Module implementing a
hybrid multi-threaded / multi-process web server







StartServers - Number of child server processes created at
startup
MinSpareThreads - Minimum number of idle threads allowed
before additional worker threads are created
MaxSpareThreads - Maximum number of idle threads allowed
before excess worker threads are destroyed
MaxClients - Maximum number of worker threads allowed
MaxMemFree - Maximum amount of memory that the main
allocator is allowed to hold without calling free()
ThreadsPerChild - Number of threads created by each child
process
http://httpd.apache.org/docs/2.2/mod/worker.html
9


Stable but slower (based on documentation)
One parent (master server)





many children (workers)
Each child server is a process itself
Each child handles one connection at a time
Uses more memory
Similar to the NetWare MPM but using
processes instead of threads
10

Prefork MPM - Implements a non-threaded, preforking web server






StartServers - Number of child server processes created at
startup
MinSpareServers - Minimum number of idle child server
processes
MaxSpareServers - Maximum number of idle child server
processes
MaxClients - Maximum number of child processes that
will be created to serve requests
MaxMemFree - Maximum amount of memory that the
main allocator is allowed to hold without calling free()
http://httpd.apache.org/docs/2.2/mod/prefork.html
11
Reading the Documentation



Online:
http://httpd.apache.or
g/docs/2.2/
Also installed with
every instance of
Apache
Most directives consist
of a name and a single
value


Some directives may
have multiple, optional
or boolean values
Example directive:






The default HTTPD.conf file contains a very good
explanation of each directive that is used and why
The directives are not ordered
The configuration file contains one directive per line
but the “\” may be used to indicate that the directive
continues to the next line
Configuration directives are case-insensitive but some
arguments may be case-sensitive
Lines that begin with “#” are considered to be
comments
<IfDefine> can be used to block out sections of the
configuration file that are only used if a specific
environment variable has been defined
13





Directives that limit the application of other
directives.
Specify by a group like a tag section in html.
<VirtualHost host[:port]>
...
</VirtualHost>
<VirtualHost…><Directory dir>, <Files file>,
<Location URL> in ascending order of
authority. <Location> can overwrite others.
dir, file, URL can specify using wildcards and
full regular expressions preceded by “~”
14








KeepAlive [on|off](on): keep connection alive for n requests
before terminate provided they come in before timeout. n is
defined in
MaxKeepAliveRequests <n>(100) directive
KeepAliveTimeout <n>(15): wait for the next request for n
seconds before terminate the connections.
Timeout <n>(300): max. time in sec for a block data.
HostNameLookups [on|off|double](off): do reverse DNS lookup
for logging the domain name of the request.
MaxClients <n>(256): the limit of # of simultaneous requests
(hence the # of child processes).
MaxRequestsPerChild <n>(0): Spare(child) server dies after <n>
requests, avoid mem leak. 0 mean infinite requests.
Min/MaxSpareServers <n>(5/10): # of Idle child servers
StartServers <n>(5): sets the number of child server processes
created on startup.
15

ServerRoot – Base directory for the server
installation



All relative paths are derived from the ServerRoot
If you have multiple installations of the web server,
make sure that the ServerRoot points to the
respective install locations
PidFile - File where the server records the
process ID of the daemon

If an error message occurs when starting Apache on
Linux indicating that HTTPD is already running, it
may be that an old httpd.pid file was orphaned after
an abnormal shutdown (ie. Kill -9)
16

Timeout – Amount of time the server will wait for
send or receive events before failing a request
(Default 300 seconds or 5 minutes)


If Apache appears to hang during a shutting down on
NetWare, it may be that a worker thread is waiting for
data from the client. After the timeout period has expired,
Apache will shutdown normally.
KeepAlive – Enable persistent connections (ie. Avoids
having to reconnect with the same client on subrequests)

If the connection is not properly terminated by the client,
the connection will be held for the duration of the
KeepAliveTimeout value. This could cause unecessary
latency when responding to new requests on a busy server
17

Listen – Binds Apache to a specific IP address and/or
port


LoadModule – Loads an external Apache module


If only a port is specified, Apache will listen to that port
on all IP addresses assigned to the box
<IfModule> - Should surround module specific directives
to prevent invalid configuration if a module has not been
loaded
UseCanonicalName – Determines how Apache
constructs self-referencing URLs (ie. Redirects)

ServerName – Used to construct a self-referencing URL
when UseCanonicalName is set to ON. Otherwise
Apache uses the host name supplied by the client
18

DocumentRoot – Default location from which all
documents are served


If an alias for a URI is not found, Apache will attempt to
serve the page from the DocumentRoot
Options – Configures the features that are available in
a specific directory

Indexes – Allows a directory listing




AddIcon - Specifies the location and file name of the icon that
should be displayed for a given file type
Multiviews – Allows language negotiation
ExecCGI – Allow CGI binaries or scripts to be executed
Includes – Enables Server-Side includes or parsed HTML
19

Order/Allow/Deny – Specifies access control
restrictions



The Order directive determines whether Apache should
be inclusive or exclusive when applying access control
Both Allow and Deny can be used to restrict access based
on full or partial IP addresses, network masks or
environment variables
DirectoryIndex – Specifies the default file name(s) to
serve when no page is specifed in the request

The file index.html.var can be used to specify additional
language negotiation rules rather than an actual web
page
20

CustomLog – Defines the location and format of
a custom log file



When used with the LogFormat directive, the
contents of the log file as well as the format can be
specified
Multiple log files can be defined containing different
information or layouts (Warning: specifying
additional log files may hurt performance)
Alias – Associates a URI prefix with a physical
directory location

<Directory>/<Location>/<Files> - Should
accompany the Alias directive to indicate how files
are accessed from the aliased location
21

ErrorDocument – Defines a custom or user
friendly response to an HTTP error



The response can be in plain text, local redirect or
external redirect
If the response is a redirect, the language can be
negotiated so that it is appropriate for the request
BrowserMatch – Customizes the request
handling for particular browsers

Can be used to force a response to HTTP 1.0 rather
than 1.1 or to turn off keepalive connections for older
browsers
22




Functional blocks of directives can be put
into a separate configuration file
Use the “Include” directive to instruct
Apache to read additional configuration files
If the “Include” directive specifies a
directory, all files within the directory will be
read as additional configuration files
Wildcards can be used to specify a certain set
of additional configuration files (include
conf/*.conf)
23

Apache supports two types of virtual hosts

Name-based virtual host
Selects a virtual host configuration based on the domain
name of the request
 Allows more that one virtual host per IP address


IP-based virtual
Selects a virtual host configuration based on the IP address
of the request
 Each IP address belongs to a specific virtual host


Each virtual host can be configured
independently

ServerName, DocumentRoot, Aliases, log files, etc.
24

There are a few way we can host a web site:

Name-based Virtual Hosting





IP-based virtual Hosting:




A set of hostnames shared the same IP address (similar to alias)
utilize the HOST: meta header in http request (browser fill in the hostname)
to distinguish different web site.
Each hostname will have its own site configuration, document root.
Require either the set of hostnames are registered DNS names or the client
machines need to configure their ip addresses mapping in hostfiles such as
/etc/hosts (Unix) or C:\WINDOWS\system32\drivers\etc\hosts
(Windows)
Require a unique IP address for each virtual hosting site
Use IP alieas to configure the same Network Interface Card (NIC) to listen
to different IP address, e.g., ifconfig eth0:1 128.198.160.33
Some Unix system sets limit on how many IP aliases can be supported.
Use <VirtualHost hostname[:port]> block directives

Specify ServerAdmin, DocumentRoot, ServerName, ErrorLog,
TransferLog for individual VH
25


With Virtual Machine (VMWare/VPC). We can configure
a virtual machine for each web site. This gives each site
total control of the OS of the virtual machine.
We can gracefully shutdown/restart individual web site
(for maintenance/configuration/software updates).




Cannot be done with name-based or IP-based virtual hosting
web sites.
We can configure different software packages, OS for each
individual web site.
Allow total control for the admin of the web site (root
privilege, user creation, etc)
Disadvantage: Require more resources (CPU, memory,
Disk).
26
NameVirtualHost *:80
<VirtualHost *:80>
ServerName www.domain.com
ServerAlias domain.com *.domain.com
DocumentRoot /www/domain
</VirtualHost>
<VirtualHost _default_>
ServerName www.otherdomain.com
DocumentRoot /www/otherdomain
</VirtualHost>
•
•
•
NameVirtualHost specifies the IP address that will be shared
ServerAlias directive allows access to a specific virtual host by
different domain names
Apache uses the ServerName directive to decide which virtual
host configuration applies based upon the HOST: header request
27
<VirtualHost www.smallco.com> ServerAdmin [email protected]
DocumentRoot /groups/smallco/www
ServerName www.smallco.com
ErrorLog /groups/smallco/logs/error_log
CustomLog /groups/smallco/logs/access_log combined
</VirtualHost>
<VirtualHost www.baygroup.org>
ServerAdmin [email protected]
DocumentRoot /groups/baygroup/www
ServerName www.baygroup.org
ErrorLog /groups/baygroup/logs/error_log
CustomLog /groups/baygroup/logs/access_log combined
</VirtualHost>
•
•
Apache determines which virtual host to use based off of the IP
address resolved from the host name
Almost any configuration directive can be put in a virtual host block
with the exception of some of the process creation directives
28



A single instance of the Apache Web server can be
used to serve page content in multiple languages
Language negotiation does not depend on the server
installed language
The <Directory> or <Location> block must contain
one of the following:



“Option Multiviews” to enable language file matching
“AddHandler type-map var” to specify a type-map file
that contains language definitions
Each HTML file encoded for a different language,
must append the corresponding language extention

Example: index.html.en – English, index.html.fr –
French
29

The following directives are used by the
language negotiation functionality:

- AddLanguage

- AddDefaultCharset



- AddLanguage
- LanguagePriority
LanguagePriority
- AddDefaultCharset
- DefaultLanguage
- DefaultLanguage
- ForceLanguagePriority
- ForceLanguagePriority
- AddCharset
- AddCharset
Each browser request contains an “acceptlanguage” header that indicates the language(s)
that the client will accept
The languages are usually specified by either 2
or 4 character keys (en, en-us, fr, de, es, ...)
30

Multiviews enabled negotiation




Type-Map enabled negotiation




Apache matches the “accept-language” key to a file extension
through the “AddLanguage” directives in the HTTPD.conf file
Apache first searches for an exact match of the specified file
Apache next searches for the specified file with the 2 or 4 character
appended language extension
Apache searches for the specified file with the type-map extension
(usually .var)
Apache reads the .var file and selects the file name that is associated
with the appropriate language
If a language file is not found, Apache will fallback to the
LanguagePriority and ForceLanguagePriority directives to
determine how to handle the request
More info:

http://httpd.apache.org/docs/2.2/content-negotiation.html
31


Directives enclosed in a <Directory> block apply to
the specified file system directory and sub-directories
Directives enclosed in a <Location> block apply to
the specified web space container

<Location /private> would apply to any URL-path that
begins with “/private”




http://your.domain.com/private
http://your.domain.com/private123
http://your.domain.com/private/mydocs/index.html
Able to apply directives to locations that don't physically
exist such as a module handler



<Location /server-status>
SetHandler server-status
</Location>
32





Default SSL port for an HTTP server is 443
All SSL requests and responses are handled through
the MOD_SSL module (NetWare handles SSL
natively)
SSL configuration is done by creating a virtual host
that listens the designated SSL port
Example SSL configuration is found in
conf/extra/httpd-ssl.conf of the Apache HTTPD
distribution
Additional documentation can be found at:


http://httpd.apache.org/docs/2.2/ssl
http://httpd.apache.org/docs/2.2/mod/mod_ssl.html
33





Terms / Authentication Elements:
Authentication Type – Type of encryption used during transport
of the authentication credentials (Basic or Digest)
Authentication Method/Provider – Process by which a user is
verified to be who they say they are
Authorization – Process by which authenticated users are granted
or denied access based on specific criteria
Previous to Apache 2.2, every authentication module had to
implement all three elements



Choosing an AuthType limited which authentication and
authorization methods could be used
Potential for inconsistencies across authentication modules
Note: Pay close attention to the words Authentication vs.
Authorization
34



The functionality of each Apache 2.0 authentication
module has been split out into the three
authentication elements for Apache 2.2
Overlapping functionality among the modules was
simply eliminated in favor of a base implementation
The module name indicates which element of the
authentication functionality it performs



Mod_auth_xxx – Implements an Authentication Type
Mod_authn_xxx – Implements an Authentication Method
or Provider
Mod_authz_xxx – Implements an Authorization Method
35
New Modules – Authentication
Type
Modules
Directives
Mod_Auth_Basic
• AuthBasicAuthoritative
Basic authentication – User credentials are
received by the server as unencrypted data
• AuthBasicProvider
Mod_Auth_Digest
• AuthDigestAlgorithm
MD5 Digest authentication – User credentials
are received by the server in encrypted format
• AuthDigestDomain
• AuthDigestNcCheck
• AuthDigestNonceFormat
• AuthDigestNonceLifetime
• AuthDigestProvider
• AuthDigestQop
• AuthDigestShmemSize
36
New Modules – Authentication
Providers
Modules
Directives
Mod_Authn_Anon
• Anonymous
Allows “anonymous” user access to
authenticated areas
• Anonymous_LogEmail
• Anonymous_MustGiveEmail
• Anonymous_NoUserID
• Anonymous_VerifyEmail
Mod_Authn_DBM
• AuthDBMType
DBM file based user authentication
• AuthDBMUserFile
Mod_Authn_Default
• AuthDefaultAuthoritative
Authentication fallback module
37
New Modules – Authentication
Providers
Modules
Mod_Authn_File
Directives
• AuthUserFile
File based user authentication
Mod_Authnz_LDAP
• AuthLDAPBindDN
LDAP directory based authentication
• AuthLDAPBindPassword
• AuthLDAPCharsetConfig
• AuthLDAPDereferenceAliases
• AuthLDAPRemoteUserIsDN
• AuthLDAPUrl
38
New Modules - Authorization
Modules
Directives
Mod_Authnz_LDAP
• Require
ldap-user
LDAP directory based authorization
• Require
ldap-group
• Require
ldap-dn
• Require
ldap-attribute
• Require
ldap-filter
• AuthLDAPCompareDNOnServer
• AuthLDAPGroupAttribute
• AuthLDAPGroupAttributeIsDN
• AuthzLDAPAuthoritative
Mod_Authz_Default
• AuthzDefaultAuthoritative
Authorization fallback module
39
New Modules - Authorization
Modules
Directives
Mod_Authz_DBM
• Require
file-group*
DBM file based group authorization
• Require
group
• AuthDBMGroupFile
• AuthzDBMAuthoritative
• AuthzDBMType
Mod_Authz_GroupFile
• Require
file-group*
File based group authorization
• Require
group
• AuthGroupFile
• AuthzGroupFileAuthoritative
Mod_Authz_Host
• Allow
Group authorization based on host (name or IP
address)
• Deny
• Order
40
New Modules - Authorization
Modules
Directives
Mod_Authz_Owner
• Require
Authorization based on file ownership
• AuthzOwnerAuthoritative
Mod_Authz_User
• Require
valid-user
User authorization
• Require
user
file-owner
• AuthzUserAuthoritative
41

New Directives




Renamed Directives


AuthBasicProvider On|Off|provider-name
[provider-name]…
AuthDigestProvider On|Off|provider-name
[provider-name]…
AuthzXXXAuthoritative On|Off
AuthBasicAuthoritative On|Off
Multiple modules must be loaded (auth, authn,
authz) rather than a single mod_auth_xxx
module
42

Apache 2.0




Apache 2.2






Require Valid-User
Require User user-id [user-id] …
Require Group group-name [group-name] …
Same as Apache 2.0
LDAP - ldap-user, ldap-group, ldap-dn, ldap-filter, ldapattribute
GroupFile – file-group*
DBM – file-group*
Owner – file-owner
Since multiple authorization methods can be used, in
most cases the type names should be unique
43
LoadModule
LoadModule
LoadModule
LoadModule
auth_basic_module
authn_file_module
authz_user_module
authz_host_module
modules/mod_auth_basic.so
modules/mod_authn_file.so
modules/mod_authz_user.so
modules/mod_authz_host.so
<Directory /www/docs>
Order deny,allow
Allow from all
AuthType Basic
AuthName Authentication_Test
AuthBasicProvider file
AuthUserFile /www/users/users.dat
require valid-user
</Directory>
The
authentication
provider is file
based and the
authorization
method is any
valid-user
44
LoadModule
LoadModule
LoadModule
LoadModule
LoadModule
LoadModule
auth_basic_module modules/mod_auth_basic.so
authn_file_module modules/mod_authn_file.so
authz_user_module modules/mod_authz_user.so
authz_host_module modules/mod_authz_host.so
authnz_ldap_module modules/mod_authnz_ldap.so
ldap_module modules/mod_ldap.so
The
<Directory /www/docs>
authentication
Order deny,allow
includes both file
Allow from all
and LDAP
AuthType Basic
providers with
AuthName Authentication_Test
the file provider
AuthBasicProvider file ldap
taking
AuthUserFile /www/users/users.dat
precedence
AuthLDAPURL ldap://ldap.server.com/o=my-context
followed by LDAP
AuthzLDAPAuthoritative off
require valid-user
</Directory>
45



Moving from hook-based to provider-based
authorization
“AND/OR/NOT” logic in authorization
Host Access Control as an authorization type




Require IP …, Require Host …, Require Env …
Require All Granted, Require All Denied
“Order Allow/Deny”, “Satisfy” where did they go?
Backward compatibility with the 2.0/2.2 Host Access
Control, use the Mod_Access_Compat module
46


Allows authorization to be granted or denied
based on a complex set of “Require…”
statements
New Directives
<SatisfyAll> … </SatisfyAll> - Must satisfy all of
the encapsulated statements
 <SatisfyOne> … </SatisfyOne> - Must satisfy at
least one of the encapsulated statements
 <RequireAlias> … </RequireAlias> - Defines a
‘Require’ alias
 Reject – Reject all matching elements

47