Next: 4.5 The Configuration Processor
Up: 4. Inside Apache
Previous: 4.3 Multitasking server architectures
Contents
Index
Subsections
4.4 The Request-Response Loop
The Request-Response Loop is the heart of the HTTP Server. Every
Apache child server processes this loop until it dies either because
it was asked to exit by the master server or because it realized that
its generation is obsolete.
Figure 4.13 shows the request-response
loop and the keep-alive loop. To be exact, the request-response
loop deals with waiting for and accepting connection requests while
the keep-alive loop deals with receiving and responding to HTTP requests
on that connection.
Depending on the multitasking architecture used, either each idle
worker tries to become listener, or it waits for a job in a job queue.
In both cases it will be suspended until the mutex or the job queue
indicates either that it will be the next listener or that a new job
is in the queue.
The transitions in the rectangle ``wait for TCP request'' in figure
4.13 show the Leader-Follower
model of the Preforking MPM: The child server task tries to get the
accept mutex to become listener and will be suspended again
until a TCP connection request comes in. It accepts the connection
and releases the accept mutex. (see also child_main()
in prefork.c).
After a child server received a connection request, it leaves the
scope of the MPM and triggers the hooks pre_connection
and process_connection.
The module http_core.c registers the handler ap_process_http_connection()
for the latter hook which reads and processes the request.
4.4.3 Waiting for and reading HTTP requests
An HTTP client, for example a browser, re-uses an existing TCP connection
for a sequence of requests. An HTML document with 5 images results
in a sequence of 6 HTTP requests that can use the same TCP connection.
The TCP connection is closed after a time-out period (usually 15
seconds). As the HTTP header used to control the connection had the
value ``keep-alive'', the loop carries this name.
The keep-alive loop for reading and processing HTTP requests is specific
for HTTP. Therefore in Apache 2, the module http_core.c
registers the handler ap_process_http_connection()
which includes the keep-alive loop. Similar to the transient pool,
the request pool is cleared with every run of the keep-alive loop.
The child server reads the request header (the request body will be
treated by the corresponding content handler) with the procedure ap_read_request()
in protocol.c. It stores the result of the parsing
in the data structure request_rec. There
are many errors that can occur in this phase4.6. Note that only the header of the HTTP request is read at this time!
4.4.4 Process HTTP Request
After the HTTP header has been read, the child server status changes
to ``busy_write''. Now it's
time to respond to the request.
Figure 4.14 shows the details
of the request processing in Apache 2.0. Request processing in Apache
1.3 is almost the same. The only major exception is that only a single
content handler can be used in apache, but multiple modules can take
part in forming the response in Apache 2.0 as the filter concept is
used.
The procedure ap_process_request()
in http_request.c calls process_request_internal()
in request.c. What happens in this procedure is shown
in figure 4.14 which is similar
to figure 3.9
in section 3.3 ,
but provides technical details and explanations:
- First the URI is modified (ap_unescape_URL(),
ap_getparents()):
Apache replaces escape character sequences like ``%20''
and path 'noise' like ``./xxx'' or ``xxx/../''.
- Then Apache retrieves the configuration for the Request URI: location_walk().
This happens before a module translates the URI because it can influence
the way the URI is translated. Detailed information about Apache's
configuration management can be found in section 4.5.
- ap_translate_name():
Some module handler must translate the request URI into a local resource
name, usually a path in the file system.
- Again it gets the pieces of configuration information
for the Request URI with location_walk() (the URI
can differ from the request URI after being processed by ap_translate_name()!).
The core handler for the hook map_to_storage,
core_map_to_storage()
calls ap_directory_walk()
and ap_file_walk()
which collect and merge configuration information for the path and
the file name of the requested resource. The result is the configuration
that will be given to all module handlers that process this request.
(``walk'': Apache traverses the configuration information of every
section of the path or URI from the root to the leaf and merges them.
The .htaccess files are read by directory_walk()in
every directory and by file_walk() in the leaf directory.)
- header_parser:
Every module has the opportunity to process the header (to read cookies
for example).
There are two independent authorization checks:
- Access check based on the IP address of the client computer
- Authorization check based on the identity of the client user (to get
the identity, an authentication check is necessary)
The administrator can configure for each resource:
- users, groups, IP addresses and network addresses
- the rules for the authorization check (allow/deny IP or users, either
both IP and Identity check must be successful or only one of both)
The complex behavior of the authorization check could not be illustrated
completely in figure 4.14.
Use the matrix on the left-hand side to map the program statements
to the operations.
- access_checker:
IP-based authorization check
- ap_check_user_id:
authentication check
- auth_checker:
authorization check based on the user's identity
- type_checker:
get the MIME type of the requested resource. This is important for
selecting the corresponding content handler.
- fixups:
Any module can write data into the response header (to set a cookie,
for example).
- insert_filter:
Any module can register output filters.
- handler:
The module registered for the MIME type of the resource offers one
or more content handlers to create the response data (header and body).
Some handlers, e.g. the CGI module content handler, read the body
of the request. The content handle sends the response body through
the output filter chain.
- ap_finalize_request_protocol():
This method should have been named ``finalize response''. Its
sole purpose is to send the terminating protocol information for any
wrappers around the response message body (i.e. transfer encodings).
The error handling shown on the left side is 'misused' if Apache wants
to skip further request processing. The procedure ap_die()
checks the status and sends an error message only if an error occurred.
This ``misuse'' happens for example if ap_translate_name
is successful (it returns ``DONE'')!
For more information on filters, check section 3.3.4.
Footnotes
- ... phase4.6
- If you want to check this, establish a TCP/IP connection with telnet
(telnet hostname port) and type in a correct HTTP request
according to HTTP/1.1.
Next: 4.5 The Configuration Processor
Up: 4. Inside Apache
Previous: 4.3 Multitasking server architectures
Contents
Index
|
Apache Modeling Portal
2004-10-29 |