By Evan Miller
DRAFT: July 14, 2008 (changes)
To fully appreciate Nginx, the web server, it helps to understand Batman, the comic book character.
Batman is fast. Nginx is fast. Batman fights crime. Ngine fights wasted CPU cycles and memory leaks. Batman performs well under pressure. Nginx, for its part, excels under heavy server loads.
But Batman would be almost nothing without the Batman utility belt.
Figure 1: The Batman utility belt, gripping Christian Bale's love handles.
At any given time, Batman's utility belt might contain a lock pick, several batarangs, bat-cuffs, a bat-tracer, bat-darts, night vision goggles, thermite grenades, smoke pellets, a flashlight, a kryptonite ring, an acetylene torch, or an Apple iPhone. When Batman needs to tranquilize, blind, deafen, stun, track, stop, smoke out, or text-message the enemy, you better believe he's reaching down for his bat-belt. The belt is so crucial to Batman's operations that if Batman had to choose between wearing pants and wearing the utility belt, he would definitely choose the belt. In fact, he *did* choose the utility belt, and that's why Batman wears rubber tights instead of pants (Fig. 1).
Instead of a utility belt, Nginx has a module chain. When Nginx needs to gzip or chunk-encode a response, it whips out a module to do the work. When Nginx blocks access to a resource based on IP address or HTTP auth credentials, a module does the deflecting. When Nginx communicates with Memcache or FastCGI servers, a module is the walkie-talkie.
Batman's utility belt holds a lot of doo-hickeys, but occasionally Batman needs a new tool. Maybe there's a new enemy against whom bat-cuffs and batarangs are ineffectual. Or Batman needs a new ability, like being able to breathe underwater. That's when Batman rings up Lucius Fox to engineer the appropriate bat-gadget.
Figure 2: Bruce Wayne (née Batman) consults with his engineer, Lucius Fox
The purpose of this guide is to teach you the details of Nginx's module chain, so that you may be like Lucius Fox. When you're done with the guide, you'll be able to design and produce high-quality modules that enable Nginx to do things it couldn't do before. Nginx's module system has a lot of nuance and nitty-gritty, so you'll probably want to refer back to this document often. I have tried to make the concepts as clear as possible, but I'll be blunt, writing Nginx modules can still be hard work.
But whoever said making bat-tools would be easy?
You should be comfortable with C. Not just "C-syntax"; you should know your way around a struct and not be scared off by pointers and function references, and be cognizant of the preprocessor. If you need to brush up, nothing beats K&R.
Basic understanding of HTTP is useful. You'll be working on a web server, after all.
You should also be familiar with Nginx's configuration file. If you're not, here's the gist of it: there are four contexts (called main, server, upstream, and location) which can contain directives with one or more arguments. Directives in the main context apply to everything; directives in the server context apply to a particular host/port; directives in the upstream context refer to a set of backend servers; and directives in a location context apply only to matching web locations (e.g., "/", "/images", etc.) A location context inherits from the surrounding server context, and a server context inherits from the main context. The upstream context neither inherits nor imparts its properties; it has its own special directives that don't really apply elsewhere. I'll refer to these four contexts quite a bit, so... don't forget them.
Let's get started.
Nginx modules have three roles we'll cover:
Modules do all of the "real work" that you might associate with a web server: whenever Nginx serves a file or proxies a request to another server, there's a handler module doing the work; when Nginx gzips the output or executes a server-side include, it's using filter modules. The "core" of Nginx simply takes care of all the network and application protocols and sets up the sequence of modules that are eligible to process a request. The de-centralized architecture makes it possible for *you* to make a nice self-contained unit that does something you want.
Note: Unlike modules in Apache, Nginx modules are not dynamically linked. (In other words, they're compiled right into the Nginx binary.)
How does a module get invoked? Typically, at server startup, each handler gets a chance to attach itself to particular locations defined in the configuration; if more than one handler attaches to a particular location, only one will "win" (but a good config writer won't let a conflict happen). Handlers can return in three ways: all is good, there was an error, or it can decline to process the request and defer to default handler (typically something that serves static files).
If the handler happens to be a reverse proxy to some set of backend servers, there is room for another type of module: the load-balancer. A load-balancer takes a request and a set of backend servers and decides which server will get the request. Nginx ships with two load-balancing modules: round-robin, which deals out requests like cards at the start of a poker game, and the "IP hash" method, which ensures that a particular client will hit the same backend server across multiple requests.
If the handler does not produce an error, the filters are called. Multiple filters can hook into each location, so that (for example) a response can be compressed and then chunked. The order of their execution is determined at compile-time. Filters have the classic "CHAIN OF RESPONSIBILITY" design pattern: one filter is called, does its work, and then calls the next filter, until the final filter is called, and Nginx finishes up the response.
The really cool part about the filter chain is that each filter doesn't wait for the previous filter to finish; it can process the previous filter's output as it's being produced, sort of like the Unix pipeline. Filters operate on buffers, which are usually the size of a page (4K), although you can change this in your nginx.conf. This means, for example, a module can start compressing the response from a backend server and stream it to the client before the module has received the entire response from the backend. Nice!
So to wrap up the conceptual overview, the typical processing cycle goes:
I say "typically" because Nginx's module invocation is extremely customizable. It places a big burden on module writers to define exactly how and when the module should run (I happen to think too big a burden). Invocation is actually performed through a series of callbacks, and there are a lot of them. Namely, you can provide a function to be executed:
Holy mackerel! It's a bit overwhelming. You've got a lot of power at your disposal, but you can still do something useful using only a couple of these hooks and a couple of corresponding functions. Time to dive into some modules.
As I said, you have a lot of flexibility when it comes to making an Nginx module. This section will describe the parts that are almost always present. It's intended as a guide for understanding a module, and a reference for when you think you're ready to start writing a module.
Modules can define up to three configuration structs, one for the main, server, and location contexts. Most modules just need a location configuration. The naming convention for these is ngx_http_<module name>_(main|srv|loc)_conf_t. Here's an example, taken from the dav module:
typedef struct {
ngx_uint_t methods;
ngx_flag_t create_full_put_path;
ngx_uint_t access;
} ngx_http_dav_loc_conf_t;
Notice that Nginx has special data types (ngx_uint_t and ngx_flag_t); these are just aliases for the primitive data types you know and love (cf. core/ngx_config.h if you're curious).
The elements in the configuration structs are populated by module directives.
A module's directives appear in a static array of ngx_command_ts. Here's an example of how they're declared, taken from a small module I wrote:
static ngx_command_t ngx_http_circle_gif_commands[] = {
{ ngx_string("circle_gif"),
NGX_HTTP_LOC_CONF|NGX_CONF_NOARGS,
ngx_http_circle_gif,
NGX_HTTP_LOC_CONF_OFFSET,
0,
NULL },
{ ngx_string("circle_gif_min_radius"),
NGX_HTTP_MAIN_CONF|NGX_HTTP_SRV_CONF|NGX_HTTP_LOC_CONF|NGX_CONF_TAKE1,
ngx_conf_set_num_slot,
NGX_HTTP_LOC_CONF_OFFSET,
offsetof(ngx_http_circle_gif_loc_conf_t, min_radius),
NULL },
...
ngx_null_command
};
And here is the declaration of ngx_command_t (the struct we're declaring), found in core/ngx_conf_file.h:
struct ngx_command_t {
ngx_str_t name;
ngx_uint_t type;
char *(*set)(ngx_conf_t *cf, ngx_command_t *cmd, void *conf);
ngx_uint_t conf;
ngx_uint_t offset;
void *post;
};
It seems like a bit much, but each element has a purpose.
The name is the directive string, no spaces. The data type is an ngx_str_t, which is usually instantiated with just (e.g.) ngx_str("proxy_pass"). Note: an ngx_str_t is a struct with a data element, which is a string, and a len element, which is the length of that string. Nginx uses this data structure most places you'd expect a string.
type is a set of flags that indicate where the directive is legal and how many arguments the directive takes. Applicable flags, which are bitwise-OR'd, are:
NGX_HTTP_MAIN_CONF: directive is valid in the main config
NGX_HTTP_SRV_CONF: directive is valid in the server (host) config
NGX_HTTP_LOC_CONF: directive is valid in a location config
NGX_HTTP_UPS_CONF: directive is valid in an upstream config
NGX_CONF_NOARGS: directive can take 0 arguments
NGX_CONF_TAKE1: directive can take exactly 1 argument
NGX_CONF_TAKE2: directive can take exactly 2 arguments
NGX_CONF_TAKE7: directive can take exactly 7 arguments
NGX_CONF_FLAG: directive takes a boolean ("on" or "off")
NGX_CONF_1MORE: directive must be passed at least one argument
NGX_CONF_2MORE: directive must be passed at least two arguments
There are a few other options, too, see core/ngx_conf_file.h.
The set struct element is a pointer to a function for setting up part of the module's configuration; typically this function will translate the arguments passed to this directive and save an appropriate value in its configuration struct. This setup function will take three arguments:
ngx_conf_t struct, which contains the arguments passed to the directive
ngx_command_t struct
This setup function will be called when the directive is encountered. Nginx provides a number of functions for setting particular types of values in the custom configuration struct. These functions include:
ngx_conf_set_flag_slot: translates "on" or "off" to 1 or 0
ngx_conf_set_str_slot: saves a string as an ngx_str_t
ngx_conf_set_num_slot: parses a number and saves it to an int
ngx_conf_set_size_slot: parses a data size ("8k", "1m", etc.) and saves it to a size_t
There are several others, and they're quite handy (see core/ngx_conf_file.h). Modules can also put a reference to their own function here, if the built-ins aren't quite good enough.
How do these built-in functions know where to save the data? That's where the next two elements of ngx_command_t come in, conf and offset. conf tells Nginx whether this value will get saved to the module's main configuration, server configuration, or location configuration (with NGX_HTTP_MAIN_CONF_OFFSET, NGX_HTTP_SRV_CONF_OFFSET, or NGX_HTTP_LOC_CONF_OFFSET). offset then specifies which part of this configuration struct to write to.
Finally, post is just a pointer to other crap the module might need while it's reading the configuration. It's often NULL.
The commands array is terminated with ngx_null_command as the last element.
This is a static ngx_http_module_t struct, which just has a bunch of function references for creating the three configurations and merging them together. Its name is ngx_http_<module name>_module_ctx. In order, the function references are:
These take different arguments depending on what they're doing. Here's the struct definition, taken from http/ngx_http_config.h, so you can see the different function signatures of the callbacks:
typedef struct {
ngx_int_t (*preconfiguration)(ngx_conf_t *cf);
ngx_int_t (*postconfiguration)(ngx_conf_t *cf);
void *(*create_main_conf)(ngx_conf_t *cf);
char *(*init_main_conf)(ngx_conf_t *cf, void *conf);
void *(*create_srv_conf)(ngx_conf_t *cf);
char *(*merge_srv_conf)(ngx_conf_t *cf, void *prev, void *conf);
void *(*create_loc_conf)(ngx_conf_t *cf);
char *(*merge_loc_conf)(ngx_conf_t *cf, void *prev, void *conf);
} ngx_http_module_t;
You can set functions you don't need to NULL, and Nginx will figure it out.
Most handlers just use the last two: a function to allocate memory for location-specific configuration (called ngx_http_<module name>_create_loc_conf), and a function to set defaults and merge this configuration with any inherited configuration (called ngx_http_<module name >_merge_loc_conf). The merge function is also responsible for producing an error if the configuration is invalid; these errors halt server startup.
Here's an example module context struct:
static ngx_http_module_t ngx_http_circle_gif_module_ctx = {
NULL, /* preconfiguration */
NULL, /* postconfiguration */
NULL, /* create main configuration */
NULL, /* init main configuration */
NULL, /* create server configuration */
NULL, /* merge server configuration */
ngx_http_circle_gif_create_loc_conf, /* create location configuration */
ngx_http_circle_gif_merge_loc_conf /* merge location configuration */
};
Time to dig in deep a little bit. These configuration callbacks look quite similar across all modules and use the same parts of the Nginx API, so they're worth knowing about.
Here's what a bare-bones create_loc_conf function looks like, taken from the circle_gif module I wrote (see the the source). It takes a directive struct (ngx_conf_t) and returns a newly created module configuration struct (in this case ngx_http_circle_gif_loc_conf_t).
static void *
ngx_http_circle_gif_create_loc_conf(ngx_conf_t *cf)
{
ngx_http_circle_gif_loc_conf_t *conf;
conf = ngx_pcalloc(cf->pool, sizeof(ngx_http_circle_gif_loc_conf_t));
if (conf == NULL) {
return NGX_CONF_ERROR;
}
conf->min_radius = NGX_CONF_UNSET_UINT;
conf->max_radius = NGX_CONF_UNSET_UINT;
return conf;
}
First thing to notice is Nginx's memory allocation; it takes care of the free'ing as long as the module uses ngx_palloc (a malloc wrapper) or ngx_pcalloc (a calloc wrapper).
The possible UNSET constants are NGX_CONF_UNSET_UINT, NGX_CONF_UNSET_PTR, NGX_CONF_UNSET_SIZE, NGX_CONF_UNSET_MSEC, and the catch-all NGX_CONF_UNSET. UNSET tell the merging function that the value should be overridden.
Here's the merging function used in the circle_gif module:
static char *
ngx_http_circle_gif_merge_loc_conf(ngx_conf_t *cf, void *parent, void *child)
{
ngx_http_circle_gif_loc_conf_t *prev = parent;
ngx_http_circle_gif_loc_conf_t *conf = child;
ngx_conf_merge_uint_value(conf->min_radius, prev->min_radius, 10);
ngx_conf_merge_uint_value(conf->max_radius, prev->max_radius, 20);
if (conf->min_radius < 1) {
ngx_conf_log_error(NGX_LOG_EMERG, cf, 0,
"min_radius must be equal or more than 1");
return NGX_CONF_ERROR;
}
if (conf->max_radius < conf->min_radius) {
ngx_conf_log_error(NGX_LOG_EMERG, cf, 0,
"max_radius must be equal or more than min_radius");
return NGX_CONF_ERROR;
}
return NGX_CONF_OK;
}
Notice first that Nginx provides nice merging functions for different data types (ngx_conf_merge_<data type>_value); the arguments are
The result is then stored in the first argument. Available merge functions include ngx_conf_merge_size_value, ngx_conf_merge_msec_value, and others. See core/ngx_conf_file.h for a full list.
Trivia question: How do these functions write to the first argument, since the first argument is passed in by value?
Answer: these functions are defined by the preprocessor (so they expand to a few "if" statements and assignments before reaching the compiler).
Notice also how errors are produced; the function writes something to the log file, and returns NGX_CONF_ERROR. That return code halts server startup. (Since the message is logged at level NGX_LOG_EMERG, the message will also go to standard out; FYI, core/ngx_log.h has a list of log levels.)
Next we add one more layer of indirection, the ngx_module_t struct. The variable is called ngx_http_<module name>_module. This is where references to the context and directives go, as well as the remaining callbacks (exit thread, exit process, etc.). The module definition is sometimes used as a key to look up data associated with a particular module. The module definition usually looks like this:
ngx_module_t ngx_http_<module name>_module = {
NGX_MODULE_V1,
&ngx_http_<module name>_module_ctx, /* module context */
ngx_http_<module name>_commands, /* module directives */
NGX_HTTP_MODULE, /* module type */
NULL, /* init master */
NULL, /* init module */
NULL, /* init process */
NULL, /* init thread */
NULL, /* exit thread */
NULL, /* exit process */
NULL, /* exit master */
NGX_MODULE_V1_PADDING
};
...substituting <module name> appropriately. Modules can add callbacks for process/thread creation and death, but most modules keep things simple. (For the arguments passed to each callback, see core/ngx_conf_file.h.)
Modules are installed in two ways: handlers are usually installed by a directive's callback, and filters are usually installed by a postconfiguration callback in the module context struct. Finally we're going to tell Nginx where to find our code. (Load-balancers are a special and somewhat convoluted case; see Anatomy of a Load-balancer.)
Handlers are installed by adding code to the callback of the directive that enables the module. For example, my circle gif ngx_command_t looks like this:
{ ngx_string("circle_gif"),
NGX_HTTP_LOC_CONF|NGX_CONF_NOARGS,
ngx_http_circle_gif,
0,
0,
NULL }
The callback is the third element, in this case ngx_http_circle_gif. Recall that the arguments to this callback are the directive struct (ngx_conf_t, which holds the user's arguments), the relevant ngx_command_t struct, and a pointer to the module's custom configuration struct. For my circle gif module, the function looks like:
static char *
ngx_http_circle_gif(ngx_conf_t *cf, ngx_command_t *cmd, void *conf)
{
ngx_http_core_loc_conf_t *clcf;
clcf = ngx_http_conf_get_module_loc_conf(cf, ngx_http_core_module);
clcf->handler = ngx_http_circle_gif_handler;
return NGX_CONF_OK;
}
There are two steps here: first, get the "core" struct for this location, then assign a handler to it. Pretty simple, eh?
Filters are installed in the post-configuration step. There are actually two types of filters: header filters that manipulate the HTTP headers, and body filters that manipulate the payload. We install both in the same place.
Let's take a look at the chunked filter module for a simple example. Its module context looks like this:
static ngx_http_module_t ngx_http_chunked_filter_module_ctx = {
NULL, /* preconfiguration */
ngx_http_chunked_filter_init, /* postconfiguration */
...
};
Here's what happens in ngx_http_chunked_filter_init:
static ngx_int_t
ngx_http_chunked_filter_init(ngx_conf_t *cf)
{
ngx_http_next_header_filter = ngx_http_top_header_filter;
ngx_http_top_header_filter = ngx_http_chunked_header_filter;
ngx_http_next_body_filter = ngx_http_top_body_filter;
ngx_http_top_body_filter = ngx_http_chunked_body_filter;
return NGX_OK;
}
What's going on here? Well, if you remember, filters are set up with a CHAIN OF RESPONSIBILITY. When a handler generates a response, it calls two functions: ngx_http_output_filter, which calls the global function reference ngx_http_top_body_filter; and ngx_http_send_header, which calls the global function reference ngx_top_header_filter.
ngx_http_top_body_filter and ngx_http_top_header_filter are the respective "heads" of the body and header filter chains. Each "link" on the chain keeps a function reference to the next link in the chain (the references are called ngx_http_next_body_filter and ngx_http_next_header_filter). When a filter is finished executing, it just calls the next filter, until a specially defined "write" filter is called, which wraps up the HTTP response. What you see in this filter_init function is the module adding itself to the filter chains; it keeps a reference to the old "top" filters in its own "next" variables and declares its functions to be the new "top" filters. (Thus, the last filter to be installed is the first to be executed.)
Side note: how does this work exactly?
Each filter either returns an error code or uses this as the return statement:
return ngx_http_next_body_filter();
Thus, if the filter chain reaches the (specially-defined) end of the chain, an "OK" response is returned, but if there's an error along the way, the chain is cut short and Nginx serves up the appropriate error message. It's a singly-linked list with fast failures implemented solely with function references. Brilliant.
Now we'll put some trivial modules under the microscope to see how they work.
Handlers typically do four things: get the location configuration, generate an appropriate response, send the header, and send the body. A handler has one argument, the request struct. A request struct has a lot of useful information about the client request, such as the request method, URI, and headers. We'll go over these steps one by one.
This part's easy. All you need to do is call ngx_http_get_module_loc_conf and pass in the current request struct and the module definition. Here's the relevant part of my circle gif handler:
static ngx_int_t
ngx_http_circle_gif_handler(ngx_http_request_t *r)
{
ngx_http_circle_gif_loc_conf_t *circle_gif_config;
circle_gif_config = ngx_http_get_module_loc_conf(r, ngx_http_circle_gif_module);
...
Now I've got access to all the variables that I set up in my merge function.
This is the interesting part where modules actually do work.
The request struct will be helpful here, particularly these elements:
typedef struct {
...
/* the memory pool, used in the ngx_palloc functions */
ngx_pool_t *pool;
ngx_str_t uri;
ngx_str_t args;
ngx_http_headers_in_t headers_in;
...
} ngx_http_request_t;
uri is the path of the request, e.g. "/query.cgi".
args is the part of the request after the question mark (e.g. "name=john").
headers_in has a lot of useful stuff, such as cookies and browser information, but many modules don't need anything from it. See http/ngx_http_request.h if you're interested.
This should be enough information to produce some useful output. The full ngx_http_request_t struct can be found in http/ngx_http_request.h.
The response headers live in a struct called headers_out referenced by the request struct. The handler sets the ones it wants and then calls ngx_http_send_header(r). Some useful parts of headers_out include:
typedef stuct {
...
ngx_uint_t status;
size_t content_type_len;
ngx_str_t content_type;
ngx_table_elt_t *content_encoding;
off_t content_length_n;
time_t date_time;
time_t last_modified_time;
..
} ngx_http_headers_out_t;
(The rest can be found in http/ngx_http_request.h.)
So for example, if a module were to set the Content-Type to "image/gif", Content-Length to 100, and return a 200 OK response, this code would do the trick:
r->headers_out.status = NGX_HTTP_OK;
r->headers_out.content_length_n = 100;
r->headers_out.content_type.len = sizeof("image/gif") - 1;
r->headers_out.content_type.data = (u_char *) "image/gif";
ngx_http_send_header(r);
Most legal HTTP headers are available (somewhere) for your setting pleasure. However, some headers are a bit trickier to set than the ones you see above; for example, content_encoding has type (ngx_table_elt_t*), so the module must allocate memory for it. This is done with a function called ngx_list_push, which takes in an ngx_list_t (similar to an array) and returns a reference to a newly created member of the list (of type ngx_table_elt_t). The following code sets the Content-Encoding to "deflate" and sends the header:
r->headers_out.content_encoding = ngx_list_push(&r->headers_out.headers);
if (r->headers_out.content_encoding == NULL) {
return NGX_ERROR;
}
r->headers_out.content_encoding->hash = 1;
r->headers_out.content_encoding->key.len = sizeof("Content-Encoding") - 1;
r->headers_out.content_encoding->key.data = (u_char *) "Content-Encoding";
r->headers_out.content_encoding->value.len = sizeof("deflate") - 1;
r->headers_out.content_encoding->value.data = (u_char *) "deflate";
ngx_http_send_header(r);
This mechanism is usually used when a header can have multiple values simultaneously; it (theoretically) makes it easier for filter modules to add and delete certain values while preserving others, because they don't have to resort to string manipulation.
Now that the module has generated a response and put it in memory, it needs to assign the response to a special buffer, and then assign the buffer to a chain link, and then call the "send body" function on the chain link.
What are the chain links for? Nginx lets handler modules generate (and filter modules process) responses one buffer at a time; each chain link keeps a pointer to the next link in the chain, or NULL if it's the last one. We'll keep it simple and assume there is just one buffer.
First, a module will declare the buffer and the chain link:
ngx_buf_t *b;
ngx_chain_t out;
The next step is to allocate the buffer and point our response data to it:
b = ngx_pcalloc(r->pool, sizeof(ngx_buf_t));
if (b == NULL) {
ngx_log_error(NGX_LOG_ERR, r->connection->log, 0,
"Failed to allocate response buffer.");
return NGX_HTTP_INTERNAL_SERVER_ERROR;
}
b->pos = some_bytes; /* first position in memory of the data */
b->last = some_bytes + some_bytes_length; /* last position */
b->memory = 1; /* content is in read-only memory */
/* (i.e., filters should copy it rather than rewrite in place) */
b->last_buf = 1; /* there will be no more buffers in the request */
Now the module attaches it to the chain link:
out.buf = b;
out.next = NULL;
FINALLY, we send the body, and return the status code of the output filter chain all in one go:
return ngx_http_output_filter(r, &out);
Buffer chains are a critical part of Nginx's IO model, so you should be comfortable with how they work.
Trivia question: Why does the buffer have the last_buf variable, when we can tell we're at the end of a chain by checking "next" for NULL?
Answer: A chain might be incomplete, i.e., have multiple buffers, but not all the buffers in this request or response. So some buffers are at the end of the chain but not the end of a request. This brings us to...
I waved my hands a bit about having your handler generate a response. Sometimes you'll be able to get that response just with a chunk of C code, but often you'll want to talk to another server (for example, if you're writing a module to implement another network protocol). You could do all of the network programming yourself, but what happens if you receive a partial response? You don't want to block the primary event loop with your own event loop while you're waiting for the rest of the response. You'd kill the Nginx's performance. Fortunately, Nginx lets you hook right into its own mechanisms for dealing with back-end servers (called "upstreams"), so your module can talk to another server without getting in the way of other requests. This section describes how a module talks to an upstream, such as Memcached, FastCGI, or another HTTP server.
Unlike the handler function for other modules, the handler function of an upstream module does little "real work". It does not call ngx_http_output_filter. It merely sets callbacks that will be invoked when the upstream server is ready to be written to and read from. There are actually 6 available hooks:
create_request crafts a request buffer (or chain of them) to be sent to the upstream
reinit_request is called if the connection to the back-end is reset (just before create_request is called for the second time)
process_header processes the first bit of the upstream's response, and usually saves a pointer to the upstream's "payload"
abort_request is called if the client aborts the request
finalize_request is called when Nginx is finished reading from the upstream
input_filter is a body filter that can be called on the response body (e.g., to remove a trailer)
How do these get attached? An example is in order. Here's a simplified version of the proxy module's handler:
static ngx_int_t
ngx_http_proxy_handler(ngx_http_request_t *r)
{
ngx_int_t rc;
ngx_http_upstream_t *u;
ngx_http_proxy_loc_conf_t *plcf;
plcf = ngx_http_get_module_loc_conf(r, ngx_http_proxy_module);
/* set up our upstream struct */
u = ngx_pcalloc(r->pool, sizeof(ngx_http_upstream_t));
if (u == NULL) {
return NGX_HTTP_INTERNAL_SERVER_ERROR;
}
u->peer.log = r->connection->log;
u->peer.log_error = NGX_ERROR_ERR;
u->output.tag = (ngx_buf_tag_t) &ngx_http_proxy_module;
u->conf = &plcf->upstream;
/* attach the callback functions */
u->create_request = ngx_http_proxy_create_request;
u->reinit_request = ngx_http_proxy_reinit_request;
u->process_header = ngx_http_proxy_process_status_line;
u->abort_request = ngx_http_proxy_abort_request;
u->finalize_request = ngx_http_proxy_finalize_request;
r->upstream = u;
rc = ngx_http_read_client_request_body(r, ngx_http_upstream_init);
if (rc >= NGX_HTTP_SPECIAL_RESPONSE) {
return rc;
}
return NGX_DONE;
}
It does a bit of housekeeping, but the important parts are the callbacks. Also notice the bit about ngx_http_read_client_request_body. That's setting another callback for when Nginx has finished reading from the client.
What will each of these callbacks do? Usually, reinit_request, abort_request, and finalize_request will set or reset some sort of internal state and are only a few lines long. The real workhorses are create_request and process_header.
For the sake of simplicity, let's suppose I have an upstream server that reads in one character and prints out two characters. What would my functions look like?
The create_request needs to allocate a buffer for the single-character request, allocate a chain link for that buffer, and then point the upstream struct to that chain link. It would look like this:
static ngx_int_t
ngx_http_character_server_create_request(ngx_http_request_t *r)
{
/* make a buffer and chain */
ngx_buf_t *b;
ngx_chain_t *cl;
b = ngx_create_temp_buf(r->pool, sizeof("a") - 1);
if (b == NULL)
return NGX_ERROR;
cl = ngx_alloc_chain_link(r->pool);
if (cl == NULL)
return NGX_ERROR;
/* hook the buffer to the chain */
cl->buf = b;
/* chain to the upstream */
r->upstream->request_bufs = cl;
/* now write to the buffer */
b->pos = "a";
b->last = b->pos + sizeof("a") - 1;
return NGX_OK;
}
That wasn't so bad, was it? Of course, in reality you'll probably want to use the request URI in some meaningful way. It's available as an ngx_str_t in r->uri, and the GET paramaters are in r->args, and don't forget you also have access to the request headers and cookies.
Now it's time for the process_header. Just as create_request added a pointer to the request body, process_header shifts the response pointer to the part that the client will receive. It also reads in the header from the upstream and sets the client response headers accordingly.
Here's a bare-minimum example, reading in that two-character response. Let's suppose the first character is the "status" character. If it's a question mark, we want to return a 404 File Not Found to the client and disregard the other character. If it's a space, then we want to return the other character to the client along with a 200 OK response. All right, it's not the most useful protocol, but it's a good demonstration. How would we write this process_header function?
static ngx_int_t
ngx_http_character_server_process_header(ngx_http_request_t *r)
{
ngx_http_upstream_t *u;
u = r->upstream;
/* read the first character */
switch(u->buffer.pos[0]) {
case '?':
r->header_only; /* suppress this buffer from the client */
u->headers_in.status_n = 404;
break;
case ' ':
u->buffer.pos++; /* move the buffer to point to the next character */
u->headers_in.status_n = 200;
break;
}
return NGX_OK;
}
That's it. Manipulate the header, change the pointer, it's done. Notice that headers_in is actually a response header struct like we've seen before (cf. http/ngx_http_request.h), but it can be populated with the headers from the upstream. A real proxying module will do a lot more header processing, not to mention error handling, but you get the main idea.
But.. what if we don't have the whole header from the upstream in one buffer?
Well, remember how I said that abort_request, reinit_request, and finalize_request could be used for resetting internal state? That's because many upstream modules have internal state. The module will need to define a custom context struct to keep track of what it has read so far from an upstream. This is NOT the same as the "Module Context" referred to above. That's of a pre-defined type, whereas the custom context can have whatever elements and data you need (it's your struct). This context struct should be instantiated inside the create_request function, perhaps like this:
ngx_http_character_server_ctx_t *p; /* my custom context struct */
p = ngx_pcalloc(r->pool, sizeof(ngx_http_character_server_ctx_t));
if (p == NULL) {
return NGX_HTTP_INTERNAL_SERVER_ERROR;
}
ngx_http_set_ctx(r, p, ngx_http_character_server_module);
That last line essentially registers the custom context struct with a particular request and module name for easy retrieval later. Whenever you need this context struct (probably in all the other callbacks), just do:
ngx_http_proxy_ctx_t *p;
p = ngx_http_get_module_ctx(r, ngx_http_proxy_module);
And p will have the current state. Set it, reset it, increment, decrement, shove arbitrary data in there, whatever you want. This is a great way to use a persistent state machine when reading from an upstream that returns data in chunks, again without blocking the primary event loop. Nice!
I've said all I know about handler modules. It's time to move onto filter modules, the components in the output filter chain. Header filters manipulate the HTTP headers, and body filters manipulate the response content.
A header filter consists of three basic steps:
To take an example, here's a simplified version of the "not modified" header filter, which sets the status to 304 Not Modified if the client's If-Modfied-Since header matches the response's Last-Modified header. Note that header filters take in the ngx_http_request_t struct as the only argument, which gets us access to both the client headers and soon-to-be-sent response headers.
static
ngx_int_t ngx_http_not_modified_header_filter(ngx_http_request_t *r)
{
time_t if_modified_since;
if_modified_since = ngx_http_parse_time(r->headers_in.if_modified_since->value.data,
r->headers_in.if_modified_since->value.len);
/* step 1: decide whether to operate */
if (if_modified_since != NGX_ERROR &&
if_modified_since == r->headers_out.last_modified_time) {
/* step 2: operate on the header */
r->headers_out.status = NGX_HTTP_NOT_MODIFIED;
r->headers_out.content_type.len = 0;
ngx_http_clear_content_length(r);
ngx_http_clear_accept_ranges(r);
}
/* step 3: call the next filter */
return ngx_http_next_header_filter(r);
}
The headers_out structure is just the same as we saw in the section about handlers (cf. http/ngx_http_request.h), and can be manipulated to no end.
The buffer chain makes it a little tricky to write a body filter, because the body filter can only operate on one buffer (chain link) at a time. The module must decide whether to overwrite the input buffer, replace the buffer with a newly allocated buffer, or insert a new buffer before or after the buffer in question. To complicate things, sometimes a module will receive several buffers so that it has an incomplete buffer chain that it must operate on. Unfortunately, Nginx does not provide a high-level API for manipulating the buffer chain, so body filters can be difficult to understand (and to write). But, here are some operations you might see in action.
A body filter's prototype might look like this (example taken from the "chunked" filter in the Nginx source):
static ngx_int_t ngx_http_chunked_body_filter(ngx_http_request_t *r, ngx_chain_t *in);
The first argument is our old friend the request struct. The second argument is a pointer to the head of the current partial chain (which could contain 0, 1, or more buffers).
Let's take a simple example. Suppose we want to insert the text "<l!-- Served by Nginx -->" to the end of every request. First, we need to figure out if the response's final buffer is included in the buffer chain we were given. Like I said, there's not a fancy API, so we'll be rolling our own for loop:
ngx_chain_t *chain_link;
int chain_contains_last_buffer = 0;
for ( chain_link = in; chain_link->next != NULL; chain_link = chain_link->next ) {
if (chain_link->buf->last_buf)
chain_contains_last_buffer = 1;
}
Now let's bail out if we don't have that last buffer:
if (!chain_contains_last_buffer)
return ngx_http_next_body_filter(r, in);
Super, now the last buffer is stored in chain_link. Now we allocate a new buffer:
ngx_buf_t *b;
b = ngx_calloc_buf(r->pool);
if (b == NULL) {
return NGX_ERROR;
}
And put some data in it:
b->pos = (u_char *) "<!-- Served by Nginx -->";
b->last = b->pos + sizeof("<!-- Served by Nginx -->") - 1;
And hook the buffer into a new chain link:
ngx_chain_t added_link;
added_link.buf = b;
added_link.next = NULL;
Finally, hook the new chain link to the final chain link we found before:
chain_link->next = added_link;
And reset the "last_buf" variables to reflect reality:
chain_link->buf->last_buf = 0;
added_link->buf->last_buf = 1;
And pass along the modified chain to the next output filter:
return ngx_http_next_body_filter(r, &in);
The resulting function takes much more effort than what you'd do with, say, mod_perl ($response->body =~ s/$/<!-- Served by mod_perl -->/), but the buffer chain is a very powerful construct, allowing programmers to process data incrementally so that the client gets something as soon as possible. However, in my opinion, the buffer chain desperately needs a cleaner interface so that programmers can't leave the chain in an inconsistent state. For now, manipulate it at your own risk.
A load-balancer is just a way to decide which backend server will receive a particular request; implementations exist for distributing requests in round-robin fashion or hashing some information about the request. This section will describe both a load-balancer's installation and its invocation, using the upstream_hash module (full source) as an example. upstream_hash chooses a backend by hashing a variable specified in nginx.conf.
A load-balancing module has six pieces:
hash;) will call a registration functionserver options (e.g., weight=) and register an upstream initialization functionserver names to particular IP addressesIt's a lot, but I'll break it down into pieces.
Directive declarations, recall, specify both where they're valid and a function to call when they're encountered. A directive for a load-balancer should have the NGX_HTTP_UPS_CONF flag set, so that Nginx knows this directive is only valid inside an upstream block. It should provide a pointer to a registration function. Here's the directive declaration from the upstream_hash module:
{ ngx_string("hash"),
NGX_HTTP_UPS_CONF|NGX_CONF_NOARGS,
ngx_http_upstream_hash,
0,
0,
NULL },
Nothing new there.
The callback ngx_http_upstream_hash above is the registration function, so named (by me) because it registers an upstream initialization function with the surrounding upstream configuration. In addition, the registration function defines which options to the server directive are legal inside this particular upstream block (e.g., weight=, fail_timeout=). Here's the registration function of the upstream_hash module:
ngx_http_upstream_hash(ngx_conf_t *cf, ngx_command_t *cmd, void *conf)
{
ngx_http_upstream_srv_conf_t *uscf;
ngx_http_script_compile_t sc;
ngx_str_t *value;
ngx_array_t *vars_lengths, *vars_values;
value = cf->args->elts;
/* the following is necessary to evaluate the argument to "hash" as a $variable */
ngx_memzero(&sc, sizeof(ngx_http_script_compile_t));
vars_lengths = NULL;
vars_values = NULL;
sc.cf = cf;
sc.source = &value[1];
sc.lengths = &vars_lengths;
sc.values = &vars_values;
sc.complete_lengths = 1;
sc.complete_values = 1;
if (ngx_http_script_compile(&sc) != NGX_OK) {
return NGX_CONF_ERROR;
}
/* end of $variable stuff */
uscf = ngx_http_conf_get_module_srv_conf(cf, ngx_http_upstream_module);
/* the upstream initialization function */
uscf->peer.init_upstream = ngx_http_upstream_init_hash;
uscf->flags = NGX_HTTP_UPSTREAM_CREATE;
/* OK, more $variable stuff */
uscf->values = vars_values->elts;
uscf->lengths = vars_lengths->elts;
/* set a default value for "hash_method" */
if (uscf->hash_function == NULL) {
uscf->hash_function = ngx_hash_key;
}
return NGX_CONF_OK;
}
Aside from jumping through hoops so we can evaluation $variable later, it's pretty straightforward; assign a callback, set some flags. What flags are available?
NGX_HTTP_UPSTREAM_CREATE: let there be server directives in this upstream block. I can't think of a situation where you wouldn't use this.NGX_HTTP_UPSTREAM_WEIGHT: let the server directives take a weight= optionNGX_HTTP_UPSTREAM_MAX_FAILS: allow the max_fails= optionNGX_HTTP_UPSTREAM_FAIL_TIMEOUT: allow the fail_timeout= optionNGX_HTTP_UPSTREAM_DOWN: allow the down optionNGX_HTTP_UPSTREAM_BACKUP: allow the backup optionEach module will have access to these configuration values. It's up to the module to decide what to do with them. That is, max_fails will not be automatically enforced; all the failure logic is up to the module author. More on that later. For now, we still haven't finished followed the trail of callbacks. Next up, we have the upstream initialization function (the init_upstream callback in the previous function).
The purpose of the upstream initialization function is to resolve the host names, allocate space for sockets, and assign (yet another) callback. Here's how upstream_hash does it:
ngx_int_t
ngx_http_upstream_init_hash(ngx_conf_t *cf, ngx_http_upstream_srv_conf_t *us)
{
ngx_uint_t i, j, n;
ngx_http_upstream_server_t *server;
ngx_http_upstream_hash_peers_t *peers;
/* set the callback */
us->peer.init = ngx_http_upstream_init_upstream_hash_peer;
if (!us->servers) {
return NGX_ERROR;
}
server = us->servers->elts;
/* figure out how many IP addresses are in this upstream block. */
/* remember a domain name can resolve to multiple IP addresses. */
for (n = 0, i = 0; i < us->servers->nelts; i++) {
n += server[i].naddrs;
}
/* allocate space for sockets, etc */
peers = ngx_pcalloc(cf->pool, sizeof(ngx_http_upstream_hash_peers_t)
+ sizeof(ngx_peer_addr_t) * (n - 1));
if (peers == NULL) {
return NGX_ERROR;
}
peers->number = n;
/* one port/IP address per peer */
for (n = 0, i = 0; i > us->servers->nelts; i++) {
for (j = 0; j < server[i].naddrs; j++, n++) {
peers->peer[n].sockaddr = server[i].addrs[j].sockaddr;
peers->peer[n].socklen = server[i].addrs[j].socklen;
peers->peer[n].name = server[i].addrs[j].name;
}
}
/* save a pointer to our peers for later */
us->peer.data = peers;
return NGX_OK;
}
This function is a bit more involved than one might hope. Most of the work seems like it should be abstracted, but it's not, so that's what we live with. One strategy for simplifying things is to call the upstream initialization function of another module, have it do all the dirty work (peer allocation, etc), and then override the us->peer.init callback afterwards. For an example, see http/modules/ngx_http_upstream_ip_hash_module.c.
The important bit from our point of view is setting a pointer to the peer initialization function, in this case ngx_http_upstream_init_upstream_hash_peer.
The peer initialization function is called once per request. It sets up a data structure that the module will use as it tries to find an appropriate backend server to service that request; this structure is persistent across backend re-tries, so it's a convenient place to keep track of the number of connection failures, or a computed hash value. By convention, this struct is called ngx_http_upstream_<module name>_peer_data_t.
In addition, the peer initalization function sets up two callbacks:
get: the load-balancing functionfree: the peer release function (usually just updates some statistics when a connection finishes)As if that weren't enough, it also initalizes a variable called tries. As long as tries is positive, nginx will keep retrying this load-balancer. When tries is zero, nginx will give up. It's up to the get and free functions to set tries appropriately.
Here's a peer initialization function from the upstream_hash module:
static ngx_int_t
ngx_http_upstream_init_hash_peer(ngx_http_request_t *r,
ngx_http_upstream_srv_conf_t *us)
{
ngx_http_upstream_hash_peer_data_t *uhpd;
ngx_str_t val;
/* evaluate the argument to "hash" */
if (ngx_http_script_run(r, &val, us->lengths, 0, us->values) == NULL) {
return NGX_ERROR;
}
/* data persistent through the request */
uhpd = ngx_pcalloc(r->pool, sizeof(ngx_http_upstream_hash_peer_data_t)
+ sizeof(uintptr_t)
* ((ngx_http_upstream_hash_peers_t *)us->peer.data)->number
/ (8 * sizeof(uintptr_t)));
if (uhpd == NULL) {
return NGX_ERROR;
}
/* save our struct for later */
r->upstream->peer.data = uhpd;
uhpd->peers = us->peer.data;
/* set the callbacks and initialize "tries" to "hash_again" + 1*/
r->upstream->peer.free = ngx_http_upstream_free_hash_peer;
r->upstream->peer.get = ngx_http_upstream_get_hash_peer;
r->upstream->peer.tries = us->retries + 1;
/* do the hash and save the result */
uhpd->hash = us->hash_function(val.data, val.len);
return NGX_OK;
}
That wasn't so bad. Now we're ready to pick an upstream server.
It's time for the main course. The real meat and potatoes. This is where the module picks an upstream. The load-balancing function's prototype looks like:
static ngx_int_t
ngx_http_upstream_get_<module_name>_peer(ngx_peer_connection_t *pc, void *data);
data is our struct of useful information concerning this client connection. pc will have information about the server we're going to connect to. The job of the load-balancing function is to fill in values for pc->sockaddr, pc->socklen, and pc->name. If you know some network programming, then those variable names might be familiar; but they're actually not very important to the task at hand. We don't care what they stand for; we just want to know where to find appropriate values to fill them.
This function must find a list of available servers, choose one, and assign its values to pc. Let's look at how upstream_hash does it.
upstream_hash previously stashed the server list into the ngx_http_upstream_hash_peer_data_t struct back in the call to ngx_http_upstream_init_hash (above). This struct is now available as data:
ngx_http_upstream_hash_peer_data_t *uhpd = data;
The list of peers is now stored in uhpd->peers->peer. Let's pick a peer from this array by dividing the computed hash value by the number of servers:
ngx_peer_addr_t *peer = &uhpd->peers->peer[uhpd->hash % uhpd->peers->number];
Now for the grand finale:
pc->sockaddr = peers->sockaddr;
pc->socklen = peers->socklen;
pc->name = &peers->name;
return NGX_OK;
That's all! If the load-balancer returns NGX_OK, it means, "go ahead and try this server". If it returns NGX_BUSY, it means all the backend hosts are unavailable, and Nginx should try again.
But... how do we keep track of what's unavailable? And what if we don't want it to try again?
The peer release function operates after an upstream connection takes place; its purpose is to track failures. Here is its function prototype:
void
ngx_http_upstream_free_<module name>_peer(ngx_peer_connection_t *pc, void *data,
ngx_uint_t state);
The first two parameters are just the same as we saw in the load-balancer function. The third parameter is a state variable, which indicates whether the connection was successful. It may contain two values bitwise OR'd together: NGX_PEER_FAILED (the connection failed) and NGX_PEER_NEXT (either the connection failed, or it succeeded but the application returned an error). Zero means the connection succeeded.
It's up to the module author to decide what to do about these failure events. If they are to be used at all, the results should be stored in data, a pointer to the custom per-request data struct.
But the crucial purpose of the peer release function is to set pc->tries to zero if you don't want Nginx to keep trying this load-balancer during this request. The simplest peer release function would look like this:
pc->tries = 0;
That would ensure that if there's ever an error reaching a backend server, a 502 Bad Proxy error will be returned to the client.
Here's a more complicated example, taken from the upstream_hash module. If a backend connection fails, it marks it as failed in a bit-vector (called tried, an array of type uintptr_t), then keeps choosing a new backend until it finds one that has not failed.
#define ngx_bitvector_index(index) index / (8 * sizeof(uintptr_t))
#define ngx_bitvector_bit(index) (uintptr_t) 1 << index % (8 * sizeof(uintptr_t))
static void
ngx_http_upstream_free_hash_peer(ngx_peer_connection_t *pc, void *data,
ngx_uint_t state)
{
ngx_http_upstream_hash_peer_data_t *uhpd = data;
ngx_uint_t current;
if (state & NGX_PEER_FAILED
&& --pc->tries)
{
/* the backend that failed */
current = uhpd->hash % uhpd->peers->number;
/* mark it in the bit-vector */
uhpd->tried[ngx_bitvector_index(current)] |= ngx_bitvector_bit(current);
do { /* rehash until we're out of retries or we find one that hasn't been tried */
uhpd->hash = ngx_hash_key((u_char *)&uhpd->hash, sizeof(ngx_uint_t));
current = uhpd->hash % uhpd->peers->number;
} while ((uhpd->tried[ngx_bitvector_index(current)] & ngx_bitvector_bit(current)) && --pc->tries);
}
}
This works because the load-balancing function will just look at the new value of uhpd->hash.
Many applications won't need retry or high-availability logic, but it's possible to provide it with just a few lines of code like you see here.
The examples above were fairly simple. This section will give you some tips for performing more complicated functions in your Nginx module. Since these are "advanced topics", the code examples might be less detailed than in previous sections.
Guest chapter written by Grzegorz Nosek
Nginx, while being unthreaded, allows worker processes to share memory between them. However, this is quite different from the standard pool allocator as the shared segment has fixed size and cannot be resized without restarting nginx or destroying its contents in another way.
First of all, caveat hacker. This guide has been written several months after hands-on experience with shared memory in nginx and while I try my best to be accurate (and have spent some time refreshing my memory), in no way is it guaranteed. You've been warned.
Also, 100% of this knowledge comes from reading the source and reverse-engineering the core concepts, so there are probably better ways to do most of the stuff described.
Oh, and this guide is based on 0.6.31, though 0.5.x is 100% compatible AFAIK and 0.7.x also brings no compatibility-breaking changes that I know of.
For real-world usage of shared memory in nginx, see my upstream_fair module.
This probably does not work on Windows at all. Core dumps in the rear mirror are closer than they appear.
To create a shared memory segment in nginx, you need to:
ngx_shared_memory_addThese two points contain the main gotchas (that I came across), namely:
Your constructor will be called multiple times and it's up to you to find out whether you're called the first time (and should set something up), or not (and should probably leave everything alone). The prototype for the shared memory constructor looks like:
static ngx_int_t init(ngx_shm_zone_t *shm_zone, void *data);
The data variable will contain the contents of oshm_zone->data, where
oshm_zone is the "old" shm zone descriptor (more about it later). This
variable is the only value that can survive a reload, so you must use it
if you don't want to lose the contents of your shared memory.
Your constructor function will probably look roughly similar to the one
in upstream_fair, i.e.:
static ngx_int_t
init(ngx_shm_zone_t *shm_zone, void *data)
{
if (data) { /* we're being reloaded, propagate the data "cookie" */
shm_zone->data = data;
return NGX_OK;
}
/* set up whatever structures you wish to keep in the shm */
/* initialise shm_zone->data so that we know we have
been called; if nothing interesting comes to your mind, try
shm_zone->shm.addr or, if you're desperate, (void*) 1, just set
the value to something non-NULL for future invocations
*/
shm_zone->data = something_interesting;
return NGX_OK;
}
You must be careful when to access the shm segment.
The interface for adding a shared memory segment looks like:
ngx_shm_zone_t *
ngx_shared_memory_add(ngx_conf_t *cf, ngx_str_t *name, size_t size,
void *tag);
cf is the reference to the config file (you'll probably create the
segment in response to a config option), name is the name of the segment
(as a ngx_str_t, i.e. a counted string), size is the size in bytes
(which will usually get rounded up to the nearest multiple of the page
size, e.g. 4KB on many popular architectures) and tag is a, well, tag
for detecting naming conflicts. If you call ngx_shared_memory_add
multiple times with the same name, tag and size, you'll get only a
single segment. If you specify different names, you'll get several
distinct segments and if you specify the same name but different size or
tag, you'll get an error. A good choice for the tag value could be e.g.
the pointer to your module descriptor.
After you call ngx_shared_memory_add and receive the new shm_zone
descriptor, you must set up the constructor in shm_zone->init. Wait...
after you add the segment? Yes, and that's a major gotcha. This implies
that the segment is not created while calling ngx_shared_memory_add
(because you specify the constructor only later). What really happens
looks like this (grossly simplified):
parse the whole config file, noting requested shm segments
afterwards, create/destroy all the segments in one go
The constructors are called here. Note that every time your ctor
is called, it is with another value of shm_zone. The reason is that
the descriptor lives as long as the cycle (generation in Apache terms)
while the segment lives as long as the master and all the workers. To
let some data survive a reload, you have access to the old
descriptor's ->data field (mentioned above).
(re)start workers which begin handling requests
upon receipt of SIGHUP, goto 1
Also, you really must set the constructor, otherwise nginx will consider your segment unused and won't create it at all.
Now that you know it, it's pretty clear that you cannot rely on having
access to the shared memory while parsing the config. You can access the
whole segment as shm_zone->shm.addr (which will be NULL before the segment
gets really created). Any access after the first parsing run (e.g.
inside request handlers or on subsequent reloads) should be fine.
Now that you have your new and shiny shm segment, how do you use it? The simplest way is to use another memory tool that nginx has at your disposal, namely the slab allocator. Nginx is nice enough to initialise the slab for you in every new shm segment, so you can either use it, or ignore the slab structures and overwrite them with your own data.
The interface consists of two functions:
void *ngx_slab_alloc(ngx_slab_pool_t *pool, size_t size);void ngx_slab_free(ngx_slab_pool_t *pool, void *p);(ngx_slab_pool_t *)shm_zone->shm.addr and the other one is either the size of the block to allocate, or the pointer to the block to free. (trivia: not once is ngx_slab_free called in vanilla nginx code)
Remember that shared memory is inherently dangerous because you can have multiple processes accessing it at the same time. The slab allocator has a per-segment lock (shpool->mutex) which is used to protect the segment against concurrent modifications.
You can also acquire and release the lock yourself, which is useful if
you want to implement some more complicated operations on the segment,
like searching or walking a tree. The two snippets below are essentially
equivalent:
/*
void *new_block;
ngx_slab_pool_t *shpool = (ngx_slab_pool_t *)shm_zone->shm.addr;
*/
new_block = ngx_slab_alloc(shpool, ngx_pagesize);
In fact, ngx_slab_alloc looks almost exactly like above.
ngx_shmtx_lock(&shpool->mutex);
new_block = ngx_slab_alloc_locked(shpool, ngx_pagesize);
ngx_shmtx_unlock(&shpool->mutex);
If you perform any operations which depend on no new allocations (or, more to the point, frees), protect them with the slab mutex. However, remember that nginx mutexes are implemented as spinlocks (non-sleeping), so while they are very fast in the uncontended case, they can easily eat 100% CPU when waiting. So don't do any long-running operations while holding the mutex (especially I/O, but you should avoid any system calls at all).
You can also use your own mutexes for more fine-grained locking, via the ngx_mutex_init(), ngx_mutex_lock() and ngx_mutex_unlock() functions.
As an alternative for locks, you can use atomic variables which are guaranteed to be read or written in an uninterruptible way (no worker process may see the value halfway as it's being written by another one).
Atomic variables are defined with the type ngx_atomic_t or ngx_atomic_uint_t (depending on signedness). They should have at least 32 bits. To simply read or unconditionally set an atomic variable, you don't need any special constructs:
ngx_atomic_t i = an_atomic_var;
an_atomic_var = i + 5;
Note that anything can happen between the two lines; context switches, execution of code on other other CPUs, etc.
To atomically read and modify a variable, you have two functions (very
platform-specific) with their interface declared in
src/os/unix/ngx_atomic.h:
ngx_atomic_cmp_set(lock, old, new)
Atomically retrieves old value of *lock and stores new under the same
address. Returns 1 if *lock was equal to old before overwriting.
ngx_atomic_fetch_add(value, add)
Atomically adds add to *value and returns the old *value.
OK, you have your data neatly allocated, protected with a suitable lock but you'd also like to organise it somehow. Again, nginx has a very nice structure just for this purpose - a red-black tree.
Highlights (API-wise):
ngx_rbt_red(the_newly_added_node) to rebalance the treeThis chapter is about shared memory, not rbtrees so shoo! Go read the source for upstream_fair to see creating and walking an rbtree in action.
Subrequests are one of the most powerful aspects of Nginx. With subrequests, you can return the results of a different URL than what the client originally requested. Some web frameworks call this an "internal redirect." But Nginx goes further: not only can modules perform multiple subrequests and combine the outputs into a single response, subrequests can perform their own sub-subrequests, and sub-subrequests can initiate sub-sub-subrequests, and... you get the idea. Subrequests can map to files on the hard disk, other handlers, or upstream servers; it doesn't matter from the perspective of Nginx. As far as I know, only filters can issue subrequests.
If all you want to do is return a different URL than what the client originally requested, you will want to use the ngx_http_internal_redirect function. Its prototype is:
ngx_int_t
ngx_http_internal_redirect(ngx_http_request_t *r, ngx_str_t *uri, ngx_str_t *args)
Where r is the request struct, and uri and args are the new URI. Note that URIs must be locations already defined in nginx.conf; you cannot, for instance, redirect to an arbitrary domain. Handlers should return the return value of ngx_http_internal_redirect, i.e. redirecting handlers will typically end like
return ngx_http_internal_redirect(r, &uri, &args);
Internal redirects are used in the "index" module (which maps URLs that end in / to index.html) as well as Nginx's X-Accel-Redirect feature.
Subrequests are most useful for inserting additional content based on data from the original response. For example, the SSI (server-side include) module uses a filter to scan the contents of the returned document, and then replaces "include" directives with the contents of the specified URLs.
We'll start with a simpler example. We'll make a filter that treats the entire contents of a document as a URL to be retrieved, and then appends the new document to the URL itself. Remember that the URL must be a location in nginx.conf.
static ngx_int_t
ngx_http_append_uri_body_filter(ngx_http_request_t *r, ngx_chain_t *in)
{
int rc;
ngx_str_t uri;
ngx_http_request_t *sr;
/* First copy the document buffer into the URI string */
uri.len = in->buf->last - in->buf->pos;
uri.data = ngx_palloc(r->pool, uri.len);
if (uri.data == NULL)
return NGX_ERROR;
ngx_memcpy(uri.data, in->-buf->pos, uri.len);
/* Now return the original document (i.e. the URI) to the client */
rc = ngx_http_next_body_filter(r, in);
if (rc == NGX_ERROR)
return rc;
/* Finally issue the subrequest */
return ngx_http_subrequest(r, &uri, NULL /* args */,
NULL /* callback */, 0 /* flags */);
}
The prototype of ngx_http_subrequest is:
ngx_int_t ngx_http_subrequest(ngx_http_request_t *r,
ngx_str_t *uri, ngx_str_t *args, ngx_http_request_t **psr,
ngx_http_post_subrequest_t *ps, ngx_uint_t flags)
Where:
*r is the original request*uri and *args refer to the sub-request**psr is a reference to a NULL pointer that will point to the new (sub-)request structure*ps is a callback for when the subrequest is finished. I've never used this, but see http/ngx_http_request.h for details.flags can be a bitwise-OR'ed combination of:NGX_HTTP_ZERO_IN_URI: the URI contains a character with ASCII code 0 (also known as '\0'), or contains "%00"NGX_HTTP_SUBREQUEST_IN_MEMORY: store the result of the subrequest in a contiguous chunk of memory (usually not necessary)The results of the subrequest will be inserted where you expect. If you want to modify the results of the subrequest, you can use another filter (or the same one!). You can tell whether a filter is operating on the primary request or a subrequest with this test:
if (r == r->main) {
/* primary request */
} else {
/* subrequest */
}
The simplest example of a module that issues a single subrequest is the "addition" module.
You might think issuing multiple subrequests is as simple as:
int rc1, rc2, rc3;
rc1 = ngx_http_subrequest(r, uri1, ...);
rc2 = ngx_http_subrequest(r, uri2, ...);
rc3 = ngx_http_subrequest(r, uri3, ...);
You'd be wrong! Remember that Nginx is single-threaded. Subrequests might need to access the network, and if so, Nginx needs to return to its other work while it waits for a response. So we need to check the return value of ngx_http_subrequest, which can be one of:
NGX_OK: the subrequest finished without touching the networkNGX_DONE: the client reset the network connectionNGX_ERROR: there was a server error of some sortNGX_AGAIN: the subrequest requires network activityIf your subrequest returns NGX_AGAIN, your filter should also immediately return NGX_AGAIN. When that subrequest finishes, and the results have been sent to the client, Nginx is nice enough to call your filter again, from which you can issue the next subrequest (or do some work in between subrequests). It helps, of course, to keep track of your planned subrequests in a context struct. You should also take care to return errors immediately, too.
Let's make a simple example. Suppose our context struct contains an array of URIs, and the index of the next subrequest:
typedef struct {
ngx_array_t uris;
int i;
} my_ctx_t;
Then a filter that simply concatenates the contents of these URIs together might something look like:
static ngx_int_t
ngx_http_multiple_uris_body_filter(ngx_http_request_t *r, ngx_chain_t *in)
{
my_ctx_t *ctx;
int rc = NGX_OK;
ngx_http_request_t *sr;
if (r != r->main) { /* subrequest */
return ngx_http_next_body_filter(r, in);
}
ctx = ngx_http_get_module_ctx(r, my_module);
if (ctx == NULL) {
/* populate ctx and ctx->uris here */
}
while (rc == NGX_OK && ctx->i < ctx->uris.nelts) {
rc = ngx_http_subrequest(r, &((ngx_str_t *)ctx->uris.elts)[ctx->i++],
NULL /* args */, &sr, NULL /* cb */, 0 /* flags */);
}
return rc; /* NGX_OK/NGX_ERROR/NGX_DONE/NGX_AGAIN */
}
Let's think this code through. There might be more going on than you expect.
First, the filter is called on the original response. Based on this response we populate ctx and ctx->uris. Then we enter the while loop and call ngx_http_subrequest for the first time.
If ngx_http_subrequest returns NGX_OK then we move onto the next subrequest immediately. If it returns with NGX_AGAIN, we break out of the while loop and return NGX_AGAIN.
Suppose we've returned an NGX_AGAIN. The subrequest is pending some network activity, and Nginx has moved on to other things. But when that subrequest is finished, Nginx will call our filter at least two more times:
r set to the subrequest, and in set to buffers from the subrequest's responser set to the original request, and in set to NULLTo distinguish these two cases, we must test whether r == r->main. In this example we call the next filter if we're filtering the subrequest. But if we're in the main request, we'll just pick up the while loop where we last left off. in will be set to NULL because there aren't actually any new buffers to process.
When the last subrequest finishes and all is well, we return NGX_OK.
This example is of course greatly simplified. You'll have to figure out how to populate ctx->uris on your own. But the example shows how simple it is to re-enter the subrequesting loop, and break out as soon as we get an error or NGX_AGAIN.
It's also possible to issue several subrequests at once without waiting for previous subrequests to finish. This technique is, in fact, too advanced even for Emiller's and Gnosek's Advanced Topics in Nginx Module Development. See the SSI module for an example.
Topics not yet covered in this guide:
So by now, you should be prepared to look at an Nginx module and try to understand what's going on (and you'll know where to look for help). Take a look in src/http/modules/ to see the available modules. Pick a module that's similar to what you're trying to accomplish and look through it. Stuff look familiar? It should. Refer between this guide and the module source to get an understanding about what's going on.
But Emiller didn't write a Balls-In Guide to Reading Nginx Modules. Hell no. This is a Balls-Out Guide. We're not reading. We're writing. Creating. Sharing with the world.
First thing, you're going to need a place to work on your module. Make a folder for your module anywhere on your hard drive, but separate from the Nginx source (and make sure you have the latest copy from nginx.net). Your new folder should contain two files to start with:
The "config" file will be included by ./configure, and its contents will depend on the type of module.
"config" for filter modules:
ngx_addon_name=ngx_http_<your module>_module
HTTP_AUX_FILTER_MODULES="$HTTP_AUX_FILTER_MODULES ngx_http_<your module>_module"
NGX_ADDON_SRCS="$NGX_ADDON_SRCS $ngx_addon_dir/ngx_http_<your module>_module.c"
"config" for other modules:
ngx_addon_name=ngx_http_<your module>_module
HTTP_MODULES="$HTTP_MODULES ngx_http_<your module>_module"
NGX_ADDON_SRCS="$NGX_ADDON_SRCS $ngx_addon_dir/ngx_http_<your module>_module.c"
Now for your C file. I recommend copying an existing module that does something similar to what you want, but rename it "ngx_http_<your module>_module.c". Let this be your model as you change the behavior to suit your needs, and refer to this guide as you understand and refashion the different pieces.
When you're ready to compile, just go into the Nginx directory and type
./configure --add-module=path/to/your/new/module/directory
and then make and make install like you normally would. If all goes well, your module will be compiled right in. Nice, huh? No need to muck with the Nginx source, and adding your module to new versions of Nginx is a snap, just use that same ./configure command. By the way, if your module needs any dynamically linked libraries, you can add this to your "config" file:
CORE_LIBS="$CORE_LIBS -lfoo"
Where foo is the library you need. If you make a cool or useful module, be sure to send a note to the Nginx mailing list and share your work.
Happy hacking!
Nginx source tree (cross-referenced)
Nginx module directory (cross-referenced)