
Advanced PHP Programming
.pdf
578 Chapter 22 Extending PHP: Part II
if((fd = open(filename, O_RDWR)) < -1) { return NULL;
}
if(!file_length) { if(fstat(fd, &sb) == -1) {
close(fd); return NULL;
}
file_length = sb.st_size;
}
if((mpos = mmap(NULL, file_length, PROT_READ|PROT_WRITE, MAP_PRIVATE, fd, 0)) == (void *) -1) {
return NULL;
}
data = emalloc(sizeof(struct mmap_stream_data)); data->base_pos = mpos;
data->current_pos = mpos; data->len = file_length; close(fd);
stream = php_stream_alloc(&mmap_ops, data, NULL, “mode”); if(opened_path) {
*opened_path = estrdup(filename);
}
return stream;
}
Now you only need to register this function with the engine.To do so, you add a registration hook to the MINIT function, as follows:
PHP_MINIT_FUNCTION(mmap_session)
{
php_register_url_stream_wrapper(“mmap”, &mmap_wrapper TSRMLS_CC);
}
Here the first argument, “mmap”, instructs the streams subsystem to dispatch to the wrapper any URLs with the protocol mmap.You also need to register a de-registration function for the wrapper in MSHUTDOWN:
PHP_MSHUTDOWN_FUNCTION(mmap_session)
{
php_unregister_url_stream_wrapper(“mmap” TSRMLS_CC);
}
This section provides only a brief treatment of the streams API. Another of its cool features is the ability to write stacked stream filters.These stream filters allow you to transparently modify data read from or written to a stream. PHP 5 features a number of stock stream filters, including the following:

Further Reading |
579 |
nContent compression
nHTTP 1.1 chunked encoding/decoding
nStreaming cryptographic ciphers via mcrypt
nWhitespace folding
The streams API’s ability to allow you to transparently affect all the internal I/O functions in PHP is extremely powerful. It is only beginning to be fully explored, but I expect some very ingenious uses of its capabilities over the coming years.
Further Reading
The official PHP documentation of how to author classes and streams is pretty sparse. As the saying goes,“Use the force, read the source.”That having been said, there are some resources out there. For OOP extension code, the following are some good resources:
nThe Zend Engine2 Reflection API, in the PHP source tree under Zend/ reflection_api.c, is a good reference for writing classes in C.
nThe streams API is documented in the online PHP manual at
http://www.php.net/manual/en/streams.php. In addition,Wez Furlong, the
streams API architect, has an excellent talk on the subject, which is available at
http://talks.php.net/index.php/Streams.


23
Writing SAPIs and Extending the
Zend Engine
THE FLIP SIDE TO WRITING PHP EXTENSIONS in C is writing applications in C that run PHP.There are a number of reasons you might want to do this:
n To allow PHP to efficiently operate on a new Web server platform.
nTo harness the ease of use of a scripting language inside an application. PHP provides powerful templating capabilities that can be validly embedded in many appli-
cations. An example of this is the PHP filter SAPI, which provides a PHP interface for writing sendmail mail filters in PHP.
nFor easy extensibility.You can allow end users to customize parts of an application with code written in PHP.
Understanding how PHP embeds into applications is also important because it helps you get the most out of the existing SAPI implementations. Do you like mod_php but feel like it’s missing a feature? Understanding how SAPIs work can help you solve your problems. Do you like PHP but wish the Zend Engine had some additional features? Understanding how to modify its behavior can help you solve your problems.
SAPIs
SAPIs provide the glue for interfacing PHP into an application.They define the ways in which data is passed between an application and PHP.
The following sections provide an in-depth look at a moderately simple SAPI, the PHP CGI SAPI, and the embed SAPI, for embedding PHP into an application with minimal custom needs.

582 Chapter 23 Writing SAPIs and Extending the Zend Engine
The CGI SAPI
The CGI SAPI provides a good introduction to how SAPIs are implemented. It is simple, in that it does not have to link against complicated external entities as mod_php does. Despite this relative simplicity, it supports reading in complex environment information, including POST, GET, and cookie data.This import of environmental information is one of the major duties of any SAPI implementation, so it is important to understand it.
The defining structure in a SAPI is sapi_module_struct, which defines all the ways that the SAPI can bridge PHP and the environment so that it can set environment and query variables. sapi_module_struct is a collection of details and function pointers that tell the SAPI how to hand data to and from PHP. It is defined as follows:
struct _sapi_module_struct { char *name;
char *pretty_name;
int (*startup)(struct _sapi_module_struct *sapi_module); int (*shutdown)(struct _sapi_module_struct *sapi_module); int (*activate)(TSRMLS_D);
int (*deactivate)(TSRMLS_D);
int (*ub_write)(const char *str, unsigned int str_length TSRMLS_DC); void (*flush)(void *server_context);
struct stat *(*get_stat)(TSRMLS_D);
char *(*getenv)(char *name, size_t name_len TSRMLS_DC); void (*sapi_error)(int type, const char *error_msg, ...); int (*header_handler)(sapi_header_struct *sapi_header,
sapi_headers_struct *sapi_headers TSRMLS_DC); int (*send_headers)(sapi_headers_struct *sapi_headers TSRMLS_DC); void (*send_header)(sapi_header_struct *sapi_header,
void *server_context TSRMLS_DC);
int (*read_post)(char *buffer, uint count_bytes TSRMLS_DC); char *(*read_cookies)(TSRMLS_D);
void (*register_server_variables)(zval *track_vars_array TSRMLS_DC); void (*log_message)(char *message);
char *php_ini_path_override;
void (*block_interruptions)(void); void (*unblock_interruptions)(void); void (*default_post_reader)(TSRMLS_D);
void (*treat_data)(int arg, char *str, zval *destArray TSRMLS_DC); char *executable_location;
int php_ini_ignore;
int (*get_fd)(int *fd TSRMLS_DC); int (*force_http_10)(TSRMLS_D);
int (*get_target_uid)(uid_t * TSRMLS_DC); int (*get_target_gid)(gid_t * TSRMLS_DC);
unsigned int (*input_filter)(int arg, char *var, char **val, unsigned int val_len TSRMLS_DC);

SAPIs 583
void (*ini_defaults)(HashTable *configuration_hash);
int phpinfo_as_text;
};
Here is the module structure for the CGI SAPI:
static sapi_module_struct cgi_sapi_module = {
“cgi”, |
/* name */ |
“CGI”, |
/* pretty name */ |
php_cgi_startup, |
/* startup */ |
php_module_shutdown_wrapper, |
/* shutdown */ |
NULL, |
/* activate */ |
sapi_cgi_deactivate, |
/* deactivate */ |
sapi_cgibin_ub_write, |
/* unbuffered write */ |
sapi_cgibin_flush, |
/* flush */ |
NULL, |
/* get uid */ |
sapi_cgibin_getenv, |
/* getenv */ |
php_error, |
/* error handler */ |
NULL, |
/* header handler */ |
sapi_cgi_send_headers, |
/* send headers handler */ |
NULL, |
/* send header handler *= |
sapi_cgi_read_post, |
/* read POST data */ |
sapi_cgi_read_cookies, |
/* read Cookies */ |
sapi_cgi_register_variables, |
/* register server variables */ |
sapi_cgi_log_message, |
/* Log message */ |
STANDARD_SAPI_MODULE_PROPERTIES |
|
}; |
|
Notice that the last 14 fields of the struct have been replaced with the macro STANDARD_ SAPI_PROPERTIES.This common technique used by SAPI authors takes advantage of the C language semantic of defining omitted struct elements in a declaration as NULL.
The first two fields in the struct are the name of the SAPI.These are what is returned when you call phpinfo() or php_sapi_name() from a script.
The third field is the function pointer sapi_module_struct.startup.When an application implementing a PHP SAPI is started, this function is called. An important task for this function is to bootstrap the rest of the loading by calling php_module_startup() on its module details. In the CGI module, only the bootstrapping procedure is performed, as shown here:
static int php_cgi_startup(sapi_module_struct *sapi_module)
{
if (php_module_startup(sapi_module, NULL, 0) == FAILURE) { return FAILURE;
}
return SUCCESS;
}

584Chapter 23 Writing SAPIs and Extending the Zend Engine
The fourth element, sapi_module_struct.shutdown, is the corresponding function called when the SAPI is shut down (usually when the application is terminating).The CGI SAPI (like most of the SAPIs that ship with PHP) calls php_module_shutdown_wrapper as its shutdown function.This simply calls php_module_shutdown, as shown here:
int php_module_shutdown_wrapper(sapi_module_struct *sapi_globals)
{
TSRMLS_FETCH(); php_module_shutdown(TSRMLS_C); return SUCCESS;
}
As described in Chapter 20,“PHP and Zend Engine Internals,” on every request, the SAPI performs startup and shutdown calls to clean up its running environment and to reset any resources it may require.These are the fifth and sixth sapi_module_struct elements.The CGI SAPI does not define sapi_module_struct.activate, meaning that it registers no generic request-startup code, but it does register sapi_module_struct.deactivate. In deactivate, the CGI SAPI flushes its output file streams to guarantee that the end user gets all the data before the SAPI closes its end of the socket.The following are the deactivation code and the flush helper function:
static void sapi_cgibin_flush(void *server_context)
{
if (fflush(stdout)==EOF) { php_handle_aborted_connection();
}
}
static int sapi_cgi_deactivate(TSRMLS_D) {cdx
sapi_cgibin_flush(SG(server_context)); return SUCCESS;
}
Note that stdout is explicitly flushed; this is because the CGI SAPI is hard-coded to send output to stdout.
A SAPI that implements more complex activate and deactivate functions is the Apache module mod_php. Its activate function registers memory cleanup functions in case Apache terminates the script prematurely (for instance, if the client clicks the Stop button in the browser or the script exceeds Apache’s timeout setting).
The seventh element, sapi_module_struct.ub_write, provides a callback for how PHP should write data to the user when output buffering is not on.This is the function that will actually send the data when you use print or echo on something in a PHP script. As mentioned earlier, the CGI SAPI writes directly to stdout. Here is its implementation, which writes data in 16KB chunks:

SAPIs 585
static inline size_t sapi_cgibin_single_write(const char *str, uint str_length TSRMLS_DC)
{
size_t ret;
ret = fwrite(str, 1, MIN(str_length, 16384), stdout); return ret;
}
static int sapi_cgibin_ub_write(const char *str, uint str_length TSRMLS_DC)
{
const char *ptr = str;
uint remaining = str_length; size_t ret;
while (remaining > 0) {
ret = sapi_cgibin_single_write(ptr, remaining TSRMLS_CC); if (!ret) {
php_handle_aborted_connection(); return str_length - remaining;
}
ptr += ret; remaining -= ret;
}
return str_length;
}
This method writes each individual character separately, which is inefficient but very cross-platform portable. On systems that support POSIX input/output, you could as easily consolidate this function into the following:
static int sapi_cgibin_ub_write(const char *str, uint str_length TSRMLS_DC)
{
size_t ret;
ret = write(fileno(stdout), str, str_length); return (ret >= 0)?ret:0;
}
The eighth element is sapi_module_struct.flush, which gives PHP a way to flush its stream buffers (for example, when you call flush() within a PHP script).This uses the function sapi_cgibin_flush, which you saw called earlier from within the deactivate function.
The ninth element is sapi_module_struct.get_stat.This provides a callback to override the default stat() of the file performed to ensure that the script can be run in safe mode.The CGI SAPI does not implement this hook.
The tenth element is sapi_module_struct.getenv. getenv provides an interface to look up environment variables by name. Because the CGI SAPI runs akin to a regular

586 Chapter 23 Writing SAPIs and Extending the Zend Engine
user shell script, its sapi_cgibin_getenv() function is just a simple gateway to the C function getenv(), as shown here:
static char *sapi_cgibin_getenv(char *name, size_t name_len TSRMLS_DC)
{
return getenv(name);
}
In more complex applications, such as mod_php, the SAPI should implement sapi_ module_struct.getenv on top of the application’s internal environment facilities.
The eleventh element is the callback sapi_module_struct.sapi_error.This sets the function to be called whenever a userspace error or an internal call to zend_error() occurs. Most SAPIs set this to php_error, which is the built-in PHP error handler.
The twelfth element is sapi_module_struct.header_handler.This function is called anytime you call header() inside code or when PHP sets its own internal headers.The CGI SAPI does not set its own header_handler, which means that it falls back on the default SAPI behavior, which is to append it to an internal list that PHP manages.This callback is mainly used in Web server SAPIs such as mod_php, where the Web server wants to maintain the headers itself instead of having PHP do so.
The thirteenth element is sapi_module_struct.send_headers.This is called when it is time to send all the headers that have been set in PHP (that is, immediately before the first content is sent).This callback can choose to send all the headers itself, in which case it returns SAPI_HEADER_SENT_SUCCESSFULLY, or it can delegate the task of sending individual headers to the fourteenth sapi_module_struct element, send_header, in which case it should return SAPI_HEADER_DO_SEND.The CGI SAPI chooses the first methodology and writes all its headers in a send_headers function, defined as follows:
static int sapi_cgi_send_headers(sapi_headers_struct *sapi_headers TSRMLS_DC)
{
char buf[SAPI_CGI_MAX_HEADER_LENGTH]; sapi_header_struct *h; zend_llist_position pos;
long rfc2616_headers = 0;
if(SG(request_info).no_headers == 1) { return SAPI_HEADER_SENT_SUCCESSFULLY;
}
if (SG(sapi_headers).http_response_code != 200) { int len;
len = sprintf(buf, “Status: %d\r\n”, SG(sapi_headers).http_response_code); PHPWRITE_H(buf, len);
}
if (SG(sapi_headers).send_default_content_type) { char *hd;
hd = sapi_get_default_content_type(TSRMLS_C);

SAPIs 587
PHPWRITE_H(“Content-type: “, sizeof(“Content-type: “)-1); PHPWRITE_H(hd, strlen(hd));
PHPWRITE_H(“\r\n”, 2); efree(hd);
}
h = zend_llist_get_first_ex(&sapi_headers->headers, &pos); while (h) {
PHPWRITE_H(h->header, h->header_len);
PHPWRITE_H(“\r\n”, 2);
h = zend_llist_get_next_ex(&sapi_headers->headers, &pos);
}
PHPWRITE_H(“\r\n”, 2);
return SAPI_HEADER_SENT_SUCCESSFULLY;
}
PHPWRITE_H is a macro wrapper that handles output buffering, which might potentially be on.
The fifteenth element is sapi_module_struct.read_post, which specifies how POST data should be read.The function is passed a buffer and a buffer size, and it is expected to fill out the buffer and return the length of the data within. Here is the CGI SAPI’s implementation, which simply reads up to the specified buffer size of data from stdin (file descriptor 0):
static int sapi_cgi_read_post(char *buffer, uint count_bytes TSRMLS_DC)
{
uint read_bytes=0, tmp_read_bytes; count_bytes = MIN(count_bytes,
(uint)SG(request_info).content_length-SG(read_post_bytes)); while (read_bytes < count_bytes) {
tmp_read_bytes = read(0, buffer+read_bytes, count_bytes-read_bytes); if (tmp_read_bytes<=0) {
break;
}
read_bytes += tmp_read_bytes;
}
return read_bytes;
}
Note that no parsing is done here: read_post only provides the facility to read in raw post data. If you want to modify the way PHP parses POST data, you can do so in sapi_module_struct.default_post_reader, which is covered later in this chapter, in the section “SAPI Input Filters.”
The sixteenth element is sapi_module_struct.read_cookies.This performs the same function as read_post, except on cookie data. In the CGI specification, cookie data is passed in as an environment variable, so the CGI SAPI cookie reader just uses the