
Advanced PHP Programming
.pdf
588Chapter 23 Writing SAPIs and Extending the Zend Engine
getenv callback to extract it, as shown here:
static char *sapi_cgi_read_cookies(TSRMLS_D)
{
return sapi_cgibin_getenv((char *)”HTTP_COOKIE”,0 TSRMLS_CC);
}
Again, filtering this data is covered in the section “SAPI Input Filters.”
Next comes sapi_module_struct.register_server_variables. As the name implies, this function is passed in what will become the $_SERVER autoglobal array, and the SAPI has the option of adding elements to the array.The following is the top-level register_server_variables callback for the CGI SAPI:
static void sapi_cgi_register_variables(zval *track_vars_array TSRMLS_DC)
{
php_import_environment_variables(track_vars_array TSRMLS_CC); php_register_variable(“PHP_SELF”,
(SG(request_info).request_uri ? SG(request_info).request_uri:””), track_vars_array TSRMLS_CC);
}
This calls php_import_environment_variables(), which loops through all the shell environment variables and creates entries for them in $_SERVER.Then it sets $_SERVER[‘PHP_SELF’] to be the requested script.
The last declared element in the CGI module is sapi_module_struct.log_message. This is a fallback function when no other error logging facility is specified. If error_log is not set in the php.ini file, then this is the function that will be called to print out any errors you receive.The CGI module implements this by printing to stderr, as follows:
static void sapi_cgi_log_message(char *message)
{
fprintf(stderr, “%s\n”, message);
}
We’ve now covered the standard sapi_module_struct elements.The filtering callbacks default_post_reader, treat_data, and input_filter are covered later in this chapter, in the section “SAPI Input Filters.”The others are special-purpose elements that are not covered here.
The CGI SAPI Application
You need to incorporate the CGI SAPI into an application that can actually run it.The actual CGI main() routine is very long, as it supports a wide variety of options and flags. Instead of covering that (which could easily take an entire chapter), this section provides a very stripped-down version of the main() routine that implements no optional flags. Here is the stripped-down version of the CGI main() routine:
int main(int argc, char **argv)
{

SAPIs 589
int exit_status = SUCCESS; zend_file_handle file_handle; int retval = FAILURE;
signal(SIGPIPE, SIG_IGN); /* ignore disconnecting clients */ sapi_startup(&cgi_sapi_module); cgi_sapi_module.executable_location = argv[0];
if (php_module_startup(&cgi_sapi_module, NULL, 0) == FAILURE) { return FAILURE;
}
zend_first_try {
SG(server_context) = (void *) 1; /* avoid server_context==NULL checks */ init_request_info(TSRMLS_C);
file_handle.type = ZEND_HANDLE_FILENAME; file_handle.filename = SG(request_info).path_translated; file_handle.handle.fp = NULL;
file_handle.opened_path = NULL; file_handle.free_filename = 0;
if (php_request_startup(TSRMLS_C)==FAILURE) { php_module_shutdown(TSRMLS_C);
return FAILURE;
}
retval = php_fopen_primary_script(&file_handle TSRMLS_CC); if (retval == FAILURE && file_handle.handle.fp == NULL) {
SG(sapi_headers).http_response_code = 404; PUTS(“No input file specified.\n”); php_request_shutdown((void *) 0); php_module_shutdown(TSRMLS_C);
return FAILURE;
}
php_execute_script(&file_handle TSRMLS_CC); if (SG(request_info).path_translated) {
char *path_translated;
path_translated = strdup(SG(request_info).path_translated); efree(SG(request_info).path_translated); SG(request_info).path_translated = path_translated;
}
php_request_shutdown((void *) 0); if (exit_status == 0) {
exit_status = EG(exit_status);
}
if (SG(request_info).path_translated) { free(SG(request_info).path_translated); SG(request_info).path_translated = NULL;

590 Chapter 23 Writing SAPIs and Extending the Zend Engine
}
}zend_catch { exit_status = 255;
}zend_end_try();
php_module_shutdown(TSRMLS_C);
sapi_shutdown();
return exit_status;
}
The following is the helper function init_request_info(), which sets the SAPI globals for script locations and query string parameters from the environment as per the CGI specification:
static void init_request_info(TSRMLS_D)
{
char *env_script_filename = sapi_cgibin_getenv(“SCRIPT_FILENAME”,0 TSRMLS_CC); char *env_path_translated = sapi_cgibin_getenv(“PATH_TRANSLATED”,0 TSRMLS_CC); char *script_path_translated = env_script_filename;
/* initialize the defaults */ SG(request_info).path_translated = NULL; SG(request_info).request_method = NULL; SG(request_info).query_string = NULL; SG(request_info).request_uri = NULL; SG(request_info).content_type = NULL; SG(request_info).content_length = 0; SG(sapi_headers).http_response_code = 200;
/* script_path_translated being set is a good indication that we are running in a cgi environment, since it is always null otherwise. otherwise, the filename
of the script will be retrieved later via argc/argv */ if (script_path_translated) {
const char *auth;
char *content_length = sapi_cgibin_getenv(“CONTENT_LENGTH”,0 TSRMLS_CC); char *content_type = sapi_cgibin_getenv(“CONTENT_TYPE”,0 TSRMLS_CC); SG(request_info).request_method =
sapi_cgibin_getenv(“REQUEST_METHOD”,0 TSRMLS_CC); SG(request_info).query_string =
sapi_cgibin_getenv(“QUERY_STRING”,0 TSRMLS_CC);
if (script_path_translated && !strstr(script_path_translated, “..”)) { SG(request_info).path_translated = estrdup(script_path_translated);
}
SG(request_info).content_type = (content_type ? content_type : “” ); SG(request_info).content_length = (content_length?atoi(content_length):0);


592 Chapter 23 Writing SAPIs and Extending the Zend Engine
allow transparent access to user environment data, and much of that work has to be done in the SAPI implementation.
If your goals are less ambitious than full custom PHP integration and you only want to execute PHP code as part of an application, the embed SAPI may be the right solution for you.The embed SAPI exposes PHP as a shared library that you can link against and run code.
To build the embed library, you need to compile PHP with the following configuration line:
--enable-embed
This creates libphp5.so.
The embed SAPI exposes two macros to the user:
PHP_EMBED_START_BLOCK(int argc, char **argv)
PHP_EMBED_END_BLOCK()
Inside the block defined by those macros is a running PHP environment where you can execute scripts with this:
php_execute_script(zend_file_handle *primary_file TSRMLS_DC);
or this:
zend_eval_string(char *str, zval *retval_ptr, char *string_name TSRMLS_DC);
As an example of just how simple this is, here is a working PHP shell that interactively executes anything you pass to it:
#include <php_embed.h> #include <stdio.h>
#include <readline/readline.h> #include <readline/history.h>
int main(int argc, char **argv) { char *code; PHP_EMBED_START_BLOCK(argc,argv);
while((code = readline(“> “)) != NULL) { zend_eval_string(code, NULL, argv[0] TSRMLS_CC);
}
PHP_EMBED_END_BLOCK(); return 0;
}
You then compile this, as shown here:
>gcc -pipe -g -O2 -I/usr/local/include/php -I/usr/local/include/php/Zend \ -I/usr/local/include/php/TSRM -I/usr/local/include/php/main -c psh.c
>gcc -pipe -g -O2 -L/usr/local/lib -lreadline -lncurses -lphp5 psh.o -o psh

SAPIs 593
Note that the embed SAPI sets the $argc and $argv autoglobals from what is passed to PHP_EMBED_START_BLOCK(). Check out the following psh session:
>./psh foo bar
>print_r($argv); Array
(
[0]=> ./psh
[1]=> foo
[2]=> bar
)
>$a = 1;
>print “$a\n”;
1
This is a toy example in that psh is pretty featureless, but it demonstrates how you can leverage all of PHP in under 15 lines of C. Later in this chapter you will use the embed SAPI to build a more significant application: the opcode dumper described in Chapter 20.
SAPI Input Filters
In Chapter 13,“User Authentication and Session Security,” you learned a bit about crosssite scripting and SQL injection attacks. Although they manifest differently, both attacks involve getting a Web application to accidentally execute (or in the case of cross-site scripting, getting a third-party user to execute) malicious code in your application’s space.
The solution to all attacks of this sort is simple:You must be fanatical about validating and sanitizing any input a user gives you.The responsibility for this sanitization process lies with the developer, but leaving it at that can be unsatisfactory for two reasons:
nDevelopers sometimes make mistakes. Cross-site scripting is an extremely serious security issue, and relying on everyone who touches PHP code to always perform the correct security measures may not be good enough.
nSanitizing all your data in PHP on every request can be slow.
To help address this issue, the SAPI interface provides a set of three callbacks that can be used to automatically sanitize data on every incoming request: input_filter, treat_data, and default_post_reader. Because they are registered at the SAPI level, they are invisible to the developer and are executed automatically.This makes it impossible to forget to apply them on a page. Further, because they are implemented in C and occur before data is inserted into the autoglobal arrays, the implementations can be much faster than anything written in PHP.


SAPIs 595
unsigned int raw_filter(int arg, char *var, char **val, unsigned int val_len, unsigned int *new_val_len TSRMLS_DC)
static void php_raw_filter_init_globals(zend_raw_filter_globals *globals)
{
memset(globals, 0, sizeof(zend_raw_filter_globals *));
}
PHP_MINIT_FUNCTION(raw_filter)
{
ZEND_INIT_MODULE_GLOBALS(raw_filter, php_raw_filter_init_globals, NULL); zend_register_auto_global(“_RAW_GET”, sizeof(“_RAW_GET”)-1, NULL TSRMLS_CC); zend_register_auto_global(“_RAW_POST”, sizeof(“_RAW_POST”)-1, NULL TSRMLS_CC); zend_register_auto_global(“_RAW_COOKIE”, sizeof(“_RAW_COOKIE”)-1,
NULL TSRMLS_CC); sapi_register_input_filter(raw_filter); return SUCCESS;
}
PHP_MSHUTDOWN_FUNCTION(raw_filter)
{
return SUCCESS;
}
PHP_RSHUTDOWN_FUNCTION(raw_filter)
{
if(IF_G(get_array)) { zval_ptr_dtor(&IF_G(get_array)); IF_G(get_array) = NULL;
}
if(IF_G(post_array)) { zval_ptr_dtor(&IF_G(post_array)); IF_G(post_array) = NULL;
}
if(IF_G(cookie_array)) { zval_ptr_dtor(&IF_G(cookie_array)); IF_G(cookie_array) = NULL;
}
return SUCCESS;
}
PHP_MINFO_FUNCTION(raw_filter)
{
php_info_print_table_start();
php_info_print_table_row( 2, “strip_tags() Filter Support”, “enabled” ); php_info_print_table_end();

596Chapter 23 Writing SAPIs and Extending the Zend Engine
}
zend_module_entry raw_filter_module_entry = {
STANDARD_MODULE_HEADER,
“raw_filter”, NULL,
PHP_MINIT(raw_filter), PHP_MSHUTDOWN(raw_filter), NULL, PHP_RSHUTDOWN(raw_filter),
PHP_MINFO(raw_filter),
“0.1”, STANDARD_MODULE_PROPERTIES
};
#ifdef COMPILE_DL_RAW_FILTER ZEND_GET_MODULE(raw_filter); #endif
This is largely a standard module.There are two new things to notice, though.The first is that you call this in the MINIT phase to register the new $_RAW arrays as autoglobals:
zend_register_auto_global(“_RAW_GET”, sizeof(“_RAW_GET”)-1, NULL TSRMLS_CC);
The second is that you register raw_filter as a SAPI input filter in MINIT via the following call:
sapi_register_input_filter(raw_filter);
The input filter forward declaration is as follows:
unsigned int raw_filter(int arg, char *var, char **val, unsigned int val_len,
unsigned int *new_val_len TSRMLS_DC);
The arguments to the input filters are as follows:
narg—The type of the input being processed (either PARSE_POST, PARSE_GET, or
PARSE_COOKIE).
nvar—The name of the input being processed.
nval—A pointer to the input of the argument being processed.
nval_len—The original length of *val.
nnew_val_len—The length of *val after any modification, to be set inside the fil-
ter.
Here is the code for the raw_filter input filter itself:
unsigned int raw_filter(int arg, char *var, char **val, unsigned int val_len, unsigned int *new_val_len TSRMLS_DC)
