
Advanced PHP Programming
.pdf
518Chapter 21 Extending PHP: Part I
To duplicate this functionality in C, you create a fresh array for george and then add its zval to return_value.Then you repeat this for theo:
PHP_FUNCTION(people)
{
zval *tmp;
array_init(return_value);
MAKE_STD_ZVAL(tmp); array_init(tmp);
add_assoc_string(tmp, “FullName”, “George Schlossnagle”, 1); add_assoc_long(tmp, “uid”, 1001);
add_assoc_long(tmp, “gid”, 1000); add_assoc_zval(return_value, “george”, tmp);
MAKE_STD_ZVAL(tmp); array_init(tmp);
add_assoc_string(tmp, “FullName”, “Theo Schlossnagle”, 1); add_assoc_long(tmp, “uid”, 1002);
add_assoc_long(tmp, “gid”, 1000); add_assoc_zval(return_value, “theo”, tmp); return;
Note that you can reuse the pointer tmp; when you call MAKE_STD_ZVAL(), it just allocates a fresh zval for your use.
There is a similar set of functions for dealing with indexed arrays.The following functions work like the PHP function array_push(), adding the new value at the end of the array and assigning it the next available index:
add_next_index_long(zval *arg, long value); add_next_index_null(zval *arg); add_next_index_bool(zval *arg, int value); add_next_index_resource(zval *arg, int value); add_next_index_double(zval *arg, double value);
add_next_index_string(zval *arg, char *str, int duplicate); add_next_index_stringl(zval *arg, char *str, uint length, int duplicate); add_next_index_zval(zval *arg, zval *value);
If you want to insert into the array at a specific index, there are convenience functions for that as well:
add_index_long(zval *arg, uint idx, long value); add_index_null(zval *arg, uint idx); add_index_bool(zval *arg, uint idx, int value); add_index_resource(zval *arg, uint idx, int value); add_index_double(zval *arg, uint idx, double value);

Extension Basics |
519 |
add_index_string(zval *arg, uint idx, char *string, int duplicate); add_index_stringl(zval *arg, uint idx, char *string,
int string_length, int duplicate); add_index_zval(zval *arg, uint index, zval *value);
Note that in the case of both the add_assoc_ and add_index_ functions, any existing data with that key or index will be overwritten.
You now know all you need to know to be able to create arrays, but how do you extract data from them in a script? As discussed in Chapter 20, one of the types represented by a zval is the HashTable type.This is used for both associative and indexed arrays in PHP.To gain access to a zval’s hashtable, you use the HASH_OF() macro.Then you utilize the hash iteration functions to handle the resulting hashtable.
Consider the following PHP function, which is designed as a rudimentary version of array_filter():
function array_strncmp($array, $match)
{
foreach ($array as $key => $value) {
if( substr($key, 0, length($match)) == $match ) { $retval[$key] = $value;
}
}
return $retval;
}
A function of this nature is useful, for example, when you’re trying to extract all the HTTP headers for a request. In C this looks as follows:
PHP_FUNCTION(array_strncmp)
{
zval *z_array, **data; char *match;
char *key; int match_len; ulong index;
HashTable *array; if(zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, “as”,
&z_array, &match, &match_len) == FAILURE) {
return;
}
array_init(return_value); array = HASH_OF(z_array);
zend_hash_internal_pointer_reset(array);
while(zend_hash_get_current_key(array, &key, &index, 0) == HASH_KEY_IS_STRING) {
if(!strncmp(key, match, match_len)) { |
|
zend_hash_get_current_data(array, (void**)&data); |
zval_add_ref(data); |

520 Chapter 21 Extending PHP: Part I
add_assoc_zval(return_value, key, *data);
}
zend_hash_move_forward(array);
}
}
There is a good bit of new material in this function. Ignore the zval manipulation for the moment; you’ll learn more on that shortly.The important part of this example for now is the process of iterating over an array. First, you access the array’s internal hashtable, using the HASH_OF() macro.Then you reset the hashtable’s internal iterator by using zend_hash_internal_pointer_reset().This is akin to calling reset($array); in PHP.
Next, you access the current array’s key with zend_hash_get_current_key().This takes the HashTable pointer, a char ** for the keyname, and an ulong * for the array index.You need to pass both pointers in because PHP uses a unified type for associative and indexed arrays, so an element may either be indexed or keyed. If there is no current key (for instance, if you have iterated through to the end of the array), this function returns HASH_KEY_NON_EXISTENT; otherwise, it returns either HASH_KEY_IS_STRING or
HASH_KEY_IS_LONG, depending on whether the array is associative or indexed. Similarly, to extract the current data element, you use
zend_hash_get_current_data(), which takes the HashTable pointer and a zval ** to hold the data value. If an array element matches the condition for copying, the zvals reference count is incremented with zval_add_ref(), and it is inserted into the return array.To advance to the next key, you use zend_hash_move_forward().
Type Testing Conversions and Accessors
As described in Chapter 20, zvals are actually a composite of primitive C data types represented by the zvalue_value union:
typedef union _zvalue_value { long lval;
double dval; struct {
char *val; int len;
} str; HashTable *ht;
zend_object_value obj; } zvalue_value;
PHP provides accessor macros that allow access to these component values. Because this is a union, only a single representation is valid at one time.This means that if you want to use an accessor to access the zval as a string, you first need to ensure that it is currently represented as a string.

Extension Basics |
521 |
To convert a zval to a given type, you can use the following functions:
convert_to_string(zval *value); convert_to_long(zval *value); convert_to_double(zval *value); convert_to_null(zval *value); convert_to_boolean(zval *value); convert_to_array(zval *value); convert_to_object(zval *value);
To test whether your zval needs conversion, you can use the Z_TYPE_P() macro to check the zval’s current type, as demonstrated in the following example:
PHP_FUNCTION(check_type)
{
zval *value; char *result;
if(zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, “z”, &value) == FAILURE){ return;
}
switch(Z_TYPE_P(value)) { case IS_NULL:
result = “NULL”; break;
case IS_LONG: result = “LONG”; break;
case IS_DOUBLE: result = “DOUBLE”; break;
case IS_STRING: result = “STRING”; break;
case IS_ARRAY: result = “ARRAY”; break;
case IS_OBJECT: result = “OBJECT”; break;
case IS_BOOL: result = “BOOL”; break;
case IS_RESOURCE: result = “RESOURCE”; break;
case IS_CONSTANT: result = “CONSTANT”;

522 |
Chapter 21 |
Extending PHP: Part I |
|
break; |
|
|
case IS_CONSTANT_ARRAY: |
|
|
result = “CONSTANT_ARRAY”; |
|
|
break; |
|
|
default: |
|
|
result = “UNKNOWN”; |
|
|
} |
|
|
RETURN_STRING(result, 1); |
|
|
} |
|
|
To then access the data in the various types, you can use the macros in Table 21.5, each |
|
|
of which takes a zval. |
|
|
Table 21.5 |
zval-to-C Data Type Conversion Macros |
|
|
|
Macro |
Returns |
Description |
Z_LVAL |
long |
Returns a long value |
Z_BVAL |
zend_bool |
Returns a Boolean value |
Z_STRVAL |
char * |
Returns a buffer for the string |
Z_STRLEN |
int |
Returns the length of a string |
Z_ARRVAL |
HashTable |
Returns an internal hashtable |
Z_RESVAL |
long |
Returns the resource handle |
|
|
|
In addition, there are forms of all these macros to accept zval * and zval ** pointers. They are named identically, but with an appended _P or _PP, respectively. For instance, to extract the string buffer for zval **p, you would use Z_STRVAL_PP(p).
When data is passed into a function via the zend_parse_parameters() function, the resulting data is largely safe for use.When you get access to data as a zval, however, all bets are off.The problem lies in the way zvals in PHP are reference counted.The Zend Engine uses a copy-on-write semantic, which means if you have code like the following, you actually only have a single zval with a reference count of two:
$a = 1;
$b = $a;
If you modify $b in your PHP code, $b is automatically separated into its own zval. Inside an extension, though, you need to perform this separation yourself. Separation takes a zval pointer whose reference count is greater than one and copies its content into a new zval.This means that you can manipulate its contents at your whim without worrying about affecting anyone else’s copy. Separating a zval is prudent if you are going to perform type conversion.
Separation is performed with the SEPARATE_ZVAL() macro. Because you often may not want to separate a zval if it is accessed by reference, there is also a SEPARATE_ZVAL_IF_NOT_REF() macro that performs the separation only if the zval is a reference to another zval.

Extension Basics |
523 |
Finally, sometimes you might want to create a new copy of a variable, as in this example:
$a = $b;
For strings and numeric scalars, this copy might seem silly; after all, it is quite easy to create a brand-new zval from a char * or a long. Copying is especially essential when it comes to complex data types, such as arrays or objects, in which case copying would be a multistep operation.
You might naively assume that if you wanted to write a function that returns its single parameter unchanged, you could use this:
PHP_FUNCTION(return_unchanged)
{
zval *arg;
if(zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, “z”, &arg) == FAILURE)
{
return;
}
*return_value = *arg; return;
}
However, performing this sort of copy creates an invalid reference to the data pointed at by arg.To correctly perform this copy, you also need to invoke zval_copy_ctor(). zval_copy_ctor() is modeled after an object-oriented style copy constructor (like the _ _clone() method in PHP 5) and handles making proper deep copies of zvals, regardless of their type.The preceding return_unchanged() function should correctly be written as follows:
PHP_FUNCTION(return_unchanged)
{
zval *arg;
if(zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, “z”, &arg) == FAILURE)
{
return;
}
*return_value = *arg; zval_copy_ctor(return_value); return;
}
Similarly, you might from time to time be required to destroy a zval—for example, if you create a temporary zval inside a function that is not returned into PHP.The same complexities that make copying a zval difficult—the deep and variable structures— make destroying a zval difficult as well. For this you should use the zval destructor zval_dtor().

524 Chapter 21 Extending PHP: Part I
Using Resources
You use resources when you need to assign an arbitrary data type to a PHP variable. By arbitrary, I don’t mean a string or number or even an array, but a generic C pointer that could correspond to anything. Resources are often used for database connections, file pointers, and other resources that you may want to pass between functions but that do not correspond to any of PHP’s native types.
Creating resources in PHP is a rather complicated process. In PHP, actual resource values are not stored in zvals. Instead, resources are handled similarly to objects: An integer that identifies the resource is stored in the zval and can be used to find the actual data pointer for the resource in a resource data storage list. Object-oriented extensions are covered in Chapter 22,“Extending PHP: Part II.”
To start handling resources, you need to create a list to store the resource values. List registration is performed with the function zend_register_list_destructors_ex(), which has the following prototype:
int zend_register_list_destructors_ex(rsrc_dtor_func_t ld, rsrc_dtor_func_t pld, char *type_name, int module_number);
ld is a function pointer that takes a zend_rsrc_list_entry * structure and handles destruction of a nonpersistent resource. For example, if the resource is a pointer to a database connection, ld would be a function that rolls back any uncommitted transactions, closes the connection, and frees any allocated memory. Nonpersistent resources are destroyed at the end of every request.
The zend_rsrc_list_entry data type looks like this:
typedef struct _zend_rsrc_list_entry { void *ptr;
int type; int refcount;
} zend_rsrc_list_entry;
pld is identical to ld, except that it is used for persistent resources. Persistent resources are not automatically destroyed until server shutdown.When registering resource lists in practice, you traditionally create one list for nonpersistent resources and one for persistent resources.This is not technically necessary, but it adds to the orderliness of your extension and is the traditional method for handling resources.
type_name is a string used to identify the type of resource contained in the list.This name is used only for making user errors pretty and serves no technical function for the resources.
module_number is the internal number used to identify the current extension. One of the elements of zend_module_entry is zend_module_entry.module_number.When PHP loads the extension, it sets this module number for you. is what you pass as the fourth parameter to zend_register_list_destructors_ex().
If you want to register a POSIX file handle as a resource (similar to what fopen does under PHP 4), you need to create a destructor for it.This destructor would simply close

Extension Basics |
525 |
the file handle in question. Here is a destructor function for closing POSIX file handles:
static void posix_fh_dtor(zend_rsrc_list_entry *rsrc TSRMLS_DC)
{
if (rsrc->ptr) { fclose(rsrc->ptr); rsrc->ptr = NULL;
}
}
The actual registration is performed in the PHP_MINIT_FUNCTION() handler.You start by defining a static int for each list you need to create.The int is a handle to the list and how you reference it.The following code creates two lists, one persistent and one not:
static int non_persist; static int persist;
PHP_MINIT_FUNCTION(example)
{
non_persist = zend_register_list_destructors_ex(posix_fh_dtor, NULL,
“non-persistent posix fh”,
module_number);
persist = zend_register_list_destructors_ex(NULL, posix_fh_dtor,
“persistent posix fh”,
module_number);
return SUCCESS;
}
To actually register a resource you use the following macro:
ZEND_REGISTER_RESOURCE(zval *rsrc_result, void *ptr, int rsrc_list)
This inserts the data pointer ptr into the list rsrc_list, returns the resource ID handle for the new resource, and makes the zval rsrc_result a resource that references that handle. rsrc_result can also be set to NULL if you prefer to assign the handle into something other than an existing zval.
The following is a function that (very roughly) models fopen() and registers its FILE pointer as a persistent resource:
PHP_FUNCTION(pfopen)
{
char *path, *mode;
int path_length, mode_length; FILE *fh;
if(zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, “ss”, &path, &path_length,
&mode, &mode_length) == FAILURE) {
return;

526 Chapter 21 Extending PHP: Part I
}
fh = fopen(path, mode); if(fh) {
ZEND_REGISTER_RESOURCE(return_value, fh, persist); return;
}
else { RETURN_FALSE;
}
}
Of course, a function that blindly creates persistent resources isn’t very interesting.What it should be doing is seeing whether a current resource exists, and if so, it should use the preexisting resource instead of creating a new one.
There are two ways you might look for a resource.The first is to look for a resource, given the general initialization parameters.This is the crux of persistent resources.When you begin to establish a new persistent resource, you see whether a similarly declared resource already exists. Of course, the difficulty here is that you have to conceive of a keyed hashing system based on the initialization parameters to find your resource. In contrast, if you have a resource value assigned to a zval, then you already have its resource ID, so retrieval should (hopefully) be much simpler.
To find resources by ID, you need both a hash and a key. PHP provides the key: the global HashTable EG(persistent_list) is used for looking up resources by key. For the key, you are on your own. In general, a resource is uniquely determined by its initialization parameters, so a typical approach is to string together the initialization parameters, perhaps with some namespacing.
Here is a reimplementation of pfopen(), which proactively looks in EG(persistent_list) for a connection before it creates one:
PHP_FUNCTION(pfopen)
{
char *path, *mode;
int path_length, mode_length; char *hashed_details;
int hashed_details_length;
FILE *fh; list_entry *le;
if(zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, “ss”, &path, &path_length,
&mode, &mode_length) == FAILURE) {
return;
}
hashed_details_length = strlen(“example_”) + path_length + mode_length; hashed_details = emalloc(hashed_details_length + 1); snprintf(hashed_details, hashed_details_length + 1,
“example_%s%s”, path, mode);

Extension Basics |
527 |
if(zend_hash_find(&EG(persistent_list), hashed_details, hashed_details_length + 1, (void **) &le) == SUCCESS) {
if(Z_TYPE_P(le) != persist) { /* not our resource */
zend_error(E_WARNING, “Not a valid persistent file handle”); efree(hashed_details);
RETURN_FALSE;
}
fh = le->ptr;
}
else {
fh = fopen(path, mode); if(fh) {
list_entry new_le; Z_TYPE(new_le) = persist; new_le.ptr = fh;
zend_hash_update(&EG(persistent_list), hashed_details, hashed_details_length+1, (void *) &new_le, sizeof(list_entry), NULL);
}
}
efree(hashed_details); if(fh) {
ZEND_REGISTER_RESOURCE(return_value, fh, persist); return;
}
RETURN_FALSE;
}
You should notice the following about the new pfopen() function:
n You store new_le of type list_entry, which is identical to the type zend_rsrc_list_entry in EG(persistent_list).This convention is a convenient structure to use for this purpose.
nYou set and check that the type of new_le is the resource list ID.This protects against potential segfaults due to naming conflicts that can occur if another extension chooses an identical namespacing scheme (or you choose not to namespace your hashed_details string).
If you are using neither concurrent access resources (where two initialization calls might correctly return the same resource) nor persistent resources, you do not need to worry about storing information in the persistent list. Accessing data by its instantiation parameters is the hard way of doing things and is necessary only when you are (possibly) creating a new resource.