Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Advanced PHP Programming

.pdf
Скачиваний:
71
Добавлен:
14.04.2015
Размер:
7.82 Mб
Скачать

508 Chapter 21 Extending PHP: Part I

Then you can configure and build PHP as normal, enabling the extension:

>./configure --with-apxs=/usr/local/apache/bin/apxs --enable-example

>make

>make install

To build an extension as a dynamically loadable shared object, the sources can be compiled outside the PHP source tree. From the source directory, you run this:

> phpize

This runs the PHP build system on the config.m4 file and creates a configuration script from it.

Then you configure and build the extension:

>./configure --enable-example

>make

>make install

This builds and installs the extension in the shared extensions directory. Because it is a dynamic extension, it should also be enabled via the php.ini file, using the following:

extension=example.so

If you do not load the extension from the php.ini file, you need to load it at script execution time with the following code:

dl(example.so);

Modules loaded at execution time are unloaded at the end of the request.This is slow, so it should be done only when loading via the php.ini file is impossible for political or policy reasons. If you are uncertain whether an extension will be loaded from the php.ini file, the standard approach is to use the following block of code to detect whether the desired extension is already loaded and dynamically load the extension if it is not:

if(!extension_loaded(example)) { dl(example.. PHP_SHLIB_SUFFIX);

}

Using Functions

One of the common tasks in an extension is writing functions.Whether refactoring existing PHP code in C or wrapping a C library for use in PHP, you will be writing functions.

A Function Example

To introduce function writing, let’s go back to my old favorite, the Fibonacci Sequence function. First, you need a C function that can calculate Fibonacci numbers. Chapter 11,

Extension Basics

509

“Computational Reuse,” surveys a number of Fibonacci implementations.The tail recursive version is quite fast. Here is a direct port of the PHP auxiliary tail recursion function to C:

int fib_aux(int n, int next, int result)

{

if(n == 0) { return result;

}

return fib_aux(n - 1, next + result, next);

}

After writing the core logic of the functions, you need to write the code that actually defines a PHP function.This happens in two parts. In the first part you define the function, and in the second you register the function with the extension so that it is registered in the global function table when the extension is loaded. Here is the declaration of the function fibonacci():

PHP_FUNCTION(fibonacci)

{

long n; long retval;

if(zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, l, &n) == FAILURE) { return;

}

if(n < 0) {

zend_error(E_WARNING, Argument must be a positive integer); RETURN_FALSE;

}

retval = fib_aux(n, 1, 0); RETURN_LONG(retval);

}

PHP functions are declared with the PHP_FUNCTION() macro.This macro does some critical name-munging on the function name (to avoid naming conflicts between extensions) and sets up the prototype of the function. (Internally, all functions are prototyped identically.) The only thing you need to know about how this macro works is that one of the parameters passed to the function is this:

zval *return_value

This variable holds the return value from the function.There are macros that you can assign to it in common cases, but occasionally you need to assign to it directly; however, the details of direct assignment are unimportant. If you stick with the macros (as you should, and as all the bundled extensions do), you do not need to probe further into the inner workings of this macro.

510 Chapter 21 Extending PHP: Part I

PHP functions are not passed arguments directly, but instead have to extract them from an argument stack that is set by the functions’ calling scope. zend_parse_ parameters() extracts the variables passed into a function from PHP.The first argument passed to it, ZEND_NUM_ARGS() TSRMLS_CC, is actually two arguments.The first argument is a macro that determines the number of arguments passed on the stack, and the second, TSRMLS_CC, is a macro that passes the correct thread-safety-management data if PHP is compiled for thread safety.

The next argument passed, l, specifies the type of data that is expected—in this case a long integer.The next argument, &n, is a reference to the C variable that you want to fill out with the value of the argument. Because you expect a long, you pass in a reference to a long.

zend_parse_parameters() returns SUCCESS if the number of arguments passed into the function matches the number of arguments searched for and if all the arguments can be successfully coerced into the types specified; otherwise, it returns FAILURE. On failure, it automatically sets the necessary warnings about the incorrect arguments passed to it, so you can simply return.

You should remember from Chapter 20,“PHP and Zend Engine Internals,” that PHP variables are not C types, but instead the special zval type. zend_parse_parameters() tries to handle all the hard work of type conversion for you. For variables that map easily to primitive C types (integers, floats, and character strings), this method works well and saves a lot of hassle. For more complex types, handling the actual zval is necessary.

After the arguments have been pulled into scope, the function is really just a C function. In fibonacci(), the nth Fibonacci value is calculated and set in retval.To return this value to the PHP user, you need to set it into return_value. For simple types there are macros to handle all this. In this case, RETURN_LONG(retval); correctly sets the type of return_value, sets its internal value holder to retval, and returns from the function.

To make this function available when you load the sample extension, you need to add it the function_entry array, like this:

function_entry example_functions[] = { PHP_FE(fibonacci, NULL)

{NULL, NULL, NULL}

};

The NULL after the PHP_FE() entry specifies argument-passing semantics (whether arguments are to be passed by reference, for example). In this case, the default passing by value is used.

If a function list appears before the functions are declared, you need to make a forward declaration of the function.This is commonly done in the header file php_example.h, as shown here:

PHP_FUNCTION(fibonacci);

Extension Basics

511

Managing Types and Memory

Chapter 18,“Profiling,” provides a cautionary tale about the real-life performance implications of hex-encoding strings in PHP.The hexencode() and hexdecode() functions described in that chapter were designed to take a character string and represent it as a hexadecimal string (for 8-bit safe data transfer) and use a function to reverse the process. In Chapter 18, I suggest that an alternative solution to using a workaround function would be to implement the encoding and decoding functions in C.This makes a nice function example.

You need a pair of C functions to perform this encoding. Each must take a char * string and its associated length and allocate and return its encoding or decoding.You pass the length into your functions instead of relying on a function such as strlen() so that your functions can be binary safe. In PHP, a string can actually contain arbitrary information, including null characters, so you need to pass in a string’s length so that you know where the string ends.

The function hexencode() works by first allocating a buffer twice the size of its input string (because a single character is represented by a two-position hex number). The source buffer is then stepped through character by character, and the first hexadecimal value for the upper 4 bits of the char is written, followed by the value for the lower 4 bits.The string is null-terminated and returned. Here is its implementation:

const char *hexchars = 0123456789ABCDEF; char *hexencode(char *in, int in_length) {

char *result; int i;

result = (char *) emalloc(2 * in_length + 1); for(i = 0; i < in_length; i++) {

result[2*i] = hexchars[(in[i] & 0x000000f0) >> 4]; result[2*i + 1] = hexchars[in[i] & 0x0000000f];

}

result[2*in_length] = \0; return result;

}

Note that the result buffer is allocated using the emalloc() function. PHP and the Zend Engine use their own internal memory-management wrapper functions. Because any data that you eventually assign into a PHP variable will be cleaned up by the Zend Engine memory-management system, that memory must be allocated with the wrapper functions. Further, because using multiple memory managers is confusing and error prone, it is a best practice to always use the Zend Engine memory-management wrappers in PHP extensions.

Table 21.1 shows the memory-management functions you will commonly need.

512

Chapter 21 Extending PHP: Part I

 

 

Table 21.1 Memory Management Functions

 

 

 

 

 

Function

Usage

 

void *emalloc(size_t size)

malloc() replacement

 

void efree(void *ptr)

free() replacement

 

void *erealloc(void *ptr, size_t size)

realloc() replacement

 

char *estrndup(char *str)

strndup replacement

 

 

 

All these functions utilize the engine’s memory system which destroys all of its memory pools at the end of every request.This is fine for almost all variables because PHP is extremely well sand-boxed and its symbol tables are all destroyed between requests anyway.

Occasionally, you might need to allocate memory that is persistent between requests. A typical reason to do this would be to allocate memory for a persistent resource.To do this, there are counterparts to all the preceding functions:

void *pemalloc(size_t size, int persistent) void pefree(void *ptr, int persistent)

void *perealloc(void *ptr, size_t size, int persistent) char *pestrndup(char *str, int persistent)

In all cases, persistent must be set to a nonzero value for the memory to be allocated as persistent memory. Internally, setting persistent instructs PHP to use malloc() to allocate memory instead of allocating from the PHP memory-management system.

You also need a hexdecode() function.This simply reverses the process in hexencode():The encoded string is read in two characters at a time, and the characters are converted into their corresponding ASCII equivalents. Here is the code to perform hexdecode():

static _ _inline_ _ int char2hex(char a)

{

return (a >= A&& a <= F)?(a - A+ 10):( a - 0);

}

char *hexdecode(char *in, int in_length)

{

char *result; int i;

result = (char *) emalloc(in_length/2 + 1); for(i = 0; i < in_length/2; i++) {

result[i] = char2hex(in[2 * i]) * 16 + char2hex(in[2 * i+1]);

}

result[in_length/2] = \0; return result;

}

Extension Basics

513

Of course, as with the Fibonacci Sequence example, these C functions are the workhorse routines.You also need PHP_FUNCTION wrappers, such as the following, for them:

PHP_FUNCTION(hexencode)

{

char *in; char *out; int in_length;

if(zend_parse_paramenters(ZEND_NUM_ARGS() TSRMLS_CC, s, &in, &in_length) == FAILURE) {

return;

}

out = hexencode(in, in_length); RETURN_STRINGL(out, in_length * 2, 0);

}

PHP_FUNCTION(hexdecode)

{

char *in; char *out; int in_length;

if(zend_parse_paramenters(ZEND_NUM_ARGS() TSRMLS_CC, s, &in, &in_length) == FAILURE) {

return;

}

out = hexdecode(in, in_length); RETURN_STRINGL(out, in_length/2, 0);

}

There are a couple important details to note in these code examples:

n PHP_FUNCTION(hexencode) calls hexencode().This is not a naming conflict because the PHP_FUNCTION() macro performs name munging.

n zend_parse_parameters() is set up to expect a string (the format section is s). Because string types in PHP are binary safe, when it accepts a string, it converts it into a char * (where the actual contents are allocated) as well as an int (which stores the length of the string).

n return_value is set via the macro RETURN_STRINGL().This macro takes three parameters.The first is the start of a char * buffer, which holds the string, the second is the length of the string (binary safeness again), and the third is a flag to indicate whether the buffer should be duplicated for use in return_value. Because you allocated out personally, you do not need to duplicate it here (in fact, you would leak memory if you did). In contrast, if you are using a character buffer that does not belong to you, you should specify 1 to duplicate the buffer.

514 Chapter 21 Extending PHP: Part I

Parsing Strings

The two examples in the preceding section parse only a single parameter each. In fact, zend_parse_parameters() provides great flexibility in parameter parsing by allowing you to specify a format string that describes the complete set of expected parameters. Table 21.2 shows the format characters, the types they describe, and the actual userdefined C variable types each format fills out.

Table 21.2 zend_parse_parameters() Format Strings

Format

Type

Takes

 

l

Long integer

long *

d

Floating-point number

double *

s

String

(char **, int *)

b

Boolean

zend_bool *

r

PHP resource

zval

**

a

Array

zval

**

o

Object

zval

**

O

Object (of a specific type)

zval **, type name

z

zval

zval **

 

 

 

 

For example, to specify that a function takes two strings and a long, you would use this:

PHP_FUNCTION(strncasecmp)

{

char *string1, *string2;

int string_length1, string_length2; long comp_length;

if(zend_parse_parameters(ZEND_NUM_ARG() TSRMLS_CC, ssl,

&string1, &string_length1,

&string2, &string_length2,

&comp_length) {

return;

}

/* ... */

}

This example specifies a char **/int * pair for each string and a long * for the long. In addition, you can specify format string modifiers that allow you to specify optional

arguments by using parameter modifiers (see Table 21.3).

 

Extension Basics

515

Table 21.3 zend_parse_parameters() Parameter Modifiers

 

 

 

 

 

Parameter Modifiers

Description

 

|

Everything after a | is an optional argument.

 

!

The preceding parameter can be a specified type or NULL. If NULL

 

 

is passed, the associated C pointer is also set to NULL.This is valid

 

 

only for the types that return zvals—types a, o, O, r, and z.

 

/

The preceding parameter should be separated, meaning that if its

 

 

reference count is greater than 1, its data should be copied into a

 

fresh zval.This is good to use if you are modifying a zval (for example, doing a forced-type conversion) and do not want to affect any other users.This modifier is usable only for types a, o, O, r, and z.

Other Return Macros

You have already seen two of the return macros, RETURN_STRINGL and RETURN_LONG, which set the value of return_value and return.Table 21.4 shows the full range of return macros.

Table 21.4 Return Macros

Macro

Description

 

RETURN_BOOL(zend_bool value)

Sets return_value from a Boolean value value.

RETURN_NULL()

Sets return_value to

null.

RETURN_TRUE()

Sets return_value to

true.

RETURN_FALSE()

Sets return_value to false.

RETURN_LONG(long value)

Sets return_value from the long integer value.

RETURN_DOUBLE(double value)

Sets return_value from the double value.

RETURN_EMPTY_STRING()

Sets return_value to the empty string “”.

RETURN_STRING(char *string,

Sets return_value from the character buffer

int duplicate)

string and a flag to indicate whether the buffer

 

memory should be used directly or copied. This is

 

not binary safe; it uses strlen() to calculate the

 

length of string.

 

RETURN_STRINGL(char *string,

Sets return_value from the character buffer

int length, int duplicate)

string of the specified length length and a flag to

 

indicate whether the buffer memory should be used

 

directly or copied.This is binary safe.

 

 

 

516 Chapter 21 Extending PHP: Part I

Manipulating Types

To understand how to set more complex values for return_value, you need to better understand how to manipulate zvals. As described in Chapter 20, variables in PHP are all represented by the zval type, which is a composite of all the possible PHP base types. This strategy permits PHP’s weak and dynamic typing semantics, as is described in Chapter 20.

When you want to create a variable that will be manipulated within PHP, that variable needs to be a zval.The normal creation process is to declare it and allocate it with a built-in macro, as in the following example:

zval *var;

MAKE_STD_ZVAL(var);

This allocates val and correctly sets its reference counters.

After the zval has been created, you can assign to it. For simple types (numbers, strings, Booleans), there are simple macros for this:

ZVAL_NULL(zval *var)

ZVAL_BOOL(zval *var, zend_bool value)

ZVAL_LONG(zval *var, long value)

ZVAL_DOUBLE(zval *var, double value)

ZVAL_EMPTY_STRING(zval *var)

ZVAL_STRINGL(zval *var, char *string, int length, int duplicate) ZVAL_STRING(zval *var, char *string, int duplicate)

These macros look very similar to the similarly named RETURN_ macros.They share identical assignment semantics.These macros all set scalar variables.To create an array, you use the following code:

zval *array; MAKE_STD_ZVAL(array); array_init(array);

Now array is an empty array zval. Much like regular zvals, there are convenience methods for adding simple types to arrays:

add_assoc_long(zval *arg, char *key, long value); add_assoc_bool(zval *arg, char *key, int value); add_assoc_resource(zval *arg, char *key, int value); add_assoc_double(zval *arg, char *key, double value); add_assoc_string(zval *arg, char *key, char *string, int duplicate); add_assoc_stringl(zval *arg, char *key, char *string,

int string_length, int duplicate); add_assoc_zval(zval *arg, char *key, zval *value);

Extension Basics

517

All these except the last should be relatively obvious:They support automatically adding base types to an array, keyed by the specified key.These functions uniformly return SUCCESS on success and FAILURE on failure.

For example, to create a C function that is identical to this PHP function:

function colors()

{

return array(Apple=> Red,

Banana=> Yellow,

Cranberry=> Maroon);

}

you would write this:

PHP_FUNCTION(colors)

{

array_init(return_value); add_assoc_string(return_value, Apple, Red, 1); add_assoc_string(return_value, Banana, Yellow, 1);

add_assoc_string(return_value, Cranberry, Maroon, 1); return;

}

Note the following:

nreturn_value is allocated outside PHP_FUNCTION, so it does not need to be acted on by MAKE_STD_ZVAL.

nBecause return_value is passed in, you do not return it at the end of the func-

tion; you simply use return.

nBecause the string values being used (Red, Yellow, Maroon) are stackallocated buffers, you need to duplicate them. Any memory not allocated with emalloc() should be duplicated if used to create a string zval.

The add_assoc_zval() function allows you to add an arbitrary zval to an array.This is useful if you need to add a nonstandard type, to create, for instance, a multidimensional array.The following PHP function generates a simple multidimensional array:

function people()

{

return array(

george=> array(FullName=> George Schlossnagle,

uid

=>

1001,

gid

=>

1000),

theo=> array(Fullname=> Theo Schlossnagle,

uid

=>

1002,

gid

=>

1000));

}

Соседние файлы в предмете [НЕСОРТИРОВАННОЕ]