Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Advanced PHP Programming

.pdf
Скачиваний:
71
Добавлен:
14.04.2015
Размер:
7.82 Mб
Скачать

488 Chapter 20 PHP and Zend Engine Internals

HashTable function_table;

HashTable default_properties;

HashTable properties_info;

HashTable class_table;

HashTable *static_members;

HashTable constants_table;

zend_function_entry *built-in_functions;

union _zend_function *constructor; union _zend_function *destructor; union _zend_function *clone; union _zend_function *_ _get; union _zend_function *_ _set; union _zend_function *_ _call;

/* handlers */

zend_object_value (*create_object)(zend_class_entry *class_type TSRMLS_DC);

zend_class_entry **interfaces; zend_uint num_interfaces;

char *filename; zend_uint line_start; zend_uint line_end; char *doc_comment;

zend_uint doc_comment_len;

};

Like the main execution scope, a class contains its own function table (for holding class methods), and its own constants table.The class entry also contains a number of other items, including tables for its attributes (for example, default_properties, properties_ info, static_members) as well as the interfaces it implements, its constructor, its destructor, its clone, and its overloadable access functions. In addition, there is the create_object function pointer, which, if defined, is used to create a new object and define its handlers, which allow for fine-grained control of how that object is accessed.

One of the major changes in PHP 5 is the object model. In PHP 4, when you create an object, you are returned a zval whose zvalue_value looks like this:

typedef struct _zend_object {

zend_class_entry *ce;

HashTable *properties;

} zend_object;

This means that zend_objects in PHP 4 are little more than hashtables (of attributes) with a zend_class_entry floating around to hold its methods.When objects are passed

Classes 489

to functions, they are copied (as all other variable types are), and implementing controls of attribute accessors is extremely hackish.

In PHP 5, an object’s zval contains a zend_object_value, like this:

struct _zend_object_value { zend_object_handle handle; zend_object_handlers *handlers;

};

The zend_object_value in turn contains a zend_object_handle (an integer that identifies the location of the object in a global object store—effectively a pointer to the object proper) and a set of handlers, which regulate all accesses to the object.

This intrinsically changes the way that objects are handled in PHP. In PHP 5, when an object’s zval is copied (as happens on assignment or when passed into a function), the data is not copied; another reference to the object is created.These semantics are much more standard and correspond to the object semantics in Java, Python, Perl, and other languages.

The Object Handlers

In PHP 5 it is possible (in the extension API) to control almost all access to an object and its properties. A handler API is provided that implements the following access handlers:

typedef struct _zend_object_handlers { /* general object functions */ zend_object_add_ref_t add_ref; zend_object_del_ref_t del_ref;

zend_object_delete_obj_t delete_obj; zend_object_clone_obj_t clone_obj; /* individual object functions */

zend_object_read_property_t read_property; zend_object_write_property_t write_property; zend_object_read_dimension_t read_dimension; zend_object_write_dimension_t write_dimension; zend_object_get_property_ptr_ptr_t get_property_ptr_ptr; zend_object_get_t get;

zend_object_set_t set; zend_object_has_property_t has_property; zend_object_unset_property_t unset_property; zend_object_has_dimension_t has_dimension; zend_object_unset_dimension_t unset_dimension; zend_object_get_properties_t get_properties; zend_object_get_method_t get_method; zend_object_call_method_t call_method; zend_object_get_constructor_t get_constructor; zend_object_get_class_entry_t get_class_entry;

490 Chapter 20 PHP and Zend Engine Internals

zend_object_get_class_name_t get_class_name;

zend_object_compare_t compare_objects;

zend_object_cast_t cast_object;

} zend_object_handlers;

We’ll explore each handler in greater depth in Chapter 22,“Extending PHP: Part II,” where you’ll actually implement extension classes. In the meantime, you just need to know that the handler names offer a relatively clear indication as to what they do. For example, add_ref is called whenever a reference to an object is added:

$object2 = $object;

and compare_objects is called whenever two objects are compared by using the is_equal operator:

if($object2 == $object) {}

Object Creation

In the Zend Engine version 2, object creation happens in two phases.When you call this:

$object = new ClassName;

a new zend_object is created and placed in the object store, and a handle to it is assigned to $object. By default (as happens when you instantiate a userspace class), the object is allocated by using the default allocator, and it is assigned the default access handlers. Alternatively, if the class’s zend_class_entry has its create_object function defined, that function is called to handle the allocation of the object and returns the array of zend_object_handlers for that object.

This level of control is especially useful if you need to override the basic operations of an object and if you need to store resource data in an object that should not be touched by the normal memory management mechanisms.The Java and mono extensions both use these facilities to allow PHP to instantiate and access objects from these other language.

Only after the zend_object_value is created is the constructor called on the object. Even in extensions, the constructor (and destructor and clone) are “normal” zend_ functions.They do not alter the object’s access handlers, which have already been established.

Other Important Structures

In addition to the function and class tables, there are a few other important global data structures worth mentioning. Knowledge of how these work isn’t terribly important for a user of PHP, but it can be useful if you want to modify how the engine itself works. Most of these are elements of either the compiler_globals struct or the executor_globals struct and are most often referenced in the source via the macros

Classes 491

CG() and EG(), respectively.These are some of the global data structures you should know about:

n CG(function_table) and EG(function_table)—These structures refer to the function table we’ve talked about up until now. It exists in both the compiler and executor globals. Iterating through this hashtable gives you every callable function.

n CG(class_table) and EG(class_table)—These structures refer to the hashtable in which all the classes are stored.

n EG(symbol_table)—This structure refers to a hashtable that is the main (that is, global) symbol table.This is where all the variables in the global scope are stored.

n EG(active_symbol_table)—This structure refers to a hashtable that contains the symbol table for the current scope.

n EG(zend_constants)—This structure refers to the constants hashtable, where constants set with the function define are stored.

n CG(auto_globals)—This structure refers to the hashtable of autoglobals ($_SERVER, $_ENV, $_POST, and so on) that are used in the script.This is a compiler global so that the autoglobals can be conditionally initialized only if the script utilizes them.This boosts performance because it avoids the work of initializing and populating these variables when they are not needed.

n EG(regular_list)—This structure refers to a hashtable that is used to store “regular” (that is, nonpersistent) resources. Resources here are PHP resource-type variables, such as streams, file pointers, database connections, and so on.You’ll learn more about how these are used in Chapter 22.

nEG(persistent_list)—This structure is like EG(regular_list), but EG(persistent_list) resources are not freed at the end of every request (persist-

ent database connections, for example).

n EG(user_error_handler)—This structure refers to a pointer to a zval that contains the name of the current user_error_handler function (as set via the set_error_handler function). If no error-handler function is set, this structure is

NULL.

n EG(user_error_handlers)—This structure refers to the stack of error-handler functions.

n EG(user_exception_handler)—This structure refers to a pointer to a zval that contains the name of the current global exception handler, as set via the function set_exception_handler. If none has been set, this structure is NULL.

n EG(user_exception_handlers)—This structure refers to the stack of global exception handlers.

n EG(exception)—This is an important structure.Whenever an exception is thrown, EG(exception) is set to the actual object handler’s zval that is thrown. Whenever a function call is returned, EG(exception) is checked. If it is not NULL,

492 Chapter 20 PHP and Zend Engine Internals

execution halts and the script jumps to the op for the appropriate catch block.We will explore throwing exceptions from within extension code in depth in Chapter 21,“Extending PHP: Part I,” and Chapter 22.

n EG(ini_directives)—This structure refers to a hashtable of the php.ini directives that is set in this execution context.

This is just a selection of the globals set in executor_globals and compiler_globals. The globals listed here were chosen either because they are used in interesting optimizations in the engine (the just-in-time population of autoglobals) or because you will want to interact with them in extensions (such as resource lists).

The Principle of Sandboxing

The principle of sandboxing is that nothing that a user does in handling one request should in any way affect a subsequent request. PHP is an extremely well-sandboxed language in that at the end of every request, the interpreter is returned to a clean starting state. This specifically entails the following:

nAll function and class tables have all ZEND_USER_FUNCTION and ZEND_USER_CLASS (that is, all userspace-defined functions and classes) removed.

nAll op arrays for any parsed files are discarded. (They are actually discarded immediately after use.)

nThe symbol tables and constants tables are completely cleaned of all data.

nAll resources not on the persistent list are destructed.

Solutions such as mod_perl make it easy to accidentally instantiate global variables that have persistent (and thus potentially unexpected) values between requests. PHP’s request-end sterilization makes that sort of problem almost impossible. It also means that data that is known not to change between requests (for example, the compilation results of a file) needs to be regenerated on every request in which it is used. As we’ve discussed before in relation to compiler caches such as APC, IonCube, and the Zend Accelerator, avoiding certain aspects of this sandboxing can be beneficial from a performance standpoint. We’ll look at some methods for that in Chapter 23.

The PHP Request Life Cycle

Now that you have a decent understanding of how the Zend Engine works, let’s look at how the engine sits inside PHP and how PHP itself sits inside other applications.

Any discussion of the architecture of PHP starts with a diagram such as Figure 20.2, which shows the application layers in PHP.

The outermost layer, where PHP interacts with other applications, is the Server Abstraction API (SAPI) layer.The SAPI layer partially handles the startup and shutdown of PHP inside an application, and it provides hooks for handling data such as cookies and POST data in an application-agnostic manner.

The PHP Request Life Cycle

493

Application (apache, thttpd, cli, etc.)

SAPI

(see Chap 23)

PHP API

(streatr ms,, output,t, etc).) ((see Cchap 22))

PHP

Extensions

(mysql, standard library, etc. ) (see Chap 22)

Modular Code

Zend API

Zend Extension API

(see Chap 23)

Zend Engine

Figure 20.2 The architecture of PHP.

Below the SAPI layer lies the PHP engine itself.The core PHP code handles setting up the running environment (populating global variables and setting default .ini options), providing interfaces such as the stream’s I/O interface, parsing of data, and most importantly, providing an interface for loading extensions (both statically compiled extensions and dynamically loaded extensions).

At the core of PHP lies the Zend Engine, which we have discussed in depth here. As you’ve seen, the Zend Engine fully handles the parsing and execution of scripts.The Zend Engine was also designed for extensibility and allows for entirely overriding its basic functionality (compilation, execution, and error handling), overriding selective portions of its behavior (overriding op_handlers in particular ops), and having functions called on registerable hooks (on every function call, on every opcode, and so on).These features allow for easy integration of caches, profilers, debuggers, and semantics-altering extensions.

494 Chapter 20 PHP and Zend Engine Internals

The SAPI Layer

The SAPI layer is the abstraction layer that allows for easy embedding of PHP into other applications. Some SAPIs include the following:

n mod_php5—This is the PHP module for Apache, and it is a SAPI that embeds PHP into the Apache Web server.

n fastcgi—This is an implementation of FastCGI that provides a scalable extension to the CGI standard. FastCGI is a persistent CGI daemon that can handle multiple requests. FastCGI is the preferred method of running PHP under IIS and shows performance almost as good as that of mod_php5.

n CLI—This is the standalone interpreter for running PHP scripts from the command line, and it is a thin wrapper around a SAPI layer.

n embed—This is a general-purpose SAPI that is designed to provide a C library interface for embedding a PHP interpreter in an arbitrary application.

The idea is that regardless of the application, PHP needs to communicate with an application in a number of common places, so the SAPI interface provides a hook for each of those places.When an application needs to start up PHP, for instance, it calls the startup hook. Conversely, when PHP wants to output information, it uses the provided ub_write hook, which the SAPI layer author has coded to use the correct output method for the application PHP is running in.

To understand the capabilities of the SAPI layer, it is easiest to look at the hooks it implements. Every SAPI interface registers the following struct, with PHP describing the callbacks it implements:

struct _sapi_module_struct { char *name;

char *pretty_name;

int (*startup)(struct _sapi_module_struct *sapi_module); int (*shutdown)(struct _sapi_module_struct *sapi_module); int (*activate)(TSRMLS_D);

int (*deactivate)(TSRMLS_D);

int (*ub_write)(const char *str, unsigned int str_length TSRMLS_DC); void (*flush)(void *server_context);

struct stat *(*get_stat)(TSRMLS_D);

char *(*getenv)(char *name, size_t name_len TSRMLS_DC); void (*sapi_error)(int type, const char *error_msg, ...); int (*header_handler)(sapi_header_struct *sapi_header,

sapi_headers_struct *sapi_headers TSRMLS_DC); int (*send_headers)(sapi_headers_struct *sapi_headers TSRMLS_DC); void (*send_header)(sapi_header_struct *sapi_header,

The PHP Request Life Cycle

495

void *server_context TSRMLS_DC);

int (*read_post)(char *buffer, uint count_bytes TSRMLS_DC); char *(*read_cookies)(TSRMLS_D);

void (*register_server_variables)(zval *track_vars_array TSRMLS_DC); void (*log_message)(char *message);

char *php_ini_path_override;

void (*block_interruptions)(void); void (*unblock_interruptions)(void); void (*default_post_reader)(TSRMLS_D);

void (*treat_data)(int arg, char *str, zval *destArray TSRMLS_DC); char *executable_location;

int php_ini_ignore;

int (*get_fd)(int *fd TSRMLS_DC); int (*force_http_10)(TSRMLS_D);

int (*get_target_uid)(uid_t * TSRMLS_DC); int (*get_target_gid)(gid_t * TSRMLS_DC);

unsigned int (*input_filter)(int arg, char *var,

char **val, unsigned int val_len TSRMLS_DC); void (*ini_defaults)(HashTable *configuration_hash);

int phpinfo_as_text;

};

The following are some of the notable elements from this example:

n startup—This is called the first time the SAPI is initialized. In an application that will serve multiple requests, this is performed only once. For example, in mod_php5, this is performed in the parent process before children are forked.

n activate—This is called at the beginning of each request. It reinitializes all the per-request SAPI data structures.

n deactivate—This is called at the end of each request. It ensures that all data has been correctly flushed to the application, and then it destroys all the per-request data structures.

n shutdown—This is called at interpreter shutdown. It destroys all the SAPI structures.

n ub_write—This is what PHP will use to output data to the client. In the CLI SAPI, this is as simple as writing to standard output; in mod_php5, the Apache library call rwrite is called.

n sapi_error—This is a handler for reporting errors to the application. Most SAPIs use php_error, which instructs PHP to use its own internal error system.

n flush—This tells the application to flush its output. In the CLI, this is implemented via the C library call fflush; mod_php5 uses the Apache library rflush.

496 Chapter 20 PHP and Zend Engine Internals

n send_header—This sends a single specified header to the client. Some servers (such as Apache) have built-in functions for handling header transmission. Others (such as the PHP CGI) require you to manually send them. Others still (such as the CLI) do not handle sending headers at all.

nsend_headers—This sends all headers to the client.

nread_cookies—During SAPI activation, if a read_cookies handler is defined, it

will be called to populate SG(request_info).cookie_data.This is then used to populate the $_COOKIE autoglobal.

nread_post—During SAPI activation, if the request method is a POST (or if the php.ini variable always_populate_raw_post_data is true), the read_post handler is called to populate $HTTP_RAW_POST_DATA and $_POST.

Chapter 23 takes a closer look at using the SAPI interface to integrate PHP into applications and does a complete walkthrough of the CGI SAPI.

The PHP Core

There are several key steps in activating and running a PHP interpreter.When an application wants to start a PHP interpreter, it starts by calling php_module_startup.This function is like the master switch that turns on the interpreter. It activates the registered SAPI, initializes the output buffering system, starts the Zend Engine, reads in and acts on the php.ini file, and prepares the interpreter for its first request. Some important functions that are used in the core are

nphp_module_startup—This is the master startup for PHP.

nphp_startup_extensions—This runs the initialization function in all registered

extensions.

nphp_output_startup—This starts the output system.

nphp_request_startup—At the beginning of a request, this is the master function,

which calls up to the SAPI per-request functions, calls down into the Zend Engine for per-request initialization, and calls the request startup function in all registered modules.

n php_output_activate—This activates the output system, setting the output functions to use the SAPI-specified output functions.

nphp_init_config—This reads in the php.ini file and acts on its contents.

nphp_request_shutdown—This is the master function to destroy per-request

resources.

zend_module_entry

The PHP Request Life Cycle

497

n php_end_ob_buffers—This is used to flush output buffers, if output buffering has been enabled.

n php_module_shutdown—This is the master shutdown function for PHP, triggering all the rest of the interpreter shutdown functions.

The PHP Extension API

Most of our discussion regarding the PHP extension API will be carried on in Chapter 22, where you will actually implement extensions. Here we’ll only look at the basic callbacks available to extensions and when they are called.

Extensions can be registered in two ways.When an extension is compiled statically into PHP, the configuration system permanently registers that module with PHP. An extension can also be loaded from the .ini file, in which case it is registered during the

.ini parsing.

The hooks that an extension can register are contained in its function, like so:

struct _zend_module_entry { unsigned short size; unsigned int zend_api; unsigned char zend_debug; unsigned char zts;

struct _zend_ini_entry *ini_entry; char *name;

zend_function_entry *functions;

int (*module_startup_func)(INIT_FUNC_ARGS);

int (*module_shutdown_func)(SHUTDOWN_FUNC_ARGS); int (*request_startup_func)(INIT_FUNC_ARGS);

int (*request_shutdown_func)(SHUTDOWN_FUNC_ARGS); void (*info_func)(ZEND_MODULE_INFO_FUNC_ARGS); char *version;

int (*global_startup_func)(void); int (*global_shutdown_func)(void); int globals_id;

int module_started; unsigned char type; void *handle;

int module_number;

};

Соседние файлы в предмете [НЕСОРТИРОВАННОЕ]