Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Advanced PHP Programming

.pdf
Скачиваний:
71
Добавлен:
14.04.2015
Размер:
7.82 Mб
Скачать

14

Session Handling

IN CHAPTER 13,“USER AUTHENTICATION AND SESSION Security,” we discussed authenticating user sessions. In addition to being able to determine that a sequence of requests are simply coming from the same user, you very often want to maintain state information for a user between requests. Some applications, such as shopping carts and games, require state in order to function at all, but these are just a subset of the expanse of applications that use state.

Handling state in an application can be a challenge, largely due to the mass of data it is possible to accumulate. If I have a shopping cart application, I need for users to be able to put objects into the cart and track the status of that cart throughout their entire session. PHP offers no data persistence between requests, so you need to tuck this data away someplace where you can access it after the current request is complete.

There are a number of ways to track state.You can use cookies, query string munging, DBM-based session caches, RDBMS-backed caches, application server–based caches, PHP’s internal session tools, or something developed in house.With this daunting array of possible choices, you need a strategy for categorizing your techniques.You can bifurcate session-management techniques into two categories, depending on whether you store the bulk of the data client side or server side:

nClient-side sessions—Client-side sessions encompass techniques that require all or most of the session-state data to be passed between the client and server on every request. Client-side sessions may seem rather low-tech, and they are sometimes called heavyweight in reference to the amount of client/server data transmission required. Heavyweight sessions excel where the amount of state data that needs to be maintained is small.They require little to no back-end support. (They have no backing store.) Although they are heavyweight in terms of content transmitted, they are very database/back-end efficient.This also means that they fit with little modification into a distributed system.

nServer-side sessions—Server-side sessions are techniques that involve little client/server data transfer.These techniques typically involve assigning an ID to a

350 Chapter 14 Session Handling

session and then simply transmitting that ID. On the server side, state is managed in some sort of session cache (typically in a database or file-based handler), and the session ID is used to associate a particular request with its set of state information. Some server-side session techniques do not extend easily to run in a distributed architecture.

We have looked at many session-caching mechanisms in the previous chapters, caching various portions of a client’s session to mete out performance gains.The principal difference between session caching as we have seen it before and session state is that session caching takes data that is already available in a slow fashion and makes it available in a faster, more convenient, format. Session state is information that is not available in any other format.You need the session state for an application to perform correctly.

Client-Side Sessions

When you visit the doctor, the doctor needs to have access to your medical history to effectively treat you. One way to accomplish this is to carry your medical history with you and present it to your doctor at the beginning of your appointment.This method guarantees that the doctor always has your most current medical records because there is a single copy and you possess it. Although this is no longer common practice in the United States, recent advances in storage technology have advocated giving each person a smart card with his or her complete medical history on it.These are akin to our clientside sessions because the user carries with him or her all the information needed to know about the person. It eliminates the need for a centralized data store.

The alternative is to leave medical data managed at the doctor’s office or HMO (as is common in the United States now).This is akin to server-side sessions, in which a user carries only an identification card, and his or her records are looked up based on the user’s Social Security number or another identifier.

This analogy highlights some of the vulnerabilities of client-side sessions:

nThere is a potential for unauthorized inspection/tampering.

nClient-side sessions are difficult to transport.

nThere is a potential for loss.

Client-side sessions get a bad rap. Developers often tend to overengineer solutions, utilizing application servers and database-intensive session management techniques because they seem “more enterprise.”There is also a trend among large-scale software design aficionados to advance server-side managed session caches ahead of heavyweight sessions. The reasoning usually follows the line that a server-based cache retains more of the state information in a place that is accessible to the application and is more easily extensible to include additional session information.

Client-Side Sessions

351

Implementing Sessions via Cookies

In Chapter 13, cookies were an ideal solution for passing session authentication information. Cookies also provide an excellent means for passing larger amounts of session data as well.

The standard example used to demonstrate sessions is to count the number of times a user has accessed a given page:

<?php

$MY_SESSION = unserialize(stripslashes($_COOKIE[session_cookie])); $MY_SESSION[count]++;

setcookie(session_cookie, serialize($MY_SESSION), time() + 3600);

?>

You have visited this page <?= $MY_SESSION[count] ?> times.

This example uses a cookie name session_cookie to store the entire state of the $MY_SESSION array, which here is the visit count stored via the key count. setcookie() automatically encodes its arguments with urlencode(), so the cookie you get from this page looks like this:

Set-Cookie: session_cookie=a%3A1%3A%7Bs%3A5%3A%22count%22%3Bi%3A1%3B%7D;

expires=Mon, 03-Mar-2003 07:07:19 GMT

If you decode the data portion of the cookie, you get this:

a:1:{s:5:count;i:1;}

This is (exactly as you would expect), the serialization of this:

$MY_SESSION = array(count=> 1);

Escaped Data in Cookies

By default PHP runs the equivalent of addslashes() on all data received via the COOKIE, POST, or

GET variables. This is a security measure to help clean user-submitted data. Because almost all serialized variables have quotes in them, you need to run stripslashes() on $_COOKIE[session_data] before you deserialize it. If you are comfortable with manually cleaning all your user input and know what you are doing, you can remove this quoting of input data by setting magic_quotes_gpc = Off in your php.ini file.

It would be trivial for a user to alter his or her own cookie to change any of these values. In this example, that would serve no purpose; but in most applications you do not want a user to be able to alter his or her own state.Thus, you should always encrypt session data when you use client-side sessions.The encryption functions from Chapter 13 will work fine for this purpose:

<?php

// Encryption.inc

class Encryption {

352

Chapter 14 Session Handling

 

static $cypher

= blowfish;

 

static $mode

= cfb;

static $key = choose a better key;

public function encrypt($plaintext) {

$td = mcrypt_module_open (self::$cypher, ‘’, self::$mode, ‘’);

$iv = mcrypt_create_iv (mcrypt_enc_get_iv_size ($td), MCRYPT_RAND); mcrypt_generic_init ($td, self::$key, $iv);

$crypttext = mcrypt_generic ($td, $plaintext); mcrypt_generic_deinit ($td);

return $iv.$crypttext;

}

public function decrypt($crypttext) {

$td = mcrypt_module_open (self::$cypher, ‘’, self::$mode, ‘’); $ivsize = mcrypt_enc_get_iv_size($td);

$iv = substr($crypttext, 0, $ivsize);

$crypttext = substr($crypttext, $ivsize); $plaintext = “”;

if ( $iv ) {

mcrypt_generic_init ($td, self::$key, $iv); $plaintext = mdecrypt_generic ($td, $crypttext); mcrypt_generic_deinit ($td);

}

return $plaintext;

}

}

?>

The page needs a simple rewrite to encrypt the serialized data before it is sent via cookie:

<?php

include_once Encryption.inc; $MY_SESSION = unserialize(

stripslashes( Encryption::decrypt($_COOKIE[session_cookie])

)

); $MY_SESSION[count]++;

setcookie(session_cookie, Encryption::encrypt(serialize($MY_SESSION)), time() + 3600);

?>

From this example we can make some early observations about heavyweight sessions. The following are the upsides of client-side sessions:

Client-Side Sessions

353

nLow back-end overhead—As a general policy, I try to never use a database when I don’t have to. Database systems are hard to distribute and expensive to scale, and they are frequently the resource bottleneck in a system. Session data tends to be short-term transient data, so the benefits of storing it in a long-term storage medium such as an RDBMS is questionable.

nEasy to apply to distributed systems—Because all session data is carried with the request itself, this technique extends seamlessly to work on clusters of multiple machines.

nEasy to scale to a large number of clients—Client-side session state manage-

ment is great from a standpoint of client scalability. Although you will still need to add additional processing power to accommodate any traffic growth, you can add clients without any additional overhead at all.The burden of managing the volume of session data is placed entirely on the shoulders of the clients and distributed in a perfectly even manner so that the actual client burden is minimal.

Client-side sessions also incur the following downsides:

nImpractical to transfer large amounts of data—Although almost all browsers support cookies, each has its own internal limit for the maximum size of a cookie. In practice, 4KB seems to be the lowest common denominator for browser cookie size support. Even so, a 4KB cookie is very large. Remember, this cookie is passed up from the client on every request that matches the cookie’s domain and path.

This can cause noticeably slow transfer on low-speed or high-latency connections, not to mention the bandwidth costs of adding 4KB to every data transfer. I set a soft 1KB limit on cookie sizes for applications I develop.This allows for significant data storage while remaining manageable.

nDifficult to reuse session data out of the session context—Because the data is stored only on the client side, you cannot access the user’s current session data when the user is not making a request.

nAll session data must be fixed before generating output—Because cookies must be sent to the client before any content is sent, you need to finish your session manipulations and call setcookie() before you send any data. Of course, if you are using output buffering, you can completely invalidate this point and set cookies at any time you want.

Building a Slightly Better Mousetrap

To render client-side sessions truly useful, you need to create an access library around them. Here’s an example:

// cs_sessions.inc require_once Encryption.inc;

function cs_session_read($name=MY_SESSION) {

354 Chapter 14 Session Handling

global $MY_SESSION;

$MY_SESSION = unserialize(Encryption::decrypt(stripslashes($_COOKIE[$name])));

}

function cs_session_write($name=MY_SESSION, $expiration=3600) { global $MY_SESSION;

setcookie($name, Encryption::encrypt(serialize($MY_SESSION)), time() + $expiration);

}

function cs_session_destroy($name) { global $MY_SESSION; setcookie($name, “”, 0);

}

Then the original page-view counting example looks like this:

<?php

include_once cs_sessions.inc; cs_session_read(); $MY_SESSION[count]++; cs_session_write();

?>

You have visited this page <?= $MY_SESSION[count] ?> times.

Server-Side Sessions

In designing a server-side session system that works in a distributed environment, it is critical to guarantee that the machine that receives a request will have access to its session information.

Returning to our analogy of medical records, a server side, or office-managed, implementation has two options:The user can be brought to the data or the data can be brought to the user. Lacking a centralized data store, we must require the user to always return to the same server.This is like requiring a patient to always return to the same doctor’s office.While this methodology works well for small-town medical practices and single-server setups, it is not very scalable and breaks down when you need to service the population at multiple locations.To handle multiple offices, HMOs implement centralized patient information databases, where any of their doctors can access and update the patient’s record.

In content load balancing, the act of guaranteeing that a particular user is always delivered to a specific server, is known as session stickiness. Session stickiness can be achieved by using a number of hardware solutions (almost all the “Level 7” or “content switching” hardware load balancers support session stickiness) or software solutions (mod_backhand for Apache supports session stickiness). Just because we can do something, however, doesn’t mean we should.While session stickiness can enhance cache locality, too many applications rely on session stickiness to function correctly, which is bad design. Relying on session stickiness exposes an application to a number of vulnerabilities:

Server-Side Sessions

355

nUndermined resource/load balancing—Resource balancing is a difficult task. Every load balancer has its own approach, but all of them attempt to optimize the given request based on current trends.When you require session stickiness, you are actually committing resources for that session for perpetuity.This can lead to suboptimal load balancing and undermines many of the “smart” algorithms that the load balancer applies to distribute requests.

nMore prone to failure—Consider this mathematical riddle: All things being equal, which is safer—a twin-engine plane that requires both engines to fly or a single-engine plane.The single-engine plane is safer because the chance of one of two engines failing is greater than the chance of one of one engines failing. (If you prefer to think of this in dice, it is more likely that you will get at least one 6 when rolling two dice than one 6 on one die.) Similarly, a distributed system that breaks when any one of its nodes fails is poorly designed.You should instead strive to have a system that is fault tolerant as long as one of its nodes functions correctly. (In terms of airplanes, a dual-engine plane that needs only one engine to fly is probabilistically safer than a single-engine plane. )

The major disadvantage of ensuring that client data is available wherever it is needed is that it is resource intensive. Session caches by their very nature tend to be updated on every request, so if you are supporting a site with 100 requests per second, you need a storage mechanism that is up to that task. Supporting 100 updates and selects per second is not a difficult task for most modern RDBMS solutions; but when you scale that number to 1,000, many of those solutions will start to break down. Even using replication for this sort of solution does not provide a large scalability gain because it is the cost of the session updates and not the selects that is the bottleneck, and as discussed earlier, replication of inserts and updates is much more difficult than distribution of selects.This should not necessarily deter you from using a database-backed session solution; many applications will never reasonably grow to that level, and it is silly to avoid something that is unscalable if you never intend to use it to the extent that its scalability breaks down. Still, it is good to know these things and design with all the potential limitations in mind.

PHP Sessions and Reinventing the Wheel

While writing this chapter, I will admit that I have vacillated a number of times on whether to focus on custom session management or PHP’s session extension. I have often preferred to reinvent the wheel (under the guise of self-education) rather than use a boxed solution that does much of what I want. For me personally, sessions sit on the cusp of features I would rather implement myself and those that I would prefer to use out of the box. PHP sessions are very robust, and while the default session handlers fail to meet a number of my needs, the ability to set custom handlers enables us to address most of the deficits I find.

The following sections focus on PHP’s session extension for lightweight sessions. Let’s start by reviewing basic use of the session extension.

356 Chapter 14 Session Handling

Tracking the Session ID

The first hurdle you must overcome in tracking the session ID is identifying the requestor. Much as you must present your health insurance or Social Security number when you go to the doctor’s office so that the doctor can retrieve your records, a session must present its session ID to PHP so that the session information can be retrieved. As discussed in Chapter 13, session hijacking is a problem that you must always consider. Because the session extension is designed to operate completely independently of any authentication system, it uses random session ID generation to attempt to deter hijacking.

Native Methods for Tracking the Session ID

The session extension natively supports two methods for transmitting a session ID:

nCookies

nQuery string munging

The cookies method uses a dedicated cookie to manage the session ID. By default the name of the cookie is PHPSESSIONID, and it is a session cookie (that is, it has an expiration time of 0, meaning that it is destroyed when the browser is shut down). Cookie support is enabled by setting the following in your php.ini file (it defaults to on):

session.use_cookies=1

The query string munging method works by automatically adding a named variable to the query string of tags present in the document. Query munging is off by default, but you can enable it by using the following php.ini setting:

session.use_trans_sid=1

In this setting, trans_sid stands for “transparent session ID,” and it is so named because tags are automatically rewritten when it is enabled. For example, when use_trans_id is true, the following:

<?php

session_start();

?>

<a href=/foo.php>Foo</a>

will be rendered as this:

<a href=/foo.php?PHPSESSIONID=12345>foo</a>

Using cookie-based session ID tracking is preferred to using query string munging for a couple reasons, which we touched on in Chapter 13:

nSecurity—It is easy for a user to accidentally mail a friend a URL with his or her active session ID in it, resulting in an unintended hijacking of the session.There are also attacks that trick users into authenticating a bogus session ID by using the same mechanism.

TEAM

FLY

Server-Side Sessions

357

 

nAesthetics—Adding yet another parameter to a query string is ugly and produces cryptic-looking URLs.

For both cookieand query-managed session identifiers, the name of the session identifier can be set with the php.ini parameter session.name. For example, to use MYSESSIONID as the cookie name instead of PHPSESSIONID, you can simply set this:

session.name=MYSESSIONID

In addition, the following parameters are useful for configuring cookie-based session support:

n session.cookie_lifetime—Defaults to 0 (a pure session cookie). Setting this to a nonzero value enables you to set sessions that expire even while the browser is still open (which is useful for “timing out” sessions) or for sessions that span multiple browser sessions. (However, be careful of this for both security reasons as well as for maintaining the data storage for the session backing.)

nsession.cookie_path—Sets the path for the cookie. Defaults to /.

nsession.cookie_domain—Sets the domain for the cookie. Defaults to “”, which

sets the cookie domain to the hostname that was requested by the client browser.

n session.cookie_secure—Defaults to false. Determines whether cookies should only be sent over SSL sessions.This is an anti-hijacking setting that is designed to prevent your session ID from being read, even if your network connection is being monitored. Obviously, this only works if all the traffic for that cookie’s domain is over SSL.

Similarly, the following parameters are useful for configuring query string session support:

n session.use_only_cookies—Disables the reading of session IDs from the query string.This is an additional security parameter that should be set when use_trans_sid is set to false.

nurl_rewriter.tags—Defaults to a=href,frame=src,input=src,form= fakeentry. Sets the tags that will be transparently rewritten with the session

parameters if use_trans_id is set to true. For example, to have session IDs also sent for images, you would add img=src to the list of tags to be rewritten.

A Brief Introduction to PHP Sessions

To use basic sessions in a script, you simply call session_start() to initialize the session and then add key/value pairs to the $_SESSION autoglobals array.The following code snippet creates a session that counts the number of times you have visited the page and displays it back to you.With default session settings, this will use a cookie to propagate the session information and reset itself when the browser is shut down.

Соседние файлы в предмете [НЕСОРТИРОВАННОЕ]