- •Contents at a Glance
- •Contents
- •Foreword
- •About the Authors
- •About the Technical Reviewers
- •Acknowledgments
- •Introduction
- •How Drupal Works
- •What Is Drupal?
- •Technology Stack
- •Core
- •Administrative Interface
- •Modules
- •Hooks
- •Themes
- •Nodes
- •Fields
- •Blocks
- •File Layout
- •Serving a Request
- •The Web Server’s Role
- •The Bootstrap Process
- •Processing a Request
- •Theming the Data
- •Summary
- •Writing a Module
- •Creating the Files
- •Implementing a Hook
- •Adding Module-Specific Settings
- •Defining Your Own Administration Section
- •Presenting a Settings Form to the User
- •Validating User-Submitted Settings
- •Storing Settings
- •Using Drupal’s variables Table
- •Retrieving Stored Values with variable_get()
- •Further Steps
- •Summary
- •Hooks, Actions, and Triggers
- •Understanding Events and Triggers
- •Understanding Actions
- •The Trigger User Interface
- •Your First Action
- •Assigning the Action
- •Changing Which Triggers an Action Supports
- •Actions That Support Any Trigger
- •Advanced Actions
- •Using the Context in Actions
- •How the Trigger Module Prepares the Context
- •Changing Existing Actions with action_info_alter()
- •Establishing the Context
- •How Actions Are Stored
- •The actions Table
- •Action IDs
- •Calling an Action Directly with actions_do()
- •Defining Your Own Triggers with hook_trigger_info()
- •Adding Triggers to Existing Hooks
- •Summary
- •The Menu System
- •Callback Mapping
- •Mapping URLs to Functions
- •Creating a Menu Item
- •Page Callback Arguments
- •Page Callbacks in Other Files
- •Adding a Link to the Navigation Block
- •Menu Nesting
- •Access Control
- •Title Localization and Customization
- •Defining a Title Callback
- •Wildcards in Menu Items
- •Basic Wildcards
- •Wildcards and Page Callback Parameters
- •Using the Value of a Wildcard
- •Wildcards and Parameter Replacement
- •Passing Additional Arguments to the Load Function
- •Special, Predefined Load Arguments: %map and %index
- •Building Paths from Wildcards Using to_arg() Functions
- •Special Cases for Wildcards and to_arg() Functions
- •Altering Menu Items from Other Modules
- •Altering Menu Links from Other Modules
- •Kinds of Menu Items
- •Common Tasks
- •Assigning Callbacks Without Adding a Link to the Menu
- •Displaying Menu Items As Tabs
- •Hiding Existing Menu Items
- •Using menu.module
- •Common Mistakes
- •Summary
- •Working with Databases
- •Defining Database Parameters
- •Understanding the Database Abstraction Layer
- •Connecting to the Database
- •Performing Simple Queries
- •Retrieving Query Results
- •Getting a Single Value
- •Getting Multiple Rows
- •Using the Query Builder and Query Objects
- •Getting a Limited Range of Results
- •Getting Results for Paged Display
- •Other Common Queries
- •Inserts and Updates with drupal_write_record()
- •The Schema API
- •Using Module .install Files
- •Creating Tables
- •Using the Schema Module
- •Field Type Mapping from Schema to Database
- •Textual
- •Varchar
- •Char
- •Text
- •Numerical
- •Integer
- •Serial
- •Float
- •Numeric
- •Date and Time: Datetime
- •Binary: Blob
- •Declaring a Specific Column Type with mysql_type
- •Maintaining Tables
- •Deleting Tables on Uninstall
- •Changing Existing Schemas with hook_schema_alter()
- •Modifying Other Modules’ Queries with hook_query_alter()
- •Connecting to Multiple Databases Within Drupal
- •Using a Temporary Table
- •Writing Your Own Database Driver
- •Summary
- •Working with Users
- •The $user Object
- •Testing If a User Is Logged In
- •Introduction to user hooks
- •Understanding hook_user_view($account, $view_mode)
- •The User Registration Process
- •Using profile.module to Collect User Information
- •The Login Process
- •Adding Data to the $user Object at Load Time
- •Providing User Information Categories
- •External Login
- •Summary
- •Working with Nodes
- •So What Exactly Is a Node?
- •Not Everything Is a Node
- •Creating a Node Module
- •Creating the .install File
- •Creating the .info File
- •Creating the .module File
- •Providing Information About Our Node Type
- •Modifying the Menu Callback
- •Defining Node-Type–Specific Permissions with hook_permission()
- •Limiting Access to a Node Type with hook__node_access()
- •Customizing the Node Form for Our Node Type
- •Validating Fields with hook_validate()
- •Saving Our Data with hook_insert()
- •Keeping Data Current with hook_update()
- •Cleaning Up with hook_delete()
- •Modifying Nodes of Our Type with hook_load()
- •Using hook_view()
- •Manipulating Nodes That Are Not Our Type with hook_node_xxxxx()
- •How Nodes Are Stored
- •Creating a Node Type with Custom Content Types
- •Restricting Access to Nodes
- •Defining Node Grants
- •What Is a Realm?
- •What Is a Grant ID?
- •The Node Access Process
- •Summary
- •Working with Fields
- •Creating Content Types
- •Adding Fields to a Content Type
- •Creating a Custom Field
- •Adding Fields Programmatically
- •Summary
- •The Theme System
- •Themes
- •Installing an Off-the-Shelf Theme
- •Building a Theme
- •The .info File
- •Adding Regions to Your Theme
- •Adding CSS Files to Your Theme
- •Adding JavaScript Files
- •Adding Settings to Your Theme
- •Understanding Template Files
- •The Big Picture
- •The html.php.tpl File
- •The page.tpl.php File
- •The region.tpl.php File
- •The node.tpl.php File
- •The field.tpl.php File
- •The block.tpl.php File
- •Overriding Template Files
- •Other Template Files
- •Introducing the theme() Function
- •An Overview of How theme() Works
- •Overriding Themable Items
- •Overriding with Template Files
- •Adding and Manipulating Template Variables
- •Using the Theme Developer Module
- •Summary
- •Working with Blocks
- •What Is a Block?
- •Block Configuration Options
- •Block Placement
- •Defining a Block
- •Using the Block Hooks
- •Building a Block
- •Enabling a Block When a Module Is Installed
- •Block Visibility Examples
- •Displaying a Block to Logged-In Users Only
- •Displaying a Block to Anonymous Users Only
- •Summary
- •The Form API
- •Understanding Form Processing
- •Initializing the Process
- •Setting a Token
- •Setting an ID
- •Collecting All Possible Form Element Definitions
- •Looking for a Validation Function
- •Looking for a Submit Function
- •Allowing Modules to Alter the Form Before It’s Built
- •Building the Form
- •Allowing Functions to Alter the Form After It’s Built
- •Checking If the Form Has Been Submitted
- •Finding a Theme Function for the Form
- •Allowing Modules to Modify the Form Before It’s Rendered
- •Rendering the Form
- •Validating the Form
- •Token Validation
- •Built-In Validation
- •Element-Specific Validation
- •Validation Callbacks
- •Submitting the Form
- •Redirecting the User
- •Creating Basic Forms
- •Form Properties
- •Form IDs
- •Fieldsets
- •Theming Forms
- •Using #prefix, #suffix, and #markup
- •Using a Theme Function
- •Telling Drupal Which Theme Function to Use
- •Specifying Validation and Submission Functions with hook_forms()
- •Call Order of Theme, Validation, and Submission Functions
- •Writing a Validation Function
- •Form Rebuilding
- •Writing a Submit Function
- •Changing Forms with hook_form_alter()
- •Altering Any Form
- •Altering a Specific Form
- •Submitting Forms Programmatically with drupal_form_submit()
- •Dynamic Forms
- •Form API Properties
- •Properties for the Root of the Form
- •#action
- •#built
- •#method
- •Properties Added to All Elements
- •#description
- •#attributes
- •#required
- •#tree
- •Properties Allowed in All Elements
- •#type
- •#access
- •#after_build
- •#array_parents
- •#attached
- •#default_value
- •#disabled
- •#element_validate
- •#parents
- •#post_render
- •#prefix
- •#pre_render
- •#process
- •#states
- •#suffix
- •#theme
- •#theme_wrappers
- •#title
- •#tree
- •#weight
- •Form Elements
- •Text Field
- •Password
- •Password with Confirmation
- •Textarea
- •Select
- •Radio Buttons
- •Check Boxes
- •Value
- •Hidden
- •Date
- •Weight
- •File Upload
- •Fieldset
- •Submit
- •Button
- •Image Button
- •Markup
- •Item
- •#ajax Property
- •Summary
- •Filters
- •Filters and Text formats
- •Installing a Filter
- •Knowing When to Use Filters
- •Creating a Custom Filter
- •Implementing hook_filter_info()
- •The Process Function
- •Helper Function
- •Summary
- •Searching and Indexing Content
- •Building a Custom Search Page
- •The Default Search Form
- •The Advanced Search Form
- •Adding to the Search Form
- •Introducing the Search Hooks
- •Formatting Search Results with hook_search_page()
- •Making Path Aliases Searchable
- •Using the Search HTML Indexer
- •When to Use the Indexer
- •How the Indexer Works
- •Adding Metadata to Nodes: hook_node_update_index()
- •Indexing Content That Isn’t a Node: hook_update_index()
- •Summary
- •Working with Files
- •How Drupal Serves Files
- •Managed and Unmanaged Drupal APIs
- •Public Files
- •Private Files
- •PHP Settings
- •Media Handling
- •Upload Field
- •Video and Audio
- •File API
- •Database Schema
- •Common Tasks and Functions
- •Finding the Default Files URI
- •Copying and Moving Files
- •Checking Directories
- •Uploading Files
- •Getting the URL for a File
- •Finding Files in a Directory
- •Finding the Temp Directory
- •Neutralizing Dangerous Files
- •Checking Disk Space
- •Authentication Hooks for Downloading
- •Summary
- •Working with Taxonomy
- •The Structure of Taxonomy
- •Creating a Vocabulary
- •Creating Terms
- •Assigning a Vocabulary to a Content Type
- •Kinds of Taxonomy
- •Flat
- •Hierarchical
- •Multiple Hierarchical
- •Viewing Content by Term
- •Using AND and OR in URLs
- •Specifying Depth for Hierarchical Vocabularies
- •Automatic RSS Feeds
- •Storing Taxonomies
- •Module-Based Vocabularies
- •Creating a Module-Based Vocabulary
- •Keeping Informed of Vocabulary Changes with Taxonomy Hooks
- •Common Tasks
- •Displaying Taxonomy Terms Associated with a Node
- •Building Your Own Taxonomy Queries
- •Using taxonomy_select_nodes()
- •Taxonomy Functions
- •Retrieving Information About Vocabularies
- •taxonomy_vocabulary_load($vid)
- •taxonomy_get_vocabularies()
- •Adding, Modifying, and Deleting Vocabularies
- •taxonomy_vocabulary_save($vocabulary)
- •taxonomy_vocabulary_delete($vid)
- •Retrieving Information About Terms
- •taxonomy_load_term($tid)
- •taxonomy_get_term_by_name($name)
- •Adding, Modifying, and Deleting Terms
- •taxonomy_term_save($term)
- •taxonomy_term_delete($tid)
- •Retrieving Information About Term Hierarchy
- •taxonomy_get_parents($tid, $key)
- •taxonomy_get_parents_all($tid)
- •taxonomy_get_children($tid, $vid, $key)
- •taxonomy_get_tree($vid, $parent, $max_depth, $load_entities = FALSE)
- •Finding Nodes with Certain Terms
- •Additional Resources
- •Summary
- •Caching
- •Knowing When to Cache
- •How Caching Works
- •How Caching Is Used Within Drupal Core
- •Menu System
- •Caching Filtered Text
- •Administration Variables and Module Settings
- •Disabling Caching
- •Page Caching
- •Static Page Caching
- •Blocks
- •Using the Cache API
- •Caching Data with cache_set()
- •Retrieving Cached Data with cache_get() and cache_get_multiple()
- •Checking to See If Cache Is Empty with cache_is_empty()
- •Clearing Cache with cache_clear_all()
- •Summary
- •Sessions
- •What Are Sessions?
- •Usage
- •Session-Related Settings
- •In .htaccess
- •In settings.php
- •In bootstrap.inc
- •Requiring Cookies
- •Storage
- •Session Life Cycle
- •Session Conversations
- •First Visit
- •Second Visit
- •User with an Account
- •Common Tasks
- •Changing the Length of Time Before a Cookie Expires
- •Changing the Name of the Session
- •Storing Data in the Session
- •Summary
- •Using jQuery
- •What Is jQuery?
- •How jQuery Works
- •Using a CSS ID Selector
- •Using a CSS Class Selector
- •jQuery Within Drupal
- •Your First jQuery Code
- •Targeting an Element by ID
- •Method Chaining
- •Adding or Removing a Class
- •Wrapping Existing Elements
- •Changing Values of CSS Elements
- •Where to Put JavaScript
- •Adding JavaScript via a Theme .info File
- •A Module That Uses jQuery
- •Overridable JavaScript
- •Building a jQuery Voting Widget
- •Building the Module
- •Using Drupal.behaviors
- •Ways to Extend This Module
- •Compatibility
- •Next Steps
- •Summary
- •Localization and Translation
- •Enabling the Locale Module
- •User Interface Translation
- •Strings
- •Translating Strings with t()
- •Replacing Built-In Strings with Custom Strings
- •String Overrides in settings.php
- •Replacing Strings with the Locale Module
- •Exporting Your Translation
- •Starting a New Translation
- •Generating .pot Files with Translation Template Extractor
- •Creating a .pot File for Your Module
- •Using the Command Line
- •Using the Web-Based Extractor
- •Creating .pot Files for an Entire Site
- •Installing a Language Translation
- •Setting Up a Translation at Install Time
- •Installing a Translation on an Existing Site
- •Right-to-Left Language Support
- •Language Negotiation
- •Default
- •User-Preferred Language
- •The Global $language Object
- •Path Prefix Only
- •Path Prefix with Language Fallback
- •URL Only
- •Content Translation
- •Introducing the Content Translation Module
- •Multilingual Support
- •Multilingual Support with Translation
- •Localizationand Translation-Related Files
- •Additional Resources
- •Summary
- •What Is XML-RPC?
- •Prerequisites for XML-RPC
- •XML-RPC Clients
- •XML-RPC Client Example: Getting the Time
- •XML-RPC Client Example: Getting the Name of a State
- •Handling XML-RPC Client Errors
- •Network Errors
- •HTTP Errors
- •Call Syntax Errors
- •A Simple XML-RPC Server
- •Mapping Your Method with hook_xmlrpc()
- •Automatic Parameter Type Validation with hook_xmlrpc()
- •Built-In XML-RPC Methods
- •system.listMethods
- •system.methodSignature
- •system.methodHelp
- •system.getCapabilities
- •system.multiCall
- •Summary
- •Writing Secure Code
- •Handling User Input
- •Thinking About Data Types
- •Plain Text
- •HTML Text
- •Rich Text
- •Using check_plain() and t() to Sanitize Output
- •Using filter_xss() to Prevent Cross-Site Scripting Attacks
- •Using filter_xss_admin()
- •Handling URLs Securely
- •Making Queries Secure with db_query()
- •Keeping Private Data Private with hook_query_alter()
- •Dynamic Queries
- •Permissions and Page Callbacks
- •Cross-Site Request Forgeries (CSRF)
- •File Security
- •File Permissions
- •Protected Files
- •File Uploads
- •Filenames and Paths
- •Encoding Mail Headers
- •Files for Production Environments
- •SSL Support
- •Stand-Alone PHP
- •AJAX Security, a.k.a. Request Replay Attack
- •Form API Security
- •Protecting the Superuser Account
- •Summary
- •Development Best Practices
- •Coding Standards
- •Line Indention and Whitespace
- •Operators
- •Casting
- •Control Structures
- •Function Calls
- •Function Declarations
- •Function Names
- •Class Constructor Calls
- •Arrays
- •Quotes
- •String Concatenators
- •Comments
- •Documentation Examples
- •Documenting Constants
- •Documenting Functions
- •Documenting Hook Implementations
- •Including Code
- •PHP Code Tags
- •Semicolons
- •Example URLs
- •Naming Conventions
- •Checking Your Coding Style with Coder Module
- •Finding Your Way Around Code with grep
- •Summary
- •Optimizing Drupal
- •Caching Is the Key to Drupal Performance
- •Optimizing PHP
- •Setting PHP Opcode Cache File to /dev/zero
- •PHP Process Pool Settings
- •Tuning Apache
- •mod_expires
- •Moving Directives from .htaccess to httpd.conf
- •MPM Prefork vs. Apache MPM Worker
- •Balancing the Apache Pool Size
- •Decreasing Apache Timeout
- •Disabling Unused Apache Modules
- •Using Nginx Instead of Apache
- •Using Pressflow
- •Varnish
- •Normalizing incoming requests for better Varnish hits
- •Varnish: finding extraneous cookies
- •Boost
- •Boost vs. Varnish
- •Linux System Tuning for High Traffic Servers
- •Using Fast File Systems
- •Dedicated Servers vs. Virtual Servers
- •Avoiding Calling External Web Services
- •Decreasing Server Timeouts
- •Database Optimization
- •Enabling MySQL’s Query Cache
- •MySQL InnoDB Performance on Windows
- •Drupal Performance
- •Eliminating 404 Errors
- •Disabling Modules You’re Not Using
- •Drupal-Specific Optimizations
- •Page Caching
- •Bandwidth Optimization
- •Pruning the Sessions Table
- •Managing the Traffic of Authenticated Users
- •Logging to the Database
- •Logging to Syslog
- •Running cron
- •Architectures
- •Single Server
- •Separate Database Server
- •Separate Database Server and a Web Server Cluster
- •Load Balancing
- •File Uploads and Synchronization
- •Multiple Database Servers
- •Database Replication
- •Database Partitioning
- •Finding the Bottleneck
- •Web Server Running Out of CPU
- •Web Server Running Out of RAM
- •Identifying Expensive Database Queries
- •Identifying Expensive Pages
- •Identifying Expensive Code
- •Optimizing Tables
- •Caching Queries Manually
- •Changing the Table Type from MyISAM to InnoDB
- •Summary
- •Installation Profiles
- •Creating a New Installation Profile
- •The enhanced.info File
- •The enhanced.profile File
- •The enhanced.install File
- •Using hook_install_tasks and hook_install_tasks_alter
- •Summary
- •Testing
- •Setting Up the Test Environment
- •How Tests Are Defined
- •Test Functions
- •Test Assertions
- •Summary
- •Database Table Reference
- •Resources
- •Code
- •The Drupal Source Code Repository on GIT
- •Examples
- •Drupal API Reference
- •Security Advisories
- •Updating Modules
- •Updating Themes
- •Handbooks
- •Forums
- •Mailing Lists
- •Development
- •Themes
- •Translations
- •User Groups and Interest Groups
- •Internet Relay Chat
- •North America
- •Europe
- •Asia
- •Latin America / Caribbean
- •Oceania
- •Africa
- •Videocasts
- •Weblogs
- •Conferences
- •Contribute
- •Index
- •Numbers
CHAPTER 23 ■ OPTIMIZING DRUPAL
if your server or file system mounts go down, the site is affected unless you also create a cluster of file servers.
If there are many large media files to be served, it may be best to serve these from a separate server using a lightweight web server, such as Nginx, to avoid having a lot of long-running processes on your web servers contending with requests handled by Drupal. An easy way to do this is to use a rewrite rule on your web server to redirect all incoming requests for a certain file type to the static server. Here’s an example rewrite rule for Apache that rewrites all requests for JPEG files:
RewriteCond %{REQUEST_URI} ^/(.*\.jpg)$ [NC]
RewriteRule .* http://static.example.com/%1 [R]
The disadvantage of this approach is that the web servers are still performing the extra work of redirecting traffic to the file server. An improved solution is to rewrite all file URLs within Drupal, so the web servers are no longer involved in static file requests.
Beyond a Single File System
If the amount of storage is going to exceed a single file system, chances are you’ll be doing some custom coding to implement storage abstraction. One option would be to use an outsourced storage system like Amazon’s S3 service.
Multiple Database Servers
Multiple database servers introduce additional complexity, because the data being inserted and updated must be replicated or partitioned across servers.
Database Replication
In MySQL database replication, a single master database receives all writes. These writes are then replicated to one or more slaves. Reads can be done on any master or slave. Slaves can also be masters in a multitiered architecture.
Database Partitioning
Since Drupal can handle multiple database connections, another strategy for scaling your database architecture is to put some tables in one database on one machine, and other tables in a different database on another machine. For example, moving all cache tables to a separate database on a separate machine and aliasing all queries on these tables using Drupal’s table prefixing mechanism can help your site scale.
Finding the Bottleneck
If your Drupal site is not performing as well as expected, the first step is to analyze where the problem lies. Possibilities include the web server, the operating system, the database, file system, and the network.
518
CHAPTER 23 ■ OPTIMIZING DRUPAL
Knowing how to evaluate the performance and scalability of a system allows you to quickly isolate and respond to system bottlenecks with confidence, even amid a crisis. You can discover where bottlenecks lie with a few simple tools and by asking questions along the way. Here’s one way to approach a badly performing server. We begin with the knowledge that performance is going to be bound by one of the following variables: CPU, RAM, I/O, or bandwidth. So begin by asking yourself the following questions:
Is the CPU maxed out? If examining CPU usage with top on Unix or the Task Manager on Windows shows CPU(s) at 100 percent, your mission is to find out what’s causing all that processing. Looking at the process list will let you know whether it’s the web server or the database eating up processor cycles. Both of these problems are solvable.
Is the server paging excessively? If the server lacks enough physical memory to handle the allocated task, the operating system will use virtual memory (disk) to handle the load. Reading and writing from disk is significantly slower than reading and writing to physical memory. If your server is paging excessively, you’ll need to figure out why.
Are the disks maxed out? If examining the disk subsystem with a tool like vmstat on Unix or the Performance Monitor on Windows shows that disk activity cannot keep up with the demands of the system while plenty of free RAM remains, you’ve got an I/O problem. Possibilities include excessively verbose logging, an improperly configured database that is creating many temporary tables on disk, background script execution, improper use of a RAID level for a write-heavy application, and so on.
Is the network link saturated? If the network pipe is filled up, there are only two solutions. One is to get a bigger pipe. The other is to send less information while making sure the information that is being sent is properly compressed.
■ Tip Investigating your page serving performance from outside your server is also useful. A tool like YSlow (http://developer.yahoo.com/yslow/help/) can be helpful when pinpointing why your pages are not downloading as quickly as you’d like when you haven’t yet hit a wall with CPU, RAM, or I/O. A helpful article on YSlow and Drupal can be found at http://wimleers.com/article/improving-drupals-page-loading- performance.
Web Server Running Out of CPU
If your CPU is maxed out and the process list shows that the resources are being consumed by the web server and not the database (which is covered later), you should look into reducing the web server overhead incurred to serve a request. Often the execution of PHP code is the culprit. See the description of PHP optimizations earlier in the chapter.
Often custom code and modules that have performed reasonably well for small-scale sites can become a bottleneck when moved into production. CPU-intensive code loops, memory-hungry algorithms, and large database retrievals can be identified by profiling your code to determine where PHP is spending most of its time and thus where you ought to spend most of your time debugging.
519
Download from Wow! eBook <www.wowebook.com>
CHAPTER 23 ■ OPTIMIZING DRUPAL
If, even after adding an opcode cache and optimizing your code, your web server cannot handle the load, it is time to get a beefier box with more or faster CPUs or to move to a different architecture with multiple web server front ends.
Web Server Running Out of RAM
The RAM footprint of the web server process serving the request includes all of the modules loaded by the web server (such as Apache’s mod_mime, mod_rewrite, etc.) as well as the memory used by the PHP interpreter. The more web server and Drupal modules that are enabled, the more RAM used per request.
Because RAM is a finite resource, you should determine how much is being used on each request and how many requests your web server is configured to handle. To see how much real RAM is being used on average for each request, use a program like top (on Linux) to see your list of processes. In Apache, the maximum number of simultaneous requests that will be served is set using the MaxClients directive. A common mistake is thinking the solution to a saturated web server is to increase the value of MaxClients. This only complicates the problem, since you’ll be hit by too many requests at once. That means RAM will be exhausted, and your server will start disk swapping and become unresponsive. Let’s assume, for example, that your web server has 2GB of RAM and each Apache request is using roughly 20MB (you can check the actual value by using top on Linux or Task Manager on Windows). You can calculate a good value for MaxClients by using the following formula; keep in mind the fact that you will need to reserve memory for your operating system and other processes:
2GB RAM / 20MB per process = 100 MaxClients
If your server consistently runs out of RAM even after disabling unneeded web server modules and profiling any custom modules or code, your next step is to make sure the database and the operating system are not the causes of the bottleneck. If they are, then add more RAM. If the database and operating system are not causing the bottlenecks, you simply have more requests than you can serve; the solution is to add more web server boxes.
■ Tip Since memory usage of Apache processes tends to increase to the level of the most memory-hungry page served by that child process, memory can be regained by setting the MaxRequestsPerChild value to a low number, such as 300 (the actual number will depend on your situation). Apache will work a little harder to generate new children, but the new children will use less RAM than the older ones they replace, so you can serve more requests in less RAM. The default setting for MaxRequestsPerChild is 0, meaning the processes will never expire.
Identifying Expensive Database Queries
If you need to get a sense of what is happening when a given page is generated, devel.module is invaluable. It has an option to display all the queries that are required to generate the page along with the execution time of each query.
Another way to find out which queries are taking too long is to enable slow query logging in MySQL. This is done in the MySQL option file (my.cnf) as follows:
520
CHAPTER 23 ■ OPTIMIZING DRUPAL
# The MySQL server [mysqld] log-slow-queries
This will log all queries that take longer than ten seconds to a log file at example.com-slow.log in MySQL’s data directory. You can change the number of seconds and the log location as shown in this code, where we set the slow query threshold to five seconds and the file name to example-slow.log:
# The MySQL server [mysqld] long_query_time = 5
log-slow-queries = /var/log/mysql/example-slow.log
Identifying Expensive Pages
To find out which pages are the most resource intensive, enable the statistics module that is included with Drupal. Although the statistics module increases the load on your server (since it records access statistics for your site into your database), it can be useful to see which pages are the most frequently viewed and thus the most ripe for query optimization. It also tracks total page generation time over a period, which you can specify in Configuration -> Statistics. This is useful for identifying out-of-control web crawlers that are eating up system resources, which you can then ban on the spot by visiting Reports -> Top visitors and clicking “ban.” Be careful, though—it’s just as easy to ban a good crawler that drives traffic to your site as a bad one. Make sure you investigate the origin of the crawler before banning it.
Identifying Expensive Code
Consider the following resource-hogging code:
//Very expensive, silly way to get node titles. First we get the node IDs
//of all published nodes.
$query = db_select('node', 'n'); $query->fields('n', array('nid')); $query->condition("n.status", 1); $query->addTag('node_access'); $result = $query->execute();
// Now we do a node_load() on each individual node and save the title.
foreach($result as $row) { $node = node_load($row->nid);
$titles[] = check_plain($node->title);
}
Fully loading a node is an expensive operation: hooks run, modules perform database queries to add or modify the node, and memory is used to cache the node in node_load()’s internal cache. If you are not depending on modification to the node by a module, it’s much faster to do your own query of the node table directly. Certainly this is a contrived example, but the same pattern can often be found, that
521