Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Beginning Python (2005)

.pdf
Скачиваний:
159
Добавлен:
17.08.2013
Размер:
15.78 Mб
Скачать

Accessing Databases

Try It Out

Creating a Gadfly Database

Enter the following script and name the file createdb.py:

import os import gadfly

connection = gadfly.gadfly()

os.mkdir(‘db’)

connection.startup(‘pydb’, ‘db’)

cursor = connection.cursor()

# Create tables. cursor.execute(“”” create table employee

(empid integer, firstname varchar, lastname varchar, dept integer, manager integer, phone varchar)

“””)

cursor.execute(“”” create table department

(departmentid integer, name varchar,

manager integer)

“””)

cursor.execute(“”” create table user

(userid integer, username varchar, employeeid integer)

“””)

# Create indices.

cursor.execute(“””create index userid on user (userid)”””) cursor.execute(“””create index empid on employee (empid)”””) cursor.execute(“””create index deptid on department (departmentid)”””) cursor.execute(“””create index deptfk on employee (dept)”””) cursor.execute(“””create index mgr on employee (manager)”””) cursor.execute(“””create index emplid on user (employeeid)”””) cursor.execute(“””create index deptmgr on department (manager)”””)

connection.commit()

cursor.close()

connection.close()

261

TEAM LinG

Chapter 14

When you run this script, you should see no output unless the script raised an error:

$ python createdb.py

$

How It Works

Gadfly has its own API along with the standard Python DB API. This script uses the Gadfly API, but you’ll notice that this API is very similar to the DB API covered in the following section, “Using the Python Database APIs.” This section briefly describes the Gadfly-specific code in the creatdb.py script.

Among the Gadfly-specific code, you need to create a Connection object using the gadfly function on the gadfly module. For example:

connection = gadfly.gadfly()

connection.startup(‘pydb’, ‘db’)

Note that the gadfly module has the Gadfly-specific API. You need to use the gadfly.dbapi20 module to work with the DB API 2.0.

Once you get a Connection object, you need to start up the database. Pass the name of the database, pydb here, and the path to the directory to use, db in this example. (This script creates the db directory using the standard Python os module.)

From there, the script gets a Cursor object, covered in the section “Working with Cursors.” The Cursor object is used to create three tables and define indexes on these tables.

The script calls the commit method on the Connection to save all the changes to disk.

Gadfly stores all of its data in a directory you define, db in this case. After running the createdb.py script, you should see the following in the db directory, where each .grl file is a gadfly table:

$ ls db

DEPARTMENT.grl EMPLOYEE.grl pydb.gfd USER.grl

You are now ready to start working with the Python database APIs.

Using the Python Database APIs

First, some history about Python and relational databases. Python’s support for relational databases started out with ad hoc solutions, with one solution written to interface with each particular database, such as Oracle. Each database module created its own API, which was highly specific to that database because each database vendor evolved its own API based on its own needs. This is hard to support, because coding for one database and trying to move it to the other gives a programmer severe heartburn, as everything needs to be completely rewritten and retested.

Over the years, though, Python has matured to support a common database, or DB, API, that’s called the DB API. Specific modules enable your Python scripts to communicate with different databases, such as

262

TEAM LinG

Accessing Databases

DB/2, PostgreSQL, and so on. All of these modules, however, support the common API, making your job a lot easier when you write scripts to access databases. This section covers this common DB API.

The DB API provides a minimal standard for working with databases, using Python structures and syntax wherever possible. This API includes the following:

Connections, which cover guidelines for how to connect to databases

Executing statements and stored procedures to query, update, insert, and delete data with cursors

Transactions, with support for committing or rolling back a transaction

Examining metadata on the database module as well as on database and table structure

Defining the types of errors

The following sections take you step by step through the Python database APIs.

Downloading Modules

You must download a separate DB API module for each database you need to access. For example, if you need to access an Oracle database as well as a MySQL database, you must download both the Oracle and the MySQL database modules.

See www.python.org/topics/database/modules.html for a listing of database modules.

Modules exist for most major databases with the notable exception of Microsoft’s SQL Server. You can access MQL Server using an ODBC module, though. In fact, the mxODBC module can communicate with most databases using ODBC on Windows or an ODBC bridge on Unix (including Mac OS X) or Linux. If you need to do this, you can search for more information on these terms online to find out how other people are doing it.

Download the modules you need. Follow the instructions that come with the modules to install them.

You may need a C compiler and build environment to install some of the database modules. If you do, this will be described in the module’s own documentation, which you’ll need to read.

For some databases, such as Oracle, you can choose among a number of different modules that are slightly different. You should choose the module that seems to best fit your needs or go to the web site for this book and ask the authors for any recommendations if you’re not sure.

Once you have verified that the necessary modules are installed, you can start working with Connections.

Creating Connections

A Connection object provides the means to communicate from your script to a database program. Note the major assumption here that the database is running in a separate process (or processes). The Python database modules connect to the database. They do not include the database application itself.

263

TEAM LinG

Chapter 14

Each database module needs to provide a connect function that returns a Connection object. The parameters that are passed to connect vary by the module and what is required to communicate with the database. The following table lists the most common parameters.

 

Parameter

Usage

 

 

 

 

dsn

Data source name, from ODBC terminology. This usually includes

 

 

the name of your database and the server where it’s running.

 

host

Host, or network system name, on which the database runs

 

database

Name of the database

 

user

User name for connecting to the database

 

password

Password for the given user name

 

 

 

For example, you can use the following code as a guide:

connection = dbmodule.connect(dsn=’localhost:MYDB’,user=’tiger’,password=’scott’)

Use your database module documentation to determine which parameters are needed.

With a Connection object, you can work with transactions, covered later in this chapter; close the connection to free system resources, especially on the database; and get a cursor.

Working with Cursors

A Cursor is a Python object that enables you to work with the database. In database terms, the cursor is positioned at a particular location within a table or tables in the database, sort of like the cursor on your screen when you’re editing a document, which is positioned at a pixel location.

To get a Cursor, you need to call the cursor method on the Connection object:

cursor = connection.cursor()

Once you have a cursor, you can perform operations on the database, such as inserting records.

Try It Out

Inserting Records

Enter the following script and name the file insertdata.py:

import gadfly.dbapi20

connection = gadfly.dbapi20.connect(‘pydb’, ‘db’)

cursor = connection.cursor()

# Create employees. cursor.execute(“””

insert into employee (empid,firstname,lastname,manager,dept,phone) values (1,’Eric’,’Foster-Johnson’,1,1,’555-5555’)”””)

264

TEAM LinG

Accessing Databases

cursor.execute(“””

insert into employee (empid,firstname,lastname,manager,dept,phone) values (2,’Peter’,’Tosh’,2,3,’555-5554’)”””)

cursor.execute(“””

insert into employee (empid,firstname,lastname,manager,dept,phone) values (3,’Bunny’,’Wailer’,2,2,’555-5553’)”””)

# Create departments. cursor.execute(“””

insert into department (departmentid,name,manager) values (1,’development’,1)”””)

cursor.execute(“””

insert into department (departmentid,name,manager) values (2,’qa’,2)”””)

cursor.execute(“””

insert into department (departmentid,name,manager) values (3,’operations’,2)”””)

# Create users. cursor.execute(“””

insert into user (userid,username,employeeid) values (1,’ericfj’,1)”””)

cursor.execute(“””

insert into user (userid,username,employeeid) values (2,’tosh’,2)”””)

cursor.execute(“””

insert into user (userid,username,employeeid) values (3,’bunny’,3)”””)

connection.commit()

cursor.close()

connection.close()

When you run this script, you will see no output unless the script raises an error:

$ python insertdata.py

How It Works

The first few lines of this script set up the database connection and create a cursor object:

import gadfly.dbapi20

connection = gadfly.dbapi20.connect(‘pydb’, ‘db’)

cursor = connection.cursor()

265

TEAM LinG

Chapter 14

Note the use of the gadfly.dbapi20 module, which connects to a Gadfly database. To connect to a different database, replace this with your database-specific module, and modify the call to use the connect function from that database module, as needed.

The next several lines execute a number of SQL statements to insert rows into the three tables set up earlier: employee, department, and user. The execute method on the cursor object executes the SQL statement:

cursor.execute(“””

insert into employee (empid,firstname,lastname,manager,dept,phone) values (2,’Peter’,’Tosh’,2,3,’555-5554’)”””)

This example uses a triple-quoted string to cross a number of lines as needed. You’ll find that SQL commands, especially those embedded within Python scripts, are easier to understand if you can format the commands over a number of lines. This becomes more important with complex queries covered in examples later in this chapter.

To save your changes to the database, you must commit the transaction:

connection.commit()

Note that this method is called on the connection, not the cursor.

When you are done with the script, close the Cursor and then the Connection to free up resources. In short scripts like this, it may not seem important, but this helps the database program free its resources, as well as your Python script:

cursor.close()

connection.close()

You now have a very small amount of sample data to work with using other parts of the DB API, such as querying for data.

Try It Out

Writing a Simple Query

The following script implements a simple query that performs a join on the employee and department tables:

import gadfly.dbapi20

connection = gadfly.dbapi20.connect(‘pydb’, ‘db’)

cursor = connection.cursor()

cursor.execute(“””

select employee.firstname, employee.lastname, department.name from employee, department

where employee.dept = department.departmentid order by employee.lastname desc

“””)

266

TEAM LinG

Accessing Databases

for row in cursor.fetchall(): print row

cursor.close()

connection.close()

Save this script under the name simplequery.py.

When you run this script, you will see output like the following:

$ python simplequery.py (‘Bunny’, ‘Wailer’, ‘qa’)

(‘Peter’, ‘Tosh’, ‘operations’)

(‘Eric’, ‘Foster-Johnson’, ‘development’)

How It Works

This script initializes the connection and cursor in the same manner as the previous script. This script, though, passes a simple join query to the cursor execute method. This query selects two columns from the employee table and one from the department table.

This is truly a simple query, but, even so, you’ll want to format your queries so they are readable, similar to what is shown here.

When working with user interfaces, you will often need to expand IDs stored in the database to humanreadable values. In this case, for example, the query expands the department ID, querying for the department name. You simply cannot expect people to remember the meaning of strange numeric IDs.

The query also orders the results by the employees’ last names, in descending order. (This means that it starts at the beginning of the alphabet, which is what you’d normally expect. However, you can reverse this and have them sorted in ascending order.)

After calling the execute method, the data, if any was found, is stored in the cursor object. You can use the fetchall method to extract the data.

You can also use the fetchone method to fetch one row at a time from the results.

Note how the data appears as Python tuples:

(‘Bunny’, ‘Wailer’, ‘qa’) (‘Peter’, ‘Tosh’, ‘operations’)

(‘Eric’, ‘Foster-Johnson’, ‘development’)

You can use this example as a template to create other queries, such as the more complex join shown in the following Try It Out.

Try It Out

Writing a Complex Join

Enter this script and name the file finduser.py:

267

TEAM LinG

Chapter 14

import sys

import gadfly.dbapi20

connection = gadfly.dbapi20.connect(‘pydb’, ‘db’)

cursor = connection.cursor()

username = sys.argv[1]

query = “””

select u.username,e.firstname,e.lastname,m.firstname,m.lastname, d.name from user u, employee e, employee m, department d where username=?

and u.employeeid = e.empid

and e.manager = m.empid and e.dept = d.departmentid “””

cursor.execute(query, (username,)) for row in cursor.fetchall():

(username,firstname,lastname,mgr_firstname,mgr_lastname,dept) = row name=firstname + “ “ + lastname

manager=mgr_firstname + “ “ + mgr_lastname

print username,”:”,name,”managed by”,manager,”in”,dept

cursor.close()

connection.close()

When you run this script, you will see results like the following:

$ python finduser.py bunny

bunny : Bunny Wailer managed by Peter Tosh in qa

You need to pass the user name of a person to query from the database. This must be a valid user name of a person in the database. In this example, bunny is a user name previously inserted into the database.

How It Works

This script performs a join on all three example tables, using table-name aliases to create a shorter query. The purpose is to find a given user in the database by searching for that user name. This script also shows an example of expanding both the manager’s ID to the manager’s name and the department’s ID to the department’s name. All of this makes for more readable output.

This example also shows how you can extract data from each row into Python variables. For example:

(username,firstname,lastname,mgr_firstname,mgr_lastname,dept) = row

Note that this is really nothing new. See Chapter 3 for more on Python tuples, which is all row is.

An important new feature of this script, though, is the use of a question mark to enable you to build a query using dynamic data. When you call the execute method on the Cursor, you can pass a tuple of dynamic data, which the execute method will fill in for the question marks in the SQL statement. (This example uses a tuple of one element.) Each element in the tuple is used, in order, to replace the question

268

TEAM LinG

Accessing Databases

marks. Thus, it is very important to have as many dynamic values as you do question marks in the SQL statement, as shown in the following example:

query = “””

select u.username,e.firstname,e.lastname,m.firstname,m.lastname, d.name from user u, employee e, employee m, department d where username=?

and u.employeeid = e.empid and e.manager = m.empid and e.dept = d.departmentid “””

cursor.execute(query, (username,))

The query used in this example is very helpful when you want to start updating rows in the tables. That’s because users will want to enter meaningful values. It is up to you, with your SQL statements, to translate the user input into the necessary IDs.

For example, the following script enables you to change the manager for an employee:

Personally, I’d like to make myself my own manager.

Try It Out

Updating an Employee’s Manager

Enter the following script and name the file updatemgr.py:

import sys

import gadfly.dbapi20

connection = gadfly.dbapi20.connect(‘pydb’, ‘db’)

cursor = connection.cursor()

newmgr = sys.argv[2] employee = sys.argv[1]

# Query to find the employee ID. query = “””

select e.empid

from user u, employee e

where username=? and u.employeeid = e.empid “””

cursor.execute(query,(newmgr,)); for row in cursor.fetchone():

if (row != None): mgrid = row

# Note how we use the same query, but with a different name. cursor.execute(query,(employee,));

for row in cursor.fetchone(): if (row != None):

empid = row

269

TEAM LinG

Chapter 14

# Now, modify the employee.

cursor.execute(“update employee set manager=? where empid=?”, (mgrid,empid))

connection.commit()

cursor.close()

connection.close()

When you run this script, you need to pass the name of the user to update, as well as the name of the manager. Both names are user names from the user table. For example:

$ python finduser.py bunny

bunny : Bunny Wailer managed by Peter Tosh in qa $ python updatemgr.py bunny ericfj

$ python finduser.py bunny

bunny : Bunny Wailer managed by Eric Foster-Johnson in qa

How It Works

The example output shows the before and after picture of the employee row, verifying that the updatemgr.py script worked.

The updatemgr.py script expects two values from the user: the user name of the employee to update and the user name of the new manager. Both of these names must be user names stored in the database. Both names are converted into IDs using a simple query. This is not very efficient, as it involves two extra round-trips to the database. A more efficient means would be to perform an inner select statement on the update statement. For simplicity, though, the separate queries are far easier to understand.

This example also shows the use of the fetchone method on the Cursor. The final SQL statement then updates the employee row for the given user to have a new manager.

The next example uses a similar technique to terminate an employee. You can really have fun with this one (terminate your friends, your enemies, and so on).

Try It Out

Removing Employees

Enter the following script and name the file terminate.py:

import sys

import gadfly.dbapi20

connection = gadfly.dbapi20.connect(‘pydb’, ‘db’)

cursor = connection.cursor()

employee = sys.argv[1]

# Query to find the employee ID. query = “””

select e.empid

from user u, employee e

where username=? and u.employeeid = e.empid “””

270

TEAM LinG