Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Beginning Python (2005)

.pdf
Скачиваний:
177
Добавлен:
17.08.2013
Размер:
15.78 Mб
Скачать

Python in the Enterprise

Try It Out

Simple LDAP Search

OK, now that you either have an LDAP server installed or one already available in your organization, try some basic LDAP accesses from Python so you can see how it all works. When this book went to press, it was difficult to get python-ldap working under Windows, so this presumes you’re scripting on a Linux machine:

1.Create a file named simpleldap.py and enter the following:

import ldap

l = ldap.open(‘127.0.0.1’) l.simple_bind_s (‘’, ‘’)

print “Search for everything:”

ldap_result = l.search_s(“dc=vivtek,dc=com”, ldap.SCOPE_SUBTREE, “cn=*”, None) print ldap_result

print

print “Search for objects with names containing ‘Michael’:”

ldap_result = l.search_s(“dc=vivtek,dc=com”, ldap.SCOPE_SUBTREE, “cn=*Michael*”, None) print ldap_result

print

print “Retrieve organizational role ‘wfstarter’:”

ldap_result = l.search_s(“dc=vivtek,dc=com”, ldap.SCOPE_SUBTREE, “cn=wfstarter”, [“organizationalRole”])

print ldap_result print

print “Search for everything again, but this time with an asynchronous search:” ldap_result_id = l.search(“dc=wftk,dc=org”, ldap.SCOPE_SUBTREE, “cn=*”, None) while 1:

result_type, result_data = l.result(ldap_result_id, 0) if (result_data == []):

break else:

if result_type == ldap.RES_SEARCH_ENTRY: print result_data

2.Now run it:

[michael@me michael]$ python simpleldap.py Search for everything:

[(‘cn=Different Person,dc=vivtek,dc=com’, {‘objectClass’: [‘person’], ‘sn’: [‘Different Person’], ‘cn’: [‘Different Person’]}), (‘cn=Michael Roberts,dc=vivtek,dc=com’, {‘objectClass’: [‘person’], ‘sn’: [‘Roberts’], ‘cn’: [‘Michael Roberts’]}), (‘cn=wfstarter,dc=vivtek,dc=com’, {‘objectClass’:

[‘organizationalRole’], ‘roleOccupant’: [‘cn=Michael Roberts’, ‘cn=Different Person’], ‘cn’: [‘wfstarter’]})]

Search for objects with names containing ‘Michael’:

[(‘cn=Michael Roberts,dc=vivtek,dc=com’, {‘objectClass’: [‘person’], ‘sn’: [‘Roberts’], ‘cn’: [‘Michael Roberts’]})]

451

TEAM LinG

Chapter 20

Retrieve organizational role ‘wfstarter’: [(‘cn=wfstarter,dc=vivtek,dc=com’, {‘objectClass’: [‘organizationalRole’],

‘roleOccupant’: [‘cn=Michael Roberts’, ‘cn=Different Person’], ‘cn’: [‘wfstarter’]})]

Search for everything again, but this time with an asynchronous search: [(‘cn=Different Person,dc=vivtek,dc=com’, {‘objectClass’: [‘person’], ‘sn’: [‘Different Person’], ‘cn’: [‘Different Person’]})]

[(‘cn=Michael Roberts,dc=vivtek,dc=com’, {‘objectClass’: [‘person’], ‘sn’: [‘Roberts’], ‘cn’: [‘Michael Roberts’]})] [(‘cn=wfstarter,dc=vivtek,dc=com’, {‘objectClass’: [‘organizationalRole’],

‘roleOccupant’: [‘cn=Michael Roberts’, ‘cn=Different Person’], ‘cn’: [‘wfstarter’]})]

How It Works

This simple example really only scratches the surface of the python-ldap module. The intent is just to get a feel for how the module is used and to look at the data structures used by the module to represent LDAP results.

The first line to note is the call to the simple_bind_s method. This method logs into the server, using — in this case — an anonymous connection. Here you would normally give the distinguished name of your user record and the password associated with that record. Note also the “_s” ending on the function. In the LDAP API, most functions come in both synchronous and asynchronous forms, as there are no guarantees that the answer from the server will come back quickly. For simplicity, it’s often easiest to use the synchronous API for utilities, but in any environment where the user needs responsiveness, you’ll want to look into the asynchronous varieties. The asynchronous API sends off requests to the server and returns immediately with a handle you can use to call monitoring functions. The monitoring functions can then tell you whether the server has answered. More on this later.

In the meantime, look at the next three blocks, which illustrate various things you can search on using the search_s method. The four inputs to the search are the distinguished name of an object to start searching at, the scope of the search (whether you’re just interested in that object, its immediate children, or all objects in its descent), a filter for the search, and the types of objects to retrieve. You can see how the filters work fairly easily from the examples; use the asterisk (*) as a wildcard for the filter parameter. If you don’t supply an object type for the last parameter, any objects matching the filter will be retrieved.

The final LDAP call is an example of how to call an asynchronous search and then poll the API to get all of the search results. This search performs the same search as the first (synchronous) search in the example.

The output is basically just the return values from all four searches. The normal formatting (or lack thereof) of the output is hard to see, so rearrange the first search result a little:

Search for everything:

[(‘cn=Different

Person,dc=vivtek,dc=com’,

{‘objectClass’: [‘person’],

‘sn’

: [‘Different Person’],

‘cn’

: [‘Different Person’]

}

 

),

 

(‘cn=Michael Roberts,dc=vivtek,dc=com’,

{‘objectClass’:

[‘person’],

‘sn’

:

[‘Roberts’],

‘cn’

:

[‘Michael Roberts’]

452

TEAM LinG

Python in the Enterprise

}

),

(‘cn=wfstarter,dc=vivtek,dc=com’, {‘objectClass’ : [‘organizationalRole’], ‘roleOccupant’: [‘cn=Michael Roberts’,

‘cn=Different Person’], ‘cn’ : [‘wfstarter’]

}

)

]

As you can see, the structure of the search return is a list of tuples. Each tuple has two elements: The first element is the distinguished name of the object returned, and the second element is a dictionary containing data fields from the object. Note that in LDAP, any field may have multiple values, so the value half of each data field is itself a list. You can see this most clearly in the value for roleOccupant in the wfstarter object; this field has two values.

More LDAP

This little introduction to the wonders of directory programming has naturally barely scratched the surface of LDAP programming. You can find several good books on working with LDAP, but as long as it’s fresh in your mind, it might be a good idea to take in a couple of pointers.

First, while the core LDAP schema used in the preceding example can do a lot of basic things, nearly any LDAP application of any scope will define its own schema. An LDAP schema does the same thing as a schema in any database system: It defines objects and the attributes they must have (and in LDAP, you can also define optional attributes) and the types of values those attributes can contain. Thus, if you’re working with an existing LDAP server, you will almost certainly encounter various schema definitions with which you need to work.

You can still store useful information in the core schema. For instance, the example database shown in the preceding example defines a couple of organizational roles (object type organizationalRole), each of which has multiple occupants. This is a simple (although not very scalable) way to define user groups, which requires absolutely no schema design or definition work at all. Therefore, it’s possible to use an LDAP installation to note roles and responsibilities for a small-to-medium organization or group.

For the reasons of flexibility talked about earlier — the fact that writing a truly flexible mapping from LDAP functionality onto the wftk’s language is more complicated than it looks — the wftk does not yet have a finished LDAP adaptor (at least, that’s the situation as this book goes to press.) But if it did, or

if you wrote a simplified one customized for your own organizational needs, LDAP would be a good choice to store the user and permissions information necessary for any real-world enterprise application.

Back to the wftk

For the third section of example programming, look at the wftk again, this time in its originally intended capacity of being a workflow toolkit. Essentially, there are two things to keep in mind about applications of the wftk’s workflow functionality: First, workflow is activated when events occur in the system — basically, the addition or modification of objects. After a list is defined, you can define a workflow to be activated when an object is added (for instance, the addition could contain data from a form submission on a web site, or it could be as simple as a cron job on a Unix server firing at a particular time of day in order to cause some workflow action to take place.)

453

TEAM LinG

Chapter 20

Second, the entire point of workflow is to coordinate the actions of a diverse group of people and/or programs. If an action to be taken for data involves any kind of scheduling of resources, for instance, this is an ideal application for workflow. The wftk excels in modeling this kind of system.

This chapter demonstrates two examples of wftk workflow programming in Python: a basic approval workflow used to intercept and process records submitted (perhaps originally from a web page — where they’re coming from is really not terribly important), and an action processing queue in which a periodic process checks for work to be done and then runs a program to do it, storing the information back into the original task object. Naturally, either of these examples could be ramified endlessly, but they’re good for getting you started.

Try It Out

Simple Workflow Trigger

For the first real workflow in this chapter, you’ll add a simple state machine and approval workflow to the repository from the very first example. This will accept submissions into a staging area list and then start a workflow process to get approval from an administrator. When the submission is approved, the record is written to the main list; if approval isn’t granted, the code deletes it and forgets the whole thing:

1.In the repository directory from the first example, open server.defn and add the following list definition; this defines a list to be used as a staging area for proposed additions to the list

“simple”:

<list id=”staging”>

<field id=”field1” special=”key”/> <field id=”field2”/>

<state id=”proposed”/>

<state id=”approved” archive-to=”simple”/> <state id=”rejected” archive-to=”_trash”/> <on action=”add”>

<task role=”me” label=”Check anonymous submission”> <data id=”state”/>

</task>

</on>

</list>

2.Add subdirectories to the repository for the task index and the staging area; these must be named “_tasks” and “staging”, respectively. Both are treated as perfectly normal lists, so they can just as easily be stored in MySQL or wherever is appropriate for your needs, but it’s a little simpler for now to leave them in the default directory-based storage.

3.Now add some scripts. First, use your favorite word processor to add a file called submit.py:

import wftk

repos = wftk.repository()

e = wftk.entry (repos, ‘staging’) e.parse (“””

<rec>

<field id=”field2”>this is an anonymous submission</field> </rec>

“””)

e.save()

454

TEAM LinG

Python in the Enterprise

4.Now add another file called approve.py:

import wftk

repos = wftk.repository() repos.user_auth (‘me’, ‘x’)

l = wftk.list (repos, ‘_todo’) l.query ()

e = repos.get(‘_todo’, l.keys()[0]) e.set (“state”, “approved”) e.save()

How It Works

This example is deceptively simple, but it actually does some useful work. In the first step, you did the greatest part of the work, by defining the staging area list. That definition has exactly the same fields as the list “simple”, because it stores suggested additions to the “simple” list. However, the staging area is a little smarter, because it has a state machine.

A state machine is a data structure that defines different states for an object; it also defines what valid transitions between states that object can take. In the wftk context, you can also specify actions to be taken by the workflow engine when objects actually transition between states, and that’s where things get interesting. The three “state” elements define the start state (proposed), and two other states, “approved” and “rejected”. Each of these two states is treated as an archival (that is, as an action to be taken after completion of workflow, which is signaled by entry of the respective state). The archival target for the “approved” state is the “simple” list — which means that if an object in the staging area enters the approved state, all outstanding workflow is removed, it is deleted from the staging area, and it is added automatically to the “simple” list.

If, on the other hand, it enters the “rejected” state, it’s still cleaned up, but it is “added” to the _trash list — a virtual list, which simply means it is deleted. This is a convenient way to specify that a record should be discarded.

After the state definitions, there is a simple workflow process definition in the “on” element to ensure that it’s activated as soon as a record is entered into the staging area. All this workflow does is to create one task on the active task list; that task is then given the state value as input/output value. This is not strictly necessary from the point of view of the code, but it’s useful to document the fact that the task is going to be used to change the object’s state.

Looking at the submit.py script, you can see that it is essentially identical to the example script to add an entry to the simple list shown earlier in the chapter. The only difference is that it specifies the “staging” list, and it has been cleaned up to remove demonstration printouts. When you run it, it will

create an entry in the staging area, but there’s an important difference: Because workflow is involved, an entry is also added to the _tasks list.

Now look at the approve.py script, which is where everything interesting happens. Here, after importing the wftk and opening the repository, the script immediately authenticates itself as the user “me”. This allows the next line to retrieve the “_todo” list (another magical list, which contains all active tasks for the current user) and get the first key from that list. Naturally, in a real-world application, you would

455

TEAM LinG

Chapter 20

do something other than blindly approve the first task in the list (like actually ask a human being to make a decision), but this illustrates the basic operation of workflow.

Once the first task on the list is retrieved, its state is changed to “approved” and it is saved again. This simple action hides a great deal of work on the part of the system — once the state is changed, this triggers the state transition, and the record is now moved into the “simple” list. Thus, the function of the staging area is fulfilled.

You could easily write an equivalent script reject.py that would reject the first submission on the task list — simply copy approve.py exactly, and then replace the new state value with “rejected”, and you have written a simple rejection script that will delete the submission. Another interesting variant would be a separate archival list for rejected submissions (perhaps you want to give people a second chance). To do this, you would simply define the list and name it in the archive-to attribute of the rejected state.

The point is that with the wftk, you don’t have to write much code to specify many rather complicated workflow tasks. You simply define the system, say how it should behave, and you’re finished.

Try It Out

Action Queue Handler

The previous example illustrated the involvement of a human being in a programmed process, by creating a task that could be inspected by a human and allowing that person to make a decision regarding further processing. Another useful application of workflow, though, is as glue for disparate software systems, or as an organizational framework for programmatic tasks. In this type of system, you can imagine both humans and programs as agents; and automatic task processing programs can be called action queue handlers, because their to-do list is effectively an action queue.

Actually, the task completion script you wrote in the previous example can already be seen as an action queue handler! It checks a task index for actions it needs to take and then completes tasks as directed, so it already fits the bill. In addition, however, it can do whatever else you need. The following exercise is a cobbled together simple task it can complete, and it works with a modified version of the previous example’s system.

1.In the repository directory from the first example, open server.defn and add or modify the staging area definition from the last example (the changed portion is in italics):

<list id=”staging”>

<field id=”field1” special=”key”/> <field id=”field2”/>

<state id=”incoming”/>

<state id=”proposed”/>

<state id=”approved” archive-to=”simple”/> <state id=”rejected” archive-to=”_trash”/> <on action=”add”>

<task role=”you” label=”Automatic incoming task”> <data id=”state”/>

<data id=”field2”/> </task>

<task role=”me” label=”Check anonymous submission”> <data id=”state”/>

</task>

</on>

</list>

456

TEAM LinG

Python in the Enterprise

2.If you skipped the last example, add subdirectories to the repository for the task index and the staging area; these must be named “_tasks” and “staging”, respectively. You’ll also need the scripts defined there, so enter them as well.

3.Use your favorite word processor to add a file called autocheck.py:

import wftk

repos = wftk.repository() repos.user_auth (‘you’, ‘x’)

l = wftk.list (repos, ‘_todo’)

l.query ()

e = repos.get(‘_todo’, l.keys()[0])

e.set (“field2”, e.get(“field2”) + “ (checked by automatic processor)”) e.set (“state”, “proposed”)

e.save()

How It Works

Again, this example only illustrates general possibilities for using the wftk; in any real-world situation, you would be doing something more elaborate, but take a look at what this setup does. The two

key differences in the definition of the staging area are that a new state has been added before the “proposed” state, and that a new task has been added to the workflow sequence before human approval is necessary. That new first task is your action queue, and the autocheck.py script is the action queue processor.

As promised, the autocheck.py script is very similar to the approval script of the previous example, with one significant difference: It makes a data change to “field2” before updating the entry’s state to “proposed”. The actual data change is trivial in this case, but it could be anything you need it to be.

The script becomes an action queue processor when you set up some scheduler in the operating system (under Unix, this is called a cron job, and under Windows, a scheduled task) to execute it periodically. This kind of queue processor can serve several related purposes: Because it effectively forces records to be handled one by one, it can be used to set a limit on the number of requests processed in a given time frame. In addition, because the submission and autocheck processes are completely separate, it enables the system to respond very quickly to the submitter, and then execute a (possibly lengthy) autocheck procedure later. Only after the autocheck procedure is finished is a human asked to make a decision about the submission.

Note that there’s nothing in this definition that forbids the autocheck procedure from changing the entry’s state to “approved” or “rejected” — this means that the autocheck procedure might also be used to handle certain automatic cases on its own, and the remainder of the workflow (that is, the human involvement) thus becomes unnecessary.

457

TEAM LinG

Chapter 20

Summar y

Enterprise applications make use of software infrastructure to model and support business processes. This chapter covered three general categories of enterprise software:

Document management, to keep track of the documents that make up the knowledge of a business

Directories, which store information about the people who run a business

Workflow systems, which model and store information about the ongoing processes of a business

You were introduced to some of the current state of the auditing art so that you can put your programming efforts into perspective and see how these categories of software are related, not only to the needs of business organizations, but to the new requirements for documentation and validation imposed by regulatory frameworks.

Then you saw two open-source packages: the python-ldap module for talking to LDAP directories and the wftk open-source workflow toolkit for document management and workflow applications. Some useful snippets of Python showed how easy it is to write code to keep your organization organized, and your auditors as happy as auditors can be expected to be. However, the key to keeping auditors happy, just as any other category of software users, is to ensure that they specify their own needs and that they have as much input as possible in the evolution of the software that makes their lives easy. Now, though, you should have a few ways to make your own life easy in that whole process.

Exercises

1.What documentation of existing business processes does your organization already have in place? Is this documentation machine-usable, or does it simply consist of English descriptions of the way you do business (the latter is enough to satisfy ISO 9000 requirements, and is already a great deal better than a system in which people just “know” what should be done). Find out how your company does its auditing, and think about how the techniques outlined in this chapter could make it easier for the employees do their jobs.

2.Think of some processes you can model in your life outside the office or in other businesses. If you have a small business and skipped question 1 because it didn’t apply to you, you’re in luck: You can start modeling your business processes right now!

3.The document retention framework stores it rules in a list in wftk. Devise a set of workflow and document management tools based on the wftk that would enable you to modify those rules in a controlled way.

458

TEAM LinG

21

Web Applications and Web Ser vices

If you’ve ever surfed the web, you’ve probably used web applications: to do research, to pay your bills, to send e-mail, or to buy from an online store. As a programmer, you may even have written web applications in other languages. If you have, you’ll find the experience of doing so in Python comfortingly familiar, and probably easier. If you’re just starting out, then rest assured there’s no better way to enter this field than with Python.

When the World Wide Web was invented in the early 1990s, the Internet was used mainly by university students, researchers, and employees of technology companies. Within a few years, the web had brought the Internet into popular use and culture, co-opting proprietary online services or driving them into bankruptcy. Its triumph is so complete that for many people, the web is synonymous with the Internet, a technology that predates it by more than 20 years.

Our culture became dependent on the web so quickly that it hardly seems necessary to evangelize the benefits for the user of web applications over traditional client-server or standalone applications. Web applications are accessible from almost anywhere in the world. Installing one piece of software on your computer — a web browser — gives you access to all of them. Web applications present a simple user interface using a limited set of widgets. They are (usually) platform independent, usable from any web browser on any operating system — including ones not yet created when the application was written.

If you haven’t yet written your own web applications, however, you might not know about the benefits of developing for the web platform. In many respects, the benefits for the developer are the flip side of the benefits for the user. A web application doesn’t need to be distributed; its users come to it. Updates don’t have to be distributed either: When you upgrade the copy of the program on your server, all of your users start using the new version. Web applications are by convention easy to pick up and use, and because others can link to a web application from their own web sites, driving traffic there, buzz and word-of-mouth spread much more quickly. As the developer, you also have more freedom to experiment and more control over the environment in which your software runs.

The virtues of the web are the virtues of Python: its flexibility, its simplicity, and its inclusive spirit. Python applications are written on Internet time; a hobbyist’s idea can be explored in an evening and become a web fad the next day.

TEAM LinG

Chapter 21

Python also comes packaged with simple, useful modules for interacting with web clients and servers: urlparse, urllib, and its big brother, urllib2, htmllib, cgi, even SimpleHttpServer. There are also many (some would say too many) open-source frameworks that make it easy to build a complex Python web application. Frameworks such as Zope, Quixote, CherryPy, and Subway provide templating, authentication, access control, and more, freeing you up to work on the code that makes your application special.

It’s a huge field, perhaps the most active in the Python community, but this chapter gets you started. You’ll learn how to use basic, standard Python modules to make web applications people will find useful. You’ll also learn how to make them even more useful by creating “web service” interfaces that make it possible for your users to use your applications as elements in their own programs. In addition, you will learn how to write scripts of your own to consume popular web services and turn the knowledge gained to your advantage.

If you’re reading this chapter, you’ve probably used web applications before and perhaps have written a web page or two, but you probably don’t know how the web is designed or how web applications work behind the scenes. If your experience is greater, feel free to skip ahead, although you may find the next section interesting. If you’ve been writing web applications, you might not have realized that the web actually implements a specific architecture, and that keeping the architecture in mind leads to better, simpler applications.

REST: The Architecture of the Web

It might seem strange to think of the web as having an architecture at all, especially for anyone who started programming as or after the web became popular. Because it’s so tightly integrated into your daily life, the assumptions that drive the web might seem invisible or have the flavor of defaults. They are out there, though, differing from what came before and arranged into a coherent architecture. The architecture of the web was formally defined in 2000 by Roy Fielding, one of its founders. He calls the web architecture Representational State Transfer, or REST. This section briefly summarizes the most important concepts behind REST, while connecting them to the workings of HTTP (the protocol that implements REST) and providing examples of architectures that made the same decisions differently.

Characteristics of REST

Much of this chapter is dedicated to writing applications that use the features of the REST architecture to best advantage. As a first step toward learning about those features, here’s a brief look at some of the main aspects of REST.

REST Resources

Fielding’s dissertation on architectural styles and REST is available at www.ics.uci. edu/~fielding/pubs/dissertation/top.htm. Chapter 5 describes REST. Introductions that are more informal are available at the REST Wiki, at http://rest. blueoxen.net/, and at the Wikipedia entry for REST, at http://en.wikipedia. org/wiki/REST.

460

TEAM LinG