Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Beginning Python (2005)

.pdf
Скачиваний:
177
Добавлен:
17.08.2013
Размер:
15.78 Mб
Скачать

Web Applications and Web Services

While you’re at Google’s site, you can also download the API developers’ kit. The kit is oriented towards Java and .NET programmers, but it does contain a useful human-readable API reference, as well as a WSDL file, which describes the API in machine-parseable form. (See the “WSDL” section later for more information on WSDL.) After you’ve got an API key, you’re ready to write some scripts. Here’s a simple one, GoogleSearch.py, that does a Google search from the command line and prints the results:

import SOAPpy

class GoogleAPI:

“Implements part of the Google Web API as a simple Python class.”

URL = ‘http://api.google.com/search/beta2’

NAMESPACE = ‘urn:GoogleSearch’

def __init__(self):

self.server = SOAPpy.SOAPProxy(self.URL, self.NAMESPACE)

SOAPProxy acts a lot like xmlrpclib’s ServerProxy class: It overrides __call__ to transform method calls into web service calls. The only difference so far is that whereas a ServerProxy operates against a server URL, a SOAPProxy also operates against a namespace. SOAP is big on namespaces, as you’ll see. Here, the namespace is used to identify a provider of web service methods. One server might provide two completely different web services, both of which implement a method called doFoo. Each web service has its own namespace, though, which enables the server to distinguish between your doFoo and your neighbor’s doFoo.

Recall that in XML-RPC interfaces, this problem is traditionally solved by qualifying the method names with package names, as in “bittywiki.getPage”. In the SOAP interface, that method would be called just plain “getPage” but it would be executed in a namespace like ‘urn:BittyWiki’ to distinguish it from the “getPage” methods provided by other web services:

#These two commands will make SOAPpy print the raw request and #response for each SOAP call, letting you see the internal #workings of the protocol.

#self.server.config.dumpSOAPOut=1

#self.server.config.dumpSOAPIn=1

def doGoogleSearch(self, key, searchString, resultOffset=0, maxResults=10, filter=True, restrict=””, safeSearch=True, languageRestrict=”en”):

“””A convenience method to hide the fact that a call to doGoogleSearch requires ten arguments, two of which are deprecated and shouldn’t be used. By calling this method you can do a search by providing only your Google API key and the search string. For the meanings of the other arguments,

see the reference for the Google Web APIs.”””

return self.server.doGoogleSearch(key, searchString, resultOffset, maxResults, filter, restrict,

safeSearch, languageRestrict, “”, “”)

521

TEAM LinG

Chapter 21

Just as with XML-RPC, a method call on the server object silently spawns an XML document depicting the method you want to call, and the values you’re providing for that method’s arguments. This XML document is submitted to the server via HTTP POST. The HTTP response is a second XML document depicting a data structure, which is parsed and used to construct an actual Python data structure. The data structure is what is returned to you:

if __name__ == ‘__main__’: import sys

if len(sys.argv) != 3:

print “Usage: %s [Google API key] [Search term]” % sys.argv[0] sys.exit(1)

key, term = sys.argv[1:3]

resultObj = GoogleAPI().doGoogleSearch(key, term) results = resultObj.resultElements

print ‘First %s result(s) for “%s”:’ % (len(results), term) for result in results:

print “ %s: %s” % (result.title, result.URL)

Running this script gives you search results on standard output:

$ python GoogleSearch.py [Your Google API key] “Python and SOAP”

First 10 result(s) for “Python and SOAP”:

The <b>Python</b> Web services developer: <b>Python</b> <b>SOAP</b> libraries http://www-106.ibm.com/developerworks/library/ws-pyth5/

The <b>Python</b> Web services developer: <b>Python</b> <b>SOAP</b> libraries, Part 2 http://www-106.ibm.com/developerworks/library/ws-pyth6/

...

Scripting the Web with <b>Python</b> : Specs > <b>SOAP</b> > Implementations http://python.scripting.com/directory/13/specs/soap/implementations

[PythonCE] using <b>Python</b> 2.2 + <b>SOAP</b> on Windows CE

http://mail.python.org/pipermail/pythonce/2002-May/000069.html

It’s useful for didactic purposes to show you how to make SOAP calls directly, and often you may have no other choice. If you’re serious about using the Google API with Python, though, it’s better to use googlepy, a Google-specific SOAP client library for Python. It’s available at http://pygoogle. sourceforge.net/, and it’s a lot simpler than calling the SOAP methods directly.

The SOAP Request

Here’s a transcript of a hypothetical SOAP RPC call that tries to sort a list in ascending order; compare it to the XML-RPC transcript earlier that called an XML-RPC version of the same method:

<?xml version=”1.0” encoding=”UTF-8”?> <SOAP-ENV:Envelope SOAP-

ENV:encodingStyle=”http://schemas.xmlsoap.org/soap/encoding/” xmlns:SOAP-ENV=”http://schemas.xmlsoap.org/soap/envelope/” xmlns:SOAP-ENC=”http://schemas.xmlsoap.org/soap/encoding/”

522

TEAM LinG

Web Applications and Web Services

xmlns:xsi=”http://www.w3.org/1999/XMLSchema-instance” xmlns:xsd=”http://www.w3.org/1999/XMLSchema”>

<SOAP-ENV:Body>

<ns1:sortList xmlns:ns1=”urn:SearchSort” SOAP-ENC:root=”1”> <v1 SOAP-ENC:arrayType=”xsd:int[2]” xsi:type=”SOAP-ENC:Array”> <item>10</item>

<item>2</item>

</v1>

<v2 xsi:type=”xsd:boolean”>True</v2> </ns1:sortList>

</SOAP-ENV:Envelope>

The first thing to notice is all those xmlns declarations. SOAP is very particular about XML namespaces, whereas XML-RPC is much more informal and serves standalone XML documents. SOAP uses XML namespaces to define the format of the SOAP message itself (SOAP-ENV), the data types (such as xsd:boolean and the SOAP-specific SOAP-ENC:Array), and the very concept of a data type (xsi:type). This gives SOAP a lot more flexibility in how its data is encoded, but between XML Schema (xsd) and the SOAP data encoding schema (SOAP-ENC), most of the basic data types are already defined for you. Only in more complicated cases will you need to define custom data types.

The other namespace mentioned in this message is urn:SearchSort. That’s the namespace of the method you’re trying to call. As mentioned before, this is like the way the XML-RPC version of this request named its method searchsort.sortList, instead of just sortList. SOAP has formalized the XML-RPC convention, and uses XML namespaces to distinguish between different methods with the same name. Your SOAP call must be executed in a particular XML namespace. If you use a Python SOAP library to make SOAP calls, this is probably the only namespace you’ll actually have to worry about.

If you ignore the namespaces, this message looks a lot like the XML-RPC message you saw earlier. There’s a method call tag that contains a list of tags for the arguments to be passed into the method. Instead of the method call tag containing a child tag with the method name, here the tag is simply named after the method to be called. In XML-RPC, the arguments were listed inside a separate params tag. Here, they’re direct children of the method call tag. The SOAP message is a little more concise, but (again, disregarding the namespace declarations) just as easy to read.

Compare the XML-RPC representation of the array to be sorted, which you saw earlier, to the SOAP representation of the same array:

<array>

<data>

<value><i4>2</i4></value>

<value><i4>10</i4></value>

</data>

</array>

<v1 SOAP-ENC:arrayType=”xsd:int[2]” xsi:type=”SOAP-ENC:Array”> <item>10</item>

<item>2</item>

</v1>

523

TEAM LinG

Chapter 21

This difference between the two protocols is typical. There’s more up-front definition in SOAP and more references to external documents that formally define the data types. The upside of that is that once the definition is done, it takes fewer bytes to actually define a data structure. It doesn’t make much difference with a small array like this, but consider an array with thousands or millions of elements. SOAP is more efficient than XML-RPC at representing large data structures.

The SOAP Response

Here’s a possible response you might get from a SOAP server after sending it the sortList request:

<?xml version=”1.0” encoding=”UTF-8”?> <SOAP-ENV:Envelope SOAP-

ENV:encodingStyle=”http://schemas.xmlsoap.org/soap/encoding/” xmlns:SOAP-ENV=”http://schemas.xmlsoap.org/soap/envelope/”

xmlns:SOAP-ENC=”http://schemas.xmlsoap.org/soap/encoding/” xmlns:xsi=”http://www.w3.org/1999/XMLSchema-instance” xmlns:xsd=”http://www.w3.org/1999/XMLSchema”>

<SOAP-ENV:Body>

<ns1:sortList xmlns:ns1=”urn:SearchSort” SOAP-ENC:root=”1”> <return SOAP-ENC:arrayType=”xsd:int[2]” xsi:type=”SOAP-ENC:Array”> <item>2</item>

<item>10</item>

</return>

</ns1:sortList> </SOAP-ENV:Envelope>

Just as with XML-RPC, a SOAP response has the same basic structure as a SOAP request. Where the SOAP request had a list of arguments, the SOAP response has a single return value. This, too, is similar to XML-RPC: Recall that an XML-RPC response contained a params list, which was only allowed to contain one param — the return value. SOAP makes this convention more natural by eliminating the params tag and just returning the return value.

If Something Goes Wrong

If you make a SOAP request that makes the server code throw an exception, the Body of the response you get back will contain a Fault element. It might look something like this:

</SOAP-ENV:Body>

<SOAP-ENV:Fault SOAP-ENC:root=”1”> <faultcode>SOAP-ENV:Client</faultcode>

<faultstring>No method urn:SearchSort:sortList found</faultstring> <detail xsi:type=”xsd:string”>

There’s no method “sortList” in the urn:SearchSort namespace. </detail>

</SOAP-ENV:Fault>

</SOAP-ENV:Body>

The faultstring and detail sub-elements of Fault are for human-readable descriptions, and the faultcode element describes the type of error. Whereas XML-RPC says nothing about the fault code except that it must be an integer, SOAP defines four standard strings to serve as fault codes. Two of them

524

TEAM LinG

Web Applications and Web Services

(mustUnderstand and VersionMismatch) you probably won’t encounter in basic SOAP use. The other two fault codes serve, appropriately enough, to identify who caused the fault. If you’re writing a SOAP client and you get a faultcode of Client (Receiver in SOAP version 1.2), that means you caused the error (for instance, in the preceding, by calling a method that doesn’t exist in the namespace you specified). If the faultcode is Server (Sender in SOAP version 1.2), that means there’s nothing wrong with your request but the server can’t fulfill it at the moment — perhaps the server code can’t access a database or some other necessary resource.

Within a Python interface, the details of a response with a Fault are hidden from you, pretty much as in XML-RPC. If a Python method you’ve exposed through SOAP throws an exception, the SOAP server will automatically transform the exception into a SOAP response with a Fault element. If you’re using SOAPpy and you call a remote method that responds with a Fault, it’ll be transformed into a subclass of

Error: SOAPpy.Types.faultType.

Exposing a SOAP Interface to BittyWiki

In principle, there’s no reason why you shouldn’t be able to run a SOAP server from a CGI script: Remember that despite all the additional complexity and mystique of SOAP, it’s just like REST and XML-RPC in that it’s just a document being POSTed to a URL and another document being sent in return. Unfortunately, SOAPpy doesn’t provide a CGI script that serves SOAP requests, only a standalone server, SOAPServer.

ZSI, the other SOAP implementation for Python, does offer a CGI-based server.

The following sample script, BittyWiki-SOAPServer.py, exposes the BittyWiki interface to SOAP using a standalone server. This file should go into the same directory as the file BittyWiki.py, so that you can use the core BittyWiki engine. Alternatively, you can put BittyWiki.py into one of the directories in your PYTHON_PATH so you can use it from anywhere:

#!/usr/bin/python import sys import SOAPpy

from BittyWiki import Wiki

class BittyWikiAPI:

“””A simple wrapper around the basic BittyWiki functionality we want to expose to the API.”””

def __init__(self, wikiBase):

“Initialize a wiki located in the given directory.”

self.wiki = Wiki(wikiBase)

def getPage(self, pageName):

“Returns the text of the given page.” page = self.wiki.getPage(pageName) if not page.exists():

raise NoSuchPage, page.name return page.getText()

def save(self, pageName, newText): “Saves a page of the wiki.”

525

TEAM LinG

Chapter 21

page = self.wiki.getPage(pageName) page.text = newText

page.save()

return “Page saved.”

def delete(self, pageName): “Deletes a page of the wiki.”

page = self.wiki.getPage(pageName) if not page.exists():

raise NoSuchPage, page.name page.delete()

return “Page deleted.”

class NoSuchPage(Exception):

“””An exception thrown when a caller tries to access a page that doesn’t exist.”””

pass

The actual API code is exactly the same as for the XML-RPC server; it could even be moved into a common library. The only difference is that now we register it with a SOAPServer instead of a

SimpleXMLRPCServer:

DEFAULT_PORT = 8002 NAMESPACE = ‘urn:BittyWiki’ WIKI_BASE = ‘wiki/’

if __name__ == ‘__main__’:

api = BittyWikiAPI(WIKI_BASE) port = DEFAULT_PORT

if len(sys.argv) > 1: port = sys.argv[1] try:

port = int(port) except ValueError:

#Oops, that wasn’t a port number. Chide the user and exit. print ‘Usage: “%s [optional port number]”’ % sys.argv[0] sys.exit(1)

print “Starting up standalone SOAP server on port %s.” % port handler = SOAPpy.SOAPServer((‘localhost’, port)) handler.registerObject(api, NAMESPACE) handler.serve_forever()

Try It Out Manipulating BittyWiki through SOAP

In one window, start the standalone SOAP server:

$ python BittyWiki-SOAPServer.py 8002

Starting up standalone XML-RPC server on port 8002.

In another, start an interactive Python session:

526

TEAM LinG

Web Applications and Web Services

>>>import SOAPpy

>>>bittywiki = SOAPpy.SOAPProxy(“http://localhost:8002/”, “urn:BittyWiki”)

>>>bittywiki.getPage(“CreatedBySOAP”)

<Fault SOAP-ENV:Server: Method urn:BittyWiki:getPage failed.: __main__.NoSuchPage CreatedBySOAP>

Traceback (most recent call last): File “<stdin>”, line 1, in ?

...

SOAPpy.Types.faultType: <Fault SOAP-ENV:Server: Method urn:BittyWiki:getPage failed.: __main__.NoSuchPage CreatedBySOAP>

>>>bittywiki.save(“CreatedBySOAP”, “This page was created through the SOAP interface.”)

‘Page saved.’

>>>bittywiki.getPage(“CreatedBySOAP”)

‘This page was created through the SOAP interface.’

The experience of using SOAP, hidden behind SOAPpy, is similar to the experience of using XML-RPC, hidden behind xmlrpclib. You can make method calls, passing in standard Python objects, and let the library take care of all the details.

Wiki Search-and-Replace Using the SOAP Web Service

Here’s WikiSpiderSOAP.py, another wiki search-and-replace client similar to the ones described earlier for BittyWiki’s REST and XML-RPC interfaces. By now, this code should be familiar. The pattern is always the same: Set up some reference to the basic BittyWiki API and run the basic search-and-replace spider algorithm using it. The only major difference between this version and the XML-RPC version is the exception handling: xmlrpclib and SOAPpy act differently when something goes wrong on the server side, so the exception handling code must be different. Other than that, the SOAP-based search- and-replace spider looks more or less the same as the XML-RPC one:

#!/usr/bin/python import re

import SOAPpy

class WikiReplaceSpider:

“A class for running search-and-replace against a web of wiki pages.”

WIKI_WORD = re.compile(‘(([A-Z][a-z0-9]*){2,})’)

def __init__(self, rpcURL):

“Accepts a URL to a BittyWiki SOAP API.”

self.api = SOAPpy.SOAPProxy(rpcURL, “urn:BittyWiki”) self.api.config.dumpSOAPIn=1

def replace(self, find, replace):

“””Spider wiki pages starting at the front page, accessing them and changing them via the XML-RPC API.”””

processed = {} #Keep track of the pages already processed. todo = [‘HomePage’] #Start at the front page of the wiki. while todo:

for pageName in todo:

print ‘Checking “%s”’ % pageName

527

TEAM LinG

Chapter 21

try:

pageText = self.api.getPage(pageName) except SOAPpy.Types.faultType, fault:

if fault.detail.find(“NoSuchPage”) != -1:

#Some page mentioned a WikiWord that doesn’t exist #yet; not a big deal.

pass else:

#Some other problem; pass it on up. raise SOAPpy.Types.faultType, fault

else:

#This page actually exists; process it.

#First, find any WikiWords in this page: they may #reference other existing pages.

for wikiWord in self.WIKI_WORD.findall(pageText): linkPage = wikiWord[0]

if not processed.get(linkPage) and linkPage not in todo: #We haven’t processed this page yet: put it on

#the to-do list. todo.append(linkPage)

#Run the search-and-replace on the page text to get the #new text of the page.

newText = pageText.replace(find, replace)

#Check to see if this page name matches the search #string. If it does, delete it and recreate it #with the new text; otherwise, just save the new #text in the existing page.

newPageName = pageName.replace(find, replace) if newPageName != pageName:

print ‘ Deleting “%s”, will recreate as “%s”’ \ % (pageName, newPageName)

self.api.delete(pageName)

if newPageName != pageName or newText != pageText: print ‘ Saving “%s”’ % newPageName self.api.save(newPageName, newText)

#Mark the new page as processed so we don’t go through #it a second time.

if newPageName != pageName: processed[newPageName] = True

processed[pageName] = True todo.remove(pageName)

if __name__ == ‘__main__’: import sys

if len(sys.argv) == 4:

rpcURL, find, replace = sys.argv[1:] else:

print ‘Usage: %s [URL to BittyWiki SOAP API] [find] [replace]’ \ % sys.argv[0]

sys.exit(1) WikiReplaceSpider(rpcURL).replace(find, replace)

528

TEAM LinG

Web Applications and Web Services

This spider works just like the REST and the XML-RPC versions described earlier in this chapter:

$ python WikiSpiderSOAP.py http://localhost:8002/ Foo Bar Checking “HomePage”

Saving “HomePage” Checking “FooCaseStudies”

...

Note that because BittyWiki-SOAPServer.py runs its own web server, there’s no need to point to a script somewhere on the web server that handles the SOAP interface. The entire web server is the SOAP interface.

Documenting Your Web Service API

Exposing a web service API won’t do any good unless the people who want to write robots can figure out how to use it. If you were to distribute a Python module with inadequate documentation (shame on you), a determined user could try to figure out the API by looking at the source code and, if necessary, making experimental changes, learning through trial and error. That isn’t possible when you expose a web service, so it’s especially important that you have a real way of getting the API information to your users.

Human-Readable API Documentation

In my opinion, no matter which web service protocol you’re using, nothing beats an up-to-date humanreadable description of an API. This can be written manually or generated through introspection and the use of Python docstrings. Up next are three sample documents that describe the three web service APIs for the BittyWiki application created in this chapter. They’re all extremely short, but they contain all the information a user needs to write an application using any of them.

The BittyWiki REST API Document

To get the raw wiki markup for the page “WikiPage”, GET the URL http://localhost:8000/ cgi-bin/bittywiki-rest.cgi/WikiPage. You’ll get an XML data structure in which the <data> tag contains the wiki markup of the WikiPage page. If the WikiPage page doesn’t exist, you’ll get an error.

To modify the contents of the page “WikiPage”, POST to the URL http://localhost:8000/cgi-bin/ bittywiki-rest.cgi/WikiPage. Set data equal to the wiki markup you want to write to the page, and operation to the string write. You’ll receive an XML data structure in which the <message> tag contains a status message. If the WikiPage page doesn’t exist, it will be automatically created.

To delete the page “WikiPage”, POST to the URL http://localhost:8000/cgi-bin/bittywiki-rest. cgi/WikiPage. Set “operation” to the string delete. You’ll receive an XML data structure in which the <message> tag contains a status message. If the WikiPage page doesn’t exist, you’ll get an error.

The BittyWiki XML-RPC API Document

The BittyWiki API server is located at http://localhost:8001/. It exposes three methods:

bittywiki.getPage(string pageName): Returns the text of the named page. Passing an empty string designates the wiki home page. This will throw a fault if you request a page that doesn’t exist.

529

TEAM LinG

Chapter 21

bittywiki.save(string pageName, string text): Sets the text of the named page. If the page doesn’t already exist, it’ll be automatically created.

bittywiki.delete(string pageName): Deletes the named page. This will throw a fault if you try to delete a page that doesn’t exist.

The BittyWiki SOAP API Document

The BittyWiki SOAP server is located at http://localhost:8002/. It exposes three methods in the namespace “urn:BittyWiki”:

getPage(string pageName): Returns the text of the named page. Passing an empty string designates the wiki homepage. This will throw a fault if you request a page that doesn’t exist.

save(string pageName, string text): Sets the text of the named page. If the page doesn’t already exist, it will be automatically created.

delete(string pageName): Deletes the named page. This will throw a fault if you try to delete a page that doesn’t exist.

The XML-RPC Introspection API

An unofficial addendum to the XML-RPC specification defines three special functions in the “system” namespace, as a convenience to users who might not know which functions an XML-RPC server supports, or what those functions might do. These special functions are the web service equivalent of Python’s everuseful dir and help commands. Both SimpleXMLRPCServer and CGIXMLRPCRequestHandler support two of the three introspection functions, assuming you call the register_introspection_functions method on the server or handler object after instantiating it:

handler = SimpleXMLRPCServer.SimpleXMLRPCServer((host, port))

handler.register_introspection_functions()

 

Method Name

What It Does

 

 

 

 

system.listMethods()

Returns the names of all the functions the server

 

 

 

makes available

 

system.methodHelp(string funcName)

Returns a string with documentation for the named

 

 

 

function. Implemented in Python by returning the

 

 

 

function’s Python docstring.

 

system.methodSignature(string funcName)

Returns the signature and return type of the

 

 

 

named function. Not automatically supported by

 

 

 

the Python implementation because Python func-

 

 

 

tion definitions don’t include type information.

 

 

 

 

 

Try It Out

Using the XML-RPC Introspection API

Start up and connect to the BittyWiki XML-RPC server (or CGI) as before. In addition to the BittyWiki methods shown earlier, you can use the XML-RPC introspection methods:

530

TEAM LinG