
Beginning Python (2005)
.pdf
Web Applications and Web Services
Sample rules include the use of a blank line to signify a new paragraph, and the use of *asterisks* to bold a selection. Unfortunately, these conventions are only informal, and there are no hard-and-fast rules So, the specific rules differ widely across the various wiki implementations.
See http://c2.com/cgi/wiki?WikiDesignPrinciples for Cunningham’s original Wiki design principles.
Sample applications often lack important features necessary to make the application fit for actual use. An online store application presented within the context of this chapter would be too complex to be easily understood, yet not complete enough to actually use to run an online store. Because the defining features of a wiki are so few and simple, it’s possible to design, build, and explain a fully fledged wiki in just a few pages. BittyWiki, the application designed and built in this chapter according to the principles just described, weighs in at under 10 kilobytes, but it’s not the shortest wiki written in Python.
See http://infomesh.net/2003/wypy/wypy.txt for a wiki written in only 814 characters and 11 lines of Python. It’s acutely painful to behold.
The BittyWiki Core Library
Before writing any code, you need to make a couple of design decisions about the nature of the wiki you want to create. In the following examples, the design decisions made are the ones that lead to the simplest wiki back-end: after all, for the purposes of this discussion, the important part of BittyWiki is the interface it presents to the web, not the back-end.
Back-end Storage
Wiki implementations store their pages in a variety of ways. Some keep their files on disk, some in a database, and some in a version control repository so that users can easily revert vandalism. For simplicity’s sake, a BittyWiki installation will keep a page on a disk file named after that page. All of a given wiki’s pages will be kept in the same directory. Because the wiki namespace is flat, no subdirectories
are needed.
WikiWords
Each wiki implementation that uses WikiWords must decide which strings are valid names of wiki pages, so that it can automatically link citations of those pages. BittyWiki will use one of the simplest WikiWord definitions: It will treat as a WikiWord any string of letters and numbers that begins with a capital letter and contains at least two capitals. “WikiWord” is itself a WikiWord, as are “WikiWord2,” “WikiworD,” “WWW,” and “AI.”
Any wiki page can be retrieved by name, but you also need a default page for when no name is specified. The default page will be the one called “HomePage.”
Writing the BittyWiki Core
On the basis of those design decisions, it’s now possible to write the core of BittyWiki: the code that reads from and writes to the back-end, and that processes the WikiWord links. Put this code into BittyWiki.py, in your cgi-bin/ directory or somewhere in your PYTHON_PATH:
481
TEAM LinG

Chapter 21
“””This module implements the BittyWiki core code: that which is not bound to any particular interface.”””
import re import os
class Wiki:
“A class representing a wiki as a whole.” HOME_PAGE_NAME = “HomePage”
def __init__(self, base):
“Initializes a wiki that uses the provided base directory.” self.base = base
if not os.path.exists(self.base): os.makedirs(self.base)
elif not os.path.isdir(self.base):
raise IOError(‘Wiki base “%s” is not a directory!’ % self.base)
def getPage(self, name=None):
“””Retrieves the given page for this wiki, which may or may not currently exist.”””
if not name:
name = self.HOME_PAGE_NAME return Page(self, name)
class Page:
“””A class representing one page of a wiki, containing all the logic necessary to manipulate that page and to determine which other pages it references.”””
#We consider a WikiWord any word beginning with a capital letter, #containing at least one other capital letter, and containing only #alphanumerics.
WIKI_WORD_MATCH = “(([A-Z][a-z0-9]*){2,})” WIKI_WORD = re.compile(WIKI_WORD_MATCH)
WIKI_WORD_ALONE = re.compile(‘^%s$’ % WIKI_WORD_MATCH)
def __init__(self, wiki, name):
“””Initializes the page for the given wiki with the given name, making sure the name is valid. The page may or may not actually exist right now in the wiki.”””
#WIKI_WORD matches a WikiWord anywhere in the string. We want to make #sure the page is a WikiWord and nothing else.
if not self.WIKI_WORD_ALONE.match(name): raise NotWikiWord, name
self.wiki = wiki self.name = name
self.path = os.path.join(self.wiki.base, name)
def exists(self):
“Returns true if there’s a page for the wiki with this name.” return os.path.isfile(self.path)
482 |
TEAM LinG |

Web Applications and Web Services
def load(self):
“Loads this page from disk, if it exists.” if not hasattr(self, ‘text’):
self.text = ‘’ if self.exists():
self.text = open(self.path, ‘r’).read()
def save(self):
“Saves this page. If it didn’t exist before, it does now.” if not hasattr(self, ‘text’):
self.text = ‘’
out = open(self.path, ‘w’) out.write(self.text) out.close()
def delete(self):
“Deletes this page, assuming it currently exists.” if self.exists():
os.remove(self.path)
def getText(self):
“Returns the raw text of this page.” self.load()
return self.text
class NotWikiWord(Exception):
“””Exception thrown when someone tries to pass off a non-WikiWord as a WikiWord.”””
pass
Try It Out |
Creating Wiki Pages from an Interactive Python Session |
In just a bit, you’re going to give BittyWiki a web interface, and spend much of the rest of the chapter accessing it via HTTP. The easiest way to get used to the basic API, however, is to play with BittyWiki from an interactive Python session — no web interface needed:
>>>from BittyWiki import Wiki
>>>wiki = Wiki(“localwiki”)
>>>homePage = wiki.getPage()
>>>homePage.text = “Here’s the home page.\n\nIt links to PageTwo and PageThree.”
>>>homePage.save()
The localwiki directory now contains your wiki’s files:
>>>#The “localwiki” directory now contains your wiki’s files.
>>>import os
>>>open(os.path.join(“localwiki”,”HomePage”)).read()
“Here’s the home page.\n\nIt links to PageTwo and PageThree.”
HomePage references other pages in the wiki, but none of them exist yet:
>>>page2 = wiki.getPage(“PageTwo”)
>>>page2.exists()
False
483
TEAM LinG

Chapter 21
Of course, we can create one of those pages:
>>>page2.text = “Here’s page 2.\n\nIt links back to HomePage.”
>>>page2.save()
>>>page2.exists()
True
Finally, a look at the NotWikiWord exception:
>>> wiki.getPage(“Wiki”)
Traceback (most recent |
call last): |
File “<stdin>”, line |
1, in ? |
File “BittyWiki.py”, |
line 25, in getPage |
return Page(self, name) |
|
File “BittyWiki.py”, |
line 47, in __init__ |
raise NotWikiWord, |
name |
BittyWiki.NotWikiWord: |
Wiki |
The BittyWiki Web Interface
The BittyWiki library provides a way to manipulate the wiki, but it has no user interface. You can write standalone scripts to manipulate the repository, or create pages from an interactive prompt, but wikis were intended to be used over the web. Another set of design decisions awaits, related to how BittyWiki should expose the wiki pages and operations as REST resources.
Resources
Because REST is based on resources, the first thing to consider when designing a web application is the nature of the resources to provide. A wiki provides only one type of resource: pages out of a flat namespace. Information in the URL path is easier to read than keeping it in the string, so a wiki page should be retrieved by sending a GET request to the CGI, appending the page name to the CGI path. The resulting resource identifier looks like /bittywiki.cgi/PageName. To modify a page, a POST request should be sent to its resource identifier.
The allowable operations on a wiki page are as follows: creating one, reading one, updating one, and deleting one. These four operations are so common to different types of resource that they have their own acronym (CRUD), used to describe the many applications designed for performing those operations. A wiki is a web-based CRUD application for named pages of text kept in a flat namespace.
Most wikis either implement page delete as a special administrator command, or don’t implement it at all; this is because a page delete command makes vandalism very easy. BittyWiki’s naivete with respect to the delete command is perhaps its least realistic feature.
Request Structure
Not by coincidence, the CRUD operations correspond to the four main HTTP verbs: Recall that the same four operations show up repeatedly, whether the subject is databases, file system access, or web resources. Ideally, one CRUD operation would map to one HTTP verb.
484 |
TEAM LinG |

Web Applications and Web Services
When users request a page for reading, the only information they must provide is the page name. Therefore, for the read operation, no additional information must be tacked on to the resource identifier defined in the previous section. A simple GET to the resource identifier will suffice.
When modifying a page, it’s necessary to send not only the name of the page but its desired new contents. POSTing the data to the resource identifier should suffice to do that.
Now you run into a problem: You have two more operations (create and delete), but only one HTTP method (POST) is both suitable for those operations and also supported by the HTML forms that will make up your interface. These operations must be consolidated somehow.
It makes no sense to “create” a page that already exists or to “edit” a nonexistent page, so those two operations could be combined into a single write operation. There are still two actions (write and delete) to go through POST, so the problem remains.
The solution is to have users put a marker in their POST data to indicate which operation they want to perform, rather than just post the data they want to use in the operation. The key for this marker will be operation, and the allowable values will be write and delete.
But Wait — There’s More (Resources)
So far, the design assumes that the write and delete actions are triggered in response to HTML form submissions. Where are those HTML forms going to come from? Because the forms need to be dynamically generated based on the name of the page they’re modifying, they must be generated by the wiki program. This makes them a new type of resource. Contrary to what was stated earlier, BittyWiki actually serves two types of resources. Its primary job is to serve pages, but it must also serve HTML forms for manipulating those pages.
Unlike pages, forms can’t be created, updated, or deleted by the user: they can only be read. (After they’re read, however, they can be used to create, update or delete pages.) The forms should therefore be accessible through GET URLs.
Because the user will be requesting a form to write or delete a particular page, it makes sense to base the resource identifier for the form on that of the page. There are two ways of doing this. The first is to continue to append to the PATH_INFO of the identifier, so that the form to delete the page at /bittywiki.cgi/ MyPage is located at /wiki.cgi/MyPage/delete. The other way is to use the QUERY_STRING, so that that form is located at /wiki.cgi/MyPage?operation=delete.
There’s no general right or wrong solution. However, because the “operation” keyword is already in use for the POST form submissions, and because the pages (not the forms) are the real point of a wiki, BittyWiki will implement the second strategy. The possible values will be the same as for the POST commands: write and delete.
To summarize: Each wiki page in BittyWiki boasts three associated resources. Each resource might behave differently in response to a GET and a POST, as shown in the following table.
485
TEAM LinG

Chapter 21
Resource |
What GET does |
What POST does |
|
|
|
/bittywiki.cgi/PageName |
Displays the page if |
Nothing |
|
it exists; displays create |
|
|
form if not |
|
/bittywiki.cgi/PageName?operation=write |
Displays edit form |
Writes page, provides |
|
|
status |
/bittywiki.cgi/PageName?operation=delete |
Displays delete form |
Deletes page, |
|
|
provides status |
|
|
|
If no page name is specified (that is, someone GETs the bare resource /bittywiki.cgi/), the CGI will ask the core wiki code to retrieve the default page.
There are tradeoffs to consider when you’re designing your resource identifiers and weighing PATH_INFO against QUERY_STRING. Both “/foo.cgi/clients/MegaCorp” and “/foo.cgi?client=MegaCorp” are legitimate REST identifiers for the same resource. The advantage of the first one is that it looks a lot nicer, more like a “real” resource. If you want to give the appearance of hierarchy in your data structure, nothing does it as well as a PATH_INFO-based identifier scheme.
The problem is that you can’t use that scheme in conjunction with an HTML form that lets you, for example, select MegaCorp from a list of clients. The destination of an HTML form needs to be defined at the time the form is printed, so the best you can do ahead of time would be /foo.cgi/, letting the web browser tack on “?client=MegaCorp” when the user submits the form. If your application has this problem, you might consider defining two resource identifiers for each of your resources: an identifier that uses PATH_INFO, and one that uses QUERY_STRING.
Wiki Markup
The final question is to consider how to transform the plaintext typed in by writers into the HTML displayed to readers. Some wikis are extravagant and let writers do things like draw tables and upload images. BittyWiki will support only a few very basic types of text-to-HTML markup:
To ensure valid HTML, all pages will be placed within paragraph (<p>) tags.
Two consecutive newlines will be treated as a paragraph break.
Any HTML manually typed into a wiki page will be escaped, so that it’s displayed to the viewer instead of being interpreted by the web browser.
Because there are so few markup rules, BittyWiki pages will look a little bland, but prohibiting raw HTML will limit the capabilities of any vandals that happen along.
With these design decisions made, it’s now possible to create the CGI web interface to BittyWiki. This code should go into bittywiki.cgi, in the same cgi-bin/ directory where you put BittyWiki.py:
486 |
TEAM LinG |

Web Applications and Web Services
#!/usr/bin/python import cgi import cgitb import os
import re
from BittyWiki import Wiki, Page, NotWikiWord cgitb.enable()
#First, some HTML templates. MAIN_TEMPLATE = ‘’’<html> <head><title>%(title)s</title> <body>%(body)s<hr />%(navLinks)s</body> </html>’’’
VIEW_TEMPLATE = ‘’’%(banner)s <h1>%(name)s</h1> %(processedText)s’’’
WRITE_TEMPLATE = ‘’’%(banner)s <h1>%(title)s</h1>
<form method=”POST” action=”%(pageURL)s”>
<input type=”hidden” name=”operation” value=”write”>
<textarea rows=”15” cols=”80” name=”data”>%(text)s</textarea><br /> <input type=”submit” value=”Save”>
</form>’’’
DELETE_TEMPLATE = ‘’’<h1>%(title)s</h1>
<p>Are you sure %(name)s is the page you want to delete?</p>
<form method=”POST” action=”%(pageURL)s”>
<input type=”hidden” name=”operation” value=”delete”> <input type=”submit” value=”Delete %(name)s!”> </form>’’’
ERROR_TEMPLATE = ‘<h1>Error: %(error)s</h1>’
BANNER_TEMPLATE = ‘<p style=”color:red;”>%s</p><hr />’
#A snippet for linking a WikiWord to the corresponding wiki page. VIEW_LINK = ‘<a href=”%s”>%%(wikiword)s</a>’
#A snippet for linking a WikiWord with not corresponding page to a #form for creating that page.
ADD_LINK = ‘%%(wikiword)s<a href=”%s”>?</a>’
Rather than print out HTML pages from inside the CGI script, it’s often useful to define HTML templates as strings ahead of time and use Python’s string interpolation to fill them with dynamic values. This helps to separate presentation and content, making it much easier to customize the HTML. Separating the HTML out from the Python code makes it possible to hand the templates over to a web designer who doesn’t know Python.
487
TEAM LinG

Chapter 21
One feature of Python that deserves wider recognition is its capability to do string interpolation with a map instead of a tuple. If you have a string “A %(foo)s string”, and a map containing an item keyed to foo, then interpolating the string with the map will replace “%(foo)s” with the string value of the item keyed to foo:
class WikiCGI:
#The possible operations on a wiki page. VIEW = ‘’
WRITE = ‘write’ DELETE = ‘delete’
def __init__(self, wikiRoot): self.wiki = Wiki(wikiRoot)
def run(self): toDisplay = None try:
#Retrieve the wiki page the user wants. page = os.environ.get(‘PATH_INFO’, ‘’) if page:
page = page[1:]
page = self.wiki.getPage(page) except NotWikiWord, badName:
page = None
error = ‘“%s” is not a valid wiki page name.’ % badName toDisplay = self.makeError(error)
if page:
#Determine what the user wants to do with the page they #requested.
makeChange = os.environ[‘REQUEST_METHOD’] == ‘POST’ if makeChange:
defaultOperation = self.WRITE else:
defaultOperation = ‘’ form = cgi.FieldStorage()
operation = form.getfirst(‘operation’, defaultOperation)
#We now know which resource the user was trying to access #(“page” in conjunction with “operation”), and “form” #contains any representation they were submitting. Now we #delegate to the appropriate method to handle the operation #they requested.
operationMethod = self.OPERATION_METHODS.get(operation) if not operationMethod:
error = ‘“%s” is not a valid operation.’ % operation toDisplay = self.makeError(error)
if not page.exists() and operation and not \ (makeChange and operation == self.WRITE):
#It’s okay to request a resource based on a page that #doesn’t exist, but only if you’re asking for the form to #create it, or actually trying to create it.
488 |
TEAM LinG |

Web Applications and Web Services
toDisplay = self.makeError(‘No such page: “%s”’ % page.name)
if operationMethod:
toDisplay = operationMethod(self, page, makeChange, form)
#All the operation methods, as well as makeError, are expected #to return a set of values that can be used to render the HTML #response: the title of the page, the body template to use, a #map of variables to interpolate into the body template, and a
#set of navigation links to put at the bottom of the page. title, bodyTemplate, bodyArgs, navLinks = toDisplay
if page and page.name != Wiki.HOME_PAGE_NAME:
backLink = ‘<a href=”%s”>Back to wiki homepage</a>’ navLinks.append(backLink % self.makeURL())
print “Content-type: text/html\n” print MAIN_TEMPLATE % {‘title’ : title,
‘body’ : bodyTemplate % bodyArgs, ‘navLinks’ : ‘ | ‘.join(navLinks)}
When the WikiCGI class is instantiated, it finds out which resource is being requested, and what the user wants to do with that resource. It delegates to one of a number of methods (yet to be defined) that handle the various possible operations.
Each of these methods is expected to return the skeleton of a web page: the title, a template string (one of the templates defined earlier: VIEW_TEMPLATE, WRITE_TEMPLATE, etc.), a map of variables to use when interpolating that template, and a set of links to help the user navigate the wiki.
The last act of WikiCGI instantiation is to fill out this skeleton: to interpolate the provided variable map into the page-specific template string and then to interpolate that into the overarching main template. The result, a complete HTML page, is simply printed to standard output.
The next part of the CGI defines the three operation-specific methods, which take a page and (possibly) a resource representation stored in form data; makes any appropriate changes; and returns the raw materials for a document:
def viewOperation(self, page, makeChange, form=None, banner=None): “””Renders a page as HTML, either as the result of a request for it as a resource, or as a side effect of some other operation.”””
if banner:
banner = BANNER_TEMPLATE % banner else:
banner = ‘’
if not page.exists():
title = ‘Creating %s’ % page.name toDisplay = (title, WRITE_TEMPLATE,
{‘title’ : title, ‘banner’ : banner,
‘pageURL’ : self.makeURL(page), ‘text’ : ‘’},
[])
else:
writeLink = ‘<a href=”%s”>Edit this page</a>’ \
489
TEAM LinG

Chapter 21
% self.makeURL(page, self.WRITE) deleteLink = ‘<a href=”%s”>Delete this page</a>’ \
% self.makeURL(page, self.DELETE) toDisplay = (page.name, VIEW_TEMPLATE,
{‘name’ : page.name, ‘banner’ : banner,
‘processedText’ : self.renderPage(page)}, [writeLink, deleteLink])
return toDisplay
def writeOperation(self, page, makeChange, form): “Saves a page, or displays its create or edit form.” if makeChange:
data = form.getfirst(‘data’) page.text = data page.save()
#The operation is done, but we still need a document to #return to the user. Display the new version of this page, #with a banner.
toDisplay = self.viewOperation(page, 0, banner=’Page saved.’) else:
navLinks = []
pageURL = self.makeURL(page) if page.exists():
title = ‘Editing ‘ + page.name
navLinks.append(‘<a href=”%s”>Back to %s</a>’ % (pageURL, page.name))
else:
title = ‘Creating ‘ + page.name
toDisplay = (title, WRITE_TEMPLATE, {‘title’ : title, ‘banner’ : ‘’, ‘pageURL’ : pageURL,
‘text’ : page.getText()},
navLinks)
return toDisplay
def deleteOperation(self, page, makeChange, form=None): “Deletes a page, or displays its delete form.”
if makeChange: page.delete()
banner = ‘Page “%s” deleted.’ % page.name
#The page is deleted, but we still need a document to
#return to the user. Display the wiki homepage, with a banner. toDisplay = self.viewOperation(self.wiki.getPage(), 0,
banner=banner)
else:
if page.exists():
title = ‘Deleting ‘ + page.name pageURL = self.makeURL(page)
backLink = ‘<a href=”%s”>Back to %s</a>’
toDisplay = (title, DELETE_TEMPLATE, {‘title’ : title, ‘name’ : page.name, ‘pageURL’ : pageURL},
[backLink % (pageURL, page.name)])
else:
error = “You can’t delete a page that doesn’t exist.”
490 |
TEAM LinG |