
Beginning Python (2005)
.pdf
Network Programming
Subject: Your picture
From: Me <me@example.com>
To: You <you@example.com>
Here’s that picture I took of you.
>>>msg.addAttachment(open(“photo.jpg”).read(), “photo.jpg”)
>>>print str(msg)
Content-Type: multipart/mixed; boundary=”===============1077328303==” MIME-Version: 1.0
Subject: Your picture From: Me <me@example.com> To: You <you@example.com>
--===============1077328303==
Content-Type: text/plain; charset=”us-ascii” MIME-Version: 1.0 Content-Transfer-Encoding: 7bit
Here’s that picture I took of you. --===============1077328303==
Content-Type: image/jpeg MIME-Version: 1.0 Content-Transfer-Encoding: base64
/9j/4AAQSkZJRgABAQEASABIAAD//gAXQ3JlYXRlZCB3aXRoIFRoZSBHSU1Q/9sAQwAIBgYHBgUI
...
[Once again, much base64 text omitted.]
...
3f7kklh4dg+UTZ1TsAAv1F69UklmZ9hrzogZibOqSSA8gZySSSJI/9k= --===============0855656444==--
How It Works
SmartMessage wraps the classes in Python’s email module. When the SmartMessage object is first created, it keeps its internal representation in a Message object. This message has a simple string representation.
When a file is attached to the SmartMessage, though, a Message object won’t do the job anymore. Message objects know only about RFC2822, nothing about the MIME extensions. At this point, SmartMessage transparently swaps out the Message object for a MimeMultipart object with the same headers and payload.
This transparent swap avoids forcing the user to decide ahead of time whether or not a message should be MIME encoded. It also avoids a lowest-common-denominator strategy of MIME-encoding each and every message, which is a wasteful operation for messages that are just one text part.
Sending Mail with SMTP and smtplib
Now that you know how to construct e-mail messages, it’s appropriate to revisit in a little more detail the protocol used to send them. This is SMTP, another TCP/IP-based protocol, defined in RFC 2821.
321
TEAM LinG

Chapter 16
Let’s look at the original example one more time:
>>>fromAddress = ‘sender@example.com’
>>>toAddress = [your email address]
>>>msg = “Subject: Hello\n\nThis is the body of the message.”
>>>import smtplib
>>>server = smtplib.SMTP(“localhost”, 25)
>>>server.sendmail(fromAddress, toAddress, msg)
{}
You connect to an SMTP server (at port 25 on localhost) and send a string message from one address to another. Of course, the location of the SMTP server shouldn’t be hard-coded, and because some servers require authentication, it would be nice to be able to accept authentication information when creating the SMTP object. Here’s a class that works with the SmartMessage class defined in the previous section to make it easier to send mail. Because the two classes go together, add this class to SendMail.py, the file that also contains the SmartMessage class:
from smtplib import SMTP class MailServer(SMTP):
“A more user-friendly interface to the default SMTP class.”
def __init__(self, server, serverUser=None, serverPassword=None, port=25): “Connect to the given SMTP server.”
SMTP.__init__(self, server, port) self.user = serverUser self.password = serverPassword
#Uncomment this line to see the SMTP exchange in detail. #self.set_debuglevel(True)
def sendMessage(self, message):
“Sends the given message through the SMTP server.” #Some SMTP servers require authentication.
if self.user:
self.login(self.user, self.password)
#The message contains a list of destination addresses that #might have names associated with them. For instance,
#”J. Random Hacker <jhacker@example.com>”. Some mail servers #will only accept bare email addresses, so we need to create a #version of this list that doesn’t have any names associated #with it.
destinations = message.to
if hasattr(destinations, ‘__iter__’):
destinations = map(self._cleanAddress, destinations) else:
destinations = self._cleanAddress(destinations) self.sendmail(message[‘From’], destinations, str(message))
def _cleanAddress(self, address):
“Transforms ‘Name <email@domain>’ into ‘email@domain’.” parts = address.split(‘<’, 1)
322 |
TEAM LinG |

Network Programming
if len(parts) > 1:
#This address is actually a real name plus an address: newAddress = parts[1]
endAddress = newAddress.find(‘>’) if endAddress != -1:
|
address = newAddress[:endAddress] |
|
return address |
Try It Out |
Sending Mail with MailServer |
This chapter’s initial example constructed a message as a string and sent it through SMTPlib. With the SmartMessage and MailServer classes, you can send a much more complex message, using simpler Python code:
>>>from SendMail import SmartMessage, MailServer
>>>msg = SmartMessage(“Me <me@example.com>”,
“You <you@example.com>”, “Your picture”,
“Here’s that picture I took of you.”)
>>>msg.addAttachment(open(“photo.jpg”).read(), “photo.jpg”)
>>>MailServer(“localhost”).sendMessage(msg)
>>>
Run this code (substituting the appropriate e-mail addresses and server hostname), and you’ll be able to send mail with MIME attachments to anyone.
How It Works
SmartMessage wraps the classes in Python’s email module. As before, the underlying representation starts out as a simple Message object but becomes a MimeMultipart object once photo.jpg is attached.
This time, the message is actually sent through an SMTP server. The MailServer class hides the fact that smtplilb expects you to specify the “To” and “From” headers twice: one in the call to the sendmail method and again in the body of the mail message. It also takes care of sanitizing the destination addresses, putting them into a form that all SMTP servers can deal with. Between the two wrapper classes, you can send complex e-mail messages from a Python script almost as easily as from a mail client.
Retrieving Internet E-mail
Now that you’ve seen how to send mail, it’s time to go all the way toward fulfilling Jamie Zawinski’s prophecy and expand your programs so that they can read mail. There are three main ways to do this, and the choice is probably not up to you. How you retrieve mail depends on your relationship with the organization that provides your Internet access.
Parsing a Local Mail Spool with mailbox
If you have a Unix shell account on your mail server (because, for instance, you run a mail server on your own computer), mail for you is appended to a file (probably /var/spool/mail/[your username]) as it comes in. If this is how your mail setup works, your existing mail client is probably set up to parse that
323
TEAM LinG

Chapter 16
file. It may also be set up to move messages out of the spool file and into your home directory as they come in.
The incoming mailbox in /var/spool/mail/ is kept in a particular format called “mbox format”. You can parse these files (as well as mailboxes in other formats such as MH or Maildir) by using the classes in the mailbox module.
Here’s a simple script, MailboxSubjectLister.py, that iterates over the messages in a mailbox file, printing out the subject of each one:
#!/usr/bin/python import email import mailbox import sys
if len(sys.argv) < 2:
print ‘Usage: %s [path to mailbox file]’ % sys.argv[0] sys.exit(1)
path = sys.argv[1] fp = open(path, ‘rb’) subjects = []
for message in mailbox.PortableUnixMailbox(fp, email.message_from_file): subjects.append(message[‘Subject’])
print ‘%s message(s) in mailbox “%s”:’ % (len(subjects), path) for subject in subjects:
print ‘’, subject
UnixMailbox (and the other Mailbox classes in the mailbox module) take as their constructor a file object (the mailbox file), and a function that reads the next message from the file-type object. In this case, the function is the email module’s message_from_file. The output of this useful function is a Message object, or one of its MIME* subclasses, such as MIMEMultipart. This and the
email.message_from_string function are the most common ways of creating Python representations of messages you receive.
You can work on these Message objects just as you could with the Message objects created from scratch in earlier examples, where the point was to send e-mail messages. Python uses the same classes to represent incoming and outgoing messages.
Try It Out |
Printing a Summary of Your Mailbox |
If you have a Unix account on your e-mail server, you can run the mailbox subject lister against your mail spool file, and get a list of subjects. If you don’t have a Unix account on your e-mail server, or if you use a web-based mail service, you won’t be able to get your mail this way:
$ python MailboxSubjectLister.py /var/spool/mail/leonardr 4 message(s) in mailbox “/var/spool/mail/leonardr”: DON’T DELETE THIS MESSAGE -- FOLDER INTERNAL DATA
This is a test message #1
This is a test message #2
This is a test message #3
324 |
TEAM LinG |

Network Programming
The first message isn’t a real message; it’s a dummy message sometimes created when you use a mail client to read your spool file. If your application works on spool files that are sometimes accessed through other means, you’ll need to recognize and deal with that kind of message.
Fetching Mail from a POP3 Server with poplib
Parsing a local mail spool didn’t require going over the network, because you ran the script on the same machine that had the mail spool. There was no need to involve a network protocol, only a file format (the format of Unix mailboxes, derived mainly from RFC 2822).
However, most people don’t have a Unix shell account on their mail server (or if they do, they want to read mail on their own machine instead of on the server). To fetch mail from your mail server, you need to go over a network, which means you must use a protocol. There are two popular protocols for doing this. The first, which was once near-universal though now waning in popularity, is POP3, the third revision of the Post Office Protocol.
POP3 is defined in RFC 1939, but as with most popular Internet protocols, you don’t need to delve very deeply into the details, because Python includes a module that wraps the protocol around a Python interface.
Here’s POP3SubjectLister, a POP3-based implementation of the same idea as the mailbox parser script. This script prints the subject line of each message on the server:
#!/usr/bin/python
from poplib import POP3 import email
class SubjectLister(PpOP3):
“””Connect to a POP3 mailbox and list the subject of every message in the mailbox.”””
def __init__(self, server, username, password): “Connect to the POP3 server.” POP3.__init__(self, server, 110)
#Uncomment this line to see the details of the POP3 protocol. #self.set_debuglevel(2)
self.user(username)
response = self.pass_(password) if response[:3] != ‘+OK’:
#There was a problem connecting to the server. raise Exception, response
def summarize(self):
“Retrieve each message, parse it, and print the subject.”
numMessages = self.stat()[0]
print ‘%d message(s) in this mailbox.’ % numMessages parser = email.Parser.Parser()
for messageNum in range(1, numMessages+1):
messageString = ‘\n’.join(self.top(messageNum, 0)[1]) message = parser.parsestr(messageString)
325
TEAM LinG

Chapter 16
#Passing in True to parser.parsestr() will only parse the headers #of the message, not the body. Since all we care about is the #body, this will save some time. However, this is only
#supported in Python 2.2.2 and up.
#message = parser.parsestr(messageString, True) print ‘’, message[‘Subject’]
After the data is on this side of the network, there’s no fundamental difference between the way it’s handled with this script and the one based on the UnixMailbox class. As with the UnixMailbox script, we use the email module to parse each message into a Python data structure (although here, we use the Parser class, defined in the email.Parser module, instead of the message_from_file convenience function).
The downside of using POP3 for this purpose is that the POP3.retr method has side effects. When you call retr on a message on the server, the server marks that message as having been read. If you use a mail client or a program like fetchmail to retrieve new mail from the POP3 server, then running this script might confuse the other program. The message will still be on the server, but your client might not download it if it thinks the message has already been read.
POP3 also defines a command called top, which doesn’t mark a message as having been read and which only retrieves the headers of a message. Both of these – top and retr – are ideal for the purposes of
this script; we’ll save bandwidth (not having to retrieve the whole message just to get the subject) and your script won’t interfere with the operation of other programs that use the same POP3 mailbox. Unfortunately, not all POP3 servers implement the top command correctly. Because it’s so useful when implemented correctly, though, here’s a subclass of the SubjectLister class which uses the top command to get message headers instead of retrieving the whole message. If you know your server supports top correctly, this is a better implementation:
class TopBasedSubjectLister(SubjectLister):
def summarize(self):
“””Retrieve the first part of the message and find the ‘Subject:’ header.”””
numMessages = self.stat()[0]
print ‘%d message(s) in this mailbox.’ % numMessages
for messageNum in range(1, numMessages+1):
#Just get the headers of each message. Scan the headers #looking for the subject.
for header in self.top(messageNum, 0)[1]: if header.find(‘Subject:’) == 0:
print header[len(‘Subject:’):] break
Both SubjectLister and TopBasedSubjectLister will yield the same output, but you’ll find that TopBasedSubjectLister runs a lot faster (assuming your POP3 server implements top correctly).
Finally, we’ll create a simple command-line interface to the POP3-based SubjectLister class, just as we did for the MailboxSubjectLister.py. This time, however, you need to provide a POP3 server and credentials on the command line, instead of the path to a file on disk:
326 |
TEAM LinG |

Network Programming
if __name__ == ‘__main__’: import sys
if len(sys.argv) < 4:
print ‘Usage: %s [POP3 hostname] [POP3 user] [POP3 password]’ % sys.argv[0] sys.exit(0)
lister = TopBasedSubjectLister(sys.argv[1], sys.argv[2], sys.argv[3]) lister.summarize()
Try It Out Printing a Summary of Your POP3 Mailbox
Run POP3SubjectLister.py with the credentials for a POP server, and you’ll get a list of subjects:
$ python POP3SubjectLister.py pop.example.com [username] [password] 3 message(s) in this mailbox.
This is a test message #1 This is a test message #2 This is a test message #3
When you go through the POP3 server, you won’t get the dummy message you might get when parsing a raw Unix mailbox file, as shown previously. Mail servers know that that message isn’t really a message; the Unix mailbox parser treats it as one.
How It Works
The SubjectLister object (or its TopBasedSubjectLister subclass) connects to the POP3 server and sends a “stat” command to get the number of messages in the mailbox. A call to stat returns a tuple containing the number of messages in the mailbox, and the total size of the mailbox in bytes. The lister then iterates up to this number, retrieving every message (or just the headers of every message) as it goes.
If SubjectLister is in use, the message is parsed with the email module’s Parser utility class, and the Subject header is extracted from the resulting Message or MIMEMultipart object. If TopBasedSubjectLister is in use, no parsing is done: The headers are retrieved from the server as a list and scanned for a “Subject” header.
Fetching Mail from an IMAP Server with imaplib
The other protocol for accessing a mailbox on a remote server is IMAP, the Internet Message Access Protocol. The most recent revision of IMAP is defined in RFC 3501, and it has significantly more features than POP3. It’s also gaining in popularity over POP3.
The main difference between POP3 and IMAP is that POP3 is designed to act like a mailbox: It just holds your mail for a while until you collect it. IMAP is designed to keep your mail permanently stored on the server. Among other things, you can create folders on the server, sort mail into them, and search them.
These are more complex features that are typically associated with end-user mail clients. With IMAP, a mail client only needs to expose these features of IMAP; it doesn’t need to implement them on its own.
Keeping your mail on the server makes it easier to keep the same mail setup while moving from computer to computer. Of course, you can still download mail to your computer and then delete it from the server, as with POP3.
327
TEAM LinG

Chapter 16
Here’s IMAPSubjectLister.py, an IMAP version of the script we’ve already written twice, which prints out the subject lines of all mail on the server. IMAP has more features than POP3, so this script exercises proportionately fewer of them. However, even for the same functionality, it’s a great improvement over the POP3 version of the script. IMAP saves bandwidth by retrieving the message subjects and nothing else: a single subject header per message. Even when POP3’s top command is implemented correctly, it can’t do better than fetching all of the headers as a group.
What’s the catch? As the imaplib module says of itself, “to use this module, you must read the RFCs pertaining to the IMAP4 protocol.” The imaplib module provides a function corresponding to each of the IMAP commands, but it doesn’t do many transformations between the Python data structures you’re used to creating and the formatted strings used by the IMAP protocol. You’ll need to keep a copy of RFC 3501 on hand or you won’t know what to pass into the imaplib methods.
For instance, to pass a list of message IDs into imaplib, you need to pass in a string like “1,2,3”, not the Python list (1,2,3). To make sure only the subject is pulled from the server, IMAPSubjectLister.py passes the string “(BODY[HEADER.FIELDS (SUBJECT)])” as an argument to an imaplib method. The result of that command is a nested list of formatted strings, only some of which are actually useful to the script.
This is not exactly the kind of intuitiveness one comes to expect from Python. imaplib is certainly useful, but it doesn’t do a very good job of hiding the details of IMAP from the programmer:
#!/usr/bin/python
from imaplib import IMAP4
class SubjectLister(IMAP4):
“””Connect to an IMAP4 mailbox and list the subject of every message in the mailbox.”””
def __init__(self, server, username, password): “Connect to the IMAP server.” IMAP4.__init__(self, server)
#Uncomment this line to see the details of the IMAP4 protocol. #self.debug = 4
self.login(username, password)
def summarize(self, mailbox=’Inbox’):
“Retrieve the subject of each message in the given mailbox.” #The SELECT command makes the given mailbox the ‘current’ one, #and returns the number of messages in that mailbox. Each message #is accessible via its message number. If there are 10 messages #in the mailbox, the messages are numbered from 1 to 10. numberOfMessages = int(self._result(self.select(mailbox)))
print ‘%s message(s) in mailbox “%s”:’ % (numberOfMessages, mailbox)
#The FETCH command takes a comma-separated list of message #numbers, and a string designating what parts of the #message you want. In this case, we want only the #’Subject’ header of the message, so we’ll use an argument #string of ‘(BODY[HEADER.FIELDS (SUBJECT)])’.
#
#See section 6.4.5 of RFC3501 for more information on the #format of the string used to designate which part of the #message you want. To get the entire message, in a form
328 |
TEAM LinG |

Network Programming
#acceptable to the email parser, ask for ‘(RFC822)’.
subjects = self._result(self.fetch(‘1:%d’ % numberOfMessages, ‘(BODY[HEADER.FIELDS (SUBJECT)])’))
for subject in subjects:
if hasattr(subject, ‘__iter__’): subject = subject[1]
print ‘’, subject[:subject.find(‘\n’)]
def _result(self, result):
“””Every method of imaplib returns a list containing a status code and a set of the actual result data. This convenience method throws an exception if the status code is other than “OK”, and returns the result data if everything went all right.”””
status, result = result if status != ‘OK’:
raise status, result if len(result) == 1:
result = result[0] return result
if __name__ == ‘__main__’: import sys
if len(sys.argv) < 4:
print ‘Usage: %s [IMAP hostname] [IMAP user] [IMAP password]’ % sys.argv[0] sys.exit(0)
lister = SubjectLister(sys.argv[1], sys.argv[2], sys.argv[3]) lister.summarize()
Try It Out Printing a Summary of Your IMAP Mailbox
Just execute IMAPSubjectLister.py with your IMAP credentials (just as with POP3SubjectLister), and you’ll get a summary similar to the two shown earlier in this chapter:
$ python IMAPSubjectLister.py imap.example.com [username] [password] 3 message(s) in mailbox “Inbox”:
This is a test message #1 This is a test message #2 This is a test message #3
How It Works
As with the POP3 example, the first thing to do is connect to the server. POP3 servers provide only one mailbox per user, but IMAP allows one user any number of mailboxes, so the next step is to select a mailbox.
The default mailbox is called “Inbox”, and selecting a mailbox yields the number of messages in that mailbox (some POP3 servers, but not all, return the number of messages in the mailbox when you connect to the server).
Unlike with POP3, IMAP lets you retrieve more than one message at once. It also gives you a lot of flexibility in defining which parts of a message you want. The IMAP-based SubjectLister makes just one IMAP call to retrieve the subjects (and only the subjects) of every message in the mailbox. Then it’s just a
329
TEAM LinG

Chapter 16
matter of iterating over the list and printing out each subject. The real trick is knowing what arguments to pass into imaplib and how to interpret the results.
IMAP’s Unique Message IDs
Complaints about imaplib’s user-friendliness aside, you might have problems writing IMAP scripts if you assume that the message numbers don’t change over time. If another IMAP client deletes messages from a mailbox while this script is running against it (suppose you have your mail client running, and you use it to delete some spam while this script is running), the message numbers will be out of sync from that point on.
The IMAP-based SubjectLister class minimizes this risk by getting the subject of every message in one operation, immediately after selecting the mailbox:
self.fetch(‘1:%d’ % numberOfMessages, ‘(BODY[HEADER.FIELDS (SUBJECT)])’)
If there are 10 messages in the inbox, the first argument to fetch will be “1:10”. This is a slice of the mailbox, similar to a slice of a Python list, which returns all of the messages: message 1 through message 10 (IMAP and POP3 messages are numbered starting from 1).
Getting the data you need as soon as you connect to the server minimizes the risk that you’ll pass a no- longer-valid message number onto the server, but you can’t always do that. You may write a script that deletes a mailbox’s messages, or that files them in a second mailbox. After you change a mailbox, you may not be able to trust the message numbers you originally got.
Try It Out |
Fetching a Message by Unique ID |
To help you avoid this problem, IMAP keeps a unique ID (UID) for every message under its control. You can fetch the unique IDs from the server and use them in subsequent calls using imaplib’s uid method. Unfortunately, this brings you even closer to the details of the IMAP protocol. The IMAP4 class defines a separate method for each IMAP command (e.g. IMAP4.fetch, IMAP4.search, etc.), but when you’re dealing with IDs, you can’t use those methods. You can use only the IMAP4.uid method, and you must pass the IMAP command you want as the first argument. For instance, instead of calling
IMAP4.fetch([arguments]), you must call IMAP4.uid(‘FETCH’, [arguments]).
>>>import imaplib
>>>import email
>>>imap = imaplib.IMAP4(‘imap.example.com’)
>>>imap.login(‘[username]’, ‘[password]’) (‘OK’, [‘Logged in.’])
>>>imap.select(‘Inbox’)[1][0]
‘3’
>>>
>>>#Get the unique IDs for the messages in this folder.
... uids = imap.uid(‘SEARCH’, ‘ALL’)
>>>print uids
(‘OK’, [‘49532 49541 49563’])
>>>
>>> #Get the first message.
... uids = uids[1][0].split(‘ ‘)
>>>messageText = imap.uid(‘FETCH’, uids[0], “(RFC822)”)[1][0][1]
>>>message = email.message_from_string(messageText)
>>>print message[‘Subject’]
This is a test message #1
330 |
TEAM LinG |