
Beginning Python (2005)
.pdf
Writing Shareware and Commercial Programs
gets there. Most users prefer having an executable that handles the heavy lifting like installation, verification that the install works, and that all system dependencies — things that are needed to work — are present. Although Immunity hasn’t chosen to go this route, it might be something to consider if your customer base is not as technical as the security community tends to be.
Having a redistributable Python program means more than just having portable Python code: It means being able to deliver that code and the software stack it relies on portably. Ideally, this functionality would be built into the platform itself with nice GUI installers, but most organizations use either py2exe or cx_freeze, which are much better. However, rest assured that distributing your Python program as one large binary is possible, as long as you’re willing to pay the size limit.
www.python.org/moin/Freeze — The original Python Freeze.
http://starship.python.net/crew/atuining/cx_Freeze/ cx_freeze — This does not require a compiler to modify your source Python distribution.
http://starship.python.net/crew/theller/py2exe/ — The more Windows-centric py2exe has been used by people looking for a more complex setup.
The option of the future for some applications may well be Live-CDs with built-in Linux and Python distributions. With these, you could even control the operating system and save clients costs such as OS licensing costs and patching and maintenance costs.
As virtual systems become more powerful (QEMU comes to mind), and Cooperative Linux (www. colinux.org/) makes running a full Linux distribution under Windows more mainstream, you’ll find it potentially ideal to distribute your product as an ISO or virtual machine package.
Of course, if you have a very technical customer base and you’re technical too, tarballs with INTSTALL.TXT files work well enough to get you started, though most customers will demand more later.
Essential Libraries
Occasionally, you’ll want to import modules into your commercial program as if they were part of the base Python distribution. The following sections describe modules that Immunity has found invaluable. As your project matures, you’ll no doubt have a list of external, free modules you just can’t do without.
Timeoutsocket
Timeoutsocket is the first indispensable module. In addition to including some neat functionality for wrapping the socket module, it adds timeouts to nearly all socket operations. Using timeoutsocket is as simple as adding an import timeoutsocket and then calling mysocket.set_timeout(4) on any newly created TCP sockets. This even affects sockets used from within libraries that know nothing about timeoutsocket. When a socket operation times out (which is something they may not do by default, but that you always want them to do, and preferably in a way that you control), it will throw an exception, which you can catch. Of course, mysocket.set_timeout(None) will emulate the standard behavior and never time out.
401
TEAM LinG

Chapter 18
This fragment from timeoutsocket.py demonstrates how you can do something similar with your code:
#From timeoutsocket.py
#Silently replace the standard socket module
import sys
if sys.modules[“socket”].__name__ != __name__: me = sys.modules[__name__]
sys.modules[“_timeoutsocket”] = sys.modules[“socket”]
sys.modules[“socket”] = me
for mod in sys.modules.values():
if hasattr(mod, “socket”) and type(mod.socket) == type(me): mod.socket = me
Being able to call s.set_timeout(5) has prevented quite a few painful sections of code inside of CANVAS. Again, if socket operations and network protocols are something on which your product relies, consider a good strong look at the Twisted Python architecture, an entire framework that gives you a way of approaching complex network application designs. It can be found at http:// twistedmatrix.com/projects/twisted/.
PyGTK
This module is cross-platform, free, and of an extremely high quality. As mentioned in Chapter 13, separating your GUI from your code is a key factor in rapid application development (RAD). Immunity wishes to spend as little time as possible writing GUI code, and as much time as possible writing application code. PyGTK is a natural fit if you can use it.
GEOip
GEOip is a free library you can install on your server that enables you to programmatically map IP addresses to countries. The following code block shows the basic usage of this in a CGI script, but Immunity uses it inside CANVAS as well. Having an IP-to-country mapping is useful in many cases, and for some extra cash, GEOip can offer you the city and state level.
#!/usr/bin/python import os,cgi,sys,md5
os.putenv(“LD_LIBRARY_PATH”, “/usr/local/lib”) #for GeoIP sys.path.append(“/usr/local/lib”) os.environ[“LD_LIBRARY_PATH”] = “/usr/local/lib”
#print os.environ
def getresult(who, ip): import GeoIP
gi = GeoIP.new(GeoIP.GEOIP_MEMORY_CACHE) country = gi.country_code_by_addr(ip) if country != “US”:
error(ip = ip)
# [ ... ]
402 |
TEAM LinG |

Writing Shareware and Commercial Programs
Summar y
Using Python in a commercial setting as part of consumer software can be trying. While Python is great on one computer, supporting it on thousands of computers requires a level of infrastructure you may not have expected. In addition, unlike thicker software stacks such as Java, operating system and platform differences leak through to the developer.
In the end, though, being able to develop your application ten times faster than the Java farm next door may mean the difference between success and failure. Immunity has found that even a small difference in the amount of Python you use can make a huge difference in time-to-market.
As with any technology, it helps to have your business model oriented correctly around the limitations of the platform. Copy protection is made harder, but customer support is made easier. The trade-offs are there, but as long as you understand them, you can use Python to deadly effect in the business world.
403
TEAM LinG

TEAM LinG

19
Numerical Programming
In this chapter, you will learn how to use Python to work with numbers. You’ve already seen some arithmetic examples, but after reading this chapter, you’ll have a better understanding of the different ways you can represent numbers in Python, of how to perform mathematical computations, and of efficient ways of working with large numerical data sets.
Numerical code lies at the heart of technical software, and is used widely in science, engineering, finance, and related fields. Almost any substantial program does some nontrivial numerical computation, so it pays to be familiar with some of the contents of this chapter even if you are not working in one of these fields. For instance, if you are writing a script to analyze web logs, you might want to compute statistics on the rate of hits on your web server; if you are writing a program with a graphical user interface, you might need math functions to compute the coordinates of the graphics in your GUI.
Parts of this chapter require some understanding of math beyond simple arithmetic. Feel free to skip over these if you have forgotten the math being used. The last section of this chapter, which discusses numerical arrays, is technically more advanced than most of the material in this book, but it’s important reading if you plan to use Python for handling large sets of numbers.
Designing software that performs complex numerical computation, known as numerical analysis, is both a science and an art. There are often many ways of doing a computation, and numerical analysis tells you which of these will produce an answer closest to the correct result. Things can get tricky, especially when working with floating-point numbers, because, as you will see, a floating-point number is merely an approximation of a real number. This chapter mentions numerical precision but doesn’t go into the finer points, so if you are embarking on writing software that performs extensive floating-point computations, consider flipping through a book on numerical analysis to get a sense of the kind of problems you might run into.
Numbers in Python
A number, like any object in Python, has a type. Python has four basic numerical types. Two of these, int and long, represent integers, and float represents floating-point numbers. The fourth numeric type, which is covered later in this chapter, represents complex floating-point numbers.
TEAM LinG

Chapter 19
Integers
You’ve already seen the simplest integer type, int. If you write an ordinary number in your program like 42, called a literal number, Python creates an int object for it:
>>>x = 42
>>>type(x) <type ‘int’>
You didn’t have to construct the int explicitly, but you could if you want, like this:
>>> x = int(42)
You can also use the int constructor to convert other types, such as strings or other numerical types, to integers:
>>>x = int(“17”)
>>>y = int(4.8)
>>>print x, y, x - y 17 4 13
In the first line, Python converts a string representing a number to the number itself; you can’t do math with “17” (a string), but you can with 17 (an integer). In the second line, Python converted the floatingpoint value 4.8 to the integer 4 by truncating it — chopping off the part after the decimal point to make it an integer.
When you convert a string to an int, Python assumes the number is represented in base 10. You can specify another base as the second argument. For instance, if you pass 16, the number is assumed to be hexadecimal:
>>>hex_number = “a1”
>>>print int(hex_number, 16)
161
You can specify hexadecimal literals by prefixing the number with 0x. For example, hexadecimal 0xa1
is equivalent to decimal 161. Similarly, literals starting with just a 0 are assumed to be octal (base 8), so octal 0105 is equivalent to decimal 69. These conventions are used in many other programming languages, too.
Long Integers
What’s the largest number Python can store in an int? Python uses at least 32 bits to represent integers, which means that you can store numbers at least as large as 231–1 and negative numbers as small as –231. If you need to store a larger number, Python provides the long type, which represents arbitrarily large integers.
For example, long before the search engine Google existed, mathematicians defined a googol, a one followed by 100 zeros. To represent this number in Python, you could type out the hundred zeros, or you can save yourself the trouble by using the exponentiation operator, **:
>>>googol = 10 ** 100
>>>print googol
406 |
TEAM LinG |

Numerical Programming
10000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000
This is an example of a long object:
>>> type(googol) <type ‘long’>
Note that when you computed the value of googol, you used only int literals — namely, 10 and 100. Python converted the result to a long automatically because it didn’t fit in an int.
If you enter a literal that is too large for an int, Python uses a long automatically:
>>> type(12345678900) <type ‘long’>
You can also construct a long object for a number that would fit in an int. Either call the long constructor explicitly, as in long(42), or append an L to the literal, as in 42L.
Floating-point Numbers
In Python, a floating-point number is represented by a float object. A floating-point number is only an approximation to a real number, so you may sometimes see results that look strange. For example:
>>>x = 1.1
>>>x 1.1000000000000001
>>>print x
1.1
What’s going on here? You assigned to x the floating-point approximation to the number 1.1. The floatingpoint number that Python can represent that is closest to 1.1 is actually a tiny bit different, and Python is honest with you and shows this number when you ask for the full representation of x. When you print x, however, Python provides you with a “nice” depiction of the number, which doesn’t show enough decimal places to illustrate the floating-point approximation.
Simply entering x at the command prompt prints what you would get by calling repr(x). Entering print x prints what you would get by calling str(x).
Representation of int and long
Internally, Python uses the C type long to represent int objects. If you are using a 64-bit architecture, Python can represent numbers between –263 and 263–1 as int objects. However, it’s best to assume that an int is only 32 bits, in case you later decide to run your program on another architecture.
Use the Python long type for larger integers. For these, Python uses a internal representation that isn’t fixed in size, so there’s no limit. Be aware, however, that long objects take up more memory than int objects, and computations involving them are much slower than those using only int objects.
407
TEAM LinG

Chapter 19
Floating-point Precision
A floating-point number is an approximation. As you have seen, it can carry only a limited number of digits of precision.
Formally, Python does not make any promises about the number of digits of precision retained in float variables. However, internally Python uses the C type double to store the contents of float objects, so if you know the precision of a C double variable on a platform, you’ll know the precision of a Python float when running on that platform.
Most systems store a double in 64 bits and provide about 16 digits of precision.
As with integers, you can use the float constructor to covert strings to numbers (but only in base 10). For example:
>>> x = float(“16.4”)
Very large and very small floating-point numbers are represented with exponential notation, which separates out the power of ten. A googol as a floating-point number would be 1e+100, which means the number 1 times ten raised to the power 100. The U.S. national debt at the time this was written, according to the Treasury Department web site, was:
>>> debt = 7784834892156.63
Python prefers exponential notation to print a number this large:
>>> print debt 7.78483489216e+012
You can also enter literals with exponential notation.
Formatting Numbers
You can convert any Python number to a string using the str constructor. This produces the text that would be printed by the print statement, as a string object. For simple applications, this is adequate.
For better control of the output format, use Python’s built-in string formatting operator, %.
Note that this has nothing to do with the remainder operator. If you use % after a string, that’s the string formatting operator. If you use % between two numbers, then you get the remainder operator.
Following are some details on formatting numbers. If you are familiar with the printf function in C, you already know much of the syntax for formatting numbers in Python.
To format an integer (int or long), use the %d conversion in the format string. For a floating-point number, use %f. If you use %d with a floating-point number or %f with an integer, Python will convert the number to the type indicated by the conversion. For example:
>>> print “%d” % 100 100
408 |
TEAM LinG |

Numerical Programming
>>> print “%d” % 101.6 101
You probably didn’t really notice, since it’s so obvious, that Python formatted these integers in base 10. For some applications, you might prefer your output in hexadecimal. Use the %x conversion to produce this. If you use %#x, Python puts 0x before the output to make it look just like a hexadecimal literal value, like so:
>>> print “%#x” % 100 0x64
Similarly, %o (that’s the letter “o,” not a zero) produces output in octal, and %#o produces octal output preceded by a 0.
For integers, you can specify the width (number of digits) of the output by placing a number after the % in the format string. If the number starts with 0, the output will be left-padded with zeros; otherwise, it will be padded with spaces. In the examples that follow, you surrounded the output with parentheses so you can see exactly what Python generates for the %d conversions:
>>> print “z is (%6d)” % 175 z is ( 175)
>>> print “z is (%06d)” % 175 z is (000175)
When you format floating-point numbers, you can specify the total width of the output, and/or the number of digits displayed after the decimal place. If you want the output to have total width w and to display p decimal places, use the conversion %w.pf in the format string. The total width includes the decimal point and digits after the decimal point. Unlike converting a float to an integer value, Python rounds to the nearest digit in last decimal place:
>>>x = 20.0 / 3
>>>print “(%6.2f)” % x ( 6.67)
If you omit the number before the decimal point, Python uses as much room as necessary to print the integer part and the decimal places you asked for:
>>> print “(%.4f)” % x (6.6667)
You can demand as many digits as you want, but remember that a float carries a limited precision and, therefore, contains information for only 16 digits or so. Python will add zero digits to fill out the rest:
>>>two_thirds = 2.0 / 3
>>>print “%.40f” % two_thirds 0.6666666666666666300000000000000000000000
The number you see may be slightly different, as architectures handle the details of floating-point computations differently.
409
TEAM LinG

Chapter 19
If you omit the number after the decimal point (or specify zero decimal places), Python doesn’t show any decimal places and omits the decimal point, too:
>>> print “(%4.f)” % x
(7)
For example, the following function formats the ratio of its arguments, num and den, as a percentage, showing one digit after the decimal point:
>>> def as_percent(num, den):
... |
if den == 0: |
... |
ratio = 0 |
... |
else: |
... |
ratio = float(num) / den |
... |
return “%5.1f%%” % (100 * ratio) |
... |
|
>>> print “ratio = “ + as_percent(6839, 13895) ratio = 49.2%
One nice thing about this function is that it confirms that the denominator is not zero, to avoid division- by-zero errors. Moreover, look closely at the format string. The first % goes with the f as part of the floating-point conversion. The %% at the end is converted to a single % in the output: Because the percent symbol is used to indicate a conversion, Python requires you to use two of them in a format string if you want one in your output.
You don’t have to hard-code the width or number of decimal places in the format string. If you use an asterisk instead of a number in the conversion, Python takes the value from an extra integer argument in the argument tuple (positioned before the number that’s being formatted). Using this feature, you can write a function that formats U.S. dollars. Its arguments are an amount of money and the number of digits to use for the dollars part, not including the two digits for cents:
>>> def format_dollars(dollars, places):
... |
return “$%*.2f” % (places + 3, dollars) |
... |
|
>>> print format_dollars(499.98, 5)
$499.98
In the format string, you use * instead of the total width in the floating-point conversion. Python looks at the argument tuple and uses the first value as the total width of the conversion. In this case, you specify three more than the desired number of digits for dollars, to leave room for the decimal point and the two digits for cents.
Even more options are available for controlling the output of numbers with the string formatting operator. Consult the Python documentation for details, under the section on sequence types (because strings are sequences) in the Python Library Reference.
Characters as Numbers
What about characters? C and C++ programmers are used to manipulating characters as numbers, as C’s char type is just another integer numeric type. Python doesn’t work like this, though. In Python, a character is just a string of length one, and cannot be used as a number.
410 |
TEAM LinG |