Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Beginning Python (2005)

.pdf
Скачиваний:
177
Добавлен:
17.08.2013
Размер:
15.78 Mб
Скачать

 

Extension Programming with C

 

 

0,

/* tp_iter */

0,

/* tp_iternext */

0,

/* tp_methods */

0,

/* tp_members */

0,

/* tp_getset */

0,

/* tp_base */

0,

/* tp_dict */

0,

/* tp_descr_get */

0,

/* tp_descr_set */

0,

/* tp_dictoffset */

0,

/* tp_init */

0,

/* tp_alloc */

Encoder_new,

/* tp_new */

0,

/* tp_free */

};

This is going to be the structure for what you’re going to get a pointer to when your Encoder_new function is called. There’s a lot to that structure (and even more that you can’t see yet), but you’re letting most of the members default to NULL for now. You’ll go over the important bits before moving on.

The PyObject_HEAD_INIT macro adds the members that are common to all types. It must be the first member in the structure. It’s like PyObject_HEAD except that it initializes the type pointer to whatever you pass in as an argument.

Remember: In Python, types are objects, too, so they also have types. You could call a type’s type a “type type”. The Python API calls it PyType_Type. It’s the type of type objects. You really want to be able to pass &PyType_Type into this macro but some compilers won’t let you statically initialize a structure member with a symbol defined in some other module, so you’ll have to fill that in later.

The next member, ob_size, might look important but it’s a remnant from an older version of the Python API and should be ignored. The member after the name of your type, tp_basicsize, represents the size of all your object instances. When the interpreter needs to allocate storage space for a new instance, it will request tp_basicsize bytes.

Most of the rest of the members are currently NULL, but you’ll be filling them in later. They’ll hold function pointers for some of the more common operations that many objects support.

The tp_flags member specifies some default flags for the type object, which all type objects need; and the tp_doc member holds a pointer to the docstring for the type, which you always want to provide because we’re good Python citizens.

Notice the tp_alloc and tp_free members, which are set to NULL. Aren’t those the members we’re calling from Encoder_new and Encoder_dealloc? Yes, they are, but you’re going to use a Python API function to fill them in with the appropriate addresses later on because some platforms don’t like it when you statically initialize structure members with addresses of functions in other libraries.

At this point, you’ve defined two structures. To actually make them available via your extension module, you need to add some code to your module’s initialization function:

371

TEAM LinG

Chapter 17

PyMODINIT_FUNC initpylame2() { PyObject *m;

if (PyType_Ready(&pylame2_EncoderType) < 0) { return;

}

m = Py_InitModule3(“pylame2”, pylame2_methods, “My second LAME module.”); Py_INCREF(&pylame2_EncoderType);

PyModule_AddObject(m, “Encoder”, (PyObject *)&pylame2_EncoderType);

}

PyType_Ready gets a type object “ready” for use by the interpreter. It sets the type of the object to PyType_Type and sets a number of the function pointer members that you had previously left NULL, along with a number of other bookkeeping tasks necessary in order to hook everything up properly, including setting your tp_alloc and tp_free members to suitable functions.

After you get your type object ready, you create your module as usual, but this time you’re saving the return value (a pointer to a module object) so you can add your new type object to the module. Previously, you had been ignoring the return value and letting the method table define all of the members of the module. Because there’s no way to fit a PyObject pointer into a method table, you need to use the PyModule_AddObject function to add your type object to the module. This function takes in the pointer to the module returned from Py_InitModule3, the name of your new object as it should be known in the module, and the pointer to the new object itself.

If you were to compile what you had so far, you’d be able to create new Encoder instances:

>>>import pylame2

>>>e = pylame2.Encoder()

That object doesn’t do you much good, however, as it doesn’t have any useful behavior yet.

To make these objects useful, you have to allow for some information to be passed into their initialization functions. That information could simply be the path to the file to which you want to write. Your initialization function could use that path to open a file handle that would enable you to write to it, but there’ll be no writing until somebody invokes the encode method on your object. Therefore, your object needs to retain the handle for the file it opened.

You’re also going to be invoking functions defined in the LAME library, so your objects will also need to remember the pointer to the lame_global_flags structure returned by lame_init.

Here’s your structure with state and a modified Encoder_new function to initialize it:

typedef struct { PyObject_HEAD FILE *outfp;

lame_global_flags *gfp; } pylame2_EncoderObject;

static PyObject *Encoder_new(PyTypeObject *type, PyObject *args, PyObject *kw) { pylame2_EncoderObject *self = (pylame2_EncoderObject *)type->tp_alloc(type, 0); self->outfp = NULL;

self->gfp = NULL; return (PyObject *)self;

}

372

TEAM LinG

Extension Programming with C

You’re not checking args and kw here, because this is the equivalent of Python’s __new__ method, not __init__. It’s in your C implementation of __init__ that you’ll be opening the file and initializing the LAME library:

static int Encoder_init(pylame2_EncoderObject *self, PyObject *args, PyObject *kw) {

char *outpath;

if (!PyArg_ParseTuple(args, “s”, &outpath)) { return -1;

}

if (self->outfp || self->gfp) { PyErr_SetString(PyExc_Exception, “__init__ already called”); return -1;

}

self->outfp = fopen(outpath, “wb”); self->gfp = lame_init(); lame_init_params(self->gfp); return 0;

}

Your __init__ implementation is checking two things. The first you’ve already seen. You’re using PyArg_ParseTuple to ensure that you were passed in one string parameter. The second check is ensuring that the outfp and gfp members of your instance are NULL. If they’re not, this function must already have been called for this object, so return the appropriate error code for this function after using the PyErr_SetString function to “set” an exception. After you return into the Python interpreter, an exception will be raised and your caller is going to have to catch it or suffer the consequences. You need to do this because it’s always possible to call __init__ twice on an object. With this code in place, calling __init__ twice on your objects might look like this:

>>>import pylame2

>>>e = pylame2.Encoder(“foo.mp3”)

>>>e.__init__(“bar.mp3”) Traceback (most recent call last):

File “<stdin>”, line 1, in ? Exception: __init__ already called

Of course, you could be nice and reinitialize the object, but that’s not necessary for what you want to get done today. You should also be checking for errors, of course.

To indicate that you want this initialization function to be called for each new instance of your class, you need to add the address this function needs to your type object:

(initproc)Encoder_init,

/* tp_init */

You’re casting it here because we cheated and declared that Encoder_init accepted a pylame2_ EncoderObject * as its first argument instead of the more generic PyObject *. You can get away with this type of stuff in C, but you have to be absolutely certain that you know what you’re doing.

Because your instances now contain state that reference resources, you need to ensure that those resources are properly disposed of when the object is released. To do this, update your Encoder_dealloc function:

373

TEAM LinG

Chapter 17

static void Encoder_dealloc(pylame2_EncoderObject *self) { if (self->gfp) {

lame_close(self->gfp);

}

if (self->outfp) { fclose(self->outfp);

}

self->ob_type->tp_free(self);

}

If you were to build your module with the code you have so far, import it, create an encoder object, and then delete it (using the del keyword or rebinding the variable referencing your object to None or some other object), you would end up with an empty file in the current directory because all you did was open and then close it without writing anything to it. You’re getting closer!

You now need to add support for the encode and close methods to your type. Previously, you had created what was called a method table, but that was really defining module-level functions. Defining methods for classes is just as easy but different. You define the methods just like the module-level functions and then create a table listing them:

static PyObject *Encoder_encode(PyObject *self, PyObject *args) { Py_RETURN_NONE;

}

static PyObject *Encoder_close(PyObject *self) { Py_RETURN_NONE;

}

static PyMethodDef Encoder_methods[] = {

{ “encode”, Encoder_encode, METH_VARARGS,

“Encodes and writes data to the output file.” },

{“close”, (PyCFunction)Encoder_close, METH_NOARGS, “Closes the output file.” },

{NULL, NULL, 0, NULL }

};

Then the address of the table is used to initialize the tp_methods member of your type object:

Encoder_methods,

/* tp_methods */

With those stubs in place, you could build the module and see the methods and even call them on your objects:

>>>import pylame2

>>>e = pylame2.Encoder(‘foo.mp3’)

>>>dir(e)

[‘__class__’, ‘__delattr__’, ‘__doc__’, ‘__getattribute__’, ‘__hash__’,

‘__init__’, ‘__new__’, ‘__reduce__’, ‘__reduce_ex__’, ‘__repr__’, ‘__setattr__’, ‘__str__’, ‘close’, ‘encode’]

>>>e.encode()

>>>e.close()

374

TEAM LinG

Extension Programming with C

All you have to do now is implement the functions. Here’s Encoder_encode (sans complete errorchecking):

static PyObject *Encoder_encode(pylame2_EncoderObject *self, PyObject *args) { char *in_buffer;

int in_length; int mp3_length; char *mp3_buffer; int mp3_bytes;

if (!(self->outfp && self->gfp)) { PyErr_SetString(PyExc_Exception, “encoder not open”); return NULL;

}

if (!PyArg_ParseTuple(args, “s#”, &in_buffer, &in_length)) { return NULL;

}

in_length /= 2;

mp3_length = (int)(1.25 * in_length) + 7200; mp3_buffer = (char *)malloc(mp3_length);

if (in_length > 0) {

mp3_bytes = lame_encode_buffer_interleaved( self->gfp,

(short *)in_buffer, in_length / 2, mp3_buffer, mp3_length

);

if (mp3_bytes > 0) {

fwrite(mp3_buffer, 1, mp3_bytes, self->outfp);

}

}

free(mp3_buffer); Py_RETURN_NONE;

}

You expect this argument to be passed a string. Unlike strings in C, which are simple NUL-terminated arrays of characters, you expect that this string will contain embedded NUL characters (the NUL character, which is simple the end-of-string indication in C has the value of ‘\0’ in C. Note the single quotes — in C remember that the different quotes have different meanings. NUL can also be shown as “” in C.) Therefore, instead of using the “s” indicator when parsing the arguments, you use “s#”, which allows for embedded NUL characters. PyArg_ParseTuple will return both the bytes in a buffer and the length of the buffer instead of tacking a NUL character on the end. Other than that, this function is pretty straightforward.

Here’s Encoder_close:

static PyObject *Encoder_close(pylame2_EncoderObject *self) { int mp3_length;

char *mp3_buffer; int mp3_bytes;

if (!(self->outfp && self->gfp)) { PyErr_SetString(PyExc_Exception, “encoder not open”); return NULL;

}

375

TEAM LinG

Chapter 17

mp3_length = 7200;

mp3_buffer = (char *)malloc(mp3_length);

mp3_bytes = lame_encode_flush(self->gfp, mp3_buffer, sizeof(mp3_buffer)); if (mp3_bytes > 0) {

fwrite(mp3_buffer, 1, mp3_bytes, self->outfp);

}

free(mp3_buffer); lame_close(self->gfp); self->gfp = NULL; fclose(self->outfp); self->outfp = NULL; Py_RETURN_NONE;

}

You need to make sure you set outfp and gfp to NULL here to prevent Encoder_dealloc from trying to close them again.

For both Encoder_encode and Encoder_close, you’re checking to make sure your object is in a valid state for encoding and closing. Somebody could always call close and then follow that up with another call to close or even a call to encode. It’s better to raise an exception than to bring down the process hosting your extension module.

We went over a lot to get to this point, so it would probably help if you could see the entire extension module in one large example:

#include <Python.h>

#include <lame.h>

typedef struct { PyObject_HEAD FILE *outfp;

lame_global_flags *gfp; } pylame2_EncoderObject;

static PyObject *Encoder_new(PyTypeObject *type, PyObject *args, PyObject *kw) { pylame2_EncoderObject *self = (pylame2_EncoderObject *)type->tp_alloc(type, 0); self->outfp = NULL;

self->gfp = NULL;

return (PyObject *)self;

}

static void Encoder_dealloc(pylame2_EncoderObject *self) { if (self->gfp) {

lame_close(self->gfp);

}

if (self->outfp) { fclose(self->outfp);

}

self->ob_type->tp_free(self);

}

static int Encoder_init(pylame2_EncoderObject *self, PyObject *args, PyObject *kw) {

376

TEAM LinG

Extension Programming with C

char *outpath;

if (!PyArg_ParseTuple(args, “s”, &outpath)) { return -1;

}

if (self->outfp || self->gfp) { PyErr_SetString(PyExc_Exception, “__init__ already called”); return -1;

}

self->outfp = fopen(outpath, “wb”); self->gfp = lame_init(); lame_init_params(self->gfp); return 0;

}

static PyObject *Encoder_encode(pylame2_EncoderObject *self, PyObject *args) { char *in_buffer;

int in_length; int mp3_length; char *mp3_buffer; int mp3_bytes;

if (!(self->outfp && self->gfp)) { PyErr_SetString(PyExc_Exception, “encoder not open”); return NULL;

}

if (!PyArg_ParseTuple(args, “s#”, &in_buffer, &in_length)) { return NULL;

}

in_length /= 2;

mp3_length = (int)(1.25 * in_length) + 7200; mp3_buffer = (char *)malloc(mp3_length);

if (in_length > 0) {

mp3_bytes = lame_encode_buffer_interleaved( self->gfp,

(short *)in_buffer, in_length / 2, mp3_buffer, mp3_length

);

if (mp3_bytes > 0) {

fwrite(mp3_buffer, 1, mp3_bytes, self->outfp);

}

}

free(mp3_buffer); Py_RETURN_NONE;

}

static PyObject *Encoder_close(pylame2_EncoderObject *self) { int mp3_length;

char *mp3_buffer; int mp3_bytes;

if (!(self->outfp && self->gfp)) { PyErr_SetString(PyExc_Exception, “encoder not open”); return NULL;

}

377

TEAM LinG

Chapter 17

mp3_length = 7200;

mp3_buffer = (char *)malloc(mp3_length);

mp3_bytes = lame_encode_flush(self->gfp, mp3_buffer, sizeof(mp3_buffer)); if (mp3_bytes > 0) {

fwrite(mp3_buffer, 1, mp3_bytes, self->outfp);

}

free(mp3_buffer); lame_close(self->gfp); self->gfp = NULL; fclose(self->outfp); self->outfp = NULL; Py_RETURN_NONE;

}

static PyMethodDef Encoder_methods[] = {

{“encode”, (PyCFunction)Encoder_encode, METH_VARARGS, “Encodes and writes data to the output file.” },

{“close”, (PyCFunction)Encoder_close, METH_NOARGS, “Closes the output file.” },

{NULL, NULL, 0, NULL }

};

 

static PyTypeObject pylame2_EncoderType = {

PyObject_HEAD_INIT(NULL)

 

0,

/* ob_size */

“pylame2.Encoder”,

/* tp_name */

sizeof(pylame2_EncoderObject),

/* tp_basicsize */

0,

/* tp_itemsize */

(destructor)Encoder_dealloc,

/* tp_dealloc */

0,

/* tp_print */

0,

/* tp_getattr */

0,

/* tp_setattr */

0,

/* tp_compare */

0,

/* tp_repr */

0,

/* tp_as_number */

0,

/* tp_as_sequence */

0,

/* tp_as_mapping */

0,

/* tp_hash */

0,

/* tp_call */

0,

/* tp_str */

0,

/* tp_getattro */

0,

/* tp_setattro */

0,

/* tp_as_buffer */

Py_TPFLAGS_DEFAULT,

/* tp_flags */

“My first encoder object.”,

/* tp_doc */

0,

/* tp_traverse */

0,

/* tp_clear */

0,

/* tp_richcompare */

0,

/* tp_weaklistoffset */

0,

/* tp_iter */

0,

/* tp_iternext */

Encoder_methods,

/* tp_methods */

0,

/* tp_members */

0,

/* tp_getset */

0,

/* tp_base */

378

TEAM LinG

Extension Programming with C

0,

0,

0,

0, (initproc)Encoder_init, 0,

Encoder_new, 0,

/* tp_dict */

/* tp_descr_get */ /* tp_descr_set */ /* tp_dictoffset */ /* tp_init */

/* tp_alloc */ /* tp_new */ /* tp_free */

};

static PyMethodDef pylame2_methods[] = { { NULL, NULL, 0, NULL }

};

PyMODINIT_FUNC initpylame2() { PyObject *m;

if (PyType_Ready(&pylame2_EncoderType) < 0) { return;

}

m = Py_InitModule3(“pylame2”, pylame2_methods, “My second LAME module.”); Py_INCREF(&pylame2_EncoderType);

PyModule_AddObject(m, “Encoder”, (PyObject *)&pylame2_EncoderType);

}

You can now save this file as pylame2.c and compile it.

On Linux:

gcc -shared -I/usr/include/python2.4 -I/usr/include/lame pylame2.c \

-lmp3lame -o pylame2.so

On Windows:

cl /LD /IC:\Python24\include /IC:\lame-3.96.1\include pylame2.c \ C:\Python24\libs\python24.lib \

C:\lame-3.96.1\libmp3lame\Release\libmp3lame.lib \

C:\lame-3.96.1\mpglib\Release\mpglib.lib

Once that’s done, you can exercise your new extension module with a simple driver script written entirely in Python:

import pylame2

INBUFSIZE = 4096

encoder = pylame2.Encoder(‘test.mp3’) input = file(‘test.raw’, ‘rb’)

data = input.read(INBUFSIZE)

while data != ‘’: encoder.encode(data)

data = input.read(INBUFSIZE)

input.close()

encoder.close()

379

TEAM LinG

Chapter 17

That completes version 2 of your extension module. You’re able to read data from anywhere. Your sample driver is still reading from the raw input file you created earlier, but there’s nothing stopping it from extracting that information out of a WAV file or reading it from a socket.

The only deficiency with this version of the module is that you can’t customize how the encoded data is written. You’re going to fix that in the next revision of the module by “writing” to an object and not directly to the file system. Intrigued? Read on.

Using Python Objects from C Code

Python’s a dynamically typed language, so it doesn’t have a formal concept of interfaces even though we use them all the time. The most common interface is the “file” interface. Terms like “file-like object” describe this interface. It’s really nothing more than an object that “looks like” a file object. Usually, it can get by with only either a read or write method and nothing more.

For the next version of your extension module, you’re going to allow your users to pass in any file-like object (supporting a write method) when constructing new encoder objects. Your encoder object will simply call the write method with the MP3-encoded bytes. You don’t have to be concerned about whether it’s a real file object or a socket or anything else your users can dream up. This is polymorphism at its finest.

In the last version of the module, your object held a FILE *. You need to change this by adding a reference to a PyObject and removing the FILE *:

typedef struct { PyObject_HEAD PyObject *outfp;

lame_global_flags *gfp; } pylame3_EncoderObject;

Encoder_new can stay the same because all it does is set outfp to NULL. Encoder_dealloc, however, needs to be modified:

static void Encoder_dealloc(pylame3_EncoderObject *self) { if (self->gfp) {

lame_close(self->gfp);

}

Py_XDECREF(self->outfp); self->ob_type->tp_free(self);

}

Instead of calling fclose, you use the Py_XDECREF macro to decrement the reference count by one. You can’t delete the object, because there might be other references to it. In fact, other references to this object are likely because the object came from outside of this module. You didn’t create it, but somebody else did and passed it in to you. They probably still have a variable bound to that object.

If you’re decrementing the reference count here in Encoder_dealloc, you must be incrementing it someplace else. You’re doing that in Encoder_init:

380

TEAM LinG