In this post, I’ll show you how to get started with interacting with C code from Python using Cython.

This is by no means a complete guide. For that, see the official documentation.

You can download the complete source code for this Cython demo here.


Dependencies

  • This Cython demo has been tested on Python3.


The example application

For the purpose of this post, let us build a Python application that can compress a user-specified string using miniLZO and write the compressed output to a file.

You’ll want to download the source code for miniLZO which we’ll refer to in the subsequent sections.


Exporting C API’s in Cython

In this case, we’ll be writing a Cython wrapper over the LZO1X-1 compression API to compress a given string.

The file testmini.c in the miniLZO source code contains example code for how to use the LZO compression and decompression API’s.

We’ll need to export two API’s - lzo_init() and lzo1x_1_compress() for use by our Python application.

lzo_init() is declared in lzoconf.h and the core compression API - lzo1x_1_compress() - is declared in minilzo.h.

To export the above two API’s to our Python application, we write the following Cython code in miniLZO.pxd.

cdef extern from "minilzo-2.10/minilzo.h":
    int lzo1x_1_compress(char *lzo_in,
                         unsigned int len_in,
                         char *lzo_out,
                         unsigned int *len_out,
                         void *workmem)

cdef extern from "minilzo-2.10/lzoconf.h":
    int lzo_init()

The code is pretty self-explanatory.

You can compare the declarations of the above two API’s with their corresponding declarations in the lzoconf.h and minilzo.h header files to see that they match.

miniLZO.pxd can be thought of as a Cython header file which we’ll import into the main Cython code in the next section.


Calling the exported C API’s in Cython

Put the following code in lzo_compress.pyx.

from libc.stdint cimport uint32_t, uint16_t
from libc.stdlib cimport malloc, free
from libc.stdio cimport FILE, fopen, fwrite, fclose
cimport miniLZO

cdef lzo_compress(char *lzo_in, uint32_t len_in):
    cdef char *lzo_out = <char *>malloc(1 * 1024 * 1024)
    cdef void *workmem = malloc(1 * 1024 * 1024)
    cdef uint32_t len_out = 0
    cdef int result = 0
    cdef FILE *fp

    assert miniLZO.lzo_init() == 0, "LZO init failed."

    result = miniLZO.lzo1x_1_compress(lzo_in, len_in, lzo_out, &len_out, workmem)

    assert result == 0, "Compression failed."

    print("Compressed output length = {}".format(len_out))

    fp = fopen('lzo_out', 'wb')
    fwrite(lzo_out, len_out, 1, fp)
    fclose(fp)

    free(lzo_out)
    free(workmem)

    return result

def do_lzo_compress(lzo_in):
    encoded_input = lzo_in.encode('ascii')
    print("Input length = {}".format(len(encoded_input)))

    return lzo_compress(encoded_input, len(encoded_input))


Let’s go through snippets of the code in more detail below:

from libc.stdint cimport uint32_t, uint16_t
from libc.stdlib cimport malloc, free
from libc.stdio cimport FILE, fopen, fwrite, fclose
cimport miniLZO

What we’re doing here is importing certain .pxd files (can be thought of as Cython header files). Note that you’re also importing the miniLZO.pxd file you created in the previous step. The stdint, stdlib and stdio are also just .pxd files like the one we’ve written in the previous section. You’ll find these .pxd files in Cython/Includes/libc/.

It is important to understand that, although the code looks like Python, we are actually writing C code. The cythonize command will autogenrate C code from this lzo_compress.pyx file which we’ll see in a later section.


cdef lzo_compress(char *lzo_in, uint32_t len_in):

This is the beginning of the Cython function which accepts a string and its length (from the main Python application which we’ll see later), compresses it using LZO and writes the compressed output to a file - lzo_out.


cdef char *lzo_out = <char *>malloc(1 * 1024 * 1024)
cdef void *workmem = malloc(1 * 1024 * 1024)
cdef uint32_t len_out = 0
cdef int result = 0
cdef FILE *fp

These are some of the local variables we’ll be using in this Cython function. Note the <> (angular brackets) for typecasting. Again, this looks like Python code, but we’re actually writing C code.


assert miniLZO.lzo_init() == 0, "LZO init failed."

result = miniLZO.lzo1x_1_compress(lzo_in, len_in, lzo_out, &len_out, workmem)

assert result == 0, "Compression failed."

print("Compressed output length = {}".format(len_out))

This is the part where we do LZO compression. Before calling the core compression API - lzo1x_1_compress() we have to call lzo_init(). Both these API’s were exported earlier in the miniLZO.pxd file that we wrote.

Please see testmini.c in the miniLZO source code for an example of how to use these API’s to perform compression.

lzo_in is the input string specified by the user and len_in - its length. lzo_out is memory we’ve allocated to store the compressed output. workmem is memory we’ve allocated that LZO uses as scratch space during compression.

Once compression is complete we check the return value and print the length of the compressed output.


fp = fopen('lzo_out', 'wb')
fwrite(lzo_out, len_out, 1, fp)
fclose(fp)

free(lzo_out)
free(workmem)

The final part is C code which writes the compressed output stored in lzo_out to a file named lzo_out and frees up all temporary memory allocated locally.


You may be wondering what the pure Python function - do_lzo_compress() is doing. Functions defined with cdef are not visible to Python applications when the .pyx file is imported. So, lzo_compress() is not callable by a Python application which imports lzo_compress.pyx. That is the reason, we define a pure Python function - do_lzo_compress() - which is callable by the Python application - and which in-turn calls the Cython function - lzo_compress().

Python3 - on which this demo has been tested - requires us to pass


The Python application

Finally, we write a Python program which accepts a string from the user and compresses it using the do_lzo_compress() API we wrote in the previous section. Put the following code in compress_string.py.

import lzo_compress

user_str = input()
lzo_compress.do_lzo_compress(user_str)


Putting it all together

The Cython code we wrote earlier - lzo_compress.pyx - does nothing unless it is compiled. Let us see how to do that.

Put the following code in Makefile.

.DEFAULT_GOAL := all

CFLAGS=-Wall -fPIC -I/home/varun/anaconda3/include/python3.6m -I.

minilzo.o:
	gcc $(CFLAGS) -c minilzo-2.10/minilzo.c -o minilzo.o

lzo_compress.c:
	cythonize lzo_compress.pyx

lzo_compress.o: lzo_compress.c
	gcc $(CFLAGS) -c lzo_compress.c -o lzo_compress.o

lzo_compress: lzo_compress.o minilzo.o
	gcc -shared $(CFLAGS) -o lzo_compress.so lzo_compress.o minilzo.o

all: lzo_compress

clean:
	rm -rf *.o *.so *.c


This is a bog-standard Makefile except for a few lines which we’ll discuss below.

CFLAGS=-Wall -fPIC -I/home/varun/anaconda3/include/python3.6m -I.

This line has to be changed with the include directory path on your machine. How do you find it out ?

Run the cythonize command with the -i option on lzo_compress.pyx like below.

$ cythonize -i lzo_compress.pyx

The compilation may fail, but that’s okay. What we’re interested in is a line that shows the compilation options; something like -

gcc -pthread -B /home/varun/anaconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I. 
-I/home/varun/anaconda3/include/python3.6m -c /home/varun/cython_demo/lzo_compress.c -o /home/varun/cython_demo/tmppjsm8ivu/home/varun/cython_dem
o/lzo_compress.o

Copy the include directory path from the above output (may be different on your machine) and paste it into the Makefile.


lzo_compress.c:
	cythonize lzo_compress.pyx

This is the execution of the cythonize command to auto-generate C code from the Cython code - lzo_compress.pyx.


lzo_compress: lzo_compress.o minilzo.o
	gcc -shared $(CFLAGS) -o lzo_compress.so lzo_compress.o minilzo.o

This is the generation of the shared object which can then be imported into the Python application - compress_string.py.


Example run

[lenovo ~/cython_demo ]$ ls
minilzo-2.10/  compress_string.py  lzo_compress.pyx  Makefile  miniLZO.pxd

[lenovo ~/cython_demo ]$ make clean && make
rm -rf *.o *.so *.c lzo_out
cythonize lzo_compress.pyx
Compiling /home/varun/cython_demo/lzo_compress.pyx because it changed.
[1/1] Cythonizing /home/varun/cython_demo/lzo_compress.pyx
...
gcc -Wall -fPIC -I/home/varun/anaconda3/include/python3.6m -I. -c minilzo-2.10/minilzo.c -o minilzo.o
gcc -shared -Wall -fPIC -I/home/varun/anaconda3/include/python3.6m -I. -o lzo_compress.so lzo_compress.o minilzo.o

[lenovo ~/cython_demo ]$ python3 compress_string.py 
This could be useful when plotting a multidimensional Pareto Front.  If you see a solution of interest in your scatter plot,  you can directly access its exact values and its index by a simple mouse click.  This index may be useful to track the corresponding decision vector in your decision matrix.  Also, note that even if I am only plotting a 5D scatter plot, I used a 6 D data matrix, hence, I can see what the sixth value is on the terminal.  The following code was used to generate the previous figure. Parts 2.1.  through 2.5 were adapted from the Visualization strategies for multidimensional data post, refer to this link for a detailed explanation of these fragments.
Input length = 677
Compressed output length = 593

[lenovo ~/cython_demo ]$ # Voila ! Compression was successful.