[Cython] How to interact with C code from Python ?
In this post, I’ll show you how to get started with interacting with C code from Python using Cython.
This is by no means a complete guide. For that, see the official documentation.
You can download the complete source code for this Cython demo here.
Dependencies
- This Cython demo has been tested on Python3.
The example application
For the purpose of this post, let us build a Python application that can compress a user-specified string using miniLZO and write the compressed output to a file.
You’ll want to download the source code for miniLZO which we’ll refer to in the subsequent sections.
Exporting C API’s in Cython
In this case, we’ll be writing a Cython wrapper over the LZO1X-1 compression API to compress a given string.
The file testmini.c
in the miniLZO source code contains example code for
how to use the LZO compression and decompression API’s.
We’ll need to export two API’s - lzo_init()
and lzo1x_1_compress()
for use by our Python application.
lzo_init()
is declared in lzoconf.h
and the core compression API -
lzo1x_1_compress()
- is declared in minilzo.h
.
To export the above two API’s to our Python application,
we write the following Cython code in miniLZO.pxd
.
cdef extern from "minilzo-2.10/minilzo.h":
int lzo1x_1_compress(char *lzo_in,
unsigned int len_in,
char *lzo_out,
unsigned int *len_out,
void *workmem)
cdef extern from "minilzo-2.10/lzoconf.h":
int lzo_init()
The code is pretty self-explanatory.
You can compare the declarations of the above two API’s with their corresponding
declarations in the lzoconf.h
and minilzo.h
header files to see that they
match.
miniLZO.pxd
can be thought of as a Cython header file which we’ll import into
the main Cython code in the next section.
Calling the exported C API’s in Cython
Put the following code in lzo_compress.pyx
.
from libc.stdint cimport uint32_t, uint16_t
from libc.stdlib cimport malloc, free
from libc.stdio cimport FILE, fopen, fwrite, fclose
cimport miniLZO
cdef lzo_compress(char *lzo_in, uint32_t len_in):
cdef char *lzo_out = <char *>malloc(1 * 1024 * 1024)
cdef void *workmem = malloc(1 * 1024 * 1024)
cdef uint32_t len_out = 0
cdef int result = 0
cdef FILE *fp
assert miniLZO.lzo_init() == 0, "LZO init failed."
result = miniLZO.lzo1x_1_compress(lzo_in, len_in, lzo_out, &len_out, workmem)
assert result == 0, "Compression failed."
print("Compressed output length = {}".format(len_out))
fp = fopen('lzo_out', 'wb')
fwrite(lzo_out, len_out, 1, fp)
fclose(fp)
free(lzo_out)
free(workmem)
return result
def do_lzo_compress(lzo_in):
encoded_input = lzo_in.encode('ascii')
print("Input length = {}".format(len(encoded_input)))
return lzo_compress(encoded_input, len(encoded_input))
Let’s go through snippets of the code in more detail below:
from libc.stdint cimport uint32_t, uint16_t
from libc.stdlib cimport malloc, free
from libc.stdio cimport FILE, fopen, fwrite, fclose
cimport miniLZO
What we’re doing here is importing certain .pxd
files (can be thought of as
Cython header files). Note that you’re also importing the miniLZO.pxd
file
you created in the previous step. The stdint, stdlib and stdio are also just
.pxd files like the one we’ve written in the previous section.
You’ll find these .pxd files in Cython/Includes/libc/
.
It is important to understand that, although the code looks like Python, we
are actually writing C code. The cythonize
command will autogenrate C code
from this lzo_compress.pyx
file which we’ll see in a later section.
cdef lzo_compress(char *lzo_in, uint32_t len_in):
This is the beginning of the Cython function which accepts a string and its
length (from the main Python application which we’ll see later), compresses
it using LZO and writes the compressed output to a file - lzo_out
.
cdef char *lzo_out = <char *>malloc(1 * 1024 * 1024)
cdef void *workmem = malloc(1 * 1024 * 1024)
cdef uint32_t len_out = 0
cdef int result = 0
cdef FILE *fp
These are some of the local variables we’ll be using in this Cython function.
Note the <>
(angular brackets) for typecasting. Again, this looks like Python
code, but we’re actually writing C code.
assert miniLZO.lzo_init() == 0, "LZO init failed."
result = miniLZO.lzo1x_1_compress(lzo_in, len_in, lzo_out, &len_out, workmem)
assert result == 0, "Compression failed."
print("Compressed output length = {}".format(len_out))
This is the part where we do LZO compression. Before calling the core
compression API - lzo1x_1_compress()
we have to call lzo_init()
.
Both these API’s were exported earlier in the miniLZO.pxd
file that
we wrote.
Please see testmini.c
in the miniLZO source code for
an example of how to use these API’s to perform compression.
lzo_in
is the input string specified by the user and len_in
- its length.
lzo_out
is memory we’ve allocated to store the compressed output.
workmem
is memory we’ve allocated that LZO uses as scratch space during
compression.
Once compression is complete we check the return value and print the length of the compressed output.
fp = fopen('lzo_out', 'wb')
fwrite(lzo_out, len_out, 1, fp)
fclose(fp)
free(lzo_out)
free(workmem)
The final part is C code which writes the compressed output stored in lzo_out
to a file named lzo_out
and frees up all temporary memory allocated locally.
You may be wondering what the pure Python function - do_lzo_compress()
is doing.
Functions defined with cdef
are not visible to Python applications when
the .pyx file is imported. So, lzo_compress()
is not callable by a Python
application which imports lzo_compress.pyx
. That is the reason, we define
a pure Python function - do_lzo_compress()
- which is callable by the
Python application - and which in-turn calls the Cython function -
lzo_compress()
.
Python3 - on which this demo has been tested - requires us to pass
The Python application
Finally, we write a Python program which accepts a string from the user and
compresses it using the do_lzo_compress()
API we wrote in the previous
section. Put the following code in compress_string.py
.
import lzo_compress
user_str = input()
lzo_compress.do_lzo_compress(user_str)
Putting it all together
The Cython code we wrote earlier - lzo_compress.pyx
- does nothing unless
it is compiled. Let us see how to do that.
Put the following code in Makefile
.
.DEFAULT_GOAL := all
CFLAGS=-Wall -fPIC -I/home/varun/anaconda3/include/python3.6m -I.
minilzo.o:
gcc $(CFLAGS) -c minilzo-2.10/minilzo.c -o minilzo.o
lzo_compress.c:
cythonize lzo_compress.pyx
lzo_compress.o: lzo_compress.c
gcc $(CFLAGS) -c lzo_compress.c -o lzo_compress.o
lzo_compress: lzo_compress.o minilzo.o
gcc -shared $(CFLAGS) -o lzo_compress.so lzo_compress.o minilzo.o
all: lzo_compress
clean:
rm -rf *.o *.so *.c
This is a bog-standard Makefile except for a few lines which we’ll discuss below.
CFLAGS=-Wall -fPIC -I/home/varun/anaconda3/include/python3.6m -I.
This line has to be changed with the include directory path on your machine. How do you find it out ?
Run the cythonize
command with the -i
option on lzo_compress.pyx
like below.
$ cythonize -i lzo_compress.pyx
The compilation may fail, but that’s okay. What we’re interested in is a line that shows the compilation options; something like -
gcc -pthread -B /home/varun/anaconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I.
-I/home/varun/anaconda3/include/python3.6m -c /home/varun/cython_demo/lzo_compress.c -o /home/varun/cython_demo/tmppjsm8ivu/home/varun/cython_dem
o/lzo_compress.o
Copy the include directory path from the above output (may be different on your machine) and paste it into the Makefile.
lzo_compress.c:
cythonize lzo_compress.pyx
This is the execution of the cythonize
command to auto-generate C code from the
Cython code - lzo_compress.pyx
.
lzo_compress: lzo_compress.o minilzo.o
gcc -shared $(CFLAGS) -o lzo_compress.so lzo_compress.o minilzo.o
This is the generation of the shared object which can then be imported into
the Python application - compress_string.py
.
Example run
[lenovo ~/cython_demo ]$ ls
minilzo-2.10/ compress_string.py lzo_compress.pyx Makefile miniLZO.pxd
[lenovo ~/cython_demo ]$ make clean && make
rm -rf *.o *.so *.c lzo_out
cythonize lzo_compress.pyx
Compiling /home/varun/cython_demo/lzo_compress.pyx because it changed.
[1/1] Cythonizing /home/varun/cython_demo/lzo_compress.pyx
...
gcc -Wall -fPIC -I/home/varun/anaconda3/include/python3.6m -I. -c minilzo-2.10/minilzo.c -o minilzo.o
gcc -shared -Wall -fPIC -I/home/varun/anaconda3/include/python3.6m -I. -o lzo_compress.so lzo_compress.o minilzo.o
[lenovo ~/cython_demo ]$ python3 compress_string.py
This could be useful when plotting a multidimensional Pareto Front. If you see a solution of interest in your scatter plot, you can directly access its exact values and its index by a simple mouse click. This index may be useful to track the corresponding decision vector in your decision matrix. Also, note that even if I am only plotting a 5D scatter plot, I used a 6 D data matrix, hence, I can see what the sixth value is on the terminal. The following code was used to generate the previous figure. Parts 2.1. through 2.5 were adapted from the Visualization strategies for multidimensional data post, refer to this link for a detailed explanation of these fragments.
Input length = 677
Compressed output length = 593
[lenovo ~/cython_demo ]$ # Voila ! Compression was successful.