Home High Performance Binary modules for Python

Binary modules for Python

by admin

Python is cool. We say "pip install" and most likely the right library will install. But sometimes the answer will be "compilation failed" because there are binary modules. They almost all modern languages have some kind of pain, because there are many architectures, something you need to build for the specific machine, something you need to link to other libraries. In general, an interesting but little-studied question: how to do them and what are the problems there? Dmitry Zhiltsov tried to answer this question ( zaabjuda ) at the MoscowPython Conf last year.
Here is the text version of Dmitry’s talk. We will briefly dwell on when binary modules are needed and when it is better to abandon them. Discuss the rules to follow when writing them. Let’s look at five possible implementations of :

  • Native C/C++ Extension
  • SWIG
  • Cython
  • Ctypes
  • Rust

About the Speaker : Dmitry Zhiltsov has been engaged in development for more than 10 years. He works in the company CIAN as a system architect, i.e. he is responsible for technical solutions and deadline control. In his life he has tried Assembler, Haskell, C, and the last 5 years he is actively programming in Python.

About the company

Many people who live in Moscow and rent apartments probably know about CIAN. CIAN is 7 million buyers and renters a month. All of these users find a place to live every month with the help of our service.
75% of Muscovites know about our company, which is very cool. In St. Petersburg and Moscow we are practically considered monopolists. At the moment we are trying to go into the regions, and so the development has grown 8-fold, over the last 3 years. That means 8 times the team, 8 times the speed of delivery of the value to the user, i.e. from the idea of the product till the hand of the engineer rolls out the build in production. We’ve learned in our large team to develop very quickly, and to understand very quickly what’s going on in the moment, but today we’re going to talk about something a little different.
I’m going to talk about binary modules. Nowadays almost 50% of Python libraries have some kind of binary modules. And as it turns out, a lot of people are not familiar with them and think it’s something exorbitant, something dark and unnecessary. And other people suggest that it’s better to write a separate microservice, and not use binary modules.
This article will be in two parts.

  1. My experience : what they are for, when it’s best to use them, and when not.
  2. Tools and techniques with which to implement a binary module for Python.

Why we need binary modules

We all know very well that Python is an interpreted language. It is almost the fastest of the interpreted languages, but unfortunately its speed is not always fast enough. for heavy mathematical calculations. That’s where you get the idea that it would be faster in C.
But Python has another pain – it is GIL There are a huge number of articles written about it and reports made on how to get around it.
We also need binary extensions for reuse logic For example, we found a C library that has all the functionality we need, so why don’t we use it. So we don’t have to rewrite the code, we just take the ready code and reuse it.
Many people believe that by using binary extensions one can hide the source code The question is very, very controversial, of course with the help of some wild distortions can be achieved, but there is no 100% guarantee. The best you can get is not to let the client decompile and see what’s going on in the code you passed.

When are binary extensions really needed?

About speed and Python it’s clear – when some function is very slow and takes 80% of execution time of all code, we start thinking about writing a binary extension. But in order to make such decisions we need, as one famous speaker said, to think with our brains.
To write C extensions, you have to take into account that it will take a long time. First you have to "lick" your algorithms, i.e. see if there are any bugs.

In 90% of cases, after a thorough check of the algorithm, there is no need to write any extensions.

The second case where binary extensions are really needed is using multi threading for simple operations It’s not so relevant now, but it’s still around in bloody enterprise, in some systems integrators where they still write in Python 2.6. There’s no asynchrony, and even for simple things like loading a bunch of pictures, multi-threading comes up. It seems to be that initially it doesn’t cause any network costs, but when we upload a picture to the buffer, the unfortunate GIL comes and some braking starts. As practice shows, such things are better solved with libraries that Python knows nothing about.
If you need to implement some specific protocol, it can be convenient to make a simple C/C++ code and get rid of a lot of pain. I did that once in one telecom operator, because there was no ready-made library – I had to write it myself. But again, it’s not very relevant now, because there is asyncio, and it’s enough for most tasks.
About heavy operations I already said beforehand. When you have number crunchers, large matrices and the like, it is logical to make an extension in C/C++. I want to note that some people think that we don’t need binary extensions here, it’s better to make a microservice in some " super fast language ", and transmit huge matrices over the network. No, it’s better not to do that.
Another good example of when they can and even should be taken is when you have an established module logic If your company has a Python module or library that has been around for 3 years with only 2 lines of changes per year, why not turn it into a normal C library if you have time and resources to spare. At least you will get an increase in productivity. And you will also get an understanding that if you need some dramatic changes in the library, it is not so easy and maybe you should use this library in some other way.

5 golden rules

I derived these rules from my practice. They apply not only to Python, but also to other languages for which you can use binary extensions. You can argue with them, but you can also think about them and derive your own.

  1. Export only functions Building classes in Python in binary libraries is quite time consuming : you need to describe a lot of interfaces, you need to revise a lot of referential integrity in the module itself. It is easier to write a small interface for a function.
  2. Use wrapper classes Some people really like OOP and want classes badly. Anyway, even if it’s not classes, it’s better to just write a Python wrapper: you create a class, define a class-method or regular method, call C/C++ functions natively. At the very least it helps maintain the integrity of the data architecture. If you use some C/C++ third-party extension that you can’t fix, you can hack it in the wrapper to make it all work.
  3. You cannot pass arguments from Python to an extension -is not even a rule, but rather a requirement. It may work in some cases, but it is usually a bad idea. So in your C code, you must first make a handler that sets the Python type to C type. And only after that call some native function that already works with C types. The same handler takes the response from the executable function and converts it to Python data types, and throws it into Python code.
  4. Count garbage collection In Python there is a well-known GC, and we shouldn’t forget about it. For example, we pass a big chunk of text by reference and try to find some word in the C library. We want to parallelize it and pass a link to this very memory area and run several threads. At that time GC just takes it and decides that nothing else refers to this object and deletes it from the memory space. In C code, we just get a null reference, and this is usually a segmentation fault. We should not forget about this feature of garbage collector and pass the simplest data types into C libraries: char, integer, etc.
    On the other hand, the language in which the extension is written can have its own garbage collector. The combination of Python and a C# library is a pain in that sense.
  5. Explicitly define the arguments of the exported function What I mean by this is that these functions will have to be annotated qualitatively. If we take a PyObject function, and we’re going to take it in our Cish libraries anyway, we’ll need to explicitly specify which arguments belong to which types. This is useful because if we pass the wrong type of data, we get an error in the C library. So this is for your own convenience.

Architecture of binary extensions

Binary modules for Python
Actually, there’s nothing complicated about the architecture of binary extensions. There’s Python, there’s a calling function that lands on a wrapper that natively calls C code. That call in turn lands on a function that is exported to Python that it can call directly. It is in this function that you need to bring the data types to the data types of your language. And only after that function has translated everything to us, do we call the native function that does the basic logic, backwards returns the result, and throws that into Python, translating the data types back.

Technologies and tools

The best known way to write binary extensions is the Native C/C++ extension. Only because it is standard Python technology.

Native C/C++ extension

Python itself is implemented in C, and methods and structures from python.h are used when writing extensions. By the way, another good thing about this stuff is that it’s very easy to implement in an already existing project. Simply specify xt_modules in setup.py and say that to build the project you need to compile so-and-so sources with so-and-so compilation flags. Below is an example.

name = 'DateTime.mxDateTime.mxDateTime'src = 'mxDateTime/mxDateTime.c'extra_compile_args=['-g3', '-o0', '-DDEBUG=2', '-UNDEBUG', '-std=c++11', '-Wall', '-Wextra']setup (...ext_modules =[(name, { 'sources': [src], 'include_dirs': ['mxDateTime'] , extra_compile_args: extra_compile_args})])

Pros of Native C/C++ Extension

  • Native Technology.
  • Easy to integrate into a project build.
  • The largest amount of documentation.
  • Allows you to create your own data types.

Disadvantages of Native C/C++ Extension

  • High entry threshold.
  • Knowledge of C is required.
  • Boost.Python.
  • Segmentation Fault.
  • Difficulties in debugging.

There is a huge amount of documentation written on this technology, both standard and posts on all sorts of blogs. It’s also a huge plus that we can make our own Python data types and construct our own classes.
This approach has big disadvantages. First, there is the entry threshold – not everyone knows C enough to code for production. You have to understand that it’s not enough to read a book and run to write native extensions. If you want to do it, start by learning C; then start writing command utilities; only after that go on to writing extensions.
Boost.Python is very good for C++, it allows you to pretty much abstract away all those wrappers we use in Python. But the downside I think is that taking some of it and importing it into a project without downloading the whole Boost takes a lot of sweat.
When listing debugging difficulties as a minus, I mean that everyone is used to using a graphical debugger nowadays, but with binary modules such a thing won’t work. Most likely you’ll need to put GDB with a plugin for Python.
Let’s look at an example of how we create this in the first place.

#include <Python.h>static PyObject*addList_add(Pyobject* self, Pyobject* args){PyObject * listObj;if (! PyArg_ParseTuple( args, "About", listObj))return NULL;long length = PyList_Size(listObj)int i, sum =0;// Omit the implementation ofreturn Py_BuildValue("i", sum);}

First, we plug in the Python header files. After that we describe the addList_add function that Python will use. The most important thing is to name the function correctly, in this case addList is the name of the C module, _add is the name of the function that will be used by Python. Pass the module itself to PyObject and pass arguments using PyObject too. After that, we do the standard checks. In this case, we’re trying to parse the tuple argument and say it’s an object – the "O" literal needs to be explicitly specified. After that we know that we passed listObj as an object and we try to find out its length using standard Python methods: PyList_Size. Notice, here we can’t use C calls to find out the length of this vector yet, but we use Python functionality. Let’s omit the implementation, after which we need to return all the values back to Python. To do this, call Py_BuildValue, specify what type of data we are returning, in this case "i" is an integer, and the sum variable itself.
In this case it is clear to everyone – we find the sum of all the elements in the list. Let’s go a little further.

for(i = 0; i< length; i++){// get an item from the list// it's also a Python objectPyObject* temp = PyList_GetItem(listObj, i);// we know that the item is an integer// cast it to a C typelong long elem= PyLong_AsLong(temp);sum += elem;}

It’s the same here, at this point listObj is a Python object. And in this case, we’re trying to take the elements of the list. Python.h has everything we need for that.
After we get temp, we try to cast it to the long type. And only after that we can do something in C.

// Documentationstatic char addList_docs[] = "add( ): add all elements of the list\n";// Registering module functionsstatic PyMethodDef addList_funcs[] = {{"add", (PyCFunction)addList_add, METH_VARARGS, addList_docs}, {NULL, NULL, 0, NULL}};

After we have implemented the whole function, we need to write the documentation. Documentation is always a good thing, and this toolkit has everything to make it easy to maintain. Adhering to the naming convention, let’s name the module addList_docs and save the description there. Now we need to register the module; there is a special structure PyMethodDef for this. Describing the properties, we say that the function is exported to Python under the name "add", that this function calls PyCFunction. METH_VARGS means that the function can potentially accept any number of variables. We also wrote extra lines and described a standard check, just in case we just imported a module but didn’t call any method, so we don’t have it all fall down.
After we have declared all this we try to make a module. We create a moduledef and put everything we’ve already done into it.

static struct PyModuleDef moduledef = {PyModuleDef_HEAD_INIT, "addList example module", -1, adList_funcs, NULL, NULL, NULL, NULL};

PyModuleDef_HEAD_INIT is a standard Python constant that should always be used. -1 denotes that no additional memory needs to be allocated during the import phase.
Once we have created the module itself, we need to initialize it. Python always looks for init, so we create a PyInit_addList for addList. Now we can call PyModule_Create from the assembled structure and finally create the module itself. Then we add the meta information and return the module itself.

PyInit_addList(void){PyObject *module = PyModule_Create(mdef);If (module == NULL)return NULL;PyModule_AddStringConstant(module, "__author__", "Bruse Lee<brus@kf.ch> :");PyModule_addStringConstant (Module, "__version__", "1.0.0");return module;}

As you’ve noticed, there’s a lot to convert here. We should always keep Python in mind when we write C/C++.
That’s why, to make life easier for the average mortal programmer, SWIGtechnology came about 15 years ago.

SWIG

This tool allows you to abstract from Python bindings and write native C code. It has the same pros and cons as Native C/C++, but there are exceptions.
Pros of SWIG:

  • Stable technology.
  • Lots of documentation.
  • Abstracts from Python binding.

SWIG Minuses:

  • Long setup.
  • Knowledge C.
  • Segmentation Fault.
  • Difficulties in debugging.
  • Difficulty integrating into a project build.

The first disadvantage is that you’ll go crazy before you set it up.. When I set it up the first time, it took me a day and a half to get it up and running at all. Then, of course, it got easier. In SWIG 3.x it gets easier.
To avoid going into more code, let’s look at the general scheme of how SWIG works.
Binary modules for Python
example.c is a C module that knows nothing about Python at all. There is an interface file example.i, which is described in SWIG format. After that, we run the SWIG utility, which creates example_wrap.c from the interface file, which is the same wrapper that we used to do by hand. That is, SWIG just creates a wrapper file for us, a so-called bridge. After that, we compile the two files with GCC and get two object files (example.o and example_wrap.o) and then create our library. It’s very simple and straightforward.

Cython

Andrei Svetlov has made at MoscowPython Conf an excellent report , so I’ll just say that it’s a popular technology with good documentation.
Pros of Cython:

  • Popular Technology.
  • Pretty stable.
  • Easy to integrate into a project build.
  • Good documentation.

Cython Minuses:

  • Syntax.
  • Knowledge of C.
  • Segmentation Fault.
  • Difficulties in debugging.

There are, as always, disadvantages. The main one is its syntax, which is similar to both C/C++ and very much like Python.
But I want to point out that Python code can be accelerated with Cython by writing native code.
Binary modules for Python
As you can see there are a lot of decorators, and that’s not very good. If you want to use Cython, check out Andrei Svetlov’s report.

CTypes

CTypes is a standard Python library that works with the Foreign Function Interface. FFI is a low-level library. It’s a native technology, it’s used terribly often in code, and it’s easy to implement cross-platform with it.
But FFI has a big overhead because all bridges, all handlers in runtime are created dynamically. That is we have loaded a dynamic library, and Python at that moment doesn’t know at all what kind of library it is. Only when the library is called in memory are these bridges dynamically constructed.
Pros of CTypes:

  • Native Technology.
  • Easy to use in code.
  • Easy to implement cross-platform.
  • You can use almost any language.

Minus CTypes:

  • Incurs overhead costs.
  • Difficulties in debugging.

from ctypes import *#load the shared object fileAdder = CDLL('./adder.so')#Calculate factorialres_int = adder.fact(4)print("Fact of 4 = " + str(res_int))

We took adder.so and natively called it in runtime. We can even pass native Python types.
After all this is the question, "Somehow everything is complicated, everywhere C, what to do?

Rust

At one time I didn’t give this language the attention it deserved, but now I’m practically switching to it.
Pros of Rust:

  • Safe Language.
  • Powerful static guarantees of correct behavior.
  • Easy to integrate into a project build ( PyO3 ).

Minuses of Rust:

  • High entry threshold.
  • Long setup.
  • Difficult to debug.
  • Documentation is scarce.
  • Overhead costs in some cases.

Rust is a safe language with automatic proof-of-work rules. The syntax itself and the language’s preprocessor make it impossible to make an obvious mistake. At the same time, it is also designed for variation, i.e. it must handle any result of executing a branch of code.
Thanks to the PyO3 team, there are good bindings for Python for Rust, and a toolkit for integration into the project.
On the downside, it takes a very long time to set up for an untrained programmer. Not much documentation, but on the downside we have no segmentation fault. In Rust, well, in 99% of cases programmers can get a segmentation fault only if they explicitly specify unwrap and just forget about it.
A small code sample, the same module we were looking at before.

#![feature(proc_macro)]#[macro_use] extern crate pyo3;Use pyo3::prelude::*;/// Module documentation string 1#[py::modinit(_addList)]fn init(py: Python, m: PyModule) -> PyResult <()> {py_exception!(_addList, EmptyListError);/// Function documentation string 1#[pufn(m, "run", args= "*", kwargs="**" )]fn run_py(_py: Python, args: PyTuple, kwargs: Option<PyDict> ) -> PyResult<()> {run(args, kwargs)}#[pyfn(m, "run", args="*", kwatgs="**")]fn run_py(_py: Python, args: PyTuple, kwargs: Option<PyDict> ) -> PyResult<()> {run(args, kwargs)}#[pyfn(m, "add")]fn add(_py: Python, py_list: PyList) -> PyResult<i32> {let mut sum : i32 = 0match py_list.len() {/// Some codeOk(sum)}Ok(())}

The code has a specific syntax, but you get used to it very quickly. In fact, it’s the same here. We use macros to do modinit, which does all the extra work of generating all sorts of bindings for Python for us. Remember I said you need to make a handler wrapper, so it’s the same here. run_py converts the types, then call the native code.
As you can see, in order to export some function, there is a syntactic sugar. We just say we need an add function and don’t describe any interfaces. We take a list, which is exactly py_list, not Object, because Rust will set up the necessary bindings itself at compile time. If we pass the wrong data type, as in Cish extensions, a TypeError will occur. After we get a list, we start processing it.
Let’s take a closer look at what he’s starting to do.

#[pyfn(m, "add", py_list="*")]fn add(_py: Python, py_list: PyList) -> PyResult<i32> {match py_list.len() {0 => Err(EmptyListError::new("List is empty")), _ => {let mutsum: i32 =0;for item in py_list.iter() {let temp:i32 = match item.extract() {Ok(v) => v, Err(_) => {let err_msg: String = format!("List item {} is not int", item);return Err(ItemListError::new(err_msg))}};sum += temp;}Ok(sum)}}}

The same code that was in C/C++/ Ctypes, but in Rust. There I was trying to get PyObject to some long. What would happen if we had a string in the list, besides numbers? Yes, we would get a SystemEerror. In this case, with let mut sum : i32 = 0; we are also trying to get a value from the list and bring it to i32. So we can’t write this code without item.extract(), so we can’t get it to the right type. When we wrote i32, in case of an error Rust, at compile time will say: "Handle the case when it’s not i32". In that case, if we have i32, we return a value, if it’s an error, we throw an exception.

What to choose

After this little excursion, let’s think about what to choose in the end?
The answer is really your taste and color.
I will not advocate any particular technology.
Binary modules for Python
Just to summarize :

  • In the case of SWIG and C/C++, you need to know C/C++ very well, understand that the development of this module will incur some additional overhead. But it will use minimum tools, and we will work in native Python technology, which is supported by developers.
  • In the case of Cython we have a low entry threshold, we have a great development speed, and it’s also a common code generator.
  • About CTypes, I want to warn you about high overhead. Dynamic loading of libraries, when we don’t know what kind of library it is, can cause a lot of trouble.
  • I would suggest taking Rust for someone who doesn’t know C/C++ very well. Rust in production is really the least problematic.

Useful Links https://github.com/zaabjuda/moscowpythonconf2017
https://docs.python.org/3/extending/building.html
https://cython.org
https://docs.python.org/376/library/ctypes.html
https://www.swig.org
https://www.rust-land.org/en-US/
https://github.com/PyO3
https://www.youtube.com/watch?v=5-WoT4X17sk
https://packaging.python.org/tutorials/distributing-packages/#platformwheels
https://github.com/PushAMP/pamagent (a combat example of using it)

Call for Papers
We are accepting applications for Moscow Python Conf++ until September 7th – write in this simple form what you know about Python that you really need to share with the community.
For those who are more interested in listening, I can tell you about the cool reports.

  • Donald Whyte likes to talk about math acceleration in Python and is preparing for us new story : how to make math 10x faster and code understandable and maintainable with popular libraries, trickery, and cunning.
  • Artyom Malyshev has collected all his years of Django development experience and presents guide report on the framework! Everything that happens between receiving an HTTP request and sending the finished web page : exposing the magic, a map of the framework’s inner workings, and lots of useful tips for your projects.

You may also like