Speed up Python (using Cython)

Introduction

CPython is the most common and widely used Python implementation. Generally when developers refer to Python, they are talking about C implementation of Python or CPython. There are other implementations of Python, here’s a list straight out of Python.org.

1. IronPython (Python running on .NET)
2. Jython (Python running on the Java Virtual Machine)
3. PyPy (A fast python implementation with a JIT compiler)
4. Stackless Python (Branch of CPython supporting micro threads)
5. MicroPython (Python running on micro controllers)

There are multiple ways of speeding up CPython, from JIT based compilation methods to writing C extensions for Python and we are going to explore the latter.

What is Cython?

Cython is a programming language which aims to become a superset of the Python, with Cython it is possible to write C extensions for the Python language while being as readable and approachable as Python. Cython is fast at the same time provides flexibility of being object-oriented, functional, and dynamic programming language. One of the key aspects of Cython include optional static type declarations which comes out of the box. The source code gets translated into optimized C/C++ code and compiled as Python extension modules.

This allows for both very fast program execution and tight integration with external C libraries, while keeping up the high programmer productivity for which the Python language is well known.

Let’s test Cython speedups with a good old CS textbook program — finding fibonacci sequence upto an integer value given by the user.

Code –

Python version :

Cython-ized version :

In the above example, syntactically the only difference is that Cython uses static typing (just like how you would do it in C/C++/Go or other statically typed language). The key is static typing for the speed up, although Cython is quite forgiving in terms of it’s declarations when compared to C/C++, one could get away with just writing pure Python syntax in Cython (you also wouldn’t see much of the speed up…duh!)

You said complied?

Yes, you do have to compile Cython code.

  • Step 1 — Write your Cython code in *.pyx file
  • Step 2 — In your main.py or app.py (which ever file you use as an entry point) write —
import pyximport
pyximport.install(build_dir=”./build”)

This will create build files *.pyx files along with the transpiled code in a C file with *.c extension, this is actual C code (over ~2750 lines in our case)! Don’t worry this is generated by Cython transpiler and it is optimized for the purpose you have written it (pretty cool isn’t it !?)

  • Step 3 — Run your program as you would run any normal Python code

Benchmark results –

Cython without static typing

python run_test.py 10000
Python test took 4.7995266050000005 secs
Cython test took 3.575472554000001 secs
Cython speed up over Python : 1.3423474890418636 times

Cython with static typing

python run_test.py 10000
Python test took 4.829826579 secs
Cython test took 0.28026456299999936 secs
Cython speed up over Python : 17.233097639247422 times

Disclaimer: Please note that the tests are on logic completely written using Python primitives, the code doesn’t rely on any 3rd party libraries. Results may differ based the underlying library / framework and the structure of the code.

When would I use Cython?

Most of the libraries for Python which demand high performance are written in Cython, C/C++.

Some libraries Cython (and there are many more…)

– Numpy (Numerical computing)
– Scipy (Scientific computing)
– Scikit-Learn (Machine Learning library)
– Pandas (Data Analysis)
– Spacy (Production grade Natural Language Processing (NLP) toolkit)

These are some advance use cases for Cython, I typically use Cython for speeding up loops and conditionals [for, while, if else etc.] (yes, that is the low hanging fruit)

Possible Cython alternatives

While Cython has good compatibility with CPython ecosystem, one needs to rewrite the Python implementation as it introduces new syntax and a different approach towards programming in Python, the developer may feel overwhelmed at times (at-least I was while reading some advance Cython code).

1. MyPy & MyPyC

Mypy is an optional static type checker for Python (leverages type annotations in Python 3.6 and above), Mypyc on the other hand aims to create C Extension from the mypy type checked code.

All this sounds great, but the project is still in its infancy and it can’t be used for production yet, follow the project closely if you are interested, I know I will 🙂

2. Numba

Well Numba takes a different approach altogether, it uses JIT compilation techniques and LLVM infrastructure to speed things up. Numba is more suited for Numpy based code, since it is optimized and built specifically for scientific computing.

We would have to go in-depth to understand these approaches, that would be out of scope for this article (may be for another time)

Where do I find the demo source code?

Here is the link to the Github repo for this article.

What next?

I have covered may be less than a percent of what you could do with Cython. I may have missed out some aspects of Cython such as running Cython on Jupyter notebook, I highly recommend that you read the official documentation.

I have listed a few links in the resources section for you to explore, feel free to share your thoughts.

Resources –

  1. Python Alternatives
  2. Cython Documentation
  3. mypyc or mypy compiler
  4. mypy github
  5. mypy docs
  6. Pyjion — JIT complier based on CoreCLR
  7. Writing C in Cython (Article by creators of Spacy)