A Quick Dive into FFI in Python
FFI (Foreign Function Interface) is a classic abstraction that allows you to invoke functions written in other programming languages from the code in your target language. In this article, we’ll explore what FFI is, how it works, and how to use it in Python.
This article is by no means a comprehensive guide and may contain inaccuracies. My aim is simply to cover the essentials and give you a nudge in the right direction for further study. I’ve also intentionally simplified some examples for clarity. If you spot an error, feel free to let me know.
FFI, short for Foreign Function Interface, is a mechanism that enables calling functions written in other programming languages from your target language’s codebase. The term foreign refers to functions originating from a different language or environment. For instance, with FFI, you can invoke a function written in C from Python. In this scenario, the C functions are considered “Foreign Functions.” This abstraction proves especially useful when you want to leverage a library or component written in another language in your current project. FFI is frequently utilized in languages that don’t have full access to the operating system, like JavaScript, which uses it to connect with C++ libraries.
FFI can either be built into the language or come as a separate library. Typically, it includes a set of functions that enable developers to define how to call functions from other languages and convert data between different types.
FFI is widely used across various programming domains, including systems programming, machine learning, and scientific computing, where integration with libraries written in other languages is required.
A Brief History of FFI
The concept of the Foreign Function Interface emerged in the late 1970s, primarily within the realm of the Common Lisp programming language. However, FFI gained broader recognition with the rise of C, a language that had direct access to the operating system and could call functions from dynamic libraries and shared object files. The term FFI is also officially used in languages like Haskell, Python, and Perl, though other languages might adopt different terminology, such as Java’s “JNI” (Java Native Interface). In some languages, it’s simply known as “language bindings.”
Initially, FFI was mainly associated with system programming and device driver development. Over time, it expanded into other fields, including application and game development, where invoking functions written in other languages became essential.
FFI Support Across Programming Languages
Many programming languages have adopted FFI as a common abstraction. Some languages have built-in support, while others rely on third-party libraries to handle FFI.
Languages with native FFI support include:
- C and C++: Both have direct operating system access and can invoke functions from dynamic libraries and shared object files.
- Rust: Provides FFI support through extern functions that can call C functions.
- Swift: Enables calling C functions from dynamic libraries and supports Objective-C as an alternative FFI mechanism.
Some languages rely on third-party libraries for FFI, such as:
- Python: ctypes and cffi libraries allow calling functions from dynamic libraries written in C and other languages.
- Java: The Java Native Access (JNA) library provides the ability to call functions from C libraries.
- JavaScript: Libraries like ffi and ref offer the means to invoke functions written in C and C++.
FFI implementations generally include mechanisms for working with pointers, memory alignment, data type conversions, and other low-level tasks. The exact support for FFI varies depending on the programming language and FFI library used.
Popular Use Cases
FFI is widely used in scenarios that require interaction between programming languages or the shared use of code across different languages. Some of the most common use cases include:
- Embedding C or C++ code in other languages, like Python, Ruby, or Java, to boost performance or access specialized libraries.
- Creating bindings for existing C or C++ libraries to make them usable in other programming languages.
- Developing extensions or plugins for applications written in different languages, expanding their functionality.
- Creating libraries in C or C++ that can be used across multiple programming languages to enhance portability and cross-platform compatibility.
- Game and multimedia development, where FFI enables the use of powerful C or C++ libraries for graphics and sound.
When Building an Extensible Library
When creating a library intended for use through FFI, you should keep a few key points in mind:
- FFI Compatibility: Ensure the library is written in a language compatible with FFI, like C or C++.
- Calling Conventions: The library should adhere to the calling conventions of its target architecture to allow FFI to invoke its functions correctly.
- Function Exporting: Export the functions you want to expose through FFI by using the appropriate directives when compiling the library.
- Function Parameters: Ensure the functions use data types supported by FFI.
- Language Features: Limit language features to those compatible with the target language. For example, C++ functions callable from C should not throw exceptions or use reference parameters.
- Security: Pay attention to security concerns to avoid vulnerabilities when using FFI.
- Documentation: Provide thorough documentation for the functions, their parameters, and return values to make the library easy to use through FFI.
FFI in Python
Python’s support for FFI was introduced in version 2.2, back in 2001,
via the ctypes
module. The developer, Martin von Löwis, was
responsible for bringing this module into Python. Inspired by Ruby’s
FFI library, ctypes
allows you to call functions exported from dynamic
libraries written in C and C++. It provides an intuitive way to load
dynamic libraries and invoke functions from them, making FFI
accessible to Python developers with minimal overhead.
Over time, more FFI libraries have emerged for Python, including cffi
,
which was added in Python 3.2 in 2011. cffi
offers a more advanced and
high-level interface, making it easier to manage data types and define
functions directly in Python instead of relying on C header files.
One of the key differences between cffi
and ctypes
lies in how they
interact with C libraries. ctypes
works by interfacing with dynamic
libraries that are already compiled and ready for use. On the other
hand, cffi
allows you to compile C code on the fly during runtime,
giving it more flexibility and making it particularly useful in
scenarios where this runtime compilation is necessary.
Another important distinction is that cffi
provides a higher-level
interface for interacting with C libraries. Specifically, cffi
can
automatically handle C data types and allows you to define and use
structures, unions, and arrays with ease. This makes working with C
libraries simpler and less prone to error.
Moreover, cffi
extends support for working with C++ code and offers
JIT (Just-In-Time) compilation for calling C functions, providing a
broader and more powerful toolset for Python developers needing FFI.
Use Cases
Today, FFI plays a vital role in Python applications across a wide range of domains, from scientific computing to game development and web applications. Some of the most common uses of FFI in Python include:
- Scientific Computing: Libraries like NumPy, SciPy, and TensorFlow rely on C/C++ code for performance optimizations during computation-heavy tasks.
- Game Development: Many game engines are written in C or C++, and FFI allows Python scripts to interface seamlessly with these engines.
- Database Interfacing: Libraries like SQLAlchemy leverage FFI to call functions written in C/C++ for faster database interactions.
- Networking: Networking libraries like Twisted may use FFI to optimize network operations by calling C/C++ functions.
- Graphical User Interfaces (GUIs): Libraries such as PyQt and PyGTK utilize FFI to interact with C/C++ functions for efficient rendering of graphical elements.
- Cryptography: Cryptography libraries, like PyCrypto and M2Crypto, can use FFI to accelerate encryption and signature operations by invoking C/C++ functions.
Overall, FFI’s presence in Python is driven by the need for speed in areas where Python alone may not perform optimally, providing access to the power of C and C++.
Examples
Simple examples of using FFI in Python often involve calling functions
from dynamic libraries written in C or C++. For example, suppose you
have a dynamic library mylib.so
that contains a function add
which
adds two numbers and returns the result.
Example with ctypes
:
Here’s a basic example of how you might invoke this function using
Python’s ctypes
module:
int add(int a, int b);
int add(int a, int b) {
return a + b;
}
import ctypes
# Load the library
mylib = ctypes.cdll.LoadLibrary('./mylib.so')
# Define argument and return types
mylib.add.argtypes = (ctypes.c_int, ctypes.c_int)
mylib.add.restype = ctypes.c_int
# Call the function
result = mylib.add(2, 3)
print(result) # Output: 5
In this example, we load the mylib.so
library, define the argument and
return types for the add function, and call it with the values 2
and
3
, resulting in an output of 5
.
Example with cffi
:
Now let’s see how you can achieve a similar result using the cffi
library:
import cffi
# Define the C library interface
ffi = cffi.FFI()
ffi.cdef("""
int printf(const char *format, ...);
""")
# Load the C library
lib = ffi.dlopen(None)
# Equivalent C code: char arg[] = "world";
arg = ffi.new("char[]", b"world")
# Call the printf function from the C library
lib.printf(b"Hello, %s!\n", arg)
In this example, the printf function from the C library is invoked to
print a formatted string. cffi
handles the translation between Python
and C types and manages the data being passed to the C function.
Performance
To demonstrate the performance difference between native Python code and code optimized via FFI, let’s consider the classic example of calculating the Fibonacci sequence.
Pure Python Example:
def fibonacci(n: int):
if n < 2:
return 1
return fibonacci(n - 2) + fibonacci(n - 1)
for _ in range(1000000):
fibonacci(12)
Running this code yields the following performance:
$ /usr/bin/time nice python fibonacci.py
29.66 real 29.52 user 0.06 sys
In this case, the pure Python implementation took around 29.66 seconds to compute the result. Now, let’s implement the same Fibonacci function in C and call it from Python using FFI.
C Implementation:
int fibonacci(int n);
int fibonacci(int n) {
if (n < 2) {
return 1;
}
return fibonacci(n - 2) + fibonacci(n - 1);
}
Python Call to C Function with ctypes
:
import ctypes
# Load the C library
C = ctypes.cdll.LoadLibrary('./fibonacci.so')
# Define argument and return types
C.fibonacci.argtypes = (ctypes.c_int,)
C.fibonacci.restype = ctypes.c_int
# Call the function in a loop
for _ in range(1000000):
C.fibonacci(12)
Performance results after calling the C version:
$ /usr/bin/time nice python fibonacci-ffi.py
1.09 real 1.01 user 0.01 sys
As you can see, the C version via FFI took only 1.09 seconds, which is nearly 29 times faster than the Python-only version. The speedup comes from the fact that C is compiled to machine code and doesn’t suffer from the overhead of Python’s interpreter or the Global Interpreter Lock (GIL). While Python has its strengths, for tasks like this, C will almost always outperform it.
When Not to Use FFI
Despite its obvious advantages, FFI isn’t always the right tool. Here are some scenarios where using FFI might not be appropriate:
- Performance isn’t critical: If you’re not dealing with large amounts of data or complex computations, FFI might be overkill.
- Code complexity: If the foreign code is too complex, using FFI could make your project harder to maintain.
- Development hurdles: Integrating FFI can introduce additional challenges, such as managing data between languages or dealing with differing calling conventions.
- Limited portability: Foreign code may not be portable across platforms or operating systems, which could hinder your project’s portability.
- Security concerns: Code invoked via FFI could introduce vulnerabilities, especially if it’s from untrusted sources or poorly written.
There are certain cases where employing FFI in Python could be more harmful than helpful:
- If a native Python library is available: If there’s already a Python library that solves your problem, using FFI might be overkill. In such cases, it’s usually better to rely on the native Python solution, which is likely more straightforward and tailored to Python’s ecosystem.
- If performance isn’t a concern: When you’re not dealing with large datasets or computationally expensive tasks, FFI can introduce unnecessary complexity. Native Python code can be simpler, easier to debug, and sufficient for many use cases.
- If you lack FFI experience: FFI demands knowledge of low-level languages like C or C++. Without experience in these areas, working with FFI can become a time-consuming and error-prone process.
- If your environment doesn’t support FFI: Some environments, such as browsers or certain mobile platforms, may not support FFI. In these cases, it’s best to rely on the native capabilities of the environment rather than attempting FFI, which may not work or could require complicated workarounds.
- If security is a concern: Using FFI could introduce security risks, particularly if you’re calling unknown libraries or external code. In such situations, it’s crucial to use trusted and well-vetted libraries, and to carefully review any external code for potential vulnerabilities.
In general, the decision to use FFI in Python should be a careful, considered one based on the needs of your project and whether Python-native libraries can solve the problem just as effectively.