Some experiments with asynchronous programming in Python, largely inspired by the following great resources:
- A book chapter by Jesse Jiryu Davis and Guido van Rossum introducing the ideas underlying the asyncio library
- David Beazley’s amazing tutorials
The general idea is to take some very simple network application senarios, implement synchronous versions, and gradually make changes to turn them asynchronous exploring various techniques and tradeoffs along the way (threads, callback style, coroutines…) Finally, provide corresponding variants using asyncio’s event loop and APIs.
The fibservers directory contains incrementals version of a simple TCP server computing Fibonacci numbers, gradully going from synchronous to async. Each version of the server comes with an equivalent implementation using asyncio’s event loop and APIs.
- fibservsync.py
Synchronous version. Handles one connexion at a time.
- fibservthrd.py
Introduce concurrency by handling each incoming connexion in a separate thread.
- fibservcb.py
First single-threaded async version, using callbacks and I/O polling with the standard library’s selectors module.
- fibservcoro.py
Use coroutines to implement tasks, driven by a simple event loop.
- fibservcorothrd.py
Asynchronous I/O handled by an event loop in the main thread, and computation of Fibonacci numbers delegated to a ThreadPoolExecutor.
- fibseraiocb.py
Modify the callback-style server using asyncio’s API to watch file descriptors.
- fibservaiocorothrd.py
Modify the coroutine-based server using asyncio’s event loop and low-level socket operations.
- fibservaoistreams.py
Final and shortest version, using `asyncio`’s TCP server and streams.
In addition to these, I adapted David Beazley’s client implementations to play around and test performance:
- fibclientfast.py
Repeatedly send short-time requests and monitor the requests-per-second rate.
- fibclientslow.py
Send comparatively CPU-demanding requests (computing Fib(33)) and monitor the time-per-request.
Using various combinations of these allows to get an idea of some tradeoffs involved when doing asynchronous programming in Python.
Here are the results of the tests performed on my machine:
Server version | 1 fast | 2 fast | 1 slow | 2 slow | fast + slow | fast + slow |
---|---|---|---|---|---|---|
req/s | req/s | s/req | s/req | (fast) req/s | slow (s/rec) | |
Synchronous | 38000 | X | 1 | X | X | X |
Threads | 34000 | 17000 | 1 | 2.1 | 100 | 1 |
Callbacks | 33000 | 20000 | 1.1 | 2.1 | 1 | 1.1 |
Coroutines | 32000 | 16000 | 1.1 | 2.2 | 1 | 1.1 |
Coroutines and threads | 8000 | 4000 | 1.1 | 2.2 | 30 | 1.1 |
Asyncio | ||||||
Callbacks | 30000 | 18000 | 1 | 2 | 1 | 1 |
Coroutines | 20000 | 9000 | 1 | 2 | 1 | 1 |
Coroutines and threads | 5000 | 3000 | 1 | 2 | 50 | 1 |
Streams | 4500 | 2600 | 1 | 2 | 40 | 1 |
Some notes in French of a short presentation I gave on the topic, and the corresponding reveal.js export (generated by org-reveal) lie in the presentation directory, along with some drawings.
While reading asyncio code, I started out to implement a simplified version of the event loop, tasks and futures, while mimicking the API. It tries to capture the essential logic, while overlooking error handling and many other subtleties. This work in progress is in the eventloop directory.
The webclients directory follows the same incremental approach, focusing on the simple task of concurrently fetching a list of URLs. Work in progress, also integrating some Tornado examples.