For any non-trivial code which is considered "slow", I bet at some point one of your engineers has wondered how can we speed this up by executing in parallel or concurrently. Setting aside solutions to this at an infrastructure or system design level (scaling out or mapreduce respectively, for example), this article looks at the options at a code execution level using Python.
In the Python world, there are 3 main ways to achieve concurrent code execution (or at least "near-concurrent"):
Approach | Summary | Best For |
---|---|---|
Multiprocessing |
| CPU bound jobs |
Multithreading |
| I/O Bound, Fast I/O, Limited Number of Connections |
Asyncio / coroutines |
| I/O Bound, Slow I/O, Many connection |
With multiprocessing, you start multiple processes which are run in their own separate memory space.
The processes can either be spawned (ie a fresh process is started) or forked (the new process is identical to the parent process up to the point of being created). Your start method (either spawning or forking) can be configured to overwrite the default method which differs by operating system.
Since processes each run in their own memory space, the advantage is that you don't have to worry about data corruption and deadlocks, ie the usual problems associated with threading.
However the drawbacks to using multiprocessing are:
Here's what a simple implementation of multiprocessing looks like:
import multiprocessing
def add_two(number):
print('Addition:' , number + 2)
def subtract_two(number):
print('Subtraction:' , number - 2)
if __name__ == "__main__":
number = 7
p1 = multiprocessing.Process(target=add_two, args=(number,))
p2 = multiprocessing.Process(target=subtract_two, args=(number,))
p1.start()
p2.start()
p1.join()
p2.join()
The join() you see here is the conceptual opposite to fork, in that it asks the master process to wait until the child processes have been "joined" back to the master process. Without this, the master process could terminate early, leaving zombie child processes.
If you really needed to, you could also exchange data between processes using queues and pipes.
Threading uses multiple threads in the same process, with each thread sharing the same memory space.
This shared memory space is both the source of threading's main advantage over multiprocessing (faster start!) and its main disadvantage - data synchronisation and the numerous headaches caused when the order of code execution in a function is not guaranteed (ie no more "local reasoning").
Still, threading might be a suitable solution for cases where your code is I/O bound and spends a lot of time waiting for results from a remote source. Like retrieving data from an external API for example, where you could have multiple threads downloading from different API endpoints "near concurrently".
Note: I mention "near concurrently" because in Python the threads are not all actually working in parallel at all times. In the most common "default" Python (ie CPython), the Global Interpreter Lock is a mutex that protects access to Python objects, preventing multiple threads from executing Python bytecodes at once. Effectively, this means that threads yield control to other threads at certain points (uh... I/O!) so that overall your tasks are completed faster, but the CPU steps are not actually being processed concurrently.
Here's what a threading implementation might look like:
import threading
def add_two(number):
print('Addition:', number + 2)
return (number + 2)
if __name__ == "__main__":
number = 7
threads = 2 # Number of threads to create
for i in range(threads):
thread = threading.Thread(target=add_two, args=(number,))
thread.start()
print("List processing complete.")
Asyncio is a part of the standard library from Python 3.5 onwards, and uses coroutines to handle similar problems as threading - without the issues of threading of course!
Coroutines are run on single threads, but allow the engineer to stipulate when control is yielded back to the main task. I think of it as threading but you (the engineer) has greater control over when control is yielded, as compared to threading where the system mostly handles this.
Asyncio is therefore useful for situations where you have relatively slow I/O and want to explicitly specify when control is yielded.
Here's what it might look like:
import time
import asyncio
async def add_two(number):
print('Addition started work: {}'.format(time.time()))
await asyncio.sleep(5)
print("Addition: %s" % (number + 2))
print('Addition ended work: {}'.format(time.time()))
async def subtract_two(number):
print('Subtraction started work: {}'.format(time.time()))
await asyncio.sleep(5)
print("Subtraction: %s" % (number - 2))
print('Subtraction ended work: {}'.format(time.time()))
if __name__ == "__main__":
# In Python 3.7+, you would use this instead:
# asyncio.run(main())
number = 7
loop = asyncio.get_event_loop()
futures = [add_two(number), subtract_two(number)]
result = loop.run_until_complete(asyncio.wait(futures))
Hopefully this article has served as an introduction to a fairly complex subject in Python programming.
The general consensus for when to use each of these seems to be: