After all the theory on Python concurrency and the Python GIL, we are now ready for some example code and experiments. Let’s get to work!
Our test function
Let’s first define a function that we can use to benchmark our different options. All the following examples use the same function, called heavy:
def heavy(n, myid): for x in range(1, n): for y in range(1, n): x**y print(myid, "is done")
The heavy function is a nested Python for-loop that does multiplication. It is a CPU-bound function. If you observe your system while running this, you’ll see CPU usage close to 100% (for one core). You can replace it with anything you want, but beware of race conditions — don’t use shared objects or variables.
We’ll be running this function in different ways and explore the differences between a regular, single-thread Python program, multithreading, and multiprocessing.
The baseline: single threaded execution
Each Python program has at least one thread: the main thread. Below you’ll find the single-threaded version, which serves as our baseline in terms of speed. It runs our heavy function 80 times, sequentially:
import time # A CPU heavy calculation, just # as an example. This can be # anything you like def heavy(n, myid): for x in range(1, n): for y in range(1, n): x**y print(myid, "is done") def sequential(n): for i in range(n): heavy(500, i) if __name__ == "__main__": start = time.time() sequential(80) end = time.time() print("Took: ", end - start)
On my system, this takes about 46 seconds to run to completion.
Note that the if __name__ == "__main__":
part is required for this to work on Windows computers, but it’s good form to always use it.
In the following articles, we’ll explore a threaded version and a multiprocessing version and learn the difference between these two ways of writing concurrent code.