Now that we’ve set the baseline, we can try to improve the speed of our code by using the Python threading library. As you’ll soon find out, threading doesn’t have much use when your software is CPU-bound. That’s why I’ll also demonstrate how IO-bound software can benefit tremendously from threading. If you need a reminder on the difference between IO-bound and CPU-bound, head over to my article on Python concurrency first.
Table of contents
A CPU-Bound Python threading example
In the following example, we use multiple threads to run our heavy function, again 80 times. Each invocation gets its own thread:
import threading import time # A CPU heavy calculation, just # as an example. This can be # anything you like def heavy(n, myid): for x in range(1, n): for y in range(1, n): x**y print(myid, "is done") def threaded(n): threads =  for i in range(n): t = threading.Thread(target=heavy, args=(500,i,)) threads.append(t) t.start() for t in threads: t.join() if __name__ == "__main__": start = time.time() threaded(80) end = time.time() print("Took: ", end - start)
This threaded version takes about 47s to run on my system. If the heavy function had a lot of blocking IO, like network calls or filesystem operations, this version would be a big optimization. The reason this is not an optimization for CPU-bound functions is the Python GIL!
If Python didn’t have the GIL, this would be much faster. However, despite having 80 threads, this runs roughly as fast as the baseline. The baseline is, in fact, even a little faster since it does not have all the overhead of thread creation and switching between threads.
An IO-Bound Python threading example
This is the GIL at work. Each thread takes turns instead of running all at once. To demonstrate that, if heavy would have been an I/O bound function, this would have given us a tremendous speed increase, we’ll create a little test! We can simulate I/O bound software by using
time.sleep(). Sleep has the same effect as blocking IO: it allows the CPU to do other stuff, and return once a certain time has passed.
I modified the heavy function in the next code fragment:
import threading import time # An I/O intensive calculation. # We simulate it with sleep. def heavy(n, myid): time.sleep(2) print(myid, "is done") def threaded(n): threads =  for i in range(n): t = threading.Thread(target=heavy, args=(500,i,)) threads.append(t) t.start() for t in threads: t.join() if __name__ == "__main__": start = time.time() threaded(80) end = time.time() print("Took: ", end - start)
Even though we have 80 Python threads all sleeping for two seconds, this code still finishes in a little over two seconds. While sleeping, the Python threading library can schedule other threads to run. Sweet!
If you’d like to learn more about Python threading, make sure to read the official documentation as well. You’re probably curious if, and how, we can optimize the CPU-bound code. That’s exactly what we’ll find out in the next article about Python multiprocessing.