The story of GIL
What if someone told you that, even though you have created a multithreaded program, only a single thread can execute at a time? This situation used to be true when systems consisted of a single core that could execute only one thread at a time, and the illusion of multiple running threads was created by the CPU switching between threads frequently.
But this situation is also true in one of the implementations of Python. The original implementation of Python, also known as CPython consists of a global mutex also known as a GIL, which allows only one thread to execute the Python bytecode at a time. This effectively limits the application to executing only one thread at a time.
The GIL was introduced in CPython because of the fact that the CPython interpreter wasn't thread-safe. The GIL proved to be an effective way to workaround the thread-safety issues by trading the properties of running multiple threads concurrently.
The existence of GIL has been a highly debated topic in the Python community and a lot of proposals have been made to eliminate it, but none of the proposals have made it to a production version of Python, for various reasons, which include the performance impact on single-threaded applications, breaking the backward compatibility of features that have grown to be dependent upon the presence of the GIL, and so on.
So, what does the presence of GIL mean for your multithreaded application? Effectively, if your applications exploit multithreading to perform I/O workloads, then you might not be impacted on that much in terms of performance loss due to GIL, since most of the I/O happens outside the GIL, and hence multiple threads can be multiplexed. The impact of GIL will be felt only when the application uses multiple threads to perform CPU-intensive tasks that require heavy manipulation of application-specific data structures. Since all data structure manipulation involves the execution of Python bytecode, the GIL will severely limit the performance of a multithreaded application by not allowing more than one thread to execute concurrently.
So, is there a workaround for the problem that GIL causes? The answer to this is yes, but which solution should be adopted depends completely on the use case of the application. The following options can prove to be of help for avoiding the GIL:
- Switching the Python implementation: If your application does not necessarily depend on the underlying Python implementation and a switch to another implementation can be made, then there are some Python implementations that do not come with GIL. Some of the implementations that do not have GIL in place are: Jython and IronPython, which can completely exploit multiprocessor systems to execute multithreaded applications.
- Utilizing multiprocessing: Python has a lot of options when it comes to building programs with concurrency in mind. We explored multithreading, which is one of the options for implementing concurrency but is limited by the GIL. Another option for achieving concurrency is by using Python's multiprocessing capabilities, which allow the launching of multiple processes to execute tasks in parallel. Since every process runs in its own instance of Python interpreter, the GIL doesn't become an issue here and allows for the full exploitation of the multiprocessor systems.
With the knowledge of how GIL impacts multithreaded applications, let's now discuss how multiprocessing can help you to overcome the limitations of concurrency.