Controlling parallelism
We know that spawned goroutines will start executing as soon as possible and in a simultaneous fashion. However, there is an inherent risk involved when the said goroutines need to work on a common source that has a lower limit on the number of simultaneous tasks it can handle. This might cause the common source to significantly slow down or in some cases even fail. As you might guess, this is not a new problem in the field of computer science, and there are many ways to handle it. As we shall see throughout the chapter, Go provides mechanisms to control parallelism in a simple and intuitive fashion. Let's start by looking at an example to simulate the problem of burdened common source, and then proceed to solve it.
Imagine a cashier who has to process orders, but has a limit to process only 10 orders in a day. Let's look at how to present this as a program:
// cashier.go package main import ( "fmt" "sync" ) func main() { var wg sync.WaitGroup // ordersProcessed & cashier are declared in main function // so that cashier has access to shared state variable 'ordersProcessed'. // If we were to declare the variable inside the 'cashier' function, // then it's value would be set to zero with every function call. ordersProcessed := 0 cashier := func(orderNum int) { if ordersProcessed < 10 { // Cashier is ready to serve! fmt.Println("Processing order", orderNum) ordersProcessed++ } else { // Cashier has reached the max capacity of processing orders. fmt.Println("I am tired! I want to take rest!", orderNum) } wg.Done() } for i := 0; i < 30; i++ { // Note that instead of wg.Add(60), we are instead adding 1 // per each loop iteration. Both are valid ways to add to WaitGroup as long as we can ensure the right number of calls. wg.Add(1) go func(orderNum int) { // Making an order cashier(orderNum) }(i) } wg.Wait() }
A possible output of the program might be as follows:
Processing order 29 Processing order 22 Processing order 23 Processing order 13 Processing order 24 Processing order 25 Processing order 21 Processing order 26 Processing order 0 Processing order 27 Processing order 14 I am tired! I want to take rest! 28 I am tired! I want to take rest! 1 I am tired! I want to take rest! 7 I am tired! I want to take rest! 8 I am tired! I want to take rest! 2 I am tired! I want to take rest! 15 ...
The preceding output shows a cashier who was overwhelmed after taking 10 orders. However, an interesting point to note is that if you run the preceding code multiple times, you might get different outputs. For example, all of the 30 orders might be processed in one of the runs!
This is happening because of what is known as the race condition. A data race (or race condition) occurs when multiple actors (goroutines, in our case) are trying to access and modify a common shared state, and this results in incorrect reads and writes by the goroutines.
We can try to solve this issue in two ways:
- Increasing the limit for processing orders
- Increasing the number of cashiers
Increasing the limit is feasible only to a certain extent, beyond which it would start degrading the system or in the case of the cashier, work will neither be efficient nor 100% accurate. On the contrary, by increasing the number of cashiers, we can start processing more orders consecutively while not changing the limit. There are two approaches to this:
- Distributed work without channels
- Distributed work with channels