I’m sure you’ve had to write some code where you have a few hundred thousand items and need to do some work on them. Performing the jobs serially would be too long. And launching everything at once with goroutines isn’t possible because of some other limitation. You need a worker pool.
package main import "fmt" import "time" func worker(id int, jobs <-chan int, results chan<- int) { for j := range jobs { fmt.Println("worker", id, "started job", j) time.Sleep(time.Second) fmt.Println("worker", id, "finished job", j) results <- j * 2 } } func main() { jobs := make(chan int, 100) results := make(chan int, 100) for w := 1; w <= 3; w++ { go worker(w, jobs, results) } for j := 1; j <= 5; j++ { jobs <- j } close(jobs) for a := 1; a <= 5; a++ { <-results } } lang-go
I’m a big fan of Go by Example. But can we make that even simpler?
I just had a case where I don’t even need a return value. So what can we remove?
It turns out Go’s channels are a great solution to this problem. Instead of using them to pass values, we can use them as locks to manage the concurrency.
package main import ( "fmt" "sync" "time" ) func main() { // We manage the concurrency by changing the size of the channel buffers := make(chan bool, 2) // We use a WaitGroup to wait for all the jobs to finish. var wg sync.WaitGroup for i := 0; i <= 10; i++ { wg.Add(1) // We create one goroutine for each job go func(i int) { // This channel will block until a job finishes buffers <- true time.Sleep(time.Second * 2) fmt.Println(i) // As soon as we've finished our work, we discard the value <-buffers wg.Done() }(i) } wg.Wait() } lang-go
In this version, handling the workers and the concurrency becomes a lot simpler. Increase the buffer size if you want to speed things up. Click here to try it online.
If you have ways to continue improving this code, I would love to hear about them!
💬 Comments