Go Simplified: Scheduling and Context Switching

5 min readFeb 8, 2022

So you recently got introduced to this marvelous language or have been using it for a while. You got your hands greasy with incredible goroutines and channels and the other shebang. However you would like some more information on just how it all works underneath those APIs. Well say no more cause I got you with this Simplified series article explaining just what the scheduler is and how does Go manage all the context switches.

What is the Scheduler ?

The Go scheduler is part of the Go runtime which is part of the executable. It’s called an M:N scheduler as well. [We’ll see why this M:N, down the road]

Go runtime will create a number of OS threads = GOMAXPROCS. GOMAXPROCS is the default number of processors on the machine. As of Go 1.14, Go allows for asynchronous preemption routines. [We’ll get to this later in this article]

The scheduler will also pre-empt a goroutine based on a time slice in case you thought that idle routines will keep on hogging memory. [Leaks can happen still but hey, Go is trying it’s best. Okay!?]

Okay, but can you make it a more graphic ?

Sure, below diagram shows the states a Go routine can be in.

When the routine, is created, it achieves the Runnable state.
When it gets scheduled, it goes to the Executing state.
If the routine while in Executing state, run through it’s time slice, it is pre-empted by the scheduler and goes back to Runnable state.
If the routine gets blocked, eg. I/O event, blocked on channel etc. It moves to Waiting state.
Once the block no longer exists, it goes back to Runnable state.

Let’s try a full rundown of all elements involved in routine scheduling.

For a system core, an OS thread is created M.
For that thread, a logical processor P is created which attached itself to the OS thread and then schedules goroutines on that OS thread context.
R1 here is a goroutine currently executing on the thread M.
R2, R3, R4 form the Local Running Queue on the processor P.
GR1, GR2 form the Global Running Queue.
Once the local running queue is empty, the processor will pull threads from the global running queue to execute them.
Whenever a new goroutine is created, it gets added to the end of the global running queue.

So, I can say that at any given moment, I can schedule N routines on M OS threads which runs at most GOMAXPROCS number of processors.

Congratulations! You now know why we called it the M:N scheduler.

Okay looks fairly straight forward, how does the blocking work ?

Glad you asked. Let’s take a look at Context Switching due to Synchronous call first :

Context Switching due to synchronous system call

Let’s assume G1 starts to read a file synchronously which then blocks M1.
Runtime then asks for another thread from the pool cache/OS. Let’s call this M2.
Now processor P is detached from thread M1 and attached to thread M2. R1 is still attached to M1.
P now schedules R2. R3 remains in local running queue.
Once R1 has finished, it’s moved back to local running queue and M1 is put to sleep and put to thread pool cache.

Tell me about asynchronous system calls context switching now !!

Asynchronous system calls happen when the file descriptor that is used to do the I/O operation is set to non blocking mode.

Now if that file descriptor is not ready, the process is not blocked but an error is returned instead. Application now retries the operation at a later time. Application now needs to create an event loop and create call backs or create a table to maintain these mappings and states.

Go uses netpoller viz. an abstraction built in syscall package. When a goroutine makes an async call and the file-descriptor is not ready, netpoller is used to park the routine.

netpoller uses interfaces provided by system to check on the status of the file descriptor. Once it gets notification for the file descriptor, it in turn notifies the go routine.

Lets see this in a diagram :

Context Switching due to Asynchronous System Calls

Let’s imagine that R1 is making an async call which has now returned and error due to the file descriptor not being ready.
The scheduler upon encountering the error, it moves the goroutine to the netpoller thread.
netpoller will not keep checking and notify R1 when the process can be resumed.
P on the side can start working on R2 while R1 is watched by netpoller.
Once R1 is finished, it goes back to the local running queue.

So here, we did not need to ask for another thread from the OS. We used the netpoller thread to park the time taking routine.

Cool, but what happens when a processor runs out of routines to work on ?

Here is where the concept of work stealing comes into picture. Unlike most humans, processors just love executing tasks. So below are the rules followed for work stealing :

If no routines are present in Local Running Queue(LRQ), try to steal from another Local Running Queue.
If no routines are found in the other LRQ, check Global Running Queue(GRQ).
If no routines are found in the GRQ, check the netpoller.

This provides a good way to make the best use of time at hand.

Hope you liked this article. Show some love in the form of comments, critiques and applauds!

If you like the idea of this series, please let me know that as well in the comment section so that I can keep bringing you more on the Simplified series !!

Go Simplified: Scheduling and Context Switching

What is the Scheduler ?

Okay, but can you make it a more graphic ?

Okay looks fairly straight forward, how does the blocking work ?

Tell me about asynchronous system calls context switching now !!

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Deepak Choudhary

Responses (1)

More from Deepak Choudhary

How I escaped Apache-POI and found my dream 3rd party excel writer

This is a story about how I finally got rid of slow excel file generation, memory exhaustion traps and finlly found the one solution which…

Solving the “Too many Swaggers” problem in a Microservice architecture

Microservice architecture solves a lot of problems but brings a couple of problems too. Read to solve the too many swagger endpoints…

Integrating Jest with your Angular CLI Application

Unit testing your Angular application in the browser takes time and if you don’t want to deal with that, say hello to my new friend.

Ātmachintan series: Redefining “success” in a highly social world

To define success, we must first understand and embrace failure. Read how to navigate and redefine what success should be…

Recommended from Medium

Implementing a Lock-Free Ring Buffer in Go

Mastering Environment Variables in Go: Unlocking Secure, Flexible, and Scalable Configuration ✨

From Native OS Integration to Advanced External Tools, Learn How to Inject Power into Your Go Applications Without Hardcoding Secrets

Lists

General Coding Knowledge

Learning JSON in Golang by Project: Easy, Medium, and Expert Levels

Introduction

10 Go Language Features Every Developer Should Know

Go language features are used to build efficient, scalable, and maintainable applications.

8 Golang Performance Tips I Discovered After Years of Coding

These have saved me a lot of headaches, and I think they’ll help you too. Don’t forget to bookmark them for later!

Go 1.20 Experiment: Memory Arenas vs. Traditional Memory Management

A dive into Go’s new experimental feature