Notes on Concurrency

These are my wide but not deep notes on concurrency. I am not an expert, so take them with a grain of salt.

Concurrency vs. Parallelism

Roughly speaking, concurrency is when the execution of two tasks is interspersed, and parallelism is when two tasks literally run at the same time.

For example, Javascript is concurrent but not parallel.

Concurrency can only be achieved on a multi-core CPU, whose hardware can run multiple sequences of assembly instructions at literally the same time.

Threads and Processes

Operating systems expose two basic concurrency abstractions:

Processes: run concurrently, and have their own memory.
Threads: belong to a process. Threads within a process all share the same memory.

Normally, when a particular thread in a process is being executed, it will continue to be executed. However, this can change if:

It chooses to go to sleep, typically because nothing interesting is happening and it hopes to wake up when something interesting to it does happen. (The terms blocking and non-blocking refer to operations that do or don’t cause a thread to go to sleep until they produce a value.)
It is killed by another thread.
The OS scheduler decides that it’s had its fair share of time for the moment, and context switches to another thread or process.
An interrupt takes control away. Hardware interrupts actually come to the processor from an external device, like a keyboard when a key is pressed. Software interrupts come from CPU instructions themselves: either an instruction tries to do something illegal (like divide by zero), or a special interrupt instruction is run (this can be used for things like talking to a driver).

When switching threads (context switching), the OS must:

Hand control to the kernel and back (as in any system call).
Swap out the contents of the registers.

When switching processes (also context switching), the OS must do those things, and additionally:

Switch address spaces (as each process has its own memory).
Clear any caches that are only valid in a particular address space.

This makes switching processes much more expensive than switching between threads within a process.

Communication

Inter-process Communication

If two processes want to talk to each other, how can they? There are a variety of ways, all intermediated by the operating system. They can communicate by:

File: The file system is shared across all processes. When one process is writing to a file, it’s supposed to lock the file so that other processes don’t clobber it.
Socket: Sockets are the primitives used in network communication, such as by TCP and UDP.
Pipe: A one-way data channel. In Bash, | makes a pipe.
Shared Memory: Ok I lied—processes can share memory and use it to communicate. It just isn’t the default.

Inter-thread Communication

Threads within a process share memory, and typically communicate by reading from and writing to that memory. There’s one big issue with this: if two threads read&write or write&write to the same memory at the same time, the results are unpredictable. When this happens, it’s called a data race.

The Perils of Communication

A race condition is when a program may behave incorrectly due to the order that threads happened to run in. (This is frequently due to data races, but both race conditions and data races may occur without the other).

The goal of concurrent code is to avoid race conditions by behaving correctly even if threads are scheduled by a malevolent hell-beast. And the goal of concurrency constructs is to make it as easy as possible for programmers to write concurrent code correctly—which is important because most programmers are less clever than malevolent hell-beasts.

Here are some more specific things that can go wrong:

Data Race: As above—two threads try to read and write or write and write to the same memory at the same time. Hilarity ensues. Rust eliminates data races with linear types.
Concurrent Critical Sections: More generally, threads may be managing a shared resource that only works if at most one thread is using it at a time. In this case, code that uses the shared resource is called its critical section, and it’s essential that only one critical section is active at the same time.
Deadlock: Multiple threads end up waiting on each other. Imagine two children each grabbing an end of the same jump rope and refusing to let go.
Starvation: One thread may request access to a shared resource, but never get it. (E.g. keep away.)

A Myriad of Concurrency Constructs

There are a huge variety of concurrency constructs. Here is a sampling of the more salient ones:

Semaphores

A semaphore is a shared variable that is used to communicate some information about what resources are available between threads. The most common semaphore is an integer keeping a count. Obviously the threads can’t simply read and write it in the ordinary way; the semaphore must provide safe atomic operations like .read(), .inc(), and .dec().

Locks

Lock is a general term, which in typical use includes mutexes and similar constructs.

Mutexes

A mutex is a shared boolean with .lock() and .unlock() operations. When a thread locks a mutex, one of two things happens: (i) the mutex was unlocked, and now it locks, and the thread can continue; or (ii) the mutex was locked, and the thread goes to sleep until it is unlocked. This can easily result in deadlock if you have multiple mutexes and aren’t very careful about what order they’re locked in.

It is expected that the thread the locks a mutex is the same one that unlocks it: this is in contrast to a semaphore, which is prototypically used by multiple threads. (A mutex can be thought of as a semaphor that holds a boolean value, though they typically have pretty different uses.)

Spin Lock

A spin lock is another kind of lock. When a thread calls .lock() on a spin lock, either (i) it immediately gets the lock and continues running, or (ii) it loops continuously until the lock is released. This is normally a pretty silly thing to do, but OS kernels do it sometimes because it can be fast when you know the wait is short and you don’t want the overhead of context switching.

Monitor

A monitor is a concurrency construct that builds upon the mutex. Famously used in Java.

[FILL: add notes on monitors]

Channels

Here’s the API for channels:

make_channel: forall T, () -> (Send<T>, Recv<T>)
send:         Send<T>, T -> ()
receive:      Recv<T> -> T

That’s the gist of it. A common way to use them is to create a channel, then create a new thread (passing it one end of the channel as you make it), and then communicate with the thread through the channel.

You should be careful not to hold on to a value that you sent through a channel, because the thread you sent it could be mucking with it right now.

The PI Calculus formally describes channels.

Cooperative Multitasking

Some concurrency primitives are primarily used in a single threaded environment, and cooperatively yield control to each other. Thus they provide concurrency but not parallelism.

Generators

A generator is like a function, except that instead of returning it may also yield. Yielding returns execution to the caller (just like return). However, after a generator yields, invoking it again will resume its execution from the point at which it yielded.

In Python, any function that contains yield is secretly a generator, and can be “invoked” with .next(), like an iterator:

def generator():
    n = 0
    while True:
        n += 1
        yield n

g = generator()
while True:
    print(g.next())
# prints 0, 1, 2, ...

Coroutines

Coroutines are like generators, except that they can yield to each other, as well as passing values as they yield. To quote the example on Wikipedia:

var q := new queue

coroutine produce
    loop
        while q is not full
            create some new items
            add the items to q
            yield to consume

coroutine consume
    loop
        while q is not empty
            remove some items from q
            use the items
            yield to produce

In the latest versions of Python, generators can accept values when re-invoked: instead of g.next() you say g.send(val), and then inside of g, the yield expression evaluates to val. This makes Python generators similar to coroutines. (But not quite as powerful: coroutines are supposed to be able to yield control to each other, but Python generators always yield control to their caller in particular.)

Go has a concurrency construct called goroutines. They are full coroutines, and also spawn their own thread, making them a source of parallelism in addition to concurrency.

January 25, 2020

Justin Pombrio

About Me

Archive

Notes

Puzzles

Tree Editors

Programming

Physics

Math

All Posts