These are my wide but not deep notes on concurrency. I am not an expert, so take them with a grain of salt.
Roughly speaking, concurrency is when the execution of two tasks is interspersed, and parallelism is when two tasks literally run at the same time.
For example, Javascript is concurrent but not parallel.
Concurrency can only be achieved on a multi-core CPU, whose hardware can run multiple sequences of assembly instructions at literally the same time.
Operating systems expose two basic concurrency abstractions:
Normally, when a particular thread in a process is being executed, it will continue to be executed. However, this can change if:
When switching threads (context switching), the OS must:
When switching processes (also context switching), the OS must do those things, and additionally:
This makes switching processes much more expensive than switching between threads within a process.
If two processes want to talk to each other, how can they? There are a variety of ways, all intermediated by the operating system. They can communicate by:
|
makes a pipe.Threads within a process share memory, and typically communicate by reading from and writing to that memory. There’s one big issue with this: if two threads read&write or write&write to the same memory at the same time, the results are unpredictable. When this happens, it’s called a data race.
A race condition is when a program may behave incorrectly due to the order that threads happened to run in. (This is frequently due to data races, but both race conditions and data races may occur without the other).
The goal of concurrent code is to avoid race conditions by behaving correctly even if threads are scheduled by a malevolent hell-beast. And the goal of concurrency constructs is to make it as easy as possible for programmers to write concurrent code correctly—which is important because most programmers are less clever than malevolent hell-beasts.
Here are some more specific things that can go wrong:
There are a huge variety of concurrency constructs. Here is a sampling of the more salient ones:
A semaphore is a shared variable that is used to communicate some information
about what resources are available between threads. The most common semaphore
is an integer keeping a count. Obviously the threads can’t simply read and write
it in the ordinary way; the semaphore must provide safe atomic operations like
.read()
, .inc()
, and .dec()
.
Lock is a general term, which in typical use includes mutexes and similar constructs.
A mutex is a shared boolean with .lock()
and .unlock()
operations. When a
thread locks a mutex, one of two things happens: (i) the mutex was unlocked, and
now it locks, and the thread can continue; or (ii) the mutex was locked, and the
thread goes to sleep until it is unlocked. This can easily result in deadlock if
you have multiple mutexes and aren’t very careful about what order they’re
locked in.
It is expected that the thread the locks a mutex is the same one that unlocks it: this is in contrast to a semaphore, which is prototypically used by multiple threads. (A mutex can be thought of as a semaphor that holds a boolean value, though they typically have pretty different uses.)
A spin lock is another kind of lock. When a thread calls .lock()
on a spin
lock, either (i) it immediately gets the lock and continues running, or (ii) it
loops continuously until the lock is released. This is normally a pretty silly
thing to do, but OS kernels do it sometimes because it can be fast when you know
the wait is short and you don’t want the overhead of context switching.
A monitor is a concurrency construct that builds upon the mutex. Famously used in Java.
[FILL: add notes on monitors]
Here’s the API for channels:
make_channel: forall T, () -> (Send<T>, Recv<T>)
send: Send<T>, T -> ()
receive: Recv<T> -> T
That’s the gist of it. A common way to use them is to create a channel, then create a new thread (passing it one end of the channel as you make it), and then communicate with the thread through the channel.
You should be careful not to hold on to a value that you sent through a channel, because the thread you sent it could be mucking with it right now.
The PI Calculus formally describes channels.
Some concurrency primitives are primarily used in a single threaded environment, and cooperatively yield control to each other. Thus they provide concurrency but not parallelism.
A generator is like a function, except that instead of returning it may also
yield
. Yielding returns execution to the caller (just like return
). However,
after a generator yields, invoking it again will resume its execution from the
point at which it yielded.
In Python, any function that contains yield
is secretly a generator, and can
be “invoked” with .next()
, like an iterator:
def generator():
n = 0
while True:
n += 1
yield n
g = generator()
while True:
print(g.next())
# prints 0, 1, 2, ...
Coroutines are like generators, except that they can yield to each other, as well as passing values as they yield. To quote the example on Wikipedia:
var q := new queue
coroutine produce
loop
while q is not full
create some new items
add the items to q
yield to consume
coroutine consume
loop
while q is not empty
remove some items from q
use the items
yield to produce
In the latest versions of Python, generators can accept values when re-invoked:
instead of g.next()
you say g.send(val)
, and then inside of g
, the yield
expression evaluates to val
. This makes Python generators similar to
coroutines. (But not quite as powerful: coroutines are supposed to be able to
yield control to each other, but Python generators always yield control to their
caller in particular.)
Go has a concurrency construct called goroutines. They are full coroutines, and also spawn their own thread, making them a source of parallelism in addition to concurrency.