Threading Models

What do I mean by threading model?

When writing a multithreaded program, you have to consider how these threads will operate on shared data. Because this is a relatively complex thing, it's very useful to have good mental models and a set of associated rules for how to write code that avoid surprises.

One thing that I'm not focusing on here - because this is already long enough - is what the exact upside is for constraining the programming model. In general, the more constraints you have, the broader scope of consistency you get (so, less locking and less chance of seeing an inconsistent view of the world).

Object-Oriented Constructs

While data access (or more generally 'resource access') lies at the core of threading models, it's useful to describe this in terms of object-oriented constructs, so there's a natural 'scope' (the object) to which we can apply rules.

Here are some aspects worth discussing:

Can my object be called from multiple threads?
Can my object be reentered (so, same thread and same stack, or different thread and same 'logical' stack)?
How do I call an API?
How do I get a callback event from an API?

C++ Model (Free-for-all)

In C++, other than constructs such as thread_local or initialization of static locals for example, really provide no constraints on how objects may be accessed.

This means any object can refer to objects from other threads, invoke methods directly, reenter, etc. The developer is really in charge of making sure that everything works out fine.

In even medium-sized projects, however, a more constrained threading model allows us to reason over the code more easily. By constraining this free-for-all model, we can get some guarantees that simplify implementation.

Windows Message Queues

Windows has had a message queue mechanism for the longest time. It has evolved over the years and has grown very flexible, but it has historically been intertwined with window messages.

The basic model here is as follows.

Objects (windows) have thread affinity and are accessed via messages. The thread has an associated message queue that's created as needed.
Callers can post or send a message. This nomenclature will come up in other models. Sending a message will block the sender until it's processed, while posting will return after adding the message to the queue.
The receiver threads aren't interrupted or anything - they must be pumping messages by regularly inspecting and dispatching messages in a loop.
There is fanciness involved:

Queues can coalesce or synthesize messages at query time.
There is some form of broadcasting (to all top-level windows and a few other ways).
There's support for peeking.
There's support for filtering message types.
There's a window-class.
There's support for 'sending a reply' explicitly - having the receiver unblock the sender if it's waiting.
There is built-in support for marshalling for system messages.

This mechanism is great, but a window message queue often has a lot of traffic, and so you may find that you may incur some undesirable latency.

The big advantage of this model is that these objects with thread affinity need not worry about races in their state.

.NET Design Guidelines

.NET design guidelines are actually quite simple if you look past the various recommendations on using locks and such.

Make static data thread safe by default.
Do not make instance data thread safe by default.

Now, the framework itself gives you a richer set of tools.

Synchronization mechanisms: monitors, mutexes, etc.
Concurrency collections (so, instances that are thread safe).
Synchronization contexts via the SynchronizationContext family of classes. These implement a similar Send/Post mechanism as the Windows message queue.

As a historical note, things got quite a bit messier/richer in the early days when there was a focus on remoting as integral part of the programming platform, but it's been obsoleted for a while and we won't get into it here.

COM Apartments

A full discussion of how COM apartments work would take a long, long read, so instead I'll summarize here.

Objects are affinitized to an 'apartment', which is an abstraction for a synchronization context of sorts. Objects within an apartment get to call each other directly. Objects in different appartments go through a proxy, where the runtime gets to do interesting things.

'Interesting things' usually boils down to marshalling arguments and synchronize via messaging.

There are two main kinds of apartments.

Single-Threaded Apartment. If calling from a different apartment, those calls get synchronized via queuing. These apartments live in a single thread, which must dispatch messages. There are never multiple concurrent calls in the object. You may get reentrancy if someone dispatches messages or the object makes a call to another thread!
Multithreaded Apartment. There is up-to-one of these in a process, and it can run on any thread.

Objects themselves can be single-threaded (and thus need to live in a STA), free-threaded (and live in MTA), both (STA or MTA, but sticky to whatever decision is made).

As an extra bonus: agile objects are free-threaded but are recognized by the global interface table (the thing used for marshalling across apartments) as being able to be called directly from one apartment or another.

Two other apartment types are the application single-threaded apartment (ASTA), which block reentrancy, and the neutral type apartment (NTA), which piggy backs on others in a way.

For a more colorful history of apartments, Raymond's got you covered.

WebRTC Threads

The models I have discussed so far are quite general. You can have different configurations in practice depending on the objects and threads that you have at runtime.

I want to cover the WebRTC threading model in Google's library as an example of a different approach. I'm going to quote directly from the documentation:

WebRTC Native APIs use two globally available threads: the signaling thread and the worker thread. Depending on how the PeerConnection factory is created, the application can either provide those two threads or just let them be created internally.

Calls to the Stream APIs and the PeerConnection APIs will be proxied to the signaling thread, which means that an application can call those APIs from whatever thread.

All callbacks will be made on the signaling thread. The application should return the callback as quickly as possible to avoid blocking the signaling thread. Resource-intensive processes should be posted to a different thread.

So, this presents a free-threaded API ("an application can call those APIs from whatever thread"), but internally provide single-threaded guaranteed ("... will be proxied to the signaling thread ...").

In this sort of scheme, there are a limited number of threads with well-defined responsabilities, and a mechanism to simplify access (message queueing in this case). This is common on other kinds of applications; I've seen one or two other real-time media processing systems that also follow a similar scheme.

Unity Game Loop

In a similar spirit as WebRTC's, let's take a game loop example. I'll use Unity in this case to be as precise as I can, but most game engines will have a similar scheme.

Unity has a scheme similar to the following (I haven't found an authoritative source, mind you!).

A main thread that runs some core engine bits as well as the Update methods. GameObjects have affinity to it and can only be called from them.
The loop on the main thread has a well-defined execution order.
The runtime generally allows multithreaded usage (although the engine itself can't be accessed from other threads). This can make fast/deterministic unloading/reloading of AppDomains complicated though once you start to interop with managed code.

Internally, the game likely has other dedicated threads, but generally they don't affect the programming surface for game developers.

NodeJS

NodeJS uses the same threading model as that of browsers. There's a main thread that runs a largely event-driven program where event pumping is hidden from the developer, and auxiliary threads are completely isolated except for very specific, very general mechanisms (in the case of web workers, for example, via message passing of serialized objects, although you can also do a hand-off).

That's how you end up with scripts that run when loaded and thereafter only in response to callbacks and events. All objects have thread affinity and need not be made thread-safe (not for the sake of the scripts, anyway).

Because of the single thread of execution model, you'll note there are no synchronization primitives - no semaphores, no locks, no condition variables, etc.

There are variations on this scheme especially with regards to the boundaries of the isolated threads, for example Audio worklets allow you to do processing on audio packets within the audio processing pipeline, which has very low latency requirements and so things are tuned to be very lightweight.

If you happen to find yourself writing native code to run within NodeJS, you should bear the model very much in mind - you'll notice that there is quite a bit of work to be done to properly honor the threading model that scripts rely on.

libuv

libuv was written to power NodeJS, and so it's not surprising that it also reflects a programming model that is based on events, callbacks and promises ("event-driven asynchronous I/O model").

The documentation provides more details on how the event loop works (aka "the I/O loop"), such as defining the time for closing handles and such.

Again, because of the usage approach, the even loop and object handles are generally not thread-safe.

PostSharp Models

I'm including the PostSharp threading models in this post because I find it an interesting way of explicitly modeling the threading contract at the type level in a runtime like .NET, where as we saw in .NET Design Guidelines above, has at best some guidelines over largely unenforced/unverified content.

The models include common patterns like immutable, freezable, thread-affinitized, synchronized, and a few others.

In Summary

Here is a nice table with a summary of some of the aspects discussed.

Model	Object Access	Reentry	Calling	Callbacks
Native free-for-all	Direct	Yes	No constraints	No constraints
Message Queues	Messages	Possible	Send or post message	Message-based or direct invocation
.NET	Direct	Yes	No constraints	No constraints
COM	Direct or proxy	Yes (except ASTA)	Via interfaces, apartment-aware	Via interfaces, apartment-aware
Fixed (eg WebRTC)	Direct	Possible	Direct (internally queued)	On specific thread
NodeJS/libuv	Direct on loop thread	Yes	Method call or message	Only on loop thread

Happy threading!

References:

Storage class specifiers in cppreference.com, as well as initialization of non-local variables, for dicussions on the few occasions where the language cares/honors threading behavior.
Writing Thread-Safe Code with Threading Models in PostSharp documentation.
Processes, Threads, and Apartments in the COM documentation, describing the architecutre and apartment models.
Descriptions and workings of OLE threading models in the Windows SDK for even more information on how apartments interact.
WebRTC Native API, with a description of the threading model.
Using Messages and Message Queues in the Windows desktop documentation, with a description of message queue usage and behavior.
Synchronizing data for multithreading in the .NET fundamentals documentation, describing ye olde guidelines.
I/O Completion Ports, not covered here for brevity, to efficiently manage multiple asynchronous I/O requests with a pre-allocated threadpool.
Order of execution for event functions in the Unity documentation, describing in detail the game loop stages.
The Node.js Event Loop, Timers, and process.nextTick(), with a NodeJS-colored-glasses view of its event loop.
The I/O loop, with a libuv-colored-glasses view of its event loop.
Raymond's series on COM Contexts, which are above all a nifty way to yank out proxies to your objects.
Using web workers, describing the web worker usage and communication mechanism.

Tags: design

Home