February 2026

Message Passing Is Shared Mutable State

What Go, Java, and Erlang reveal about the state of concurrency

Something about the way we write concurrent programs has always felt wrong to me. When I pick up a new language and look at its concurrency model I get the same uneasy feeling. The APIs change, the terminology changes, but the underlying patterns look strangely familiar.

Maybe you’ve felt this too. The tools get better, the abstractions get nicer, but the core problem never seems to go away.

Any software developer who has tackled concurrency in a serious project has the battle scars of dealing with the pitfalls of multi-threaded and concurrent programs: the touchy, often clunky APIs and synchronization mechanisms, the dread of debugging data races and deadlocks, and the brain-bending non-locality of it all.

It’s taken me a while to understand what feels so off about them to the point I can articulate it, but I think I’m finally ready. Let’s start with a somewhat recent language: Go.

The Prediction

In 2006, Edward Lee published The Problem with Threads. His argument was stark: threads are “wildly nondeterministic,” and the programmer’s job becomes pruning that nondeterminism rather than expressing computation.

But Lee went further than criticizing threads: he argued that the shared-memory vs. message-passing debate was a false dichotomy. Both approaches model concurrency as threads of execution that need to be coordinated. Switching the coordination mechanism from locks to messages doesn’t change the underlying model, it changes the syntax of failure.

At the time, this was a contrarian position, and the mainstream languages were moving the other way.

The Experiment

Three years later Go launched with a concurrency philosophy built on the opposite bet. “Do not communicate by sharing memory,” the Go documentation urged. “Instead, share memory by communicating.” Channels (typed, first-class message-passing primitives) were Go’s answer to the concurrency mess.

Go wasn’t a toy. Backed by Google, it was adopted by the infrastructure that runs the modern internet: Docker, Kubernetes, etcd, gRPC, CockroachDB. These systems are among the most heavily used Go codebases in existence and are maintained by experienced teams with extensive code review and testing practices. Tens of thousands of developers wrote concurrent code following Go’s guidance, using channels instead of mutexes or locks, sharing memory by communicating.

It was the most prominent, well-resourced, real-world test of the message-passing hypothesis the industry has ever run.

The Results

In 2019 Tengfei Tu and colleagues studied 171 real concurrency bugs across these flagship Go projects and published Understanding Real-World Concurrency Bugs in Go. The findings were striking: message-passing bugs were at least as common as shared-memory bugs.

Around 58% of blocking bugs (i.e. goroutines stuck, unable to make progress) were caused by message passing, not shared memory. The thing that was supposed to be the cure was producing the same problems as the disease.

To be clear, message passing does eliminate one important class of concurrency bugs: unsynchronized memory access. If two goroutines communicate only through channels, they cannot simultaneously mutate the same variable. But eliminating data races does not eliminate coordination failures. Deadlocks, leaks, protocol violations, and nondeterministic scheduling remain.

Go ships with a built-in deadlock detector, but it only caught 2 of the 21 blocking bugs the researchers tested. Two. The race detector fared better on non-blocking bugs, catching roughly half, which still means half the concurrency bugs in production Go code are invisible to the tools that were designed to find them.

Most of these bugs had long lifetimes. They were committed, shipped, ran in production, and weren’t discovered until someone happened to trigger the right interleaving. Testing didn’t find them, and code review didn’t find them. Instead they hid in some of the most heavily scrutinized Go codebases.

Lee’s prediction that switching the coordination mechanism wouldn’t address the root cause was confirmed.

The Code

Here’s a simplified bug in Kubernetes from the paper. A function spawns a goroutine to handle a request with a timeout:

func finishReq(timeout time.Duration) ob {
    ch := make(chan ob)
    go func() {
        result := fn()
        ch <- result  // blocks forever if timeout wins
    }()
    select {
    case result = <-ch:
        return result
    case <-time.After(timeout):
        return nil
    }
}

If fn() takes longer than the timeout then the parent returns nil and nobody ever reads from ch. The child goroutine blocks on ch <- result and will never be cleaned up. Go garbage-collects objects, but it doesn’t garbage-collect goroutines blocked on channels that will never be read.

In Kubernetes (the system managing your production container infrastructure) every one of these leaked goroutines hangs onto references and never lets the memory go. Under load, they accumulate, and the process will slowly degrade until it crashes or gets OOM-killed. This is a reliability failure in the software responsible for keeping your other software running, caused by a single missing buffer in a channel.

The fix is one character: change make(chan ob) to make(chan ob, 1).

Now look at the same logic in Java:

BlockingQueue<Result> queue = new ArrayBlockingQueue<>(1);

new Thread(() -> {
    Result result = computeResult();
    try { queue.put(result); }   // blocks if queue is full
    catch (InterruptedException e) { }
}).start();

try {
    Result result = queue.poll(timeout, TimeUnit.SECONDS);
    if (result != null) {
        return result;
    } else {
        return null;
        // thread still running, still blocked on put()
        // queue object still holds a reference
        // nothing will ever clean this up
    }
} catch (InterruptedException e) { return null; }

No Java developer would look at this and say “I’m doing message passing.” They’d say “I’m using a shared concurrent queue,” because BlockingQueue lives in java.util.concurrent, right next to Mutex and Semaphore. They’d know it carries all the risks of shared mutable state.

But this is the Go channel code. Same shared mutable data structure, same blocking semantics, same bug. If the timeout fires then nobody consumes from the queue and the producer blocks forever. The thread leaks. The structure is identical, the only thing that changed is the vocabulary.

In Java, we call this a shared concurrent queue and we understand the risks. In Go, we call it a channel and pretend it’s something different.

Message passing is often presented as an alternative to shared mutable state, but in practice it frequently reintroduces shared coordination structures under another name.

Why This Keeps Happening

Arthur O’Dwyer, writing about the paper, identified what he called the “original sin” of Go channels: they aren’t really channels at all. A channel has two distinct endpoints, a producer end and a consumer end, with different types and capabilities. If the last consumer disappears, the runtime can detect it, unblock producers, and clean up.

A Go channel has none of this. It’s a single object, a concurrent queue, shared between however many goroutines happen to hold a reference. Any goroutine can send, and any goroutine can receive. There are no distinct endpoints, no directional typing, no way for the runtime to detect when one side is gone. It is a mutable data structure shared between multiple threads, where any thread can mutate the shared state by pushing or popping.

Once you see this, the bug categories in the study become predictable rather than surprising. Every classic failure mode of shared mutable state has a channel equivalent:

Deadlock. Goroutine A sends to a channel and waits for a response on another. Goroutine B does the reverse. Both block. This is a circular dependency on shared state, i.e. the same structure as a mutex deadlock but expressed through queues instead of locks. These issues were found in Docker, Kubernetes, and gRPC.

Leak. Nobody reads from a channel, so the sender blocks forever. The shared queue retains a reference to the goroutine, preventing cleanup. The Kubernetes bug above is this pattern: a resource leak caused by a dangling reference to shared state.

Race. If multiple goroutines read from the same channel, which one gets each message? The answer is nondeterministic: the runtime’s scheduler picks one. This is concurrent access to a shared resource, with the nondeterminism mediated by the scheduler instead of explicit locking. The paper documents these in etcd and CockroachDB.

Protocol violation. A goroutine sends a message the receiver doesn’t expect, or sends on a closed channel (which panics in Go), or closes an already-closed channel. The shared object’s implicit contract was violated, the same category of bug that shared mutable state has always produced.

Every one of these is a classic shared-mutable-state bug wearing a message-passing costume.

And this isn’t just a Go problem. Message passing as a concurrency model doesn’t eliminate shared state, it relocates it. The data being communicated may transfer cleanly from sender to receiver, but the communication mechanism itself (channel, mailbox, or message queue) is a shared mutable resource. And that resource inherits every problem shared mutable state has always had.

Even Erlang demonstrates this. Erlang processes are genuinely isolated with separate heaps, no shared references, and messages copied between processes. These are the strongest form of the message-passing guarantee available anywhere, and yet researchers found previously unknown race conditions hiding in Erlang’s own heavily-tested standard library.

The races clustered around ETS tables, Erlang’s escape hatch from pure actor isolation, which are shared mutable storage that exists because the pure actor model didn’t meet performance requirements. The safety model promised isolation, yet reality demanded a shared mutable escape hatch. The escape hatch reintroduced exactly the bugs the model was supposed to prevent.

Message passing solves concurrency bugs the way moving your mess from one room to another solves clutter.

So Now What?

When a Go programmer hits a channel deadlock and considers reaching for a mutex, they’re choosing between two approaches that fail for the same structural reason. “Go channels are fine if you use them correctly” is a true statement. So is “mutexes are fine if you use them correctly.” They’re the same statement.

Lee saw this in 2006. The shared-memory vs. message-passing debate is an argument about which coordination mechanism to use. It has never questioned whether we’re even asking the right question.

If both sides of the dichotomy fail then maybe the dichotomy itself is wrong. Maybe the problem isn’t which tool we use to coordinate concurrent execution. Maybe there’s something deeper about the foundation that both approaches share, something we haven’t questioned yet.

I think there is. Some languages have tried different foundations and attacked aspects of the problem with real insight, but none of them have fully broken through to the mainstream. It’s worth exploring why, so that’s where we’re headed.


References

  • Lee, Edward A. “The Problem with Threads.” IEEE Computer 39.5 (2006): 33–42. PDF
  • Tu, Tengfei, et al. “Understanding Real-World Concurrency Bugs in Go.” ASPLOS 2019. PDF
  • O’Dwyer, Arthur. “Understanding Real-World Concurrency Bugs in Go.” Blog post, June 6, 2019.
  • Christakis, Maria and Konstantinos Sagonas. “Static Detection of Race Conditions in Erlang.” PADL 2010. PDF