Thinking about Time

Concurrency is hard because thinking about multiple things happening at once defies our human perceptions of linear time. Despite our limited cognitive abilities, we’ve managed to get chunks of silicon to perform computations in parallel, allowing us to speed up computations.

When taking advantage of this, it’s important to understand the guarantees provided by the concurrency model you’re using and avoid making assumptions unless they’re supported by the model. Let’s take a look at an example to make things concrete.

In the banking world, accounts are constantly subject to deposits and withdrawal. A large business account may have multiple users, each using the account at different locations. Let’s see what can go wrong when a single account has to support concurrent deposits and withdrawals. Here’s some Go code for managing our bank accounts.

type Account struct {
  balance int
}

func (a *Account) Deposit(amount int) {
  currTotal := a.balance
  a.balance = currTotal + amount
}

func (a *Account) Withdraw(amount int) error {
  currTotal := a.balance
  if currTotal < amount {
    return errors.New("insufficient funds")
  }
  a.balance = currTotal - amount
  return nil
}

Pretty straight-forward, if a little verbose; but I’ve used this implementation on purpose. We increment the balance when we deposit and decrement when we withdraw (if we have enough funds). If we were to run this code as-is using multiple goroutines, we might be able to see a problem.

Let’s imagine two threads are calling Deposit at the same time. Both are operating on the same *Account, so they’re each using values stored in the same block of memory. We receive no guarantees from the Go runtime that the lines involved execute in any particular order. Below is an example of one thing that can go wrong. In the below diagram I’ve tried to show the lines executed one at a time in order as time flows down the page.

a = &Account{ balance: 100 }

Thread A:                                       Thread B:
  a.Deposit(100)                                  a.Deposit(100)
  currTotal := a.balance          // read 100  
                                                  currTotal := a.balance         // read 100
  a.balance = currTotal + amount  // write 200  
                                                  a.balance = currTotal + amount // write 200
                                                  
final balance: $200 (?!) but I deposited $100 twice!

We made two $100 deposits; but our account will only show $200. If we work at a bank and put this in production, we can expect angry calls from panicked customers.

Since we have no guarantees about the order of execution, any order is possible when goroutines are running concurrently. Each goroutine can read the current balance onto their stack, add the amount it’s been asked to deposit, and then write that amount back to memory. With this particular order of operations, that results in a lost update.

In general, any order of interleavings between two executing threads is possible. This results in an exponentially increasing number of combinations to think about. In large programs, there’s no way to check them all or even think about them, so in practice, we take mental shortcuts.

One useful tool for taking shortcuts is to think about which lines must happen-before which other lines. For instance, when using a sync.Mutex, a Lock() operation on one goroutine blocks until the corresponding Unlock() happens on another thread. Lock() always happens-before Unlock() operations. This gives us a chance to cut down on the sorts of possibilities that can occur.

Let’s look at our bank Account again and fix things. Below is the same code, but I’ve added a sync.Mutex which guards writes.

type Account struct {
  balance int
  mut sync.Mutex
}

func (a *Account) Deposit(amount int) {
  a.mut.Lock() // potentially blocks
  defer a.mut.Unlock()
  
  currTotal := a.balance
  a.balance = currTotal + amount
}

func (a *Account) Withdraw(amount int) error {
  a.mut.Lock() // potentially blocks
  defer a.mut.Unlock()
   
  currTotal := a.balance
  if currTotal < amount {
    return errors.New("insufficient funds")
  }
  a.balance = currTotal - amount
  return nil
}

Best-practice: Use a defer statement when calling Unlock(), wherever possible, to protect against panics in code. Even when you can prove your code can’t possibly panic today, later changes may introduce that possibility.

sync.Mutex implements a mututal exclusion lock. Only one goroutine is allowed to “hold” the lock at any given time. With our changes above, the first goroutine to call Lock() will be given the lock. Any further goroutines which call Lock() are blocked until Unlock() is called.

If we’ve done our jobs as programmers right, the only goroutine which can possibly call Unlock() is the goroutine which called Lock() in the first place. That’s the case for the implementation above and it should always be true in your programs as well. In fact we can prove it: since we only call Unlock() in a defer, and our code above always defers the call toUnlock() immediately after calling Lock(), the only way a thread can call Unlock() is if they previously called Lock(). Since calls to Lock() block until the goroutine holds the lock, the only way a func calling Lock()can complete is if the goroutine running the func holds the lock, which means that Unlock() is only ever called by a thread which has previously called Lock().

If you didn’t follow the proof, don’t worry, we’re about to revisit our example. Let’s look at the bad execution above again and see if it’s possible when we’ve included the locks.

a = &Account{ balance: 100 }

Thread A:                                       Thread B:
  a.Deposit(100)                                  a.Deposit(100)
  a.mut.Lock()                                   
                                                  a.mut.Lock() // blocks!
  currTotal := a.balance          // read 100  
  a.balance = currTotal + amount  // write 200
  a.mut.Unlock() // was deferred                               // unblocked!
                                                  currTotal := a.balance           // read 200
                                                  a.balance = currTotal + amount   // write 300
                                                  a.mut.Unlock()  // was deferred
                                                  
final balance: $300!

When one thread holds the lock, any other threads calling Lock() block until Unlock() is called. Calls to Lock() are also synchronized, meaning that the Go runtime ensures that calls can’t complete simultaneously. Whichever goroutine completes its call first gets the lock and is able to proceed. All other goroutines have to wait until that thread calls Unlock(), at which point only one of the waiting goroutines will be granted the lock.

So there’s a “happens-before” relationship between the Lock() and Unlock() calls in our code. The Unlock() in each goroutine must “happen before” any call to Lock() completes in another goroutine. This guarantees in our code that writes are sequential. But it also exponentially cuts down on the number of interleavings we have to think about. For any pair of goroutines, there are only two possible orders; either A gets the lock first, or B does.

When working with concurrency, it’s crucial to identify (and prove) which lines have to happen before which others. Trying to think through all the possibilities otherwise will drive you crazy.

Next time we’ll actually show some concurrent code, covering goroutines and the go keyword.