Skip to content

Latest commit

 

History

History
365 lines (295 loc) · 16.3 KB

6 - Memory and Concurrency Model.rst

File metadata and controls

365 lines (295 loc) · 16.3 KB

Swift Memory and Concurrency Model

The goal of this writeup is to provide a safe and efficient way to model, design, and implement concurrent applications in Swift. It is believed that it will completely eliminate data races and reduce deadlocks in swift apps, and will allow for important performance wins as well. This happens by eliminating shared mutable data, locks, and the need for most atomic memory accesses. The model is quite different from what traditional unix folks are used to though. :-)

As usual, Swift will eventually support unsafe constructs, so if it turns out that the model isn't strong enough for some particular use case, that coder can always fall back to the hunting for concurrency with arrows and bone knives instead of using the native language support. Ideally this will never be required though.

This starts by describing the basic model, then talks about issues and some possible extensions to improve the model.

Language Concepts

The unit of concurrency in Swift is an Actor. As an execution context, an actor corresponds to a lightweight thread or dispatch queue that can respond to a declared list of async messages. Each actor has its own private set of mutable data, can communicate with other actors by sending them async messages, and can lock each other to perform synchronous actions.

Only one message can be active at a time (you can also extend the model to allow read/writer locks as a refinement) in an actor. Through the type system and runtime, we make it impossible for an actor to read or change another actor's mutable data: all mutable data is private to some actor. When a message is sent from one actor to another, the argument is deep copied to ensure that two actors can never share mutable data.

To achieve this, there are three important kinds of memory object in Swift. Given a static type, it is obvious what kind it is from its definition. These kinds are:

  1. Immutable Data - Immutable data (which can have a constructor, but whose value cannot be changed after it completes) is sharable across actors, and it would make sense to unique them where possible. Immutable data can (transitively) point to other immutable data, but it isn't valid (and the compiler rejects) immutable data that is pointing to mutable data. For example:

    struct [immutable,byref] IList { data : int, next : List }
    ...
    var a = IList(1, IList(2, IList(3, nil)))
    ...
    a.data = 42 // error, can't mutate field of immutable 'IList' type!
    a.next = nil // error, same deal
    a = IList(2, IList(3, nil)) // ok, "a" itself is mutable, it now points to something new.
    a = IList(1, a) // ok
    
Strings are a very important example of (byref) immutable data.
  1. Normal mutable data - Data (objects, structs, etc) are mutable by default, and are thus local to the containing actor. Mutable data can point to immutable data with no problem, and since it is local to an actor, no synchronization or atomic accesses are ever required for any mutable data. For example:

    struct [byref] MEntry { x : int, y : int, list : IList }
    ...
    var b = MEntry(4, 2, IList(1, nil))
    b.x = 4 // ok
    b.y = 71 // ok
    b.list = nil // ok
    b.list = IList(2, nil) // ok
    b.list.data = 42 // error, can't change immutable data.
    

    As part of mutable data, it is worth pointing out that mutable "global variables" in swift are not truly global, they are local to the current actor (somewhat similar to "thread local storage", or perhaps to "an ivar on the actor"). Immutable global variables (like lookup tables) are simple immutable data just like today. Global variables with "static constructors / initializers" in the C++ sense have their constructor lazily run on the first use of the variable.

    Not having mutable shared state also prevents race conditions from turning into safety violations. Here's a description of the go issue which causes them to be unsafe when running multiple threads: http://research.swtch.com/2010/02/off-to-races.html

    Note that a highly desirable feature of the language is the ability to take an instance of a mutable type and make it immutable somehow, allowing it to be passed around by-value. It is unclear how to achieve this, but perhaps typestate can magically make it possible.

  2. Actors - In addition to being the unit of concurrency, Actors have an address and are thus memory objects. Actors themselves can have mutable fields, one actor can point to another, and mutable data can point to an actor. However, any mutable data in an actor can only be directly accessed by that actor, it isn't possible for one actor to access another actor's mutable data. Also note that actors and objects are completely unrelated: it is not possible for an object to inherit from an actor, and it is not possible for an actor to inherit from an object. Either can contain the other as an ivar though.

    Syntax for actors is TBD and not super important for now, but here is a silly multithreaded mandelbrot example, that does each pixel in "parallel", to illustrate some ideas:

    func do_mandelbrot(x : float, y : float) -> int {
      // details elided
    }
    
    actor MandelbrotCalculator {
      func compute(x : float, y : float, Driver D) {
        var num_iters = do_mandelbrot(x, y)
        D.collect_point(x, y, num_iters)
      }
    }
    
    actor Driver {
      var result : image; // result and numpoints are mutable per-actor data.
      var numpoints : int;
      func main() {
        result = new image()
        foreach i in -2.0 .. 2.0 by 0.001 {
          // Arbitrarily, create one MandelbrotCalculator for each row.
          var MC = new MandelbrotCalculator()
          foreach j in -2.0 .. 2.0 by 0.001 {
            MC.compute(i, j, self)
            ++numpoints;
          }
        }
      }
    
      func collect_point(x : float, y : float, num_iters : int) {
        result.setPoint(x, y, Color(num_iters, num_iters, num_iters))
        if (--numpoints == 0)
        draw(result)
      }
    }
    

    Though actors have mutable data (like 'result' and 'numpoints'), there is no need for any synchronization on that mutable data.

    One of the great things about this model (in my opinion) is that it gives programmers a way to reason about granularity, and the data copy/sharing issue gives them something very concrete and understandable that they can use to make design decisions when building their app. While it is a common pattern to have one class that corresponds to a thread in C++ and ObjC, this is an informal pattern -- baking this into the language with actors and giving a semantic difference between objects and actors makes the tradeoffs crisp and easy to understand and reason about.

Communicating with Actors

As the example above shows, the primary and preferred way to communicate with actors is through one-way asynchronous messages. Asynchronous message sensed are nice because they cannot block, deadlock, or have other bad effects. However, they aren't great for two things: 1) invoking multiple methods on an actor that need to be synchronized together, and 2) getting a value back from the actor.

Sending multiple messages asynchronously

With the basic approach above, you can only perform actions on actors that are built into the actor. For example, if you had an actor with two methods:

actor MyActor {
  func foo() {…}
  func bar() {…}
  func getvalue() -> double {… }
}

Then there is no way to perform a composite operation that needs to "atomically" perform foo() and bar() without any other operations getting in between. If you had code like this:

var a : MyActor = …
a.foo()
a.bar()

Then the foo/bar methods are both sent asynchronously, and (while they would be ordered with respect to each other) there is no guarantee that some other method wouldn't be run in between them. To handle this, the async block structure can be used to submit a sequence of code that is atomically run in the actor's context, e.g.:

var a : MyActor = …
async a {
  a.foo()
  a.bar()
}

This conceptually submits a closure to run in the context of the actor. If you look at it this way, an async message send is conceptually equivalent to an async block. As such, the original example was equivalent to:

var a : MyActor = …
async a { a.foo() }
async a { a.bar() }

which makes it pretty clear that the two sends are separate from each other. We could optionally require all accesses to an actor to be in an async block, which would make this behavior really clear at the cost of coding clarity.

It is worth pointing out that you can't asynchronously call a message and get its return value back. However, if the return value is ignored, a message send can be performed. For example, "a.getvalue()" would be fine so long as the result is ignored or if the value is in an explicit async block structure.

From an implementation perspective, the code above corresponds directly to GCD's dispatch_asynch on a per-actor queue.

Performing synchronous operations

Asynchronous calls are nice and define away the possibility of deadlock, but at some point you need to get a return value back and async programming is very awkward. To handle this, a 'synch' block is used. For example, the following is valid:

var x : double
synch a {
  x = a.getvalue();
}

but this is not:

var x = a.getvalue();

A synch block statement is directly related to dispatch_sync and conceptually locks the specified actor's lock/queue and performs the block within its context.

Memory Ownership Model

Within an actor there is a question of how ownership is handled. It's not in the scope of this document to say what the "one true model" is, but here are a couple of interesting observations:

  1. Automated reference counting would be much more efficient in this model than in ObjC, because the compiler statically knows whether something is mutable data or is shared. Mutable data (e.g. normal objects) can be ref counted with non-atomic reference counting, which is 20-30x faster than atomic adjustments. Actors are shared, so they'd have to have atomic ref counts, but they should be much much less common than the normal objects in the program. Immutable data is shared (and thus needs atomic reference counts) but there are optimizations that can be performed since the edges in the pointer graph can never change and cycles aren't possible in immutable data.
  2. Garbage collection for mutable data becomes a lot more attractive than in ObjC for four reasons: 1) all GC is local to an actor, so you don't need to stop the world to do a collection. 2) actors have natural local quiescent points: when they have finished servicing a message, if their dispatch queue is empty, they go to sleep. If nothing else in the CPU needs the thread, it would be a natural time to collect. 3) GC would be fully precise in swift, unlike in ObjC, no conservative stack scanning or other hacks are needed. 4) If GC is used for mutable data, it would make sense to still use reference counting for actors themselves and especially for immutable data, meaning that you'd have no "whole process" GC.
  3. Each actor can use a different memory management policy: it is completely fine for one actor to be GC and one actor to be ARC, and another to be manually malloc/freed (and thus unsafe) because actors can't reach each other's pointers. However, realistically, we will still have to pick "the right" model, because different actors can share the same code (e.g. they can instantiate the same objects) and the compiled code has to implement the model the actor wants.

Issues with this Model

There are two significant issues with this model: 1) the amount of data copying may be excessive if you have lots of messages each passing lots of mutable data that is deep copied, and 2) the awkward nature of async programming for some (common) classes of programming. For example, the "branch and rejoin" pattern in the example requires a counter to know when everyone rejoined, and we really want a "parallel for loop".

I'd advocate implementing the simple model first, but once it is there, there are several extensions that can help with these two problems:

No copy is needed for some important cases: If you can prove (through the type system) that an object graph has a single (unique) pointer to it, the pointer value can be sent in the message and nil'd out in the sender. In this way you're "transferring" ownership of the subgraph from one actor to the other. It's not fully clear how to do this though. Another similar example: if we add some way for an actor to self destruct along with a message send, then it is safe for an actor to transfer any and all of its mutable state to another actor when it destroys itself, avoiding a copy.

Getters for trivial immutable actor fields: If an actor has an ivar with an immutable type, then we can make all stores to it atomic, and allow other actors to access the ivar. Silly example:

actor Window {
  var title : string; // string is an immutable by-ref type.
  ...
}

...
var x = new Window;
print(x.title) // ok, all stores will be atomic, an (recursively) immutable data is valid in all actors, so this is fine to load.
...

Parallel for loops and other constructs that don't guarantee that each "thread" has its own non-shared mutable memory are very important and not covered by this model at all. For example, having multiple threads execute on different slices of the same array would require copying the array to temporary disjoint memory spaces to do operations, then recopy it back into place. This data copying can be awkward and reduce the benefits of parallelism to make it non- profitable.

There are multiple different ways to tackle this. We can just throw it back into the programmer's lap and tell them that the behavior is undefined if they get a race condition. This is fine for some systems levels stuff, but defeats the purpose of having a safe language and is clearly not good enough for mobile code.

Another (more aggressive) approach is to provide a parallel for loop, and use it as a hint that each iteration can be executed in parallel. It would then be up to the implementation to try to prove the safety of this (e.g. using dependence analysis), and if provable everything is good. If not provable, then the implementation would have to compile it as serial code, or use something like an STM approach to guarantee that the program is correct or the error is detected. There is much work in academia that can be tapped for this sort of thing. One nice thing about this approach is that you'd always get full parallel performance if you "disable checking", which could be done in a production build or something.

Some blue sky kinds of random thoughts

Distributed Programming - Since deep copy is part of the language and "deep copy" is so similar to "serialization", it would be easy to do a simple implementation of something like "Distributed Objects". The primary additional thing that is required is for messages sent to actors to be able to fail, which is required anyway. The granularity issues that come up are similar in these two domains.

Immutable Data w/Synch and Lazy Faulting - Not a fully baked idea, but if you're heavily using immutable data to avoid copies, a "distributed objects" implementation would suffer because it would have to deep copy all the immutable data that the receiver doesn't have, defeating the optimization. One approach to handling this is to treat this as a data synch problem, and have the client fault pieces of the immutable data subgraph in on demand, instead of eagerly copying it.

OpenCL Integration with this model could be really natural: the GPU is an inherently async device to talk to.

UNIX processes: Actors in a shared address space with no shared mutable data are related to processes in a unix app that share by communicating with mmap etc.