|
| 1 | +--- |
| 2 | +author: "Dave Abrahams" |
| 3 | +date: "2014-04-10" |
| 4 | +--- |
| 5 | + |
| 6 | +# The Swift Array Design |
| 7 | + |
| 8 | +## Goals |
| 9 | + |
| 10 | +1. Performance equivalent to C arrays for subscript get/set of |
| 11 | + non-class element types is the most important performance goal. |
| 12 | +2. It should be possible to receive an `NSArray` from Cocoa, represent |
| 13 | + it as an `Array<AnyObject>`, and pass it right back to Cocoa as an |
| 14 | + `NSArray` in O(1) and with no memory allocations. |
| 15 | +3. Arrays should be usable as stacks, so we want amortized O(1) append |
| 16 | + and O(1) popBack. Together with goal #1, this implies a |
| 17 | + `std::vector`-like layout, with a reserved tail memory capacity that |
| 18 | + can exceed the number of actual stored elements. |
| 19 | + |
| 20 | +To achieve goals 1 and 2 together, we use static knowledge of the |
| 21 | +element type: when it is statically known that the element type is not a |
| 22 | +class, code and checks accounting for the possibility of wrapping an |
| 23 | +`NSArray` are eliminated. An `Array` of Swift value types always uses |
| 24 | +the most efficient possible representation, identical to that of |
| 25 | +`ContiguousArray`. |
| 26 | + |
| 27 | +## Components |
| 28 | + |
| 29 | +Swift provides three generic array types, all of which have amortized |
| 30 | +O(1) growth. In this document, statements about **ArrayType** apply to |
| 31 | +all three of the components. |
| 32 | + |
| 33 | +- `ContiguousArray<Element>` is the fastest and simplest of the |
| 34 | + three\--use this when you need \"C array\" performance. The elements |
| 35 | + of a `ContiguousArray` are always stored contiguously in memory. |
| 36 | + |
| 37 | +  |
| 38 | + |
| 39 | +- `Array<Element>` is like `ContiguousArray<Element>`, but optimized |
| 40 | + for efficient conversions from Cocoa and back\--when `Element` can |
| 41 | + be a class type, `Array<Element>` can be backed by the (potentially |
| 42 | + non-contiguous) storage of an arbitrary `NSArray` rather than by a |
| 43 | + Swift `ContiguousArray`. `Array<Element>` also supports up- and |
| 44 | + downcasts between arrays of related class types. When `Element` is |
| 45 | + known to be a non-class type, the performance of `Array<Element>` is |
| 46 | + identical to that of `ContiguousArray<Element>`. |
| 47 | + |
| 48 | +  |
| 49 | + |
| 50 | +- `ArraySlice<Element>` is a subrange of some `Array<Element>` or |
| 51 | + `ContiguousArray<Element>`; it\'s the result of using slice |
| 52 | + notation, e.g. `a[7...21]` on any Swift array `a`. A slice always |
| 53 | + has contiguous storage and \"C array\" performance. Slicing an |
| 54 | + *ArrayType* is O(1) unless the source is an `Array<Element>` backed |
| 55 | + by an `NSArray` that doesn\'t supply contiguous storage. |
| 56 | + |
| 57 | + `ArraySlice` is recommended for transient computations but not for |
| 58 | + long-term storage. Since it references a sub-range of some shared |
| 59 | + backing buffer, a `ArraySlice` may artificially prolong the lifetime |
| 60 | + of elements outside the `ArraySlice` itself. |
| 61 | + |
| 62 | +  |
| 63 | + |
| 64 | +## Mutation Semantics |
| 65 | + |
| 66 | +The *ArrayType*s have full value semantics via copy-on-write (COW): |
| 67 | + |
| 68 | +```swift |
| 69 | +var a = [1, 2, 3] |
| 70 | +let b = a |
| 71 | +a[1] = 42 |
| 72 | +print(b[1]) // prints "2" |
| 73 | +``` |
| 74 | + |
| 75 | +## Bridging Rules and Terminology for all Types |
| 76 | + |
| 77 | +- Every class type or `@objc` existential (such as `AnyObject`) is |
| 78 | + **bridged** to Objective-C and **bridged back** to Swift via the |
| 79 | + identity transformation, i.e. it is **bridged verbatim**. |
| 80 | + |
| 81 | +- A type `T` that is not [bridged verbatim](#bridging-rules-and-terminology-for-all-types) |
| 82 | + can conform to `BridgedToObjectiveC`, which specifies its conversions to |
| 83 | + and from Objective-C: |
| 84 | + |
| 85 | + ```swift |
| 86 | + protocol _BridgedToObjectiveC { |
| 87 | + typealias _ObjectiveCType: AnyObject |
| 88 | + func _bridgeToObjectiveC() -> _ObjectiveCType |
| 89 | + class func _forceBridgeFromObjectiveC(_: _ObjectiveCType) -> Self |
| 90 | + } |
| 91 | + ``` |
| 92 | + |
| 93 | + ### Note |
| 94 | + |
| 95 | + > Classes and `@objc` existentials shall not conform to |
| 96 | + `_BridgedToObjectiveC`, a restriction that\'s not currently |
| 97 | + enforceable at compile-time. |
| 98 | + |
| 99 | +- Some generic types (`Array<T>` in particular) bridge to |
| 100 | + Objective-C only if their element types bridge. These types conform |
| 101 | + to `_ConditionallyBridgedToObjectiveC`: |
| 102 | + |
| 103 | + ```swift |
| 104 | + protocol _ConditionallyBridgedToObjectiveC : _BridgedToObjectiveC { |
| 105 | + class func _isBridgedToObjectiveC() -> Bool |
| 106 | + class func _conditionallyBridgeFromObjectiveC(_: _ObjectiveCType) -> Self? |
| 107 | + } |
| 108 | + ``` |
| 109 | + |
| 110 | + Bridging from, or *bridging back* to, a type `T` conforming to |
| 111 | + `_ConditionallyBridgedToObjectiveC` when |
| 112 | + `T._isBridgedToObjectiveC()` is `false` is a user programming error |
| 113 | + that may be diagnosed at runtime. |
| 114 | + `_conditionallyBridgeFromObjectiveC` can be used to attempt to |
| 115 | + bridge back, and return `nil` if the entire object cannot be |
| 116 | + bridged. |
| 117 | + |
| 118 | + ### Implementation Note |
| 119 | + |
| 120 | + There are various ways to move this detection to compile-time |
| 121 | + |
| 122 | + - For a type `T` that is not [bridged verbatim](#bridging-rules-and-terminology-for-all-types), |
| 123 | + |
| 124 | + - if `T` conforms to `BridgedToObjectiveC` and either |
| 125 | + |
| 126 | + - `T` does not conform to `_ConditionallyBridgedToObjectiveC` |
| 127 | + - or, `T._isBridgedToObjectiveC()` |
| 128 | + |
| 129 | + then a value `x` of type `T` is **bridged** as |
| 130 | + `T._ObjectiveCType` via `x._bridgeToObjectiveC()`, and an object |
| 131 | + `y` of `T._ObjectiveCType` is **bridged back** to `T` via |
| 132 | + `T._forceBridgeFromObjectiveC(y)` |
| 133 | + |
| 134 | + - Otherwise, `T` **does not bridge** to Objective-C |
| 135 | + |
| 136 | +## `Array` Type Conversions |
| 137 | + |
| 138 | +From here on, this document deals only with `Array` itself, and not |
| 139 | +`Slice` or `ContiguousArray`, which support a subset of `Array`\'s |
| 140 | +conversions. Future revisions will add descriptions of `Slice` and |
| 141 | +`ContiguousArray` conversions. |
| 142 | + |
| 143 | +### Kinds of Conversions |
| 144 | + |
| 145 | +In these definitions, `Base` is `AnyObject` or a trivial subtype |
| 146 | +thereof, `Derived` is a trivial subtype of `Base`, and `X` conforms to |
| 147 | +`_BridgedToObjectiveC`: |
| 148 | + |
| 149 | +- **Trivial bridging** implicitly converts `[Base]` to `NSArray` in |
| 150 | + O(1). This is simply a matter of returning the Array\'s internal |
| 151 | + buffer, which is-a `NSArray`. |
| 152 | + |
| 153 | +- **Trivial bridging back** implicitly converts `NSArray` to |
| 154 | + `[AnyObject]` in O(1) plus the cost of calling `copy()` on the |
| 155 | + `NSArray`.[^1] |
| 156 | + |
| 157 | +- **Implicit conversions** between `Array` types |
| 158 | + |
| 159 | + - **Implicit upcasting** implicitly converts `[Derived]` to |
| 160 | + `[Base]` in O(1). |
| 161 | + - **Implicit bridging** implicitly converts `[X]` to |
| 162 | + `[X._ObjectiveCType]` in O(N). |
| 163 | + |
| 164 | + ### Note |
| 165 | + |
| 166 | + > Either type of implicit conversion may be combined with [trivial |
| 167 | + bridging](#trivial bridging) in an implicit conversion to `NSArray`. |
| 168 | + |
| 169 | +- **Checked conversions** convert `[T]` to `[U]?` in O(N) via |
| 170 | + `a as [U]`. |
| 171 | + |
| 172 | + - **Checked downcasting** converts `[Base]` to `[Derived]?`. |
| 173 | + - **Checked bridging back** converts `[T]` to `[X]?` where |
| 174 | + `X._ObjectiveCType` is `T` or a trivial subtype thereof. |
| 175 | + |
| 176 | +- **Forced conversions** convert `[AnyObject]` or `NSArray` to `[T]` |
| 177 | + implicitly, in bridging thunks between Swift and Objective-C. |
| 178 | + |
| 179 | + For example, when a user writes a Swift method taking `[NSView]`, it |
| 180 | + is exposed to Objective-C as a method taking `NSArray`, which is |
| 181 | + force-converted to `[NSView]` when called from Objective-C. |
| 182 | + |
| 183 | + - **Forced downcasting** converts `[AnyObject]` to `[Derived]` in |
| 184 | + O(1) |
| 185 | + - **Forced bridging back** converts `[AnyObject]` to `[X]` in |
| 186 | + O(N). |
| 187 | + |
| 188 | + A forced conversion where any element fails to convert is considered |
| 189 | + a user programming error that may trap. In the case of forced |
| 190 | + downcasts, the trap may be [deferred](#deferred-checking-for-forced-downcasts) |
| 191 | + to the point where an offending element is accessed. |
| 192 | + |
| 193 | +### Note |
| 194 | + |
| 195 | +Both checked and forced downcasts may be combined with [trivial bridging |
| 196 | +back](#trivial bridging back) in conversions from `NSArray`. |
| 197 | + |
| 198 | +### Maintaining Type-Safety |
| 199 | + |
| 200 | +Both upcasts and forced downcasts raise type-safety issues. |
| 201 | + |
| 202 | +#### Upcasts |
| 203 | + |
| 204 | +TODO: this section is outdated. |
| 205 | + |
| 206 | +When up-casting an `[Derived]` to `[Base]`, a buffer of `Derived` object |
| 207 | +can simply be `unsafeBitCast`\'ed to a buffer of elements of type |
| 208 | +`Base`\--as long as the resulting buffer is never mutated. For example, |
| 209 | +we cannot allow a `Base` element to be inserted in the buffer, because |
| 210 | +the buffer\'s destructor will destroy the elements with the (incorrect) |
| 211 | +static presumption that they have `Derived` type. |
| 212 | + |
| 213 | +Furthermore, we can\'t (logically) copy the buffer just prior to |
| 214 | +mutation, since the `[Base]` may be copied prior to mutation, and our |
| 215 | +shared subscript assignment semantics imply that all copies must observe |
| 216 | +its subscript assignments. |
| 217 | + |
| 218 | +Therefore, converting `[T]` to `[U]` is akin to resizing: the new |
| 219 | +`Array` becomes logically independent. To avoid an immediate O(N) |
| 220 | +conversion cost, and preserve shared subscript assignment semantics, we |
| 221 | +use a layer of indirection in the data structure. Further, when `T` is a |
| 222 | +subclass of `U`, the intermediate object is marked to prevent in-place |
| 223 | +mutation of the buffer; it will be copied upon its first mutation: |
| 224 | + |
| 225 | + |
| 226 | + |
| 227 | +#### Deferred Checking for Forced Downcasts |
| 228 | + |
| 229 | +In forced downcasts, if any element fails to have dynamic type |
| 230 | +`Derived`, it is considered a programming error that may cause a trap. |
| 231 | +Sometimes we can do this check in O(1) because the source holds a known |
| 232 | +buffer type. Rather than incur O(N) checking for the other cases, the |
| 233 | +new intermediate object is marked for deferred checking, and all element |
| 234 | +accesses through that object are dynamically typechecked, with a trap |
| 235 | +upon failure (except in `-Ounchecked` builds). |
| 236 | + |
| 237 | +When the resulting array is later up-cast (other than to a type that can |
| 238 | +be validated in O(1) by checking the type of the underlying buffer), the |
| 239 | +result is also marked for deferred checking. |
| 240 | + |
| 241 | +------------------------------------------------------------------------ |
| 242 | + |
| 243 | +[^1]: This `copy()` may amount to a retain if the `NSArray` is already |
| 244 | + known to be immutable. We could eventually optimize out the copy if |
| 245 | + we can detect that the `NSArray` is uniquely referenced. Our current |
| 246 | + unique-reference detection applies only to Swift objects, though. |
0 commit comments