Just as Tensor
is our fundamental building block for accelerated parallel computation, most
machine learning models and operations will be expressed in terms of the
Layer
protocol. Layer
defines an interface for types that take a differentiable input, process it, and
produce a differentiable output. A Layer
can contain state, such as trainable weights.
Layer
is a refinement of the Module
protocol, with Module
defining the more general case where
the input to the type is not necessarily differentiable. Most components in a model will deal with
differentiable inputs, but there are cases where types may need to conform to Module
instead.
If you create an operation has no trainable parameters within it, you'll want to define it in terms
of ParameterlessLayer
instead of Layer
.
Models themselves are often defined as Layer
s, and are regularly composed of other Layer
s. A
model or subunit that has been defined as a Layer
can be treated just like any other Layer
,
allowing for the construction of arbitarily complex models from other models or subunits.
To define a custom Layer
for a model or operation of your own, you generally will follow a
template similar to this:
public struct MyModel: Layer {
// Define your layers or other properties here.
// A custom initializer may be desired to configure the model.
public init() {}
@differentiable
public func callAsFunction(_ input: Tensor<Float>) -> Tensor<Float> {
// Define the sequence of operations performed on model input to arrive at the output.
return ...
}
}
Trainable components of Layers
, such as weights and biases, as well as other Layer
s, can be
declared as properties. A custom initializer is a good place to expose customizable parameters for
a model, such as a variable numbers of layers or the output size of a classification model.
Finally, the core of the Layer
is callAsFunction()
, where you will define the types for the
input and output as well as the transformation that takes in one and returns the other.
Many common machine learning operations have been encapsulated as Layer
s for you to use when
defining models or subunits. The following is a list of the layers provided by Swift for TensorFlow,
grouped by functional areas:
- Conv1D
- Conv2D
- Conv3D
- Dense
- DepthwiseConv2D
- SeparableConv1D
- SeparableConv2D
- TransposedConv1D
- TransposedConv2D
- TransposedConv3D
- ZeroPadding1D
- ZeroPadding2D
- ZeroPadding3D
- AvgPool1D
- AvgPool2D
- AvgPool3D
- MaxPool1D
- MaxPool2D
- MaxPool3D
- FractionalMaxPool2D
- GlobalAvgPool1D
- GlobalAvgPool2D
- GlobalAvgPool3D
- GlobalMaxPool1D
- GlobalMaxPool2D
- GlobalMaxPool3D
Optimizers are a key component of the training of a machine learning model, updating the model based on a calculated gradient. These updates ideally will adjust the parameters of a model in such a way as to train the model.
To use an optimizer, first initialize it for a target model with appropriate training parameters:
let optimizer = RMSProp(for: model, learningRate: 0.0001, decay: 1e-6)
Train a model by obtaining a gradient with respect to input and a loss function, and then update the model along that gradient using your optimizer:
optimizer.update(&model, along: gradients)
Several common optimizers are provided by Swift for TensorFlow. These include the following: