Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AutoDiff] Derive Differentiable.zeroTangentVectorInitializer. #31823

Merged
merged 3 commits into from
May 29, 2020

Conversation

dan-zheng
Copy link
Contributor

@dan-zheng dan-zheng commented May 15, 2020

Differentiable conformance derivation now supports
Differentiable.zeroTangentVectorInitializer.

zeroTangentVectorInitializer is a closure that produces a zero tangent vector,
capturing minimal necessary information from self.

It is an instance property, unlike the static property AdditiveArithmetic.zero,
and should be used by the differentiation transform for correctness.

Remove zeroTangentVectorInitializer dummy default implementation.
Primitive Differentiable-conforming types now must implement it.

Update stdlib Differentiable conformances and tests.

Resolves TF-1007.
Progress towards TF-1008: differentiation correctness for projection operations.


Examples:

import _Differentiation

struct Struct<T: Differentiable, U: Differentiable>: Differentiable {
  var x: T
  var y: U

  // Compiler synthesizes:
  // var zeroTangentVectorInitializer: () -> TangentVector {
  //   { [xZeroTanInit = x.zeroTangentVectorInitializer,
  //      yZeroTanInit = y.zeroTangentVectorInitializer] in
  //     return TangentVector(x: xZeroTanInit(), y: yZeroTanInit())
  //   }
  // }
}

let s = Struct(x: [1, 2, 3], y: [[4, 5, 6], [], [2]])
// `Differentiable.zeroTangentVector` default implementation calls
// `zeroTangentVectorInitializer`.
print(s.zeroTangentVector)

// Before (via dummy `zeroTangentVectorInitializer` default implementation):
// TangentVector(x: [], y: [])

// After (via derived conformances):
// TangentVector(x: [0.0, 0.0, 0.0], y: [[0.0, 0.0, 0.0], [], [0.0]])
struct CustomTangentVector<T: Differentiable, U: Differentiable>: Differentiable {
  var x: T
  var y: U

  typealias TangentVector = T.TangentVector
  mutating func move(along direction: TangentVector) {}

  // Memberwise synthesis is not possible for custom `TangentVector` type.
  // Fallback synthesis is done instead:
  // var zeroTangentVectorInitializer: () -> TangentVector {
  //   { TangentVector.zero }
  // }
}

@dan-zheng dan-zheng requested review from rxwei and marcrasi May 15, 2020 18:53
///
/// The parser typically takes care of assigning unique discriminators to
/// closures, but the parser is unavailable to this transform.
class DiscriminatorFinder : public ASTWalker {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I moved DiscriminatorFinder to a shared location lib/Sema/CodeSynthesis.h.

@dan-zheng
Copy link
Contributor Author

@CodaFi: I'll add you as reviewer in case you'd like to look at any of these derived conformances changes. lib/Sema/DerivedConformanceDifferentiable.cpp has some room for cleanup that I'd like to investigate later.

@dan-zheng dan-zheng requested a review from CodaFi May 15, 2020 19:02
@dan-zheng
Copy link
Contributor Author

@swift-ci Please test

@dan-zheng dan-zheng requested a review from rxwei May 15, 2020 22:40
Copy link
Contributor

@rxwei rxwei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you derive a default implementation for zeroTangentVectorInitializer even when the user has a custom tangent vector? If you do that then this PR won't be source-breaking.

@dan-zheng
Copy link
Contributor Author

Can you derive a default implementation for zeroTangentVectorInitializer even when the user has a custom tangent vector? If you do that then this PR won't be source-breaking.

I asked offline for clarification. What @rxwei has in mind is: deriving zeroTangentVectorInitializer as { TangentVector.zero } for types defining a custom TangentVector.

@marcrasi: do you agree whether this makes sense and is desirable? I think you have experience with custom TangentVector types from borglab/SwiftFusion.

This behavior should be possible but requires more work. I'd like to reach consensus before exploring.

@marcrasi
Copy link

I have some thoughts about default synthesis of zeroTangentVectorInitializer.

The ultimate ideal behavior

  • We clarify the semantics of AdditiveArithmetic to require that static var zero always "makes sense". (Depending on how you interpret the documentation, this is already how the semantics are, and it is wrong for Tensor to conform to AdditiveArithmetic).
  • Therefore we remove the conformances of Array, Tensor, etc to AdditiveArithmetic.
  • We introduce a new protocol X that captures the concept of an additive arithmetic without zero. AdditiveArithmetic refines X.
  • We relax the Differentiable requirement that TangentVector: AdditiveArithmetic. Make it TangentVector: X instead.
  • When Differentiable synthesis sees a custom TangentVector that conforms to AdditiveArithmetic, it synthesizes a zeroTangentVectorInitializer. This is completely correct and safe because of the semantics of AdditiveArithmetic.
  • When Differentiable synthesis sees a custom TangentVector that does not conform to AdditiveArithmetic, it is unable to synthesize zeroTangentVectorInitializer.

What to do today

Yes, derive zeroTangentVectorInitializer as { TangentVector.zero } for types defining a custom TangentVector. This is consistent with the ultimate ideal behavior, and therefore a step in the right direction.

@marcrasi
Copy link

From a usability point of view

It's already pretty difficult to define a custom TangentVector. If we don't automatically synthesize zeroTangentVectorInitialzer for custom TangentVectors, then it becomes even more difficult. This is a step in the wrong direction for usability.

Therefore the usability point of view also says that, it's good to derive zeroTangentVectorInitializer as { TangentVector.zero } for types defining a custom TangentVector.

@rxwei
Copy link
Contributor

rxwei commented May 16, 2020

The root problem is that types like Tensor have dynamic ranks, so the tangent vector type of a Tensor is in fact value-dependent. I wish Swift supports things like path-dependent types one day.

@derivative(of: foo)
func _(x: T) -> (value: U, differential: @differentiable(linear) (x.TangentVector) -> value.TangentVector)

However, zeroTangentVectorInitializer() will be a required workaround in the foreseeable future...

@dan-zheng dan-zheng force-pushed the zero-tangent-vector-initializer branch from 97f6739 to cf81768 Compare May 28, 2020 02:17
@dan-zheng
Copy link
Contributor Author

Thanks for the feedback above!

Yes, derive zeroTangentVectorInitializer as { TangentVector.zero } for types defining a custom TangentVector. This is consistent with the ultimate ideal behavior, and therefore a step in the right direction.

I implemented this behavior. There are now two cases for Differentiable.zeroTangentVectorInitializer derivation:

  1. Memberwise derivation, when TangentVector type can be initialized memberwise.
  2. Fallback { TangentVector.zero } derivation.

The result is improved usability: almost no types (including "primitive" Differentiable-conforming ones) are required to define Differentiable.zeroTangentVectorInitializer.


Differentiable.zeroTangentVectorInitializer derivation is now also fixed so that synthesized closures never capture self.

import _Differentiation
struct Struct<T: Differentiable, U: Differentiable>: Differentiable {
  var x: T
  var y: U

  // Previously synthesized, incorrectly captures `self`:
  // var zeroTangentVectorInitializer: () -> TangentVector {
  //   {
  //     // Accesses to `self.x` and `self.y` cause `self` to be captured.
  //     TangentVector(x: x.zeroTangentVectorInitializer(),
  //                   y: y.zeroTangentVectorInitializer())
  //   }
  // }

  // Now synthesized, correct:
  // var zeroTangentVectorInitializer: () -> TangentVector {
  //   { [xZeroTanInit = x.zeroTangentVectorInitializer,
  //      yZeroTanInit = y.zeroTangentVectorInitializer] in
  //     TangentVector(x: xZeroTanInit(), y: yZeroTanInit())
  //   }
  // }
}

Ready for re-review.

@dan-zheng dan-zheng requested review from marcrasi and rxwei May 28, 2020 02:42
`Differentiable` conformance derivation now supports
`Differentiable.zeroTangentVectorInitializer`.

There are two potential cases:
1. Memberwise derivation: done when `TangentVector` can be initialized memberwise.
2. `{ TangentVector.zero }` derivation: done as a fallback.

`zeroTangentVectorInitializer` is a closure that produces a zero tangent vector,
capturing minimal necessary information from `self`.

It is an instance property, unlike the static property `AdditiveArithmetic.zero`,
and should be used by the differentiation transform for correctness.

Remove `Differentiable.zeroTangentVectorInitializer` dummy default implementation.

Update stdlib `Differentiable` conformances and tests.
Clean up DerivedConformanceDifferentiable.cpp cruft.

Resolves TF-1007.
Progress towards TF-1008: differentiation correctness for projection operations.
@dan-zheng dan-zheng force-pushed the zero-tangent-vector-initializer branch from cf81768 to be54faa Compare May 28, 2020 02:45
@dan-zheng
Copy link
Contributor Author

@swift-ci Please smoke test

Add test for enum with custom non-`Self` `TangentVector` type.
Fallback `{ TangentVector.zero }` synthesis is expected and correct.
@dan-zheng
Copy link
Contributor Author

@swift-ci Please smoke test

@dan-zheng
Copy link
Contributor Author

@swift-ci Please clean smoke test Linux

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants