[Layers] Created GRU layer #417

dhasl002 · 2019-08-06T03:22:04Z

Based on the fully gated unit seen below

There are the formulas that I used.

Shashi456 · 2019-08-06T03:24:34Z

@dhasl002 there's already #110 on this, which was blocked by an SIL error then, does this pass locally for you?

dhasl002 · 2019-08-06T04:13:09Z

@Shashi456 This does pass locally for me

rxwei

Thanks. This looks great!

rxwei · 2019-08-06T18:42:12Z

Sources/TensorFlow/Layers/Recurrent.swift

+        inputSize: Int,
+        hiddenSize: Int,
+        seed: TensorFlowSeed = Context.local.randomSeed
+        ) {


Suggested change

) {

) {

rxwei · 2019-08-06T18:42:18Z

Sources/TensorFlow/Layers/Recurrent.swift

+    public var updateBias, outputBias, resetBias: Tensor<Scalar>
+
+    @noDerivative public var stateShape: TensorShape {
+        TensorShape([1, updateWeight.shape[0]])


Suggested change

TensorShape([1, updateWeight.shape[0]])

[1, updateWeight.shape[0]]

Use literal initialization when a contextual type exists.

rxwei · 2019-08-06T18:43:03Z

Sources/TensorFlow/Layers/Recurrent.swift

@@ -200,6 +200,75 @@ public struct LSTMCell<Scalar: TensorFlowFloatingPoint>: RNNCell {
    }
 }

+/// An GRU cell.
+public struct GRUCell<Scalar: TensorFlowFloatingPoint>: RNNCell {
+    public var updateWeight, updateWeight2, resetWeight, resetWeight2, outputWeight, outputWeight2: Tensor<Scalar>


Break this into multiple var declarations so that it fits within 100 columns.

rxwei · 2019-08-06T18:43:32Z

Sources/TensorFlow/Layers/Recurrent.swift

+    /// - Returns: The hidden state.
+    @differentiable
+    public func callAsFunction(_ input: Input) -> Output {
+        let resetGate = sigmoid(matmul(input.input, resetWeight) + matmul(input.state.hidden, resetWeight2) + resetBias)


Make sure all lines fit within 100 columns.

rxwei · 2019-08-06T18:43:50Z

Tests/TensorFlowTests/LayerTests.swift

+        let x = Tensor<Float>(rangeFrom: 0.0, to: 0.4, stride: 0.1).rankLifted()
+        let inputs: [Tensor<Float>] = Array(repeating: x, count: 4)
+        let rnn = RNN(GRUCell<Float>(inputSize: 4, hiddenSize: 4,
+                                           seed: (0xFeed, 0xBeef)))


Move this to the end of the previous line.

dhasl002 · 2019-08-06T21:12:53Z

@rxwei updated PR. I used this for line wrapping style guide.

rxwei · 2019-08-06T21:29:19Z

Sources/TensorFlow/Layers/Recurrent.swift

+    @differentiable
+    public func callAsFunction(_ input: Input) -> Output {
+        let resetGate = sigmoid(matmul(input.input, resetWeight) +
+          matmul(input.state.hidden, resetWeight2) + resetBias)


Indent each line wrapping by 4 from the previous line. The only difference from the Google Swift Style Guide is that we use 4-space indentation instead of 2-space.

Suggested change

matmul(input.state.hidden, resetWeight2) + resetBias)

matmul(input.state.hidden, resetWeight2) + resetBias)

Same for other occurrences below.

rxwei · 2019-08-06T21:43:00Z

Sources/TensorFlow/Layers/Recurrent.swift

@@ -200,6 +200,80 @@ public struct LSTMCell<Scalar: TensorFlowFloatingPoint>: RNNCell {
    }
 }

+/// An GRU cell.
+public struct GRUCell<Scalar: TensorFlowFloatingPoint>: RNNCell {
+    public var updateWeight, updateWeight2: Tensor<Scalar>


One last thing: I think it's better to rename variables that have a 2 variant to have a 1 suffix. So updateWeight1, resetWeight1, and outputWeight1. What do you think?

I don't feel strongly at all, so I made the change.

Sorry, I meant updating updateWeight to be updateWeight1 so that you'll have updateWeight1 and updateWeight2.

Haha that makes way more sense. I agree this is an improvement. Made the changes accordingly

rxwei · 2019-08-08T00:17:16Z

CI has been broken recently and we are looking into it. The PR looks good to me so we'll merge it when CI gets fixed.

Shashi456 · 2019-08-09T02:45:42Z

@dhasl002 Could you pull master, The errors are from PR changes. I guess this should pass after that.

eaplatanios · 2019-08-09T02:55:38Z

Sources/TensorFlow/Layers/Recurrent.swift

+        inputSize: Int,
+        hiddenSize: Int,
+        seed: TensorFlowSeed = Context.local.randomSeed
+    ) {


Could you switch to using the initialization convention we use for other layers? See, for example, how the Dense layer initializers are defined.

Could you be more specific? I see some small differences between the initializers, but I'm not exactly sure which conventions you are referring to.

Sorry for the confusion. I was referring to the initialization method being used (e.g., zeros vs glorotUniform vs others). The approach followed for other layers allows the user to provide a custom initialization method for the layer parameters if they want to. You should also modify this one to support custom initialization methods.

@eaplatanios Sorry, I am still confused. Could you tell me if I am understanding you correctly?

In other words, you would like me to add the weights and biases to the initializer so that someone could initialize them differently?

@eaplatanios @sgugger @marcrasi This is the last request on this PR, could you give me the requested info so that we can get this merged in. Thanks 😄

Sorry I missed that comment. The requested change is to pass a weightInitializer and biasInitializer in this init, on top of the sizes and use them (no need to take a seed then, since you can seed your initiliazer). Here is an example for the initialization of Dense:

init( inputSize: Int, outputSize: Int, activation: @escaping Activation = identity, weightInitializer: ParameterInitializer<Scalar> = glorotUniform(), biasInitializer: ParameterInitializer<Scalar> = zeros() ) { self.init( weight: weightInitializer([inputSize, outputSize]), bias: biasInitializer([outputSize]), activation: activation) }

@sgugger Thanks for the help! Everything should be finished now.

Sources/TensorFlow/Layers/Recurrent.swift

Tests/TensorFlowTests/LayerTests.swift

saeta · 2019-11-07T18:37:00Z

Hi @dhasl002 ! Thank you so much for this PR. Do you think you might be able to make some of the changes @eaplatanios suggested? If not, no worries! -Brennan

dhasl002 · 2019-11-11T16:59:36Z

Hi @dhasl002 ! Thank you so much for this PR. Do you think you might be able to make some of the changes @eaplatanios suggested? If not, no worries! -Brennan

@saeta Thank you for reminding me, I will work this today.

sgugger · 2019-11-18T14:45:43Z

Hi again @dhasl002 . Did you have time to work on those changes? Please let us know if you have any questions or if you don't have any time for this.

fixed spacing variable renaming correct variable renaming

Delete contents.xcworkspacedata

sgugger · 2019-12-05T20:26:25Z

Thanks a lot for your help!

dhasl002 changed the title ~~created GRU layer~~ [Layers] Created GRU layer Aug 6, 2019

dhasl002 mentioned this pull request Aug 6, 2019

Implement Recurrent Layers #52

Open

rxwei reviewed Aug 6, 2019

View reviewed changes

dhasl002 force-pushed the layer_gru branch 2 times, most recently from 5936f1f to 81046d8 Compare August 7, 2019 23:24

rxwei added the kokoro:run label Aug 8, 2019

rxwei approved these changes Aug 8, 2019

View reviewed changes

kokoro-team removed the kokoro:run label Aug 8, 2019

rxwei added the kokoro:force-run label Aug 8, 2019

kokoro-team removed the kokoro:force-run label Aug 8, 2019

eaplatanios suggested changes Aug 9, 2019

View reviewed changes

dhasl002 force-pushed the layer_gru branch from 81046d8 to 3bb1e15 Compare August 14, 2019 02:50

rxwei requested a review from eaplatanios August 23, 2019 02:11

sgugger self-assigned this Nov 14, 2019

Shashi456 mentioned this pull request Nov 19, 2019

[Layer] Add GRUCell. #110

Closed

dhasl002 added 3 commits November 19, 2019 22:24

created GRU layer

67068ec

code review

d3585fc

fixed spacing variable renaming correct variable renaming

review1

622ea6f

dhasl002 force-pushed the layer_gru branch from 3bb1e15 to 622ea6f Compare November 20, 2019 06:25

review2

97dd831

Delete contents.xcworkspacedata

dhasl002 force-pushed the layer_gru branch from 98859ae to 97dd831 Compare November 20, 2019 07:32

dan-zheng added the kokoro:force-run label Nov 20, 2019

kokoro-team removed the kokoro:force-run label Nov 20, 2019

dhasl002 added 2 commits November 20, 2019 17:25

review3

69df48b

review4

7c3a172

sgugger added the kokoro:force-run label Dec 5, 2019

kokoro-team removed the kokoro:force-run label Dec 5, 2019

sgugger merged commit f56c59e into tensorflow:master Dec 5, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Layers] Created GRU layer #417

[Layers] Created GRU layer #417

dhasl002 commented Aug 6, 2019

Shashi456 commented Aug 6, 2019

dhasl002 commented Aug 6, 2019 •

edited

Loading

rxwei left a comment

rxwei Aug 6, 2019

rxwei Aug 6, 2019

rxwei Aug 6, 2019

rxwei Aug 6, 2019

rxwei Aug 6, 2019

dhasl002 commented Aug 6, 2019

rxwei Aug 6, 2019

dhasl002 Aug 6, 2019

rxwei Aug 6, 2019

dhasl002 Aug 7, 2019

rxwei Aug 7, 2019

dhasl002 Aug 7, 2019

rxwei Aug 8, 2019

rxwei commented Aug 8, 2019

Shashi456 commented Aug 9, 2019

eaplatanios Aug 9, 2019

dhasl002 Aug 14, 2019

eaplatanios Aug 14, 2019

dhasl002 Nov 20, 2019

dhasl002 Nov 21, 2019 •

edited

Loading

sgugger Dec 4, 2019

dhasl002 Dec 5, 2019

saeta commented Nov 7, 2019

dhasl002 commented Nov 11, 2019

sgugger commented Nov 18, 2019

sgugger commented Dec 5, 2019

	TensorShape([1, updateWeight.shape[0]])
	[1, updateWeight.shape[0]]

	matmul(input.state.hidden, resetWeight2) + resetBias)
	matmul(input.state.hidden, resetWeight2) + resetBias)

[Layers] Created GRU layer #417

[Layers] Created GRU layer #417

Conversation

dhasl002 commented Aug 6, 2019

Shashi456 commented Aug 6, 2019

dhasl002 commented Aug 6, 2019 • edited Loading

rxwei left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dhasl002 commented Aug 6, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rxwei commented Aug 8, 2019

Shashi456 commented Aug 9, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dhasl002 Nov 21, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

saeta commented Nov 7, 2019

dhasl002 commented Nov 11, 2019

sgugger commented Nov 18, 2019

sgugger commented Dec 5, 2019

dhasl002 commented Aug 6, 2019 •

edited

Loading

dhasl002 Nov 21, 2019 •

edited

Loading