4
4
**Author **: `Jerry Zhang <https://github.com/jerryzh168 >`_
5
5
6
6
FX Graph Mode Quantization requires a symbolically traceable model.
7
- We use the FX framework (TODO: link) to convert a symbolically traceable nn.Module instance to IR,
7
+ We use the FX framework to convert a symbolically traceable nn.Module instance to IR,
8
8
and we operate on the IR to execute the quantization passes.
9
9
Please post your question about symbolically tracing your model in `PyTorch Discussion Forum <https://discuss.pytorch.org/c/quantization/17 >`_
10
10
@@ -22,16 +22,19 @@ You can use any combination of these options:
22
22
b. Write your own observed and quantized submodule
23
23
24
24
25
- ####################################################################
26
25
If the code that is not symbolically traceable does not need to be quantized, we have the following two options
27
26
to run FX Graph Mode Quantization:
28
- 1.a. Symbolically trace only the code that needs to be quantized
27
+
28
+
29
+ Symbolically trace only the code that needs to be quantized
29
30
-----------------------------------------------------------------
30
31
When the whole model is not symbolically traceable but the submodule we want to quantize is
31
32
symbolically traceable, we can run quantization only on that submodule.
33
+
32
34
before:
33
35
34
36
.. code :: python
37
+
35
38
class M (nn .Module ):
36
39
def forward (self , x ):
37
40
x = non_traceable_code_1(x)
@@ -42,6 +45,7 @@ before:
42
45
after:
43
46
44
47
.. code :: python
48
+
45
49
class FP32Traceable (nn .Module ):
46
50
def forward (self , x ):
47
51
x = traceable_code(x)
@@ -69,8 +73,7 @@ Note if original model needs to be preserved, you will have to
69
73
copy it yourself before calling the quantization APIs.
70
74
71
75
72
- #####################################################
73
- 1.b. Skip symbolically trace the non-traceable code
76
+ Skip symbolically trace the non-traceable code
74
77
---------------------------------------------------
75
78
When we have some non-traceable code in the module, and this part of code doesn’t need to be quantized,
76
79
we can factor out this part of the code into a submodule and skip symbolically trace that submodule.
@@ -134,8 +137,7 @@ quantization code:
134
137
135
138
If the code that is not symbolically traceable needs to be quantized, we have the following two options:
136
139
137
- ##########################################################
138
- 2.a Refactor your code to make it symbolically traceable
140
+ Refactor your code to make it symbolically traceable
139
141
--------------------------------------------------------
140
142
If it is easy to refactor the code and make the code symbolically traceable,
141
143
we can refactor the code and remove the use of non-traceable constructs in python.
@@ -167,15 +169,10 @@ after:
167
169
return x.permute(0 , 2 , 1 , 3 )
168
170
169
171
170
- quantization code:
171
-
172
172
This can be combined with other approaches and the quantization code
173
173
depends on the model.
174
174
175
-
176
-
177
- #######################################################
178
- 2.b. Write your own observed and quantized submodule
175
+ Write your own observed and quantized submodule
179
176
-----------------------------------------------------
180
177
181
178
If the non-traceable code can’t be refactored to be symbolically traceable,
@@ -207,8 +204,8 @@ non-traceable logic, wrapped in a module
207
204
class FP32NonTraceable :
208
205
...
209
206
210
-
211
- 2. Define observed version of FP32NonTraceable
207
+ 2. Define observed version of
208
+ FP32NonTraceable
212
209
213
210
.. code :: python
214
211
0 commit comments