提交 30f824ad authored 作者: Frederic's avatar Frederic

Speed up the compilation with many incsubtensor(grad of subtensor).

上级 471f7cdc
...@@ -2116,6 +2116,14 @@ def local_IncSubtensor_serialize(node): ...@@ -2116,6 +2116,14 @@ def local_IncSubtensor_serialize(node):
#print incsub_inputs, [id(i.owner.inputs[0]) for i in incsub_inputs] #print incsub_inputs, [id(i.owner.inputs[0]) for i in incsub_inputs]
# We register it in a TopoOptimizer inside the canonizer EQ optimizer.
# Otherwise in some cases it was making the EQ optimizer use 45. In
# the TopoOptimizer, the EQ only use 6 passes.
compile.optdb.register('pre_local_IncSubtensor_serialize',
in2out(local_IncSubtensor_serialize),
#Just before canonizer
0.99, 'fast_run')
#after priority 50 Destructive inplace operations #after priority 50 Destructive inplace operations
#gemm is the first one now, at priority 70 #gemm is the first one now, at priority 70
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论