Skip to content
项目
群组
代码片段
帮助
当前项目
正在载入...
登录 / 注册
切换导航面板
P
pytensor
项目
项目
详情
活动
周期分析
仓库
仓库
文件
提交
分支
标签
贡献者
图表
比较
统计图
议题
0
议题
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
CI / CD
CI / CD
流水线
作业
日程
统计图
Wiki
Wiki
代码片段
代码片段
成员
成员
折叠边栏
关闭边栏
活动
图像
聊天
创建新问题
作业
提交
问题看板
Open sidebar
testgroup
pytensor
Commits
6f4542f8
提交
6f4542f8
authored
7月 27, 2015
作者:
carriepl
浏览文件
操作
浏览文件
下载
差异文件
Merge pull request #3083 from caglar/minor_scan_opt_optimizations
Speed up Scan optimizations
上级
755ba97a
808da4a5
显示空白字符变更
内嵌
并排
正在显示
1 个修改的文件
包含
226 行增加
和
164 行删除
+226
-164
scan_opt.py
theano/scan_module/scan_opt.py
+226
-164
没有找到文件。
theano/scan_module/scan_opt.py
浏览文件 @
6f4542f8
...
@@ -26,13 +26,16 @@ scan_eqopt2 -> They are all global optimizer. (in2out convert local to global).
...
@@ -26,13 +26,16 @@ scan_eqopt2 -> They are all global optimizer. (in2out convert local to global).
registered. (So don't change the order we register them!)
registered. (So don't change the order we register them!)
If we convert to local optimizer, we must convert all of them
If we convert to local optimizer, we must convert all of them
to local optimizer. But:
to local optimizer. But:
1) can ScanMerge be made local? Can we keep only this one global?
1) can ScanMerge be made local? Can we keep only this one
global?
2) ScanSaveMem assert that we remove all nodes outputs,
2) ScanSaveMem assert that we remove all nodes outputs,
we need to keep this.
we need to keep this.
3) It is ScanSaveMem suppose the the others ran before.
3) It is ScanSaveMem suppose the the others ran before.
I added an assert at one place, but didn't looked for other place.
I added an assert at one place, but didn't looked for
other place.
4) Moving this to local opt could speed up significant this opt,
4) Moving this to local opt could speed up significant this opt,
as we pass frequently on all nodes in the graph for no good reason.
as we pass frequently on all nodes in the graph for no
good reason.
5) We register remove_constant_* many places, as some
5) We register remove_constant_* many places, as some
opt create them and let this one clean up the mess.
opt create them and let this one clean up the mess.
Doing it that way, make things simpler for those already
Doing it that way, make things simpler for those already
...
@@ -70,14 +73,16 @@ from theano.compat import OrderedDict
...
@@ -70,14 +73,16 @@ from theano.compat import OrderedDict
from
six
import
integer_types
,
iteritems
from
six
import
integer_types
,
iteritems
from
six.moves
import
xrange
from
six.moves
import
xrange
from
theano.gof.opt
import
Optimizer
from
theano.gof.opt
import
Optimizer
from
theano.gof.opt
import
pre_constant_merge
,
pre_greedy_local_optimizer
from
theano.gof
import
toolbox
,
DestroyHandler
,
InconsistencyError
from
theano.gof
import
toolbox
,
DestroyHandler
,
InconsistencyError
from
theano.compile
import
optdb
from
theano.compile
import
optdb
from
theano.compile.function_module
import
deep_copy_op
from
theano.compile.function_module
import
deep_copy_op
from
theano.scan_module
import
scan_op
from
theano.scan_module
import
scan_op
from
theano.scan_module
import
scan_utils
from
theano.scan_module
import
scan_utils
from
theano.scan_module.scan_utils
import
equal_computations
,
find_up
,
scan_args
from
theano.scan_module.scan_utils
import
equal_computations
,
find_up
,
\
from
theano.gof.opt
import
pre_constant_merge
,
pre_greedy_local_optimizer
scan_args
# Logging function for sending warning or info
# Logging function for sending warning or info
_logger
=
logging
.
getLogger
(
'theano.scan_module.scan_opt'
)
_logger
=
logging
.
getLogger
(
'theano.scan_module.scan_opt'
)
...
@@ -113,7 +118,7 @@ def remove_constants_and_unused_inputs_scan(node):
...
@@ -113,7 +118,7 @@ def remove_constants_and_unused_inputs_scan(node):
op
=
node
.
op
op
=
node
.
op
# We only need to take care of sequences and other arguments
# We only need to take care of sequences and other arguments
st
=
op
.
n_seqs
st
=
op
.
n_seqs
st
+=
int
(
numpy
.
sum
([
len
(
x
)
for
x
in
st
+=
int
(
sum
([
len
(
x
)
for
x
in
op
.
tap_array
[:(
op
.
n_mit_mot
+
op
.
n_mit_sot
)]]))
op
.
tap_array
[:(
op
.
n_mit_mot
+
op
.
n_mit_sot
)]]))
st
+=
op
.
n_sit_sot
st
+=
op
.
n_sit_sot
st
+=
op
.
n_shared_outs
st
+=
op
.
n_shared_outs
...
@@ -162,8 +167,8 @@ def remove_constants_and_unused_inputs_scan(node):
...
@@ -162,8 +167,8 @@ def remove_constants_and_unused_inputs_scan(node):
index
=
node
.
inputs
.
index
(
identical_seqs
[
0
])
-
1
index
=
node
.
inputs
.
index
(
identical_seqs
[
0
])
-
1
givens
[
op_ins
[
idx
]]
=
op_ins
[
index
]
givens
[
op_ins
[
idx
]]
=
op_ins
[
index
]
else
:
else
:
nw_inner
+=
[
op_ins
[
idx
]]
nw_inner
.
append
(
op_ins
[
idx
])
nw_outer
+=
[
node_inp
]
nw_outer
.
append
(
node_inp
)
nw_n_seqs
=
len
(
nw_inner
)
nw_n_seqs
=
len
(
nw_inner
)
# Add outputs stuff
# Add outputs stuff
...
@@ -185,8 +190,9 @@ def remove_constants_and_unused_inputs_scan(node):
...
@@ -185,8 +190,9 @@ def remove_constants_and_unused_inputs_scan(node):
if
identical_nonseq_idx
:
if
identical_nonseq_idx
:
givens
[
nw_in
]
=
nw_inner_nonseq
[
identical_nonseq_idx
[
0
]]
givens
[
nw_in
]
=
nw_inner_nonseq
[
identical_nonseq_idx
[
0
]]
else
:
else
:
nw_inner_nonseq
+=
[
nw_in
]
nw_inner_nonseq
.
append
(
nw_in
)
nw_outer_nonseq
+=
[
nw_out
]
nw_outer_nonseq
.
append
(
nw_out
)
nw_inner
.
extend
(
nw_inner_nonseq
)
nw_inner
.
extend
(
nw_inner_nonseq
)
nw_outer
.
extend
(
nw_outer_nonseq
)
nw_outer
.
extend
(
nw_outer_nonseq
)
...
@@ -205,7 +211,10 @@ def remove_constants_and_unused_inputs_scan(node):
...
@@ -205,7 +211,10 @@ def remove_constants_and_unused_inputs_scan(node):
# This is a global opt for historical reason
# This is a global opt for historical reason
# It should be possible to change it to a local opt.
# It should be possible to change it to a local opt.
class
PushOutNonSeqScan
(
gof
.
Optimizer
):
class
PushOutNonSeqScan
(
gof
.
Optimizer
):
"""
A global optimizer for pushing out the variables inside the scan that
are not used by the scan.
"""
def
__init__
(
self
):
def
__init__
(
self
):
gof
.
Optimizer
.
__init__
(
self
)
gof
.
Optimizer
.
__init__
(
self
)
...
@@ -219,57 +228,78 @@ class PushOutNonSeqScan(gof.Optimizer):
...
@@ -219,57 +228,78 @@ class PushOutNonSeqScan(gof.Optimizer):
self
.
process_node
(
fgraph
,
node
)
self
.
process_node
(
fgraph
,
node
)
def
process_node
(
self
,
fgraph
,
node
):
def
process_node
(
self
,
fgraph
,
node
):
"""
IMPORTANT NOTE: This function uses set and dictionary data structures.
By default they are not ordered for efficiency reasons. Take care
and make sure of changing them with their Ordered counterparts if you
need to iterate over these variables.
"""
# this flag tells if there was any change during the last iterations
# this flag tells if there was any change during the last iterations
changed
=
True
clean_inputs
,
clean_outputs
=
scan_utils
.
reconstruct_graph
(
clean_inputs
,
clean_outputs
=
scan_utils
.
reconstruct_graph
(
node
.
op
.
inputs
,
node
.
op
.
outputs
)
node
.
op
.
inputs
,
node
.
op
.
outputs
)
local_fgraph
=
gof
.
FunctionGraph
(
clean_inputs
,
clean_outputs
,
clone
=
False
)
local_fgraph
=
gof
.
FunctionGraph
(
clean_inputs
,
max_iterations
=
2
*
len
(
local_fgraph
.
toposort
())
+
3
clean_outputs
,
counts
=
0
clone
=
False
)
to_remove
=
[]
to_replace
=
[]
local_fgraph_topo
=
local_fgraph
.
toposort
()
local_fgraph_outs_set
=
set
(
local_fgraph
.
outputs
)
local_fgraph_outs_map
=
dict
([(
v
,
k
)
for
k
,
v
in
\
enumerate
(
local_fgraph
.
outputs
)])
to_remove_set
=
set
()
to_replace_set
=
set
()
to_replace_map
=
OrderedDict
()
nto_replace
=
0
def
add_to_replace
(
y
):
to_replace_set
.
add
(
y
)
to_replace_map
[
y
]
=
add_to_replace
.
n
add_to_replace
.
n
+=
1
add_to_replace
.
n
=
0
replace_with_in
=
[]
replace_with_in
=
[]
replace_with_out
=
[]
replace_with_out
=
[]
op
=
node
.
op
op
=
node
.
op
# Construct the list of non_sequences to simplify a few things
# Construct the list of non_sequences to simplify a few things
inner_non_seqs
=
op
.
inner_non_seqs
(
clean_inputs
)
inner_non_seqs
=
op
.
inner_non_seqs
(
clean_inputs
)
inner_non_seqs_set
=
set
(
inner_non_seqs
)
inner_non_seqs_map
=
dict
([(
v
,
k
)
for
k
,
v
in
enumerate
(
inner_non_seqs
)])
outer_non_seqs
=
op
.
outer_non_seqs
(
node
.
inputs
)
outer_non_seqs
=
op
.
outer_non_seqs
(
node
.
inputs
)
inner_seqs
=
op
.
inner_seqs
(
clean_inputs
)
inner_seqs
=
op
.
inner_seqs
(
clean_inputs
)
outer_seqs
=
op
.
outer_seqs
(
node
.
inputs
)
outer_seqs
=
op
.
outer_seqs
(
node
.
inputs
)
assert
len
(
inner_non_seqs
)
==
len
(
outer_non_seqs
)
assert
len
(
inner_non_seqs
)
==
len
(
outer_non_seqs
)
assert
len
(
inner_seqs
)
==
len
(
outer_seqs
)
assert
len
(
inner_seqs
)
==
len
(
outer_seqs
)
while
changed
and
counts
<
max_iterations
:
for
nd
in
local_fgraph_topo
:
counts
+=
1
if
(
# we haven't already looked at this node
changed
=
False
nd
not
in
to_remove_set
and
all
([((
x
in
inner_non_seqs_set
)
or
for
nd
in
local_fgraph
.
toposort
():
(
x
.
owner
in
to_remove_set
)
or
if
(
numpy
.
all
([(
x
in
inner_non_seqs
)
or
isinstance
(
x
,
tensor
.
Constant
))
(
x
.
owner
in
to_remove
)
or
isinstance
(
x
,
tensor
.
Constant
)
for
x
in
nd
.
inputs
])
and
for
x
in
nd
.
inputs
])
and
# we can do this because the assumption is that a
# we can do this because the assumption is that a
# viewOp or deepCopyOp will be just at the end of the
# viewOp or deepCopyOp will be just at the end of the
# function and not somewhere in the middle ..
# function and not somewhere in the middle ..
not
isinstance
(
nd
.
op
,
theano
.
compile
.
ViewOp
)
and
not
isinstance
(
nd
.
op
,
theano
.
compile
.
ViewOp
)
and
not
isinstance
(
nd
.
op
,
theano
.
compile
.
DeepCopyOp
)
and
not
isinstance
(
nd
.
op
,
theano
.
compile
.
DeepCopyOp
)):
# and we didn't already looked at this node
not
nd
in
to_remove
):
# We have a candidate node to removable
# We have a candidate node to removable
# Step 1. Reconstruct it on outside
# Step 1. Reconstruct it on outside
to_remove
.
appen
d
(
nd
)
to_remove_set
.
ad
d
(
nd
)
outside_ins
=
[]
outside_ins
=
[]
for
x
in
nd
.
inputs
:
for
x
in
nd
.
inputs
:
if
x
in
inner_non_seqs
:
if
x
in
inner_non_seqs_set
:
_idx
=
inner_non_seqs
.
index
(
x
)
_idx
=
inner_non_seqs_map
[
x
]
outside_ins
+=
[
outer_non_seqs
[
_idx
]]
outside_ins
.
append
(
outer_non_seqs
[
_idx
])
elif
x
in
to_replace
:
elif
x
in
to_replace_set
:
outside_ins
+=
[
outside_ins
.
append
(
replace_with_out
[
to_replace_map
[
x
]])
replace_with_out
[
to_replace
.
index
(
x
)]]
elif
isinstance
(
x
,
theano
.
Constant
):
elif
isinstance
(
x
,
theano
.
Constant
):
outside_ins
+=
[
x
.
clone
()]
outside_ins
.
append
(
x
.
clone
())
else
:
else
:
raise
Exception
(
raise
Exception
(
(
'Error in the `scan_pushout_non_seq_'
(
'Error in the `scan_pushout_non_seq_'
...
@@ -286,39 +316,36 @@ class PushOutNonSeqScan(gof.Optimizer):
...
@@ -286,39 +316,36 @@ class PushOutNonSeqScan(gof.Optimizer):
# Step 2. Create variables for replacements
# Step 2. Create variables for replacements
for
idx
,
y
in
enumerate
(
nd
.
outputs
):
for
idx
,
y
in
enumerate
(
nd
.
outputs
):
y_place_holder
=
scan_utils
.
safe_new
(
y
,
'_replace'
)
y_place_holder
=
scan_utils
.
safe_new
(
y
,
'_replace'
)
to_replace
+=
[
y
]
add_to_replace
(
y
)
replace_with_in
+=
[
y_place_holder
]
replace_with_in
.
append
(
y_place_holder
)
assert
type
(
y
)
==
type
(
nw_outer_node
.
outputs
[
idx
])
assert
isinstance
(
y
,
type
(
nw_outer_node
.
outputs
[
idx
]))
replace_with_out
+=
[
nw_outer_node
.
outputs
[
idx
]]
replace_with_out
.
append
(
nw_outer_node
.
outputs
[
idx
])
changed
=
True
if
counts
>=
max_iterations
:
raise
Exception
(
'Error in the `scan_pushout_non_seq_operations`.'
' The optimization exhausted the maximal number '
'of iterations allowed!'
)
# We need to check all candidate replacements and choose those that
# We need to check all candidate replacements and choose those that
# make sense for us
# make sense for us
# Step 1. which elements of `to_replace` are used by remaining
# Step 1. which elements of `to_replace` are used by remaining
# components of the inner function
# components of the inner function
clean_to_replace
=
[]
clean_to_replace
=
[]
clean_replace_with_in
=
[]
clean_replace_with_in
=
[]
clean_replace_with_out
=
[]
clean_replace_with_out
=
[]
existent_nodes
=
[
nd
for
nd
in
local_fgraph
.
toposort
()
existent_nodes
=
[
nd
for
nd
in
local_fgraph_topo
if
nd
not
in
to_remove
]
if
nd
not
in
to_remove_set
]
to_keep
=
[]
existent_nodes_set
=
set
(
existent_nodes
)
to_keep_set
=
set
([])
for
nd
in
existent_nodes
:
for
nd
in
existent_nodes
:
to_keep
+=
nd
.
inputs
to_keep_set
.
update
(
nd
.
inputs
)
for
idx
,
out
in
enumerate
(
to_replace
):
if
(
out
in
to_keep
for
out
,
idx
in
to_replace_map
.
items
():
and
out
.
owner
not
in
existent_nodes
if
(
# If types are different, conversion Op will be inserted,
# If types are different, conversion Op will be inserted,
# and it may trigger an infinite loop.
# and it may trigger an infinite loop.
and
replace_with_in
[
idx
]
.
type
==
out
.
type
):
replace_with_in
[
idx
]
.
type
==
out
.
type
and
clean_to_replace
+=
[
out
]
out
in
to_keep_set
and
clean_replace_with_in
+=
[
replace_with_in
[
idx
]]
out
.
owner
not
in
existent_nodes_set
):
clean_replace_with_out
+=
[
replace_with_out
[
idx
]]
clean_to_replace
.
append
(
out
)
clean_replace_with_in
.
append
(
replace_with_in
[
idx
])
clean_replace_with_out
.
append
(
replace_with_out
[
idx
])
if
len
(
clean_to_replace
)
>
0
:
if
len
(
clean_to_replace
)
>
0
:
# We can finally put an end to all this madness
# We can finally put an end to all this madness
...
@@ -331,12 +358,13 @@ class PushOutNonSeqScan(gof.Optimizer):
...
@@ -331,12 +358,13 @@ class PushOutNonSeqScan(gof.Optimizer):
if
isinstance
(
repl_out
,
theano
.
Constant
):
if
isinstance
(
repl_out
,
theano
.
Constant
):
repl_in
=
repl_out
.
clone
()
repl_in
=
repl_out
.
clone
()
else
:
else
:
nw_inner
+=
[
repl_in
]
nw_inner
.
append
(
repl_in
)
nw_outer
+=
[
repl_out
]
nw_outer
.
append
(
repl_out
)
givens
[
to_repl
]
=
repl_in
givens
[
to_repl
]
=
repl_in
_op_outs
=
scan_utils
.
clone
(
clean_outputs
,
_op_outs
=
scan_utils
.
clone
(
clean_outputs
,
replace
=
givens
)
replace
=
givens
)
_op_ins
=
clean_inputs
+
nw_inner
_op_ins
=
clean_inputs
+
nw_inner
op_ins
,
op_outs
=
scan_utils
.
reconstruct_graph
(
_op_ins
,
_op_outs
)
op_ins
,
op_outs
=
scan_utils
.
reconstruct_graph
(
_op_ins
,
_op_outs
)
# Reconstruct node
# Reconstruct node
...
@@ -351,14 +379,14 @@ class PushOutNonSeqScan(gof.Optimizer):
...
@@ -351,14 +379,14 @@ class PushOutNonSeqScan(gof.Optimizer):
remove
=
[
node
],
remove
=
[
node
],
reason
=
'scanOp_pushout_nonseqs_ops'
)
reason
=
'scanOp_pushout_nonseqs_ops'
)
return
True
return
True
elif
to_keep
==
[]
:
elif
not
to_keep_set
:
# Nothing in the inner graph should be kept
# Nothing in the inner graph should be kept
replace_with
=
OrderedDict
()
replace_with
=
OrderedDict
()
for
idx
,
out
in
enumerate
(
to_replace
):
for
out
,
idx
in
to_replace_map
.
items
(
):
if
out
in
local_fgraph
.
outputs
:
if
out
in
local_fgraph
_outs_set
:
x
=
node
.
outputs
[
local_fgraph
.
outputs
.
index
(
out
)
]
x
=
node
.
outputs
[
local_fgraph
_outs_map
[
out
]
]
y
=
replace_with_out
[
idx
]
y
=
replace_with_out
[
idx
]
shape
=
[
y
.
shape
[
idx
]
for
idx
in
xrange
(
y
.
ndim
)
]
shape
=
[
shp
for
shp
in
y
.
shape
]
replace_with
[
x
]
=
tensor
.
alloc
(
y
,
replace_with
[
x
]
=
tensor
.
alloc
(
y
,
node
.
inputs
[
0
],
node
.
inputs
[
0
],
*
shape
)
*
shape
)
...
@@ -379,7 +407,10 @@ class PushOutNonSeqScan(gof.Optimizer):
...
@@ -379,7 +407,10 @@ class PushOutNonSeqScan(gof.Optimizer):
# This is a global opt for historical reason
# This is a global opt for historical reason
# It should be possible to change it to a local opt.
# It should be possible to change it to a local opt.
class
PushOutSeqScan
(
gof
.
Optimizer
):
class
PushOutSeqScan
(
gof
.
Optimizer
):
"""
A global optimizer for pushing out the input variables that are not being
used inside the scan and provided in the sequences.
"""
def
__init__
(
self
):
def
__init__
(
self
):
gof
.
Optimizer
.
__init__
(
self
)
gof
.
Optimizer
.
__init__
(
self
)
...
@@ -393,57 +424,77 @@ class PushOutSeqScan(gof.Optimizer):
...
@@ -393,57 +424,77 @@ class PushOutSeqScan(gof.Optimizer):
self
.
process_node
(
fgraph
,
node
)
self
.
process_node
(
fgraph
,
node
)
def
process_node
(
self
,
fgraph
,
node
):
def
process_node
(
self
,
fgraph
,
node
):
"""
IMPORTANT NOTE: This function uses set and dictionary data structure.
By default they are not ordered for efficiency reasons. Take care
and make sure of changing them to Ordered versions if you need to
iterate over those variables.
"""
# this flag tells if there was any change during the last iterations
# this flag tells if there was any change during the last iterations
changed
=
True
clean_inputs
,
clean_outputs
=
scan_utils
.
reconstruct_graph
(
clean_inputs
,
clean_outputs
=
scan_utils
.
reconstruct_graph
(
node
.
op
.
inputs
,
node
.
op
.
outputs
)
node
.
op
.
inputs
,
node
.
op
.
outputs
)
local_fgraph
=
gof
.
FunctionGraph
(
clean_inputs
,
clean_outputs
,
clone
=
False
)
local_fgraph
=
gof
.
FunctionGraph
(
clean_inputs
,
clean_outputs
,
max_iterations
=
2
*
len
(
local_fgraph
.
toposort
())
+
3
clone
=
False
)
counts
=
0
local_fgraph_topo
=
local_fgraph
.
toposort
()
to_remove
=
[]
local_fgraph_outs_set
=
set
(
local_fgraph
.
outputs
)
to_replace
=
[]
local_fgraph_outs_map
=
dict
([(
v
,
k
)
for
k
,
v
in
\
enumerate
(
local_fgraph
.
outputs
)])
to_remove_set
=
set
()
to_replace_set
=
set
()
to_replace_map
=
OrderedDict
()
nto_replace
=
0
def
add_to_replace
(
y
):
to_replace_set
.
add
(
y
)
to_replace_map
[
y
]
=
add_to_replace
.
n
add_to_replace
.
n
+=
1
add_to_replace
.
n
=
0
replace_with_in
=
[]
replace_with_in
=
[]
replace_with_out
=
[]
replace_with_out
=
[]
op
=
node
.
op
op
=
node
.
op
# Construct the list of non_sequences to simplify a few things
# Construct the list of non_sequences to simplify a few things
inner_non_seqs
=
op
.
inner_non_seqs
(
clean_inputs
)
inner_non_seqs
=
op
.
inner_non_seqs
(
clean_inputs
)
inner_non_seqs_set
=
set
(
inner_non_seqs
)
inner_non_seqs_map
=
dict
([(
v
,
k
)
for
k
,
v
in
enumerate
(
inner_non_seqs
)])
outer_non_seqs
=
op
.
outer_non_seqs
(
node
.
inputs
)
outer_non_seqs
=
op
.
outer_non_seqs
(
node
.
inputs
)
inner_seqs
=
op
.
inner_seqs
(
clean_inputs
)
inner_seqs
=
op
.
inner_seqs
(
clean_inputs
)
inner_seqs_set
=
set
(
inner_seqs
)
inner_seqs_map
=
dict
([(
v
,
k
)
for
k
,
v
in
enumerate
(
inner_seqs
)])
outer_seqs
=
op
.
outer_seqs
(
node
.
inputs
)
outer_seqs
=
op
.
outer_seqs
(
node
.
inputs
)
assert
len
(
inner_non_seqs
)
==
len
(
outer_non_seqs
)
assert
len
(
inner_non_seqs
)
==
len
(
outer_non_seqs
)
assert
len
(
inner_seqs
)
==
len
(
outer_seqs
)
assert
len
(
inner_seqs
)
==
len
(
outer_seqs
)
while
changed
and
counts
<
max_iterations
:
for
nd
in
local_fgraph_topo
:
counts
+=
1
if
(
nd
not
in
to_remove_set
and
changed
=
False
all
([(
x
in
inner_non_seqs_set
)
or
(
x
.
owner
in
to_remove_set
)
or
for
nd
in
local_fgraph
.
toposort
():
if
(
isinstance
(
nd
.
op
,
theano
.
tensor
.
Elemwise
)
and
numpy
.
all
([(
x
in
inner_non_seqs
)
or
(
x
.
owner
in
to_remove
)
or
isinstance
(
x
,
tensor
.
Constant
)
or
isinstance
(
x
,
tensor
.
Constant
)
or
(
x
in
inner_seqs
)
(
x
in
inner_seqs_set
)
for
x
in
nd
.
inputs
])
and
for
x
in
nd
.
inputs
])
and
isinstance
(
nd
.
op
,
theano
.
tensor
.
Elemwise
)):
not
nd
in
to_remove
):
to_remove
.
appen
d
(
nd
)
to_remove_set
.
ad
d
(
nd
)
outside_ins
=
[]
outside_ins
=
[]
depends_on_seqs
=
False
depends_on_seqs
=
False
for
x
in
nd
.
inputs
:
for
x
in
nd
.
inputs
:
if
x
in
inner_non_seqs
:
if
x
in
inner_non_seqs_set
:
_idx
=
inner_non_seqs
.
index
(
x
)
_idx
=
inner_non_seqs_map
[
x
]
outside_ins
+=
[
outer_non_seqs
[
_idx
]]
outside_ins
.
append
(
outer_non_seqs
[
_idx
])
elif
x
in
inner_seqs
:
elif
x
in
inner_seqs_set
:
outside_ins
+=
[
outer_seqs
[
inner_seqs
.
index
(
x
)]]
outside_ins
.
append
(
outer_seqs
[
inner_seqs_map
[
x
]])
depends_on_seqs
=
True
depends_on_seqs
=
True
elif
x
in
to_replace
:
elif
x
in
to_replace_set
:
outside_ins
+=
[
replace_with_out
[
outside_ins
.
append
(
replace_with_out
[
to_replace
.
index
(
x
)]]
to_replace_map
[
x
]])
depends_on_seqs
=
True
depends_on_seqs
=
True
elif
isinstance
(
x
,
theano
.
Constant
):
elif
isinstance
(
x
,
theano
.
Constant
):
outside_ins
+=
[
x
.
clone
()]
outside_ins
.
append
(
x
.
clone
())
else
:
else
:
raise
Exception
(
raise
Exception
(
(
'Error in the `scan_pushout_seq_'
(
'Error in the `scan_pushout_seq_'
...
@@ -466,24 +517,22 @@ class PushOutSeqScan(gof.Optimizer):
...
@@ -466,24 +517,22 @@ class PushOutSeqScan(gof.Optimizer):
# Step 2. Create variables for replacements
# Step 2. Create variables for replacements
for
idx
,
y
in
enumerate
(
nd
.
outputs
):
for
idx
,
y
in
enumerate
(
nd
.
outputs
):
y_place_holder
=
scan_utils
.
safe_new
(
y
,
'_replace'
)
y_place_holder
=
scan_utils
.
safe_new
(
y
,
'_replace'
)
to_replace
+=
[
y
]
add_to_replace
(
y
)
replace_with_in
+=
[
y_place_holder
]
replace_with_in
.
append
(
y_place_holder
)
replace_with_out
+=
[
nw_outer_node
.
outputs
[
idx
]]
replace_with_out
.
append
(
nw_outer_node
.
outputs
[
idx
])
changed
=
True
elif
(
nd
not
in
to_remove_set
and
isinstance
(
nd
.
op
,
theano
.
tensor
.
DimShuffle
)
and
(
nd
.
inputs
[
0
]
in
inner_seqs_set
or
nd
.
inputs
[
0
]
.
owner
in
to_remove_set
)):
elif
(
isinstance
(
nd
.
op
,
theano
.
tensor
.
DimShuffle
)
and
to_remove_set
.
add
(
nd
)
(
nd
.
inputs
[
0
]
in
inner_seqs
or
nd
.
inputs
[
0
]
.
owner
in
to_remove
)
and
not
nd
in
to_remove
):
to_remove
.
append
(
nd
)
x
=
nd
.
inputs
[
0
]
x
=
nd
.
inputs
[
0
]
if
x
in
inner_seqs
:
if
x
in
inner_seqs_set
:
outside_ins
=
outer_seqs
[
inner_seqs
.
index
(
x
)
]
outside_ins
=
outer_seqs
[
inner_seqs_map
[
x
]
]
elif
x
in
to_replace
:
elif
x
in
to_replace_set
:
outside_ins
=
replace_with_out
[
to_replace
.
index
(
x
)
]
outside_ins
=
replace_with_out
[
to_replace_map
[
x
]
]
new_ord
=
(
0
,)
new_ord
=
(
0
,)
for
old_ord
in
nd
.
op
.
new_order
:
for
old_ord
in
nd
.
op
.
new_order
:
if
(
old_ord
==
'x'
):
if
(
old_ord
==
'x'
):
...
@@ -493,43 +542,42 @@ class PushOutSeqScan(gof.Optimizer):
...
@@ -493,43 +542,42 @@ class PushOutSeqScan(gof.Optimizer):
new_outer
=
outside_ins
.
dimshuffle
(
new_ord
)
new_outer
=
outside_ins
.
dimshuffle
(
new_ord
)
y
=
nd
.
outputs
[
0
]
y
=
nd
.
outputs
[
0
]
y_place_holder
=
scan_utils
.
safe_new
(
y
,
'_replace'
)
y_place_holder
=
scan_utils
.
safe_new
(
y
,
'_replace'
)
to_replace
+=
[
y
]
add_to_replace
(
y
)
replace_with_in
+=
[
y_place_holder
]
replace_with_in
.
append
(
y_place_holder
)
replace_with_out
+=
[
new_outer
]
replace_with_out
.
append
(
new_outer
)
if
hasattr
(
new_outer
.
tag
,
"test_value"
):
if
hasattr
(
new_outer
.
tag
,
"test_value"
):
new_sh
=
new_outer
.
tag
.
test_value
.
shape
new_sh
=
new_outer
.
tag
.
test_value
.
shape
ref_sh
=
(
outside_ins
.
tag
.
test_value
.
shape
[
0
],)
ref_sh
=
(
outside_ins
.
tag
.
test_value
.
shape
[
0
],)
ref_sh
+=
nd
.
outputs
[
0
]
.
tag
.
test_value
.
shape
ref_sh
+=
nd
.
outputs
[
0
]
.
tag
.
test_value
.
shape
assert
new_sh
==
ref_sh
assert
new_sh
==
ref_sh
changed
=
True
if
counts
>=
max_iterations
:
raise
Exception
(
'Error in the `scan_pushout_seq_operations`.'
' The optimization exhausted the maximal number '
'of iterations allowed!'
)
# We need to check all candidate replacements and choose those that
# We need to check all candidate replacements and choose those that
# make sense for us
# make sense for us
# Step 1. which elements of `to_replace` are used by remaining
# Step 1. which elements of `to_replace` are used by remaining
# components of the inner function
# components of the inner function
clean_to_replace
=
[]
clean_to_replace
=
[]
clean_replace_with_in
=
[]
clean_replace_with_in
=
[]
clean_replace_with_out
=
[]
clean_replace_with_out
=
[]
existent_nodes
=
[
nd
for
nd
in
local_fgraph
.
toposort
()
existent_nodes
=
[
nd
for
nd
in
local_fgraph_topo
if
nd
not
in
to_remove
]
if
nd
not
in
to_remove_set
]
to_keep
=
[]
existent_nodes_set
=
set
(
existent_nodes
)
to_keep_set
=
set
([])
for
nd
in
existent_nodes
:
for
nd
in
existent_nodes
:
to_keep
+=
nd
.
inputs
to_keep_set
.
update
(
nd
.
inputs
)
for
idx
,
out
in
enumerate
(
to_replace
):
if
(
out
in
to_keep
for
out
,
idx
in
to_replace_map
.
items
():
and
out
.
owner
not
in
existent_nodes
if
(
out
in
to_keep_set
and
out
.
owner
not
in
existent_nodes_set
# If types are different, conversion Op will be inserted,
# If types are different, conversion Op will be inserted,
# and it may trigger an infinite loop.
# and it may trigger an infinite loop.
and
replace_with_in
[
idx
]
.
type
==
out
.
type
):
and
replace_with_in
[
idx
]
.
type
==
out
.
type
):
clean_to_replace
+=
[
out
]
clean_replace_with_in
+=
[
replace_with_in
[
idx
]]
clean_to_replace
.
append
(
out
)
clean_replace_with_out
+=
[
replace_with_out
[
idx
]]
clean_replace_with_in
.
append
(
replace_with_in
[
idx
])
clean_replace_with_out
.
append
(
replace_with_out
[
idx
])
if
len
(
clean_to_replace
)
>
0
:
if
len
(
clean_to_replace
)
>
0
:
# We can finally put an end to all this madness
# We can finally put an end to all this madness
...
@@ -542,8 +590,9 @@ class PushOutSeqScan(gof.Optimizer):
...
@@ -542,8 +590,9 @@ class PushOutSeqScan(gof.Optimizer):
if
isinstance
(
repl_out
,
theano
.
Constant
):
if
isinstance
(
repl_out
,
theano
.
Constant
):
repl_in
=
repl_out
.
clone
()
repl_in
=
repl_out
.
clone
()
else
:
else
:
nw_inner
+=
[
repl_in
]
nw_inner
.
append
(
repl_in
)
nw_outer
+=
[
repl_out
]
nw_outer
.
append
(
repl_out
)
givens
[
to_repl
]
=
repl_in
givens
[
to_repl
]
=
repl_in
_op_outs
=
scan_utils
.
clone
(
clean_outputs
,
_op_outs
=
scan_utils
.
clone
(
clean_outputs
,
...
@@ -563,14 +612,14 @@ class PushOutSeqScan(gof.Optimizer):
...
@@ -563,14 +612,14 @@ class PushOutSeqScan(gof.Optimizer):
remove
=
[
node
],
remove
=
[
node
],
reason
=
'scanOp_pushout_seqs_ops'
)
reason
=
'scanOp_pushout_seqs_ops'
)
return
True
return
True
elif
(
to_keep
==
[]
and
elif
(
not
to_keep_set
and
not
op
.
as_while
and
not
op
.
as_while
and
not
op
.
outer_mitmot
(
node
)):
not
op
.
outer_mitmot
(
node
)):
# Nothing in the inner graph should be kept
# Nothing in the inner graph should be kept
replace_with
=
OrderedDict
()
replace_with
=
OrderedDict
()
for
idx
,
out
in
enumerate
(
to_replace
):
for
out
,
idx
in
to_replace_map
.
items
(
):
if
out
in
local_fgraph
.
outputs
:
if
out
in
local_fgraph
_outs_set
:
x
=
node
.
outputs
[
local_fgraph
.
outputs
.
index
(
out
)
]
x
=
node
.
outputs
[
local_fgraph
_outs_map
[
out
]
]
_y
=
replace_with_out
[
idx
]
_y
=
replace_with_out
[
idx
]
ls
=
local_fgraph
.
outputs
ls
=
local_fgraph
.
outputs
if
out
in
op
.
inner_mitsot_outs
(
ls
):
if
out
in
op
.
inner_mitsot_outs
(
ls
):
...
@@ -601,10 +650,9 @@ class PushOutSeqScan(gof.Optimizer):
...
@@ -601,10 +650,9 @@ class PushOutSeqScan(gof.Optimizer):
class
PushOutScanOutput
(
gof
.
Optimizer
):
class
PushOutScanOutput
(
gof
.
Optimizer
):
"""
"""
This
optimization can push operations performed at the end of the inner
This
is an optimization that can push operations performed
graph of scan to outside of scan
at the end of the inner graph of scan to outside of scan.
"""
"""
def
__init__
(
self
):
def
__init__
(
self
):
gof
.
Optimizer
.
__init__
(
self
)
gof
.
Optimizer
.
__init__
(
self
)
...
@@ -631,19 +679,17 @@ class PushOutScanOutput(gof.Optimizer):
...
@@ -631,19 +679,17 @@ class PushOutScanOutput(gof.Optimizer):
# Use scan_args to parse the inputs and outputs of scan for ease of
# Use scan_args to parse the inputs and outputs of scan for ease of
# use
# use
args
=
scan_args
(
node
.
inputs
,
node
.
outputs
,
args
=
scan_args
(
node
.
inputs
,
node
.
outputs
,
node
.
op
.
inputs
,
node
.
op
.
outputs
,
node
.
op
.
info
)
op
.
inputs
,
op
.
outputs
,
op
.
info
)
local_fgraph
=
gof
.
FunctionGraph
(
args
.
inner_inputs
,
local_fgraph
=
gof
.
FunctionGraph
(
args
.
inner_inputs
,
args
.
inner_outputs
,
args
.
inner_outputs
,
clone
=
False
)
clone
=
False
)
new_scan_node
=
None
new_scan_node
=
None
local_fgraph_topo
=
local_fgraph
.
toposort
()
for
nd
in
local_fgraph
.
toposort
():
for
nd
in
local_fgraph_topo
:
if
(
isinstance
(
nd
.
op
,
theano
.
tensor
.
Dot
)
and
if
(
isinstance
(
nd
.
op
,
theano
.
tensor
.
Dot
)
and
nd
.
out
in
args
.
inner_out_nit_sot
):
nd
.
out
in
args
.
inner_out_nit_sot
):
"""
"""
The following optimization involves pushing out, after the
The following optimization involves pushing out, after the
scan, a Dot whose output is nitsot (not feed back to the inner
scan, a Dot whose output is nitsot (not feed back to the inner
...
@@ -719,7 +765,8 @@ class PushOutScanOutput(gof.Optimizer):
...
@@ -719,7 +765,8 @@ class PushOutScanOutput(gof.Optimizer):
# Modify the outer graph to add the outer Dot
# Modify the outer graph to add the outer Dot
fgraph
.
replace_all
([
fgraph
.
replace_all
([
(
new_scan_args
.
outer_out_nit_sot
[
dot_out_nitsot_idx
],
(
new_scan_args
.
outer_out_nit_sot
[
dot_out_nitsot_idx
],
outer_dot_output
)],
outer_dot_output
)],
reason
=
"scanOp_pushout_output"
)
reason
=
"scanOp_pushout_output"
)
...
@@ -743,7 +790,8 @@ class PushOutScanOutput(gof.Optimizer):
...
@@ -743,7 +790,8 @@ class PushOutScanOutput(gof.Optimizer):
# otherwise doing a Dot in the outer graph will only
# otherwise doing a Dot in the outer graph will only
# duplicate computation.
# duplicate computation.
sitsot_in_idx
=
nd
.
inputs
.
index
(
args
.
inner_in_sit_sot
[
sitsot_idx
])
sitsot_in_idx
=
nd
.
inputs
.
index
(
args
.
inner_in_sit_sot
[
sitsot_idx
])
dot_in_idx
=
1
-
sitsot_in_idx
# 0 if sitsot_in_idx==1,
dot_in_idx
=
1
-
sitsot_in_idx
# 0 if sitsot_in_idx==1,
# 1 if sitsot_in_idx==0
# 1 if sitsot_in_idx==0
...
@@ -754,8 +802,10 @@ class PushOutScanOutput(gof.Optimizer):
...
@@ -754,8 +802,10 @@ class PushOutScanOutput(gof.Optimizer):
len
(
dot_input
.
clients
)
==
1
and
len
(
dot_input
.
clients
)
==
1
and
dot_input
.
owner
.
inputs
[
0
]
.
ndim
==
2
and
dot_input
.
owner
.
inputs
[
0
]
.
ndim
==
2
and
dot_input
.
owner
.
inputs
[
1
]
.
ndim
==
2
and
dot_input
.
owner
.
inputs
[
1
]
.
ndim
==
2
and
self
.
get_outer_ndim
(
dot_input
.
owner
.
inputs
[
0
],
args
)
==
3
and
self
.
get_outer_ndim
(
dot_input
.
owner
.
inputs
[
0
],
args
)
\
self
.
get_outer_ndim
(
dot_input
.
owner
.
inputs
[
1
],
args
)
==
3
):
==
3
and
self
.
get_outer_ndim
(
dot_input
.
owner
.
inputs
[
1
],
args
)
\
==
3
):
# The optimization can be be applied in this case.
# The optimization can be be applied in this case.
...
@@ -764,7 +814,8 @@ class PushOutScanOutput(gof.Optimizer):
...
@@ -764,7 +814,8 @@ class PushOutScanOutput(gof.Optimizer):
inner_dot_inputs
=
nd
.
inputs
[
dot_in_idx
]
.
owner
.
inputs
inner_dot_inputs
=
nd
.
inputs
[
dot_in_idx
]
.
owner
.
inputs
(
outer_dot_inputs
,
(
outer_dot_inputs
,
new_scan_node
,
new_scan_node
,
new_scan_args
)
=
self
.
push_out_inner_vars
(
fgraph
,
new_scan_args
)
=
\
self
.
push_out_inner_vars
(
fgraph
,
inner_dot_inputs
,
inner_dot_inputs
,
node
,
args
)
node
,
args
)
...
@@ -777,20 +828,23 @@ class PushOutScanOutput(gof.Optimizer):
...
@@ -777,20 +828,23 @@ class PushOutScanOutput(gof.Optimizer):
outdim
=
2
)
outdim
=
2
)
shape_input1
=
theano
.
tensor
.
shape
(
outer_dot_inputs
[
1
])
shape_input1
=
theano
.
tensor
.
shape
(
outer_dot_inputs
[
1
])
outer_dot_inputs
[
1
]
=
outer_dot_inputs
[
1
]
.
reshape
((
shape_input1
[
0
]
*
outer_dot_inputs
[
1
]
=
\
outer_dot_inputs
[
1
]
.
reshape
((
shape_input1
[
0
]
*
shape_input1
[
1
],
shape_input1
[
1
],
shape_input1
[
2
]))
shape_input1
[
2
]))
# Perform the dot on the newly obtained matrices and
# Perform the dot on the newly obtained matrices and
# add the initial value
# add the initial value
outer_dot_output
=
theano
.
tensor
.
dot
(
*
outer_dot_inputs
)
outer_dot_output
=
theano
.
tensor
.
dot
(
*
outer_dot_inputs
)
init_value
=
new_scan_args
.
outer_in_sit_sot
[
sitsot_idx
][
0
]
init_value
=
\
new_scan_args
.
outer_in_sit_sot
[
sitsot_idx
][
0
]
replacement
=
outer_dot_output
+
init_value
replacement
=
outer_dot_output
+
init_value
# Alter the outer graph to use the output of the
# Alter the outer graph to use the output of the
# external Dot instead of the output of scan
# external Dot instead of the output of scan
# Modify the outer graph to add the outer Dot
# Modify the outer graph to add the outer Dot
outer_sitsot
=
new_scan_args
.
outer_out_sit_sot
[
sitsot_idx
]
outer_sitsot
=
\
new_scan_args
.
outer_out_sit_sot
[
sitsot_idx
]
subtensor_node
=
outer_sitsot
.
clients
[
0
][
0
]
subtensor_node
=
outer_sitsot
.
clients
[
0
][
0
]
outer_sitsot_last_step
=
subtensor_node
.
outputs
[
0
]
outer_sitsot_last_step
=
subtensor_node
.
outputs
[
0
]
...
@@ -813,9 +867,7 @@ class PushOutScanOutput(gof.Optimizer):
...
@@ -813,9 +867,7 @@ class PushOutScanOutput(gof.Optimizer):
outer_var
=
scan_args
.
outer_out_sit_sot
[
idx
]
outer_var
=
scan_args
.
outer_out_sit_sot
[
idx
]
if
len
(
outer_var
.
clients
)
==
1
:
if
len
(
outer_var
.
clients
)
==
1
:
client
=
outer_var
.
clients
[
0
][
0
]
client
=
outer_var
.
clients
[
0
][
0
]
if
(
client
!=
'output'
and
if
(
client
!=
'output'
and
isinstance
(
client
.
op
,
theano
.
tensor
.
Subtensor
)):
isinstance
(
client
.
op
,
theano
.
tensor
.
Subtensor
)):
lst
=
theano
.
tensor
.
subtensor
.
get_idx_list
(
lst
=
theano
.
tensor
.
subtensor
.
get_idx_list
(
...
@@ -963,6 +1015,7 @@ class ScanInplaceOptimizer(Optimizer):
...
@@ -963,6 +1015,7 @@ class ScanInplaceOptimizer(Optimizer):
info
=
copy
.
deepcopy
(
op
.
info
)
info
=
copy
.
deepcopy
(
op
.
info
)
if
not
'destroy_map'
in
info
:
if
not
'destroy_map'
in
info
:
info
[
'destroy_map'
]
=
OrderedDict
()
info
[
'destroy_map'
]
=
OrderedDict
()
info
[
'destroy_map'
][
pos
]
=
[
pos
+
1
+
op
.
info
[
'n_seqs'
]]
info
[
'destroy_map'
][
pos
]
=
[
pos
+
1
+
op
.
info
[
'n_seqs'
]]
# inputs corresponding to sequences and n_steps
# inputs corresponding to sequences and n_steps
ls_begin
=
node
.
inputs
[:
1
+
op
.
n_seqs
]
ls_begin
=
node
.
inputs
[:
1
+
op
.
n_seqs
]
...
@@ -1048,7 +1101,7 @@ class ScanSaveMem(gof.Optimizer):
...
@@ -1048,7 +1101,7 @@ class ScanSaveMem(gof.Optimizer):
c_outs
=
op
.
n_mit_mot
+
op
.
n_mit_sot
+
op
.
n_sit_sot
+
op
.
n_nit_sot
c_outs
=
op
.
n_mit_mot
+
op
.
n_mit_sot
+
op
.
n_sit_sot
+
op
.
n_nit_sot
init_l
=
[
0
for
x
in
xrange
(
op
.
n_mit_mot
)]
init_l
=
[
0
for
x
in
xrange
(
op
.
n_mit_mot
)]
init_l
+=
[
abs
(
numpy
.
min
(
v
))
for
v
in
op
.
tap_array
[
op
.
n_mit_mot
:]]
init_l
+=
[
abs
(
min
(
v
))
for
v
in
op
.
tap_array
[
op
.
n_mit_mot
:]]
init_l
+=
[
0
for
x
in
xrange
(
op
.
n_nit_sot
)]
init_l
+=
[
0
for
x
in
xrange
(
op
.
n_nit_sot
)]
# 2. Check the clients of each output and see for how many steps
# 2. Check the clients of each output and see for how many steps
# does scan need to run
# does scan need to run
...
@@ -1259,7 +1312,8 @@ class ScanSaveMem(gof.Optimizer):
...
@@ -1259,7 +1312,8 @@ class ScanSaveMem(gof.Optimizer):
# for mitsots and sitsots (because mitmots are not
# for mitsots and sitsots (because mitmots are not
# currently supported by the mechanism) and only if
# currently supported by the mechanism) and only if
# the pre-allocation mechanism is activated.
# the pre-allocation mechanism is activated.
prealloc_outs
=
theano
.
config
.
scan
.
allow_output_prealloc
prealloc_outs
=
\
theano
.
config
.
scan
.
allow_output_prealloc
first_mitsot_idx
=
node
.
op
.
n_mit_mot
first_mitsot_idx
=
node
.
op
.
n_mit_mot
last_sitsot_idx
=
(
node
.
op
.
n_mit_mot
+
last_sitsot_idx
=
(
node
.
op
.
n_mit_mot
+
...
@@ -1281,11 +1335,13 @@ class ScanSaveMem(gof.Optimizer):
...
@@ -1281,11 +1335,13 @@ class ScanSaveMem(gof.Optimizer):
# TODO: Simplify the number of steps needed.
# TODO: Simplify the number of steps needed.
# FB: This need good testing, left to later.
# FB: This need good testing, left to later.
# call get_scalar_constant_value()? it can
# call get_scalar_constant_value()? it can
# return python/numpy scalar or numpy.ndarray currently.
# return python/numpy scalar or numpy.ndarray
# currently.
# pval = pre_greedy_local_optimizer(list_opt_slice,
# pval = pre_greedy_local_optimizer(list_opt_slice,
# pval)
# pval)
#pval = pre_constant_merge([pval])[0]
#pval = pre_constant_merge([pval])[0]
# if (isinstance(pval, theano.tensor.TensorConstant) and
# if (isinstance(pval, theano.tensor.TensorConstant)
# and
# pval.dtype.startswith('int')):
# pval.dtype.startswith('int')):
# try:
# try:
# pval = int(pval.data)
# pval = int(pval.data)
...
@@ -1329,7 +1385,6 @@ class ScanSaveMem(gof.Optimizer):
...
@@ -1329,7 +1385,6 @@ class ScanSaveMem(gof.Optimizer):
# a) the input is a set_subtensor, in that case we
# a) the input is a set_subtensor, in that case we
# can replace the initial tensor by a slice,
# can replace the initial tensor by a slice,
# b) it is not, and we simply take a slice of it.
# b) it is not, and we simply take a slice of it.
# TODO: commit change below with Razvan
# TODO: commit change below with Razvan
if
(
nw_inputs
[
offset
+
idx
]
.
owner
and
if
(
nw_inputs
[
offset
+
idx
]
.
owner
and
isinstance
(
nw_inputs
[
offset
+
idx
]
.
owner
.
op
,
isinstance
(
nw_inputs
[
offset
+
idx
]
.
owner
.
op
,
...
@@ -1513,7 +1568,8 @@ class ScanSaveMem(gof.Optimizer):
...
@@ -1513,7 +1568,8 @@ class ScanSaveMem(gof.Optimizer):
# 3.9. Get replace pairs for all other nodes
# 3.9. Get replace pairs for all other nodes
if
flag_store
or
global_nsteps
is
not
None
:
if
flag_store
or
global_nsteps
is
not
None
:
for
idx
,
o
in
enumerate
(
node
.
outputs
):
for
idx
,
o
in
enumerate
(
node
.
outputs
):
if
not
(
idx
in
replaced_outs
)
and
not
idx
in
not_required
:
if
not
(
idx
in
replaced_outs
)
and
\
not
idx
in
not_required
:
nw_pos
=
compress_map
[
idx
]
nw_pos
=
compress_map
[
idx
]
old_new
+=
[(
o
,
new_outs
[
nw_pos
])]
old_new
+=
[(
o
,
new_outs
[
nw_pos
])]
# Check if the new outputs depend on the old scan node
# Check if the new outputs depend on the old scan node
...
@@ -2054,12 +2110,16 @@ class PushOutDot1(gof.Optimizer):
...
@@ -2054,12 +2110,16 @@ class PushOutDot1(gof.Optimizer):
new_info
=
op
.
info
.
copy
()
new_info
=
op
.
info
.
copy
()
st
=
len
(
op
.
mitmot_taps
())
+
len
(
op
.
mitsot_taps
())
st
=
len
(
op
.
mitmot_taps
())
+
len
(
op
.
mitsot_taps
())
new_info
[
'tap_array'
]
=
(
new_info
[
'tap_array'
][:
st
+
idx
]
+
new_info
[
'tap_array'
]
=
(
\
new_info
[
'tap_array'
][
st
+
idx
+
1
:])
new_info
[
'tap_array'
][:
st
+
idx
]
+
new_info
[
'tap_array'
][
st
+
idx
+
1
:])
new_info
[
'n_sit_sot'
]
-=
1
new_info
[
'n_sit_sot'
]
-=
1
new_info
[
'n_nit_sot'
]
+=
1
new_info
[
'n_nit_sot'
]
+=
1
inner_sitsot
=
inner_sitsot
[:
idx
]
+
inner_sitsot
[
idx
+
1
:]
inner_sitsot
=
inner_sitsot
[:
idx
]
+
\
outer_sitsot
=
outer_sitsot
[:
idx
]
+
outer_sitsot
[
idx
+
1
:]
inner_sitsot
[
idx
+
1
:]
outer_sitsot
=
outer_sitsot
[:
idx
]
+
\
outer_sitsot
[
idx
+
1
:]
inner_sitsot_outs
=
inner_sitsot_outs
[:
idx
]
+
\
inner_sitsot_outs
=
inner_sitsot_outs
[:
idx
]
+
\
inner_sitsot_outs
[
idx
+
1
:]
inner_sitsot_outs
[
idx
+
1
:]
# add n_steps as the length
# add n_steps as the length
...
@@ -2095,8 +2155,9 @@ class PushOutDot1(gof.Optimizer):
...
@@ -2095,8 +2155,9 @@ class PushOutDot1(gof.Optimizer):
if
type
(
new_outs
)
not
in
(
list
,
tuple
):
if
type
(
new_outs
)
not
in
(
list
,
tuple
):
new_outs
=
[
new_outs
]
new_outs
=
[
new_outs
]
# We need now to pair correctly the new outputs with the
# We need now to pair correctly the new outputs
# old ones
# with the old ones
outer_mitmot_outs
=
new_op
.
outer_mitmot_outs
(
new_outs
)
outer_mitmot_outs
=
new_op
.
outer_mitmot_outs
(
new_outs
)
outer_mitsot_outs
=
new_op
.
outer_mitsot_outs
(
new_outs
)
outer_mitsot_outs
=
new_op
.
outer_mitsot_outs
(
new_outs
)
outer_sitsot_outs
=
new_op
.
outer_sitsot_outs
(
new_outs
)
outer_sitsot_outs
=
new_op
.
outer_sitsot_outs
(
new_outs
)
...
@@ -2135,7 +2196,8 @@ class PushOutDot1(gof.Optimizer):
...
@@ -2135,7 +2196,8 @@ class PushOutDot1(gof.Optimizer):
old_new
=
list
(
zip
(
node
.
outputs
[:
pos
],
new_outs
[:
pos
]))
old_new
=
list
(
zip
(
node
.
outputs
[:
pos
],
new_outs
[:
pos
]))
old
=
node
.
outputs
[
pos
]
.
clients
[
0
][
0
]
.
outputs
[
0
]
old
=
node
.
outputs
[
pos
]
.
clients
[
0
][
0
]
.
outputs
[
0
]
old_new
.
append
((
old
,
new_out
))
old_new
.
append
((
old
,
new_out
))
old_new
+=
list
(
zip
(
node
.
outputs
[
pos
+
1
:],
new_outs
[
pos
:]))
old_new
+=
list
(
zip
(
node
.
outputs
[
pos
+
1
:],
new_outs
[
pos
:]))
fgraph
.
replace_all_validate_remove
(
fgraph
.
replace_all_validate_remove
(
old_new
,
remove
=
[
node
],
reason
=
'scan_pushout_dot1'
)
old_new
,
remove
=
[
node
],
reason
=
'scan_pushout_dot1'
)
...
...
编写
预览
Markdown
格式
0%
重试
或
添加新文件
添加附件
取消
您添加了
0
人
到此讨论。请谨慎行事。
请先完成此评论的编辑!
取消
请
注册
或者
登录
后发表评论