Skip to content
项目
群组
代码片段
帮助
当前项目
正在载入...
登录 / 注册
切换导航面板
P
pytensor
项目
项目
详情
活动
周期分析
仓库
仓库
文件
提交
分支
标签
贡献者
图表
比较
统计图
议题
0
议题
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
CI / CD
CI / CD
流水线
作业
日程
统计图
Wiki
Wiki
代码片段
代码片段
成员
成员
折叠边栏
关闭边栏
活动
图像
聊天
创建新问题
作业
提交
问题看板
Open sidebar
testgroup
pytensor
Commits
77f6b2be
提交
77f6b2be
authored
8月 28, 2015
作者:
abergeron
浏览文件
操作
浏览文件
下载
差异文件
Merge pull request #3334 from abergeron/delete_old_crap
Delete old stuff
上级
7320e1b1
bcfe70c7
隐藏空白字符变更
内嵌
并排
正在显示
12 个修改的文件
包含
0 行增加
和
3959 行删除
+0
-3959
scan.py
theano/sandbox/scan.py
+0
-708
__init__.py
theano/sandbox/scan_module/__init__.py
+0
-42
scan.py
theano/sandbox/scan_module/scan.py
+0
-584
scan_op.py
theano/sandbox/scan_module/scan_op.py
+0
-388
scan_utils.py
theano/sandbox/scan_module/scan_utils.py
+0
-448
__init__.py
theano/sandbox/scan_module/tests/__init__.py
+0
-0
test_scan.py
theano/sandbox/scan_module/tests/test_scan.py
+0
-572
test_utils.py
theano/sandbox/scan_module/tests/test_utils.py
+0
-282
symbolic_module.py
theano/sandbox/symbolic_module.py
+0
-469
test_scan.py
theano/sandbox/tests/test_scan.py
+0
-85
test_theano_object.py
theano/sandbox/tests/test_theano_object.py
+0
-108
theano_object.py
theano/sandbox/theano_object.py
+0
-273
没有找到文件。
theano/sandbox/scan.py
deleted
100644 → 0
浏览文件 @
7320e1b1
"""
This module provides a different interface for the Scan Op.
This is a sligthly more advanced interface that helps avoiding certain
issues that scan can cause.
"""
__docformat__
=
'restructedtext en'
__authors__
=
"Razvan Pascanu "
__copyright__
=
"(c) 2010, Universite de Montreal"
__contact__
=
"Razvan Pascanu <r.pascanu@gmail>"
import
logging
import
numpy
import
warnings
from
theano.compile
import
SharedVariable
,
function
from
six
import
iteritems
from
six.moves
import
xrange
from
theano
import
compile
from
theano
import
gof
from
theano.compat
import
izip
from
theano.compat
import
OrderedDict
,
ifilter
from
theano.tensor
import
opt
from
theano
import
tensor
from
theano
import
config
from
theano.updates
import
OrderedUpdates
from
theano.scan_module
import
scan_op
from
theano.scan_module
import
scan_utils
from
theano.scan_module.scan_utils
import
safe_new
# Logging function for sending warning or info
_logger
=
logging
.
getLogger
(
'theano.scan_module.scan'
)
def
scan
(
fn
,
sequences
=
None
,
states
=
None
,
params
=
None
,
n_steps
=
None
,
mode
=
None
,
name
=
None
,
profile
=
False
,
allow_gc
=
None
):
"""
Similar to Theano's official scan, this function gives the user more
control over the scan op, avoiding certain difficulties that arose from
missing optimizations.
Parameters
----------
fn
Lambda function that describes one step of scan (see the
official Theano scan function)
sequences
Similar to the official Theano's scan. This version
of scan does not support taps for the sequences (it can only be a
list of tensor). Scan assumes that sequences have the right length
and it does not check for this.
states
Similar to outputs_info of the official scan function.
There is one crucial difference though, namely that the `initial`
key in the dictionary has been replace by 'membuf' key. This
reflects the change of meaning. Instead of passing to scan just
the initial steps misisng, one has now to pass a memory buffer in
which scan will try to store its output. In this memory buffer the
first entries should be set to the initial states of the
corresponding states.
Providing a memory buffer that has less entries then the number of
steps, mneans scan will only use that amount of memory. The user has
to match the memory buffer size with the number of steps, otherwise
scan will produce wrong results. Also if gradients are to be
computed through the scan, the memory buffer should have the same
length as the number of steps.
For states that do not require a initial state, one has to provide a
dictionary with a single key 'steps' that says how many intermediate
results to store. See examples below for more insight.
n_steps
This parameter is mandatory and it will represent the
number of steps scan will do (scan will not check sequences or any
other source of information to figure out how many steps it needs
to do).
mode
Same as for the official scan.
name
Same as for the official scan.
profile
Same as for the official scan.
Notes
-----
- There is no truncate / go_backwards anymore !
- The outputs returned by scan contain the initial states as well (i.e.
if I loop over k steps, with my smallest tap for an output -3 and keep
al intermediate results, my output will be of length k+3.
Examples
--------
(a) if you do not want to store any intermediate results (just the
last one)
# The memory buffer can be the initial state, just that we need to
# add one extra dimension in front of it
state = TT.unbroadcast(TT.shape_padleft(x0),0)
out,_ = scan(lambda x:x+1, states = state, n_steps = 5)
# Once we got our result we need to remove the extra dimension
out = out[0]
(b) if you want to keep every intermediate results
state = TT.alloc(TT.constant(0), 6, x0.shape[0])
state = TT.set_subtensor(state[0], x0)
out,_ = scan(lambda x:x+1, states = state, n_steps = 5)
out = out[1:]
"""
def
wrap_into_list
(
x
):
'''
Wrap the input into a list if it is not already a list
'''
if
x
is
None
:
return
[]
elif
not
isinstance
(
x
,
(
list
,
tuple
)):
return
[
x
]
else
:
return
list
(
x
)
seqs
=
wrap_into_list
(
sequences
)
outs_info
=
wrap_into_list
(
states
)
if
allow_gc
is
None
:
allow_gc
=
config
.
scan
.
allow_gc
# Make sure we get rid of numpy arrays or ints or anything like that
# passed as inputs to scan
non_seqs
=
[]
for
elem
in
wrap_into_list
(
params
):
if
not
isinstance
(
elem
,
gof
.
Variable
):
non_seqs
.
append
(
tensor
.
as_tensor_variable
(
elem
))
else
:
non_seqs
.
append
(
elem
)
# If we provided a known number of steps ( before compilation)
# and if that number is 1 or -1, then we can skip the Scan Op,
# and just apply the inner function once
# To do that we check here to see the nature of n_steps
n_fixed_steps
=
None
if
isinstance
(
n_steps
,
(
float
,
int
)):
n_fixed_steps
=
int
(
n_steps
)
else
:
try
:
n_fixed_steps
=
opt
.
get_scalar_constant_value
(
n_steps
)
except
tensor
.
basic
.
NotScalarConstantError
:
n_fixed_steps
=
None
# Check n_steps is an int
if
(
hasattr
(
n_steps
,
'dtype'
)
and
str
(
n_steps
.
dtype
)[:
3
]
not
in
(
'uin'
,
'int'
)):
raise
ValueError
(
' n_steps must be an int. dtype provided '
'is
%
s'
%
n_steps
.
dtype
)
# compute number of sequences and number of outputs
n_seqs
=
len
(
seqs
)
n_outs
=
len
(
outs_info
)
return_steps
=
OrderedDict
()
# wrap outputs info in a dictionary if they are not already in one
for
i
in
xrange
(
n_outs
):
if
outs_info
[
i
]
is
not
None
:
if
not
isinstance
(
outs_info
[
i
],
dict
):
# by default any output has a tap value of -1
outs_info
[
i
]
=
dict
(
membuf
=
outs_info
[
i
],
taps
=
[
-
1
])
elif
(
not
outs_info
[
i
]
.
get
(
'membuf'
,
None
)
and
outs_info
[
i
]
.
get
(
'taps'
,
None
)):
# ^ no initial state but taps provided
raise
ValueError
((
'If you are using slices of an output '
'you need to provide a memory buffer for '
'the state '
),
outs_info
[
i
])
elif
(
outs_info
[
i
]
.
get
(
'membuf'
,
None
)
and
not
outs_info
[
i
]
.
get
(
'taps'
,
None
)):
# ^ initial state but taps not provided
if
'taps'
in
outs_info
[
i
]:
# ^ explicitly provided a None for taps
_logger
.
warning
(
'Output
%
s (index
%
d) has a memory '
'buffer but taps is explicitly set to None '
,
getattr
(
outs_info
[
i
][
'membuf'
],
'name'
,
'None'
),
i
)
outs_info
[
i
][
'taps'
]
=
[
-
1
]
else
:
# if a None is provided as the output info we replace it
# with an dict(steps=n_steps) to simplify handling
outs_info
[
i
]
=
dict
(
steps
=
n_steps
)
##
# Step 2. Generate inputs and outputs of the inner functions
# for compiling a dummy function (Iteration #1)
##
# create theano inputs for the recursive function
# note : this is a first batch of possible inputs that will
# be compiled in a dummy function; we used this dummy
# function to detect shared variables and their updates
# and to construct a new and complete list of inputs and
# outputs
n_seqs
=
0
scan_seqs
=
[]
# Variables passed as inputs to the scan op
inner_seqs
=
[]
# Variables passed as inputs to the inner function
inner_slices
=
[]
# Actual slices if scan is removed from the picture
# go through sequences picking up time slices as needed
for
i
,
seq
in
enumerate
(
seqs
):
if
isinstance
(
seq
,
dict
):
seq
=
seq
[
'input'
]
actual_slice
=
seq
[
0
]
_seq_val
=
tensor
.
as_tensor_variable
(
seq
)
_seq_val_slice
=
_seq_val
[
0
]
nw_slice
=
_seq_val_slice
.
type
()
# Try to transfer test_value to the new variable
if
config
.
compute_test_value
!=
'off'
:
try
:
nw_slice
.
tag
.
test_value
=
gof
.
Op
.
_get_test_value
(
_seq_val_slice
)
except
AttributeError
as
e
:
if
config
.
compute_test_value
!=
'ignore'
:
# No need to print a warning or raise an error now,
# it will be done when fn will be called.
_logger
.
info
((
'Cannot compute test value for '
'the inner function of scan, input value '
'missing
%
s'
),
e
)
if
seq
.
name
:
nw_slice
.
name
=
seq
.
name
+
'[t]'
scan_seqs
.
append
(
_seq_val
)
inner_seqs
.
append
(
nw_slice
)
inner_slices
.
append
(
actual_slice
)
n_seqs
+=
1
actual_n_steps
=
tensor
.
as_tensor
(
n_steps
)
# Conventions :
# mit_mot = multiple input taps, multiple output taps ( only provided
# by the gradient function )
# mit_sot = multiple input taps, single output tap (t + 0)
# sit_sot = single input tap, single output tap (t + 0)
# nit_sot = no input tap, single output tap (t + 0)
# MIT_MOT -- not provided by the user only by the grad function
n_mit_mot
=
0
n_mit_mot_outs
=
0
mit_mot_scan_inputs
=
[]
mit_mot_inner_inputs
=
[]
mit_mot_inner_outputs
=
[]
mit_mot_out_slices
=
[]
mit_mot_rightOrder
=
[]
# SIT_SOT -- provided by the user
n_mit_sot
=
0
mit_sot_scan_inputs
=
[]
mit_sot_inner_inputs
=
[]
mit_sot_inner_slices
=
[]
mit_sot_inner_outputs
=
[]
mit_sot_return_steps
=
OrderedDict
()
mit_sot_tap_array
=
[]
mit_sot_rightOrder
=
[]
n_sit_sot
=
0
sit_sot_scan_inputs
=
[]
sit_sot_inner_inputs
=
[]
sit_sot_inner_slices
=
[]
sit_sot_inner_outputs
=
[]
sit_sot_return_steps
=
OrderedDict
()
sit_sot_rightOrder
=
[]
nit_sot_steps
=
[]
# go through outputs picking up time slices as needed
for
i
,
init_out
in
enumerate
(
outs_info
):
# Note that our convention dictates that if an output uses
# just the previous time step, as a initial state we will only
# provide a tensor of the same dimension as one time step; This
# makes code much cleaner for those who do not use taps. Otherwise
# they would always had to shape_padleft the initial state ..
# which is ugly
# Note, 'taps' might not be in the dictionary
if
'taps'
in
init_out
and
init_out
[
'taps'
]
==
[
-
1
]:
actual_arg
=
init_out
[
'membuf'
]
arg
=
safe_new
(
init_out
[
'membuf'
][
0
])
if
isinstance
(
arg
,
tensor
.
Constant
):
# safe new returns a clone of the constants, but that is not
# what we need for initial states
arg
=
arg
.
type
()
# Try to transfer test_value to the new variable
if
config
.
compute_test_value
!=
'off'
:
try
:
arg
.
tag
.
test_value
=
gof
.
Op
.
_get_test_value
(
actual_arg
)
except
AttributeError
as
e
:
if
config
.
compute_test_value
!=
'ignore'
:
# No need to print a warning or raise an error now,
# it will be done when fn will be called.
_logger
.
info
((
'Cannot compute test value for the '
'inner function of scan, input value missing
%
s'
),
e
)
if
getattr
(
init_out
[
'membuf'
],
'name'
,
None
)
is
not
None
:
arg
.
name
=
init_out
[
'membuf'
]
.
name
+
'[t-1]'
# We need now to allocate space for storing the output and copy
# the initial state over. We do this using the expand function
# defined in scan utils
sit_sot_scan_inputs
.
append
(
actual_arg
)
sit_sot_inner_slices
.
append
(
actual_arg
[
0
])
if
i
in
return_steps
:
sit_sot_return_steps
[
n_sit_sot
]
=
return_steps
[
i
]
sit_sot_inner_inputs
.
append
(
arg
)
sit_sot_rightOrder
.
append
(
i
)
n_sit_sot
+=
1
elif
init_out
.
get
(
'taps'
,
None
):
if
numpy
.
any
(
numpy
.
array
(
init_out
.
get
(
'taps'
,
[]))
>
0
):
# Make sure we do not have requests for future values of a
# sequence we can not provide such values
raise
ValueError
(
'Can not use future taps of outputs'
,
init_out
)
# go through the taps
mintap
=
abs
(
numpy
.
min
(
init_out
[
'taps'
]))
mit_sot_tap_array
.
append
(
init_out
[
'taps'
])
idx_offset
=
abs
(
numpy
.
min
(
init_out
[
'taps'
]))
# Sequence
mit_sot_scan_inputs
.
append
(
init_out
[
'membuf'
])
if
i
in
return_steps
:
mit_sot_return_steps
[
n_mit_sot
]
=
return_steps
[
i
]
mit_sot_rightOrder
.
append
(
i
)
n_mit_sot
+=
1
for
k
in
init_out
[
'taps'
]:
# create a new slice
actual_nw_slice
=
init_out
[
'membuf'
][
k
+
mintap
]
_init_out_var
=
tensor
.
as_tensor_variable
(
init_out
[
'membuf'
])
_init_out_var_slice
=
_init_out_var
[
k
+
mintap
]
nw_slice
=
_init_out_var_slice
.
type
()
# Try to transfer test_value to the new variable
if
config
.
compute_test_value
!=
'off'
:
try
:
nw_slice
.
tag
.
test_value
=
gof
.
Op
.
_get_test_value
(
_init_out_var_slice
)
except
AttributeError
as
e
:
if
config
.
compute_test_value
!=
'ignore'
:
# No need to print a warning or raise an error now,
# it will be done when fn will be called.
_logger
.
info
((
'Cannot compute test value for '
'the inner function of scan, input value '
'missing.
%
s'
),
e
)
# give it a name or debugging and pretty printing
if
getattr
(
init_out
[
'membuf'
],
'name'
,
None
)
is
not
None
:
if
k
>
0
:
nw_slice
.
name
=
(
init_out
[
'membuf'
]
.
name
+
'[t+
%
d]'
%
k
)
elif
k
==
0
:
nw_slice
.
name
=
init_out
[
'membuf'
]
.
name
+
'[t]'
else
:
nw_slice
.
name
=
(
init_out
[
'membuf'
]
.
name
+
'[t
%
d]'
%
k
)
mit_sot_inner_inputs
.
append
(
nw_slice
)
mit_sot_inner_slices
.
append
(
actual_nw_slice
)
else
:
pass
# Re-order args
max_mit_sot
=
numpy
.
max
([
-
1
]
+
mit_sot_rightOrder
)
+
1
max_sit_sot
=
numpy
.
max
([
-
1
]
+
sit_sot_rightOrder
)
+
1
n_elems
=
numpy
.
max
([
max_mit_sot
,
max_sit_sot
])
_ordered_args
=
[[]
for
x
in
xrange
(
n_elems
)]
offset
=
0
for
idx
in
xrange
(
n_mit_sot
):
n_inputs
=
len
(
mit_sot_tap_array
[
idx
])
if
n_fixed_steps
==
1
:
_ordered_args
[
mit_sot_rightOrder
[
idx
]]
=
\
mit_sot_inner_slices
[
offset
:
offset
+
n_inputs
]
else
:
_ordered_args
[
mit_sot_rightOrder
[
idx
]]
=
\
mit_sot_inner_inputs
[
offset
:
offset
+
n_inputs
]
offset
+=
n_inputs
for
idx
in
xrange
(
n_sit_sot
):
if
n_fixed_steps
==
1
:
_ordered_args
[
sit_sot_rightOrder
[
idx
]]
=
\
[
sit_sot_inner_slices
[
idx
]]
else
:
_ordered_args
[
sit_sot_rightOrder
[
idx
]]
=
\
[
sit_sot_inner_inputs
[
idx
]]
ordered_args
=
[]
for
ls
in
_ordered_args
:
ordered_args
+=
ls
if
n_fixed_steps
==
1
:
args
=
(
inner_slices
+
ordered_args
+
non_seqs
)
else
:
args
=
(
inner_seqs
+
ordered_args
+
non_seqs
)
# add only the non-shared variables and non-constants to the arguments of
# the dummy function [ a function should not get shared variables or
# constants as input ]
dummy_args
=
[
arg
for
arg
in
args
if
(
not
isinstance
(
arg
,
SharedVariable
)
and
not
isinstance
(
arg
,
tensor
.
Constant
))]
# when we apply the lambda expression we get a mixture of update rules
# and outputs that needs to be separated
lambda_result
=
fn
(
*
args
)
condition
,
outputs
,
updates
=
scan_utils
.
get_updates_and_outputs
(
lambda_result
)
if
condition
is
not
None
:
as_while
=
True
else
:
as_while
=
False
##
# Step 3. Check if we actually need scan and remove it if we don't
##
if
n_fixed_steps
==
1
:
# We do not need to use the scan op anymore, so we can just return
# the outputs and updates we have
if
condition
is
not
None
:
_logger
.
warning
((
'When the number of steps is fixed and equal '
'to 1, the provided stopping condition, '
,
str
(
condition
),
' is ignored'
))
for
pos
,
inner_out
in
enumerate
(
outputs
):
# we need to see if we need to pad our sequences with an
# unbroadcastable dimension; case example : we return an
# output for which we want all intermediate. If n_steps is 1
# then, if we return the output as given by the innner function
# this will represent only a slice and it will have one
# dimension less.
if
(
isinstance
(
inner_out
.
type
,
tensor
.
TensorType
)
and
return_steps
.
get
(
pos
,
0
)
!=
1
):
outputs
[
pos
]
=
tensor
.
unbroadcast
(
tensor
.
shape_padleft
(
inner_out
),
0
)
if
len
(
outputs
)
==
1
:
outputs
=
outputs
[
0
]
return
(
outputs
,
updates
)
##
# Step 4. Compile the dummy function
##
# We can now compile a dummy function just to see what shared variable
# we have and what are their update rules (note that the user has
# the option not to pass the shared variable to scan, so we need to
# pick them manually and add them to scan)
# make the compilation as fast as possible by not applying any
# optimization or conversion to C [ note this region is not important
# for performance so we can do stuff as unoptimal as we wish ]
# extract still missing inputs (there still might be so) and add them
# as non sequences at the end of our args
fake_nonseqs
=
[
x
.
type
()
for
x
in
non_seqs
]
fake_outputs
=
scan_utils
.
clone
(
outputs
+
list
(
updates
.
values
()),
replace
=
dict
(
izip
(
non_seqs
,
fake_nonseqs
)))
all_inputs
=
ifilter
(
lambda
x
:
(
isinstance
(
x
,
gof
.
Variable
)
and
not
isinstance
(
x
,
SharedVariable
)
and
not
isinstance
(
x
,
gof
.
Constant
)),
gof
.
graph
.
inputs
(
fake_outputs
))
extra_inputs
=
[
x
for
x
in
all_inputs
if
x
not
in
args
+
fake_nonseqs
]
non_seqs
+=
extra_inputs
# Note we do not use all_inputs directly since the order of variables
# in args is quite important
dummy_args
+=
extra_inputs
dummy_outs
=
outputs
if
condition
is
not
None
:
dummy_outs
.
append
(
condition
)
# If we use a regular dict here, the results are non-deterministic
if
not
isinstance
(
updates
,
(
list
,
tuple
)):
if
isinstance
(
updates
,
dict
)
and
\
not
isinstance
(
updates
,
OrderedDict
):
warnings
.
warn
(
"Using non-deterministic dictionary."
)
dummy_f
=
function
(
dummy_args
,
dummy_outs
,
updates
=
updates
,
mode
=
compile
.
mode
.
Mode
(
linker
=
'py'
,
optimizer
=
None
),
on_unused_input
=
'ignore'
)
##
# Step 5. Re-arange inputs of scan into a more strict order
##
# Step 5.0 Check the outputs of the dummy function to see if they
# match with user provided data
# if the number of outputs to the function does not match the number of
# assumed outputs until now (provided by the user) there can be
# only one explanation: No information is provided for any of the
# outputs (i.e. we are dealing with a map)
tmp_dummy_f_outs
=
len
(
dummy_f
.
maker
.
outputs
)
if
as_while
:
tmp_dummy_f_outs
-=
1
if
not
(
tmp_dummy_f_outs
==
n_outs
or
outs_info
==
[]):
raise
ValueError
(
'Please provide None as outputs_info for '
'any output that does not feed back into '
'scan (i.e. it behaves like a map) '
)
if
outs_info
==
[]:
n_outs
=
len
(
dummy_f
.
maker
.
outputs
)
if
as_while
:
n_outs
=
n_outs
-
1
outs_info
=
[
dict
(
steps
=
n_steps
)
for
x
in
xrange
(
n_outs
)]
# Step 5.1 Outputs with taps different then -1
for
i
,
out
in
enumerate
(
outs_info
):
if
'taps'
in
out
and
out
[
'taps'
]
!=
[
-
1
]:
mit_sot_inner_outputs
.
append
(
outputs
[
i
])
# Step 5.2 Outputs with tap equal to -1
for
i
,
out
in
enumerate
(
outs_info
):
if
'taps'
in
out
and
out
[
'taps'
]
==
[
-
1
]:
sit_sot_inner_outputs
.
append
(
outputs
[
i
])
# Step 5.3 Outputs that correspond to update rules of shared variables
givens
=
OrderedDict
()
n_shared_outs
=
0
shared_scan_inputs
=
[]
shared_inner_inputs
=
[]
shared_inner_outputs
=
[]
for
input
in
dummy_f
.
maker
.
expanded_inputs
:
if
isinstance
(
input
.
variable
,
SharedVariable
)
and
input
.
update
:
new_var
=
safe_new
(
input
.
variable
)
if
getattr
(
input
.
variable
,
'name'
,
None
)
is
not
None
:
new_var
.
name
=
input
.
variable
.
name
+
'_copy'
shared_inner_inputs
.
append
(
new_var
)
shared_scan_inputs
.
append
(
input
.
variable
)
shared_inner_outputs
.
append
(
input
.
update
)
givens
[
input
.
variable
]
=
new_var
n_shared_outs
+=
1
# Step 5.4 Outputs with no taps used in the input
n_nit_sot
=
0
nit_sot_inner_outputs
=
[]
nit_sot_return_steps
=
OrderedDict
()
nit_sot_rightOrder
=
[]
for
i
,
out
in
enumerate
(
outs_info
):
if
not
'taps'
in
out
:
nit_sot_inner_outputs
.
append
(
outputs
[
i
])
if
i
in
return_steps
:
nit_sot_return_steps
[
n_nit_sot
]
=
return_steps
[
i
]
nit_sot_rightOrder
.
append
(
i
)
nit_sot_steps
.
append
(
out
[
'steps'
])
n_nit_sot
+=
1
# Step 5.5 all other arguments including extra inputs
other_scan_args
=
[]
other_inner_args
=
[]
other_scan_args
+=
[
arg
for
arg
in
non_seqs
if
(
not
isinstance
(
arg
,
SharedVariable
)
and
not
isinstance
(
arg
,
tensor
.
Constant
))]
# Step 5.6 all shared variables with no update rules
other_inner_args
+=
[
safe_new
(
arg
,
'_copy'
)
for
arg
in
non_seqs
if
(
not
isinstance
(
arg
,
SharedVariable
)
and
not
isinstance
(
arg
,
tensor
.
Constant
))]
givens
.
update
(
dict
(
izip
(
other_scan_args
,
other_inner_args
)))
other_shared_scan_args
=
[
arg
.
variable
for
arg
in
dummy_f
.
maker
.
expanded_inputs
if
(
isinstance
(
arg
.
variable
,
SharedVariable
)
and
not
arg
.
update
)]
other_shared_inner_args
=
[
safe_new
(
arg
.
variable
,
'_copy'
)
for
arg
in
dummy_f
.
maker
.
expanded_inputs
if
(
isinstance
(
arg
.
variable
,
SharedVariable
)
and
not
arg
.
update
)]
givens
.
update
(
dict
(
izip
(
other_shared_scan_args
,
other_shared_inner_args
)))
##
# Step 6. Re-order the outputs and clone them replacing things
# using the givens
##
inner_inputs
=
(
inner_seqs
+
mit_mot_inner_inputs
+
mit_sot_inner_inputs
+
sit_sot_inner_inputs
+
shared_inner_inputs
+
other_shared_inner_args
+
other_inner_args
)
inner_outs
=
(
mit_mot_inner_outputs
+
mit_sot_inner_outputs
+
sit_sot_inner_outputs
+
nit_sot_inner_outputs
+
shared_inner_outputs
)
if
condition
is
not
None
:
inner_outs
.
append
(
condition
)
new_givens
=
OrderedDict
()
for
w
,
w_copy
in
iteritems
(
givens
):
new_givens
[
w
]
=
w
.
type
.
filter_variable
(
w_copy
)
new_outs
=
scan_utils
.
clone
(
inner_outs
,
replace
=
new_givens
)
##
# Step 7. Create the Scan Op
##
tap_array
=
mit_sot_tap_array
+
[[
-
1
]
for
x
in
xrange
(
n_sit_sot
)]
info
=
OrderedDict
()
info
[
'tap_array'
]
=
tap_array
info
[
'n_seqs'
]
=
n_seqs
info
[
'n_mit_mot'
]
=
n_mit_mot
info
[
'n_mit_mot_outs'
]
=
n_mit_mot_outs
info
[
'mit_mot_out_slices'
]
=
mit_mot_out_slices
info
[
'n_mit_sot'
]
=
n_mit_sot
info
[
'n_sit_sot'
]
=
n_sit_sot
info
[
'n_shared_outs'
]
=
n_shared_outs
info
[
'n_nit_sot'
]
=
n_nit_sot
info
[
'truncate_gradient'
]
=
-
1
info
[
'name'
]
=
name
info
[
'mode'
]
=
mode
info
[
'destroy_map'
]
=
OrderedDict
()
info
[
'inplace'
]
=
False
info
[
'gpu'
]
=
False
info
[
'as_while'
]
=
as_while
info
[
'profile'
]
=
profile
info
[
'_scan_savemem_visited'
]
=
True
info
[
'allow_gc'
]
=
allow_gc
local_op
=
scan_op
.
Scan
(
inner_inputs
,
new_outs
,
info
)
##
# Step 8. Compute the outputs using the scan op
##
_scan_inputs
=
(
scan_seqs
+
mit_mot_scan_inputs
+
mit_sot_scan_inputs
+
sit_sot_scan_inputs
+
shared_scan_inputs
+
nit_sot_steps
+
other_shared_scan_args
+
other_scan_args
)
scan_inputs
=
[]
for
arg
in
[
actual_n_steps
]
+
_scan_inputs
:
if
not
isinstance
(
arg
,
gof
.
Variable
):
arg
=
tensor
.
as_tensor_variable
(
arg
)
scan_inputs
+=
[
arg
]
scan_outs
=
local_op
(
*
scan_inputs
)
if
type
(
scan_outs
)
not
in
(
list
,
tuple
):
scan_outs
=
[
scan_outs
]
##
# Step 9. Figure out which outs are update rules for shared variables
# and so on ...
##
update_map
=
OrderedUpdates
()
offset
=
n_mit_mot
offsets
=
[
abs
(
numpy
.
min
(
x
))
for
x
in
mit_sot_tap_array
]
mit_sot_outs
=
scan_outs
[
offset
:
offset
+
n_mit_sot
]
offset
+=
n_mit_sot
offsets
=
[
1
for
x
in
xrange
(
n_sit_sot
)]
sit_sot_outs
=
scan_outs
[
offset
:
offset
+
n_sit_sot
]
offset
+=
n_sit_sot
nit_sot_outs
=
scan_outs
[
offset
:
offset
+
n_nit_sot
]
offset
+=
n_nit_sot
for
idx
,
update_rule
in
enumerate
(
scan_outs
[
offset
:
offset
+
n_shared_outs
]):
update_map
[
shared_scan_inputs
[
idx
]]
=
update_rule
_scan_out_list
=
(
mit_sot_outs
+
sit_sot_outs
+
nit_sot_outs
)
# Step 10. I need to reorder the outputs to be in the order expected by
# the user
rightOrder
=
(
mit_sot_rightOrder
+
sit_sot_rightOrder
+
nit_sot_rightOrder
)
scan_out_list
=
[
None
]
*
len
(
rightOrder
)
for
idx
,
pos
in
enumerate
(
rightOrder
):
scan_out_list
[
pos
]
=
_scan_out_list
[
idx
]
if
len
(
scan_out_list
)
==
1
:
scan_out_list
=
scan_out_list
[
0
]
elif
len
(
scan_out_list
)
==
0
:
scan_out_list
=
None
assert
isinstance
(
update_map
,
OrderedDict
)
return
(
scan_out_list
,
update_map
)
theano/sandbox/scan_module/__init__.py
deleted
100644 → 0
浏览文件 @
7320e1b1
"""
This module provides the Scan Op.
Scanning is a general form of recurrence, which can be used for looping.
The idea is that you *scan* a function along some input sequence, producing
an output at each time-step that can be seen (but not modified) by the
function at the next time-step. Technically, the function can see the
previous K time-steps of your outputs and L time steps (from the past and
future) of your inputs.
So for example, ``sum()`` could be computed by scanning the ``z+x_i``
function over a list, given an initial state of ``z=0``.
Special cases:
* A *reduce* operation can be performed by returning only the last
output of a ``scan``.
* A *map* operation can be performed by applying a function that
ignores previous steps of the outputs.
Often a for-loop can be expressed as a ``scan()`` operation, and ``scan`` is
the closest that theano comes to looping. The advantage of using ``scan``
over for loops is that it allows the number of iterations to be a part of
the symbolic graph.
The Scan Op should typically be used by calling any of the following
functions: ``scan()``, ``map()``, ``reduce()``, ``foldl()``,
``foldr()``.
"""
__docformat__
=
'restructedtext en'
__authors__
=
(
"Razvan Pascanu "
"Frederic Bastien "
"James Bergstra "
"Pascal Lamblin "
"Arnaud Bergeron "
)
__copyright__
=
"(c) 2010, Universite de Montreal"
__contact__
=
"Razvan Pascanu <r.pascanu@gmail>"
from
.scan
import
scan
theano/sandbox/scan_module/scan.py
deleted
100644 → 0
浏览文件 @
7320e1b1
"""
This module provides the Scan Op.
Scanning is a general form of recurrence, which can be used for looping.
The idea is that you *scan* a function along some input sequence, producing
an output at each time-step that can be seen (but not modified) by the
function at the next time-step. Technically, the function can see the
previous K time-steps of your outputs and L time steps (from past and
future) of your inputs.
So for example, ``sum()`` could be computed by scanning the ``z+x_i``
function over a list, given an initial state of ``z=0``.
Special cases:
* A *reduce* operation can be performed by using only the last
output of a ``scan``.
* A *map* operation can be performed by applying a function that
ignores previous steps of the outputs.
Often a for-loop or while-loop can be expressed as a ``scan()`` operation,
and ``scan`` is the closest that theano comes to looping. The advantages
of using ``scan`` over `for` loops in python (amongs other) are:
* it allows the number of iterations to be part of the symbolic graph
* it allows computing gradients through the for loop
* there exist a bunch of optimizations that help re-write your loop
such that less memory is used and that it runs faster
* it ensures that data is not copied from host to gpu and gpu to
host at each step
The Scan Op should typically be used by calling any of the following
functions: ``scan()``, ``map()``, ``reduce()``, ``foldl()``,
``foldr()``.
"""
__docformat__
=
'restructedtext en'
__authors__
=
(
"Razvan Pascanu "
"Frederic Bastien "
"James Bergstra "
"Pascal Lamblin "
)
__copyright__
=
"(c) 2010, Universite de Montreal"
__contact__
=
"Razvan Pascanu <r.pascanu@gmail>"
import
logging
import
numpy
from
six.moves
import
xrange
from
theano
import
gof
from
theano.compat
import
izip
from
theano.tensor
import
opt
,
TensorVariable
from
theano.tensor.sharedvar
import
TensorSharedVariable
from
theano
import
tensor
from
theano.scalar.sharedvar
import
shared
as
scalar_shared
from
theano.compile.pfunc
import
rebuild_collect_shared
from
.
import
scan_op
from
.
import
scan_utils
# Logging function for sending warning or info
_logger
=
logging
.
getLogger
(
'theano.scan_module.scan'
)
def
scan
(
fn
,
sequences
=
None
,
outputs_info
=
None
,
non_sequences
=
None
,
n_steps
=
None
,
truncate_gradient
=-
1
,
go_backwards
=
False
,
mode
=
None
,
name
=
None
,
options
=
None
,
profile
=
False
):
"""
This function constructs and applies a Scan op to the provided
arguments.
Parameters
----------
fn
``fn`` is a function that describes the operations involved in one
step of ``scan``. ``fn`` should construct variables describing the
output of one iteration step. It should expect as input theano
variables representing all the slices of the input sequences
and previous values of the outputs, as well as all other arguments
given to scan as ``non_sequences``. The order in which scan passes
these variables to ``fn`` is the following :
* all time slices of the first sequence
* all time slices of the second sequence
* ...
* all time slices of the last sequence
* all past slices of the first output
* all past slices of the second otuput
* ...
* all past slices of the last output
* all other arguments (the list given as `non_sequences` to
scan)
The order of the sequences is the same as the one in the list
`sequences` given to scan. The order of the outputs is the same
as the order of ``outputs_info``. For any sequence or output the
order of the time slices is the same as the one in which they have
been given as taps. For example if one writes the following :
.. code-block:: python
scan(fn, sequences = [ dict(input= Sequence1, taps = [-3,2,-1])
, Sequence2
, dict(input = Sequence3, taps = 3) ]
, outputs_info = [ dict(initial = Output1, taps = [-3,-5])
, dict(initial = Output2, taps = None)
, Output3 ]
, non_sequences = [ Argument1, Argument 2])
``fn`` should expect the following arguments in this given order:
#. ``Sequence1[t-3]``
#. ``Sequence1[t+2]``
#. ``Sequence1[t-1]``
#. ``Sequence2[t]``
#. ``Sequence3[t+3]``
#. ``Output1[t-3]``
#. ``Output1[t-5]``
#. ``Output3[t-1]``
#. ``Argument1``
#. ``Argument2``
The list of ``non_sequences`` can also contain shared variables
used in the function, though ``scan`` is able to figure those
out on its own so they can be skipped. For the clarity of the
code we recommend though to provide them to scan. To some extend
``scan`` can also figure out other ``non sequences`` (not shared)
even if not passed to scan (but used by `fn`). A simple example of
this would be :
.. code-block:: python
import theano.tensor as TT
W = TT.matrix()
W_2 = W**2
def f(x):
return TT.dot(x,W_2)
The function is expected to return two things. One is a list of
outputs ordered in the same order as ``outputs_info``, with the
difference that there should be only one output variable per
output initial state (even if no tap value is used). Secondly
`fn` should return an update dictionary (that tells how to
update any shared variable after each iteration step). The
dictionary can optionally be given as a list of tuples. There is
no constraint on the order of these two list, ``fn`` can return
either ``(outputs_list, update_dictionary)`` or
``(update_dictionary, outputs_list)`` or just one of the two (in
case the other is empty).
To use ``scan`` as a while loop, the user needs to change the
function ``fn`` such that also a stopping condition is returned.
To do so, he/she needs to wrap the condition in an ``until`` class.
The condition should be returned as a third element, for example:
.. code-block:: python
...
return [y1_t, y2_t], {x:x+1}, theano.scan_module.until(x < 50)
Note that a number of steps (considered in here as the maximum
number of steps ) is still required even though a condition is
passed (and it is used to allocate memory if needed). = {}):
sequences
``sequences`` is the list of Theano variables or dictionaries
describing the sequences ``scan`` has to iterate over. If a
sequence is given as wrapped in a dictionary, then a set of optional
information can be provided about the sequence. The dictionary
should have the following keys:
* ``input`` (*mandatory*) -- Theano variable representing the
sequence.
* ``taps`` -- Temporal taps of the sequence required by ``fn``.
They are provided as a list of integers, where a value ``k``
impiles that at iteration step ``t`` scan will pass to ``fn``
the slice ``t+k``. Default value is ``[0]``
Any Theano variable in the list ``sequences`` is automatically
wrapped into a dictionary where ``taps`` is set to ``[0]``
outputs_info
``outputs_info`` is the list of Theano variables or dictionaries
describing the initial state of the outputs computed
recurrently. When this initial states are given as dictionary
optional information can be provided about the output corresponding
to these initial states. The dictionary should have the following
keys:
* ``initial`` -- Theano variable that represents the initial
state of a given output. In case the output is not computed
recursively (think of a map) and does not require a initial
state this field can be skiped. Given that only the previous
time step of the output is used by ``fn`` the initial state
should have the same shape as the output. If multiple time
taps are used, the initial state should have one extra
dimension that should cover all the possible taps. For example
if we use ``-5``, ``-2`` and ``-1`` as past taps, at step 0,
``fn`` will require (by an abuse of notation) ``output[-5]``,
``output[-2]`` and ``output[-1]``. This will be given by
the initial state, which in this case should have the shape
(5,)+output.shape. If this variable containing the initial
state is called ``init_y`` then ``init_y[0]`` *corresponds to*
``output[-5]``. ``init_y[1]`` *correponds to* ``output[-4]``,
``init_y[2]`` corresponds to ``output[-3]``, ``init_y[3]``
coresponds to ``output[-2]``, ``init_y[4]`` corresponds to
``output[-1]``. While this order might seem strange, it comes
natural from splitting an array at a given point. Assume that
we have a array ``x``, and we choose ``k`` to be time step
``0``. Then our initial state would be ``x[:k]``, while the
output will be ``x[k:]``. Looking at this split, elements in
``x[:k]`` are ordered exactly like those in ``init_y``.
* ``taps`` -- Temporal taps of the output that will be pass to
``fn``. They are provided as a list of *negative* integers,
where a value ``k`` implies that at iteration step ``t`` scan
will pass to ``fn`` the slice ``t+k``.
``scan`` will follow this logic if partial information is given:
* If an output is not wrapped in a dictionary, ``scan`` will wrap
it in one assuming that you use only the last step of the output
(i.e. it makes your tap value list equal to [-1]).
* If you wrap an output in a dictionary and you do not provide any
taps but you provide an initial state it will assume that you are
using only a tap value of -1.
* If you wrap an output in a dictionary but you do not provide any
initial state, it assumes that you are not using any form of
taps.
* If you provide a ``None`` instead of a variable or a empty
dictionary ``scan`` assumes that you will not use any taps for
this output (like for example in case of a map)
If ``outputs_info`` is an empty list or None, ``scan`` assumes
that no tap is used for any of the outputs. If information is
provided just for a subset of the outputs an exception is
raised (because there is no convention on how scan should map
the provided information to the outputs of ``fn``)
non_sequences
``non_sequences`` is the list of arguments that are passed to
``fn`` at each steps. One can opt to exclude variable
used in ``fn`` from this list as long as they are part of the
computational graph, though for clarity we encourage not to do so.
n_steps
``n_steps`` is the number of steps to iterate given as an int
or Theano scalar. If any of the input sequences do not have
enough elements, scan will raise an error. If the *value is 0* the
outputs will have *0 rows*. If the value is negative, ``scan``
will run backwards in time. If the ``go_backwards`` flag is already
set and also ``n_steps`` is negative, ``scan`` will run forward
in time. If n stpes is not provided, ``scan`` will figure
out the amount of steps it should run given its input sequences.
truncate_gradient
``truncate_gradient`` is the number of steps to use in truncated
BPTT. If you compute gradients through a scan op, they are
computed using backpropagation through time. By providing a
different value then -1, you choose to use truncated BPTT instead
of classical BPTT, where you go for only ``truncate_gradient``
number of steps back in time.
go_backwards
``go_backwards`` is a flag indicating if ``scan`` should go
backwards through the sequences. If you think of each sequence
as indexed by time, making this flag True would mean that
``scan`` goes back in time, namely that for any sequence it
starts from the end and goes towards 0.
name
When profiling ``scan``, it is crucial to provide a name for any
instance of ``scan``. The profiler will produce an overall
profile of your code as well as profiles for the computation of
one step of each instance of ``scan``. The ``name`` of the instance
appears in those profiles and can greatly help to disambiguate
information.
mode
It is recommended to leave this argument to None, especially
when profiling ``scan`` (otherwise the results are not going to
be accurate). If you prefer the computations of one step of
``scan`` to be done differently then the entire function, you
can use this parameter to describe how the computations in this
loop are done (see ``theano.function`` for details about
possible values and their meaning).
profile
Flag or string. If true, or different from the empty string, a
profile object will be created and attached to the inner graph of
scan. In case ``profile`` is True, the profile object will have the
name of the scan instance, otherwise it will have the passed string.
Profile object collect (and print) information only when running the
inner graph with the new cvm linker ( with default modes,
other linkers this argument is useless)
Returns
-------
tuple
Tuple of the form (outputs, updates); ``outputs`` is either a
Theano variable or a list of Theano variables representing the
outputs of ``scan`` (in the same order as in
``outputs_info``). ``updates`` is a subclass of dictionary
specifying the
update rules for all shared variables used in scan
This dictionary should be passed to ``theano.function`` when
you compile your function. The change compared to a normal
dictionary is that we validate that keys are SharedVariable
and addition of those dictionary are validated to be consistent.
"""
# Note : see the internal documentation of the scan op for naming
# conventions and all other details
if
options
is
None
:
options
=
{}
rvals
=
scan_utils
.
canonical_arguments
(
sequences
,
outputs_info
,
non_sequences
,
go_backwards
,
n_steps
)
inputs
,
states_and_outputs_info
,
parameters
,
T
=
rvals
# If we provided a known number of steps ( before compilation)
# and if that number is 1 or -1, then we can skip the Scan Op,
# and just apply the inner function once
# To do that we check here to see the nature of n_steps
T_value
=
None
if
isinstance
(
n_steps
,
(
float
,
int
)):
T_value
=
int
(
n_steps
)
else
:
try
:
T_value
=
opt
.
get_scalar_constant_value
(
n_steps
)
except
(
TypeError
,
AttributeError
):
T_value
=
None
if
T_value
in
(
1
,
-
1
):
return
one_step_scan
(
fn
,
inputs
,
states_and_outputs_info
,
parameters
,
truncate_gradient
)
# 1. Variable representing the current time step
t
=
scalar_shared
(
numpy
.
int64
(
0
),
name
=
't'
)
# 2. Allocate memory for the states of scan.
mintaps
=
[]
lengths
=
[]
for
pos
,
arg_info
in
enumerate
(
states_and_outputs_info
):
if
arg_info
.
get
(
'taps'
,
None
)
==
[
-
1
]:
mintaps
.
append
(
1
)
lengths
.
append
(
scalar_shared
(
numpy
.
int64
(
0
),
name
=
'l
%
d'
%
pos
))
arg_info
[
'initial'
]
=
scan_utils
.
expand
(
tensor
.
unbroadcast
(
tensor
.
shape_padleft
(
arg_info
[
'initial'
]),
0
),
T
)
elif
arg_info
.
get
(
'taps'
,
None
):
if
numpy
.
any
(
numpy
.
array
(
arg_info
.
get
(
'taps'
,
[]))
>
0
):
# Make sure we do not have requests for future values of a
# sequence we can not provide such values
raise
ValueError
(
'Can not use future taps of outputs'
,
arg_info
)
mintap
=
abs
(
numpy
.
min
(
arg_info
[
'taps'
]))
lengths
.
append
(
scalar_shared
(
numpy
.
int64
(
0
),
name
=
'l
%
d'
%
pos
))
mintaps
.
append
(
mintap
)
arg_info
[
'initial'
]
=
scan_utils
.
expand
(
arg_info
[
'initial'
][:
mintap
],
T
)
else
:
mintaps
.
append
(
0
)
lengths
.
append
(
scalar_shared
(
numpy
.
int64
(
0
),
name
=
'l
%
d'
%
pos
))
# 3. Generate arguments for the function passed to scan. This will
# function will return the outputs that need to be computed at every
# timesteps
inputs_slices
=
[
input
[
t
]
for
input
in
inputs
]
states_slices
=
[]
for
n
,
state
in
enumerate
(
states_and_outputs_info
):
# Check if it is actually a state and not an output
if
mintaps
[
n
]
!=
0
:
for
k
in
state
[
'taps'
]:
states_slices
.
append
(
state
[
'initial'
][(
t
+
mintaps
[
n
]
+
k
)
%
lengths
[
n
]])
# 4. Construct outputs that are to be computed by the inner
# function of scan
args
=
inputs_slices
+
states_slices
+
parameters
cond
,
states_and_outputs
,
updates
=
\
scan_utils
.
get_updates_and_outputs
(
fn
(
*
args
))
# User is allowed to provide no information if it only behaves like a
# map
if
(
len
(
states_and_outputs
)
!=
len
(
states_and_outputs_info
)
and
len
(
states_and_outputs_info
)
==
0
):
mintaps
=
[
0
]
*
len
(
states_and_outputs
)
# 5. Construct the scan op
# 5.1 Construct list of shared variables with updates (those that
# can be treated as states (i.e. of TensorType) and those that can not
# (like Random States)
if
cond
is
not
None
:
_cond
=
[
cond
]
else
:
_cond
=
[]
rvals
=
rebuild_collect_shared
(
states_and_outputs
+
_cond
,
updates
=
updates
,
rebuild_strict
=
True
,
copy_inputs_over
=
True
,
no_default_updates
=
False
)
# extracting the arguments
input_variables
,
cloned_outputs
,
other_rval
=
rvals
clone_d
,
update_d
,
update_expr
,
shared_inputs
=
other_rval
additional_input_states
=
[]
additional_output_states
=
[]
additional_lengths
=
[]
additional_mintaps
=
[]
original_numeric_shared_variables
=
[]
non_numeric_input_states
=
[]
non_numeric_output_states
=
[]
original_non_numeric_shared_variables
=
[]
pos
=
len
(
lengths
)
for
sv
in
shared_inputs
:
if
sv
in
update_d
:
if
isinstance
(
sv
,
(
TensorVariable
,
TensorSharedVariable
)):
# We can treat it as a sit sot
nw_state
=
scan_utils
.
expand
(
tensor
.
unbroadcast
(
tensor
.
shape_padleft
(
sv
),
0
),
T
)
additional_lengths
.
append
(
scalar_shared
(
numpy
.
int64
(
0
),
name
=
'l
%
d'
%
pos
))
pos
=
pos
+
1
additional_mintaps
.
append
(
1
)
additional_input_states
.
append
(
nw_state
)
additional_output_states
.
append
(
scan_utils
.
clone
(
tensor
.
set_subtensor
(
nw_state
[(
t
+
1
)
%
additional_lengths
[
-
1
]],
update_d
[
sv
])))
original_numeric_shared_variables
.
append
(
sv
)
else
:
non_numeric_input_states
.
append
(
sv
)
non_numeric_output_states
.
append
(
update_d
[
sv
])
original_non_numeric_shared_variables
.
append
(
sv
)
# Replace shared variables in the update
_additional_output_states
=
[]
replace
=
{}
for
sv
,
buf
in
zip
(
original_numeric_shared_variables
,
additional_input_states
):
replace
[
sv
]
=
buf
[
t
]
for
out
in
additional_output_states
:
_additional_output_states
.
append
(
scan_utils
.
clone
(
out
,
replace
=
replace
))
additional_output_states
=
_additional_output_states
# 5.2 Collect inputs/outputs of the inner function
inputs
=
[]
outputs
=
[]
for
n
,
mintap
in
enumerate
(
mintaps
):
if
mintap
!=
0
:
input_state
=
states_and_outputs_info
[
n
][
'initial'
]
inputs
.
append
(
input_state
)
outputs
.
append
(
tensor
.
set_subtensor
(
input_state
[(
t
+
mintap
)
%
lengths
[
n
]],
states_and_outputs
[
n
]))
else
:
mem_buffer
=
scan_utils
.
allocate_memory
(
T
,
states_and_outputs_info
[
n
],
states_and_outputs
[
n
])
inputs
.
append
(
output
)
outputs
.
append
(
tensor
.
set_subtensor
(
output
[
t
%
lengths
[
n
]],
states_and_outputs
[
n
]))
inputs
.
extend
(
additional_input_states
)
outputs
.
extend
(
additional_output_states
)
lengths
.
extend
(
additional_lengths
)
mintaps
.
extend
(
additional_mintaps
)
inputs
.
extend
(
non_numeric_input_states
)
outputs
.
extend
(
non_numeric_output_states
)
all_other_inputs
=
gof
.
graph
.
inputs
(
outputs
)
parameters
=
[
x
for
x
in
all_other_inputs
if
(
x
not
in
inputs
and
x
not
in
lengths
and
x
is
not
t
and
isinstance
(
x
,
gof
.
Variable
)
and
not
isinstance
(
x
,
gof
.
Constant
))]
inputs
.
extend
(
parameters
)
# 5.3 Construct the the options dictionary
options
[
'name'
]
=
name
options
[
'profile'
]
=
profile
options
[
'mode'
]
=
mode
options
[
'inplace'
]
=
False
options
[
'gpu'
]
=
False
options
[
'truncate_gradient'
]
=
truncate_gradient
options
[
'hash_inner_graph'
]
=
0
# 5.4 Construct the ScanOp instance
local_op
=
scan_op
.
ScanOp
(
inputs
=
inputs
,
outputs
=
outputs
,
lengths
=
lengths
,
switches
=
[],
mintaps
=
mintaps
,
index
=
t
,
options
=
options
,
as_repeatUntil
=
cond
)
# Note that we get here all the outputs followed by the update rules to
# the shared variables we had in our scan
# we know that we have (in this given order):
# * len(states_and_outputs) real outputs
# * len(additional_input_states) updates for numeric shared variable
# * len(non_numeric_input_states) updates for non numeric shared
# variables
scan_inputs
=
[
T
]
+
inputs
scan_outputs_update_rules
=
scan_utils
.
to_list
(
local_op
(
*
scan_inputs
))
# 5.5 Collect outputs and add permutation object
scan_outputs
=
[]
for
pos
in
xrange
(
len
(
states_and_outputs
)):
out
=
scan_utils
.
ScanPermutation
(
mintaps
[
pos
])(
scan_outputs_update_rules
[
pos
],
t
)
scan_outputs
.
append
(
out
[
mintaps
[
pos
]:])
# 5.6 Construct updates dictionary
update_rules
=
scan_outputs_update_rules
[
len
(
states_and_outputs
):]
updates
=
{}
for
v
,
u
in
izip
(
original_numeric_shared_variables
,
update_rules
[:
len
(
additional_input_states
)]):
updates
[
v
]
=
u
[
-
1
]
for
v
,
u
in
izip
(
original_non_numeric_shared_variables
,
update_rules
[
len
(
additional_input_states
):]):
updates
[
v
]
=
u
# Step 5.7 We are done and can return everything back to the user
return
scan_outputs
,
updates
def
one_step_scan
(
fn
,
inputs
,
states_and_outputs_info
,
parameters
,
truncate_gradient
):
"""
This function is evaluated if `n_steps` evaluates to either 1 or -1.
"""
# 1. Grab slices of sequences
inputs_slices
=
[
input
[
0
]
for
input
in
inputs
]
# 2. Grab slices of states
states_slices
=
[]
for
n
,
arg_info
in
enumerate
(
states_and_outputs_info
):
if
arg_info
.
get
(
'taps'
,
None
)
==
[
-
1
]:
states_slices
.
append
(
arg_info
[
'initial'
])
elif
arg_info
.
get
(
'taps'
,
None
):
if
numpy
.
any
(
numpy
.
array
(
arg_info
.
get
(
'taps'
,
[]))
>
0
):
# Make sure we do not have requests for future values of a
# sequence we can not provide such values
raise
ValueError
(
'Can not use future taps of outputs'
,
arg_info
)
# go through the taps
mintap
=
abs
(
numpy
.
min
(
arg_info
[
'taps'
]))
states_slices
.
extend
(
[
arg_info
[
'initial'
][
k
+
mintap
]
for
k
in
arg_info
[
'taps'
]])
# Re-order args
args
=
(
inputs_slices
+
states_slices
+
parameters
)
cond
,
states_and_outputs
,
updates
=
\
scan_utils
.
get_updates_and_outputs
(
fn
(
*
args
))
# We do not need to use the scan op anymore, so we can just return
# the outputs and updates we have
if
cond
is
not
None
:
_logger
.
warning
((
'When the number of steps is fixed and equal '
'to 1, the provided stopping condition, '
,
str
(
cond
),
' is ignored'
))
states_and_outputs
=
[
tensor
.
unbroadcast
(
tensor
.
shape_padleft
(
arg
),
0
)
for
arg
in
states_and_outputs
]
if
len
(
states_and_outputs
)
==
1
:
states_and_outputs
=
states_and_outputs
[
0
]
return
(
states_and_outputs
,
updates
)
theano/sandbox/scan_module/scan_op.py
deleted
100644 → 0
浏览文件 @
7320e1b1
"""
This module provides the Scan Op.
See scan.py for details on scan.
"""
from
__future__
import
print_function
__docformat__
=
'restructedtext en'
__authors__
=
(
"Razvan Pascanu "
"Frederic Bastien "
"James Bergstra "
"Pascal Lamblin "
)
__copyright__
=
"(c) 2010, Universite de Montreal"
__contact__
=
"Razvan Pascanu <r.pascanu@gmail>"
import
logging
import
numpy
from
six
import
iteritems
from
six.moves
import
xrange
import
theano
from
theano
import
compile
from
theano.compat
import
izip
from
theano.gof
import
PureOp
,
Apply
from
theano
import
gof
from
theano.tensor
import
TensorType
from
theano.tensor.opt
import
Shape_i
# Logging function for sending warning or info
_logger
=
logging
.
getLogger
(
'theano.scan_module.scan_op'
)
class
ScanOp
(
PureOp
):
def
__init__
(
self
,
inputs
,
outputs
,
lengths
,
switches
,
mintaps
,
index
,
options
,
as_repeatUntil
):
self
.
inputs
=
inputs
self
.
outputs
=
outputs
self
.
index
=
index
self
.
switches
=
switches
self
.
lengths
=
lengths
self
.
mintaps
=
mintaps
self
.
as_repeatUntil
=
as_repeatUntil
self
.
options
=
options
self
.
name
=
options
[
'name'
]
self
.
mode
=
options
[
'mode'
]
self
.
inplace
=
options
[
'inplace'
]
self
.
gpu
=
options
[
'gpu'
]
self
.
profile
=
options
[
'profile'
]
self
.
hash_inner_graph
=
options
[
'hash_inner_graph'
]
# --Construct the destroy map--
if
self
.
inplace
:
for
idx
in
xrange
(
len
(
outputs
)):
self
.
destroy_map
[
idx
]
=
[
idx
+
1
]
# --Decide on the default mode--
mode_instance
=
compile
.
mode
.
get_mode
(
self
.
mode
)
# if the default mode is used, and that mode is ProfileMode
# then we need to copy the mode otherwise the time for a given
# op will be counted multiple times
if
(
self
.
mode
is
None
and
isinstance
(
mode_instance
,
compile
.
profilemode
.
ProfileMode
)):
mode_instance
=
compile
.
profilemode
.
ProfileMode
(
optimizer
=
mode_instance
.
provided_optimizer
,
linker
=
mode_instance
.
provided_linker
)
compile
.
profilemode
.
prof_mode_instance_to_print
.
append
(
mode_instance
)
self
.
mode_instance
=
mode_instance
if
self
.
name
:
self
.
mode_instance
.
message
=
self
.
name
+
" sub profile"
else
:
self
.
mode_instance
.
message
=
"Scan sub profile"
else
:
self
.
mode_instance
=
mode_instance
# --Adding default name--
if
not
hasattr
(
self
,
'name'
)
or
self
.
name
is
None
:
self
.
name
=
'scan_fn'
def
make_node
(
self
,
*
inputs
):
# Checking if arguments are of the right type is done in the scan
# function
out_types
=
[
out
.
type
()
for
out
in
self
.
outputs
]
return
Apply
(
self
,
inputs
,
out_types
)
def
__eq__
(
self
,
other
):
# Check if we are dealing with same type of objects
if
not
type
(
self
)
==
type
(
other
):
return
False
if
self
.
options
!=
other
.
options
:
return
False
if
self
.
mintals
!=
other
.
mintaps
:
return
False
# Check if the number of different types of arguments is the same
diff_args
=
[
'inputs'
,
'outputs'
,
'lengths'
,
'mintaps'
,
'switches'
]
for
arg
in
diff_args
:
if
len
(
getattr
(
self
,
arg
))
!=
len
(
getattr
(
other
,
arg
)):
return
False
for
x
,
y
in
izip
(
self
.
inputs
,
other
.
inputs
):
if
x
.
type
!=
y
.
type
:
return
False
for
x
,
y
in
izip
(
self
.
lengths
,
other
.
lengths
):
if
x
.
type
!=
y
.
type
:
return
False
s_ins
=
[
self
.
index
]
+
self
.
inputs
+
self
.
lengths
+
self
.
switches
o_ins
=
[
other
.
index
]
+
other
.
inputs
+
other
.
lengths
+
other
.
switches
givens
=
dict
(
izip
(
s_ins
,
o_ins
))
# This part might be slow
for
x
,
y
in
izip
(
self
.
outputs
,
other
.
outputs
):
if
not
gof
.
graph
.
is_same_graph
(
x
,
y
,
givens
=
givens
):
return
False
return
True
def
__str__
(
self
):
if
self
.
gpu
:
gpu_str
=
'gpu'
else
:
gpu_str
=
'cpu'
if
self
.
as_repeatUntil
is
not
None
:
name
=
'repeat/until'
else
:
name
=
'loop'
if
self
.
inplace
:
aux_txt
=
'
%
s{inplace,
%
s,
%
s}'
%
(
name
,
gpu_str
,
str
(
self
.
name
))
else
:
aux_txt
=
'
%
s{
%
s,
%
s}'
%
(
name
,
gpu_str
,
str
(
self
.
name
))
return
aux_txt
def
__hash__
(
self
):
rval
=
hash
(
type
(
self
))
^
self
.
hash_inner_graph
for
val
in
self
.
options
.
values
():
if
isinstance
(
val
,
(
list
,
tuple
)):
for
el
in
val
:
rval
=
rval
^
el
else
:
rval
=
rval
^
val
return
rval
def
infer_shape
(
self
,
node
,
input_shapes
):
for
inp
,
inp_shp
in
izip
(
node
.
inputs
,
input_shapes
):
assert
inp_shp
is
None
or
len
(
inp_shp
)
==
inp
.
type
.
ndim
n_outs
=
len
(
self
.
outputs
)
if
self
.
as_repeatUntil
is
not
None
:
return
[(
Shape_i
(
0
)(
o
),)
+
x
[
1
:]
for
o
,
x
in
izip
(
node
.
outputs
,
input_shapes
[
1
:
n_outs
+
1
])]
else
:
return
input_shapes
[
1
:
n_outs
+
1
]
def
make_thunk
(
self
,
node
,
storage_map
,
compute_map
,
no_recycling
):
"""
Parameters
----------
node
The Apply node returned by the ``make_node`` function of the scan
op class.
storage_map
dict variable -> one-element-list where a computed value for this
variable may be found.
compute_map
dict variable -> one-element-list where a boolean value will be
found. The boolean indicates whether the variable's storage_map
container contains a valid value (True) or if it has not been
computed yet (False).
no_recycling
List of variables for which it is forbidden to reuse memory
allocated by a previous call.
Notes
-----
If the thunk consults the storage_map on every call, it is safe
for it to ignore the no_recycling argument, because elements of the
no_recycling list will have a value of None in the storage map. If
the thunk can potentially cache return values (like CLinker does),
then it must not do so for variables in the no_recycling list.
"""
# 1. Collect all memory buffers
node_input_storage
=
[
storage_map
[
r
]
for
r
in
node
.
inputs
]
node_output_storage
=
[
storage_map
[
r
]
for
r
in
node
.
outputs
]
node_input_compute
=
[
compute_map
[
r
]
for
r
in
node
.
inputs
]
node_output_compute
=
[
compute_map
[
r
]
for
r
in
node
.
outputs
]
# 2. Construct fake shared variables around every argument of scan
givens
=
{}
base_inputs
=
self
.
inputs
[:
len
(
self
.
outputs
)]
base_buffers
=
node_input_storage
[
1
:
1
+
len
(
base_inputs
)]
aux_inputs
=
self
.
inputs
[
len
(
self
.
outputs
):]
aux_membuffers
=
node_input_storage
[
1
+
len
(
base_inputs
):]
# 2.1 First the auxiliary arguments, those that are parameters or
# input
def
fake_shared
(
var
):
val
=
0
for
dim
in
xrange
(
var
.
ndim
):
val
=
[
val
]
val
=
numpy
.
asarray
(
val
,
dtype
=
var
.
dtype
)
return
theano
.
shared
(
val
,
name
=
var
.
name
)
non_tensor_args
=
[]
non_tensor_buffers
=
[]
aux_buffers
=
[]
for
mem_buf
,
var
in
izip
(
aux_membuffers
,
aux_inputs
):
if
mem_buf
[
0
]
is
not
None
:
givens
[
var
]
=
theano
.
shared
(
mem_buf
[
0
],
name
=
var
.
name
,
borrow
=
True
)
elif
isinstance
(
var
,
TensorType
):
givens
[
var
]
=
fake_shared
(
var
)
aux_buffers
.
append
((
givens
[
var
],
mem_buf
))
else
:
givens
[
var
]
=
var
.
type
()
non_tensor_args
.
append
(
givens
[
var
])
non_tensor_buffers
.
append
(
mem_buf
)
# 2.2. Next the states (numeric) and the outputs
updates
=
{}
state_buffers
=
[]
n_numeric_values
=
len
(
self
.
lengths
)
for
pos
in
xrange
(
n_numeric_values
):
var
=
base_inputs
[
pos
]
mem_buf
=
base_buffers
[
pos
]
expr
=
self
.
outputs
[
pos
]
givens
[
var
]
=
fake_shared
(
var
)
state_buffers
.
append
((
givens
[
var
],
self
.
lengths
[
pos
],
mem_buf
))
updates
[
givens
[
var
]]
=
expr
# 2.3 Non-numeric states
n_non_numeric
=
len
(
self
.
outputs
)
-
n_numeric_values
fn_outs
=
self
.
outputs
[
n_numeric_values
:]
for
var
in
base_inputs
[
n_numeric_values
:]:
givens
[
var
]
=
var
.
type
()
non_tensor_args
.
append
(
givens
[
var
])
non_numeric_states_bufs
=
base_buffers
[
n_numeric_values
:]
# 2.4 Add the update for the index of scan
updates
[
self
.
index
]
=
self
.
index
+
numpy
.
int64
(
1
)
# 3.1 Construct the inner function of scan
if
self
.
as_repeatUntil
is
not
None
:
fn_outs
=
self
.
as_repeatUntil
self
.
fn
=
theano
.
function
(
non_tensor_args
,
fn_outs
,
givens
=
givens
,
updates
=
updates
,
mode
=
self
.
mode_instance
,
name
=
self
.
name
,
profile
=
self
.
profile
)
# 3.2 Construct the perform
if
self
.
as_repeatUntil
is
not
None
:
# 3.2.1 as a repeat until
def
p
(
node
,
args
,
outs
):
pos
=
0
cont
=
1
# copy inputs if not inplace
if
not
self
.
inplace
:
for
_
,
_
,
val
in
state_buffers
:
val
[
0
]
=
val
[
0
]
.
copy
()
for
buf
in
non_numeric_states_bufs
:
buf
[
0
]
=
buf
[
0
]
.
copy
()
# reset all switches if any
for
sw
in
self
.
switches
:
sw
.
set_value
(
numpy
.
int8
(
0
),
borrow
=
True
)
# set aux shared variables
for
var
,
val
in
aux_buffers
:
var
.
set_value
(
val
[
0
],
borrow
=
True
)
# set state shared variables
for
var
,
length
,
val
in
state_buffers
:
var
.
set_value
(
val
[
0
],
borrow
=
True
)
length
.
set_value
(
val
[
0
]
.
shape
[
0
],
borrow
=
True
)
self
.
index
.
set_value
(
numpy
.
int64
(
0
))
# grab fixed arguments
fix_args
=
[
x
[
0
]
for
x
in
non_tensor_buffers
]
while
cont
and
pos
<
node_input_storage
[
0
][
0
]:
extra_args
=
[
x
[
0
]
for
x
in
non_numeric_states_bufs
]
rvals
=
self
.
fn
(
*
(
fix_args
+
extra_args
))
for
buf
,
rval
in
izip
(
non_numeric_states_bufs
,
rvals
):
buf
[
0
]
=
rval
cont
=
rvals
[
-
1
]
pos
=
pos
+
1
# We need to trim the outputs if they are longer
for
pos
in
xrange
(
n_numeric_values
):
buf
=
state_buffers
[
pos
][
2
][
0
]
mintap
=
self
.
mintaps
[
pos
]
if
buf
.
shape
[
0
]
>
pos
+
self
.
mintaps
[
pos
]:
node_output_storage
[
pos
][
0
]
=
buf
[:
pos
+
mintap
]
else
:
node_output_storage
[
pos
][
0
]
=
buf
for
out_buf
,
in_buf
in
izip
(
node_output_storage
[
n_numeric_values
:],
non_numeric_states_bufs
):
out_buf
[
0
]
=
in_buf
[
0
]
else
:
# 3.2.2 as a for
def
p
(
node
,
args
,
outs
):
# copy inputs if not inplace
if
not
self
.
inplace
:
for
_
,
_
,
val
in
state_buffers
:
val
[
0
]
=
val
[
0
]
.
copy
()
for
buf
in
non_numeric_states_bufs
:
buf
[
0
]
=
buf
[
0
]
.
copy
()
# reset all switches if any
for
sw
in
self
.
switches
:
sw
.
set_value
(
numpy
.
int8
(
0
),
borrow
=
True
)
# set aux shared variables
for
var
,
val
in
aux_buffers
:
var
.
set_value
(
val
[
0
],
borrow
=
True
)
# set state shared variables
for
var
,
length
,
val
in
state_buffers
:
var
.
set_value
(
val
[
0
],
borrow
=
True
)
length
.
set_value
(
val
[
0
]
.
shape
[
0
],
borrow
=
True
)
self
.
index
.
set_value
(
numpy
.
int64
(
0
))
# grab fixed arguments
fix_args
=
[
x
[
0
]
for
x
in
non_tensor_buffers
]
for
dx
in
xrange
(
node_input_storage
[
0
][
0
]):
extra_args
=
[
x
[
0
]
for
x
in
non_numeric_states_bufs
]
rvals
=
self
.
fn
(
*
(
fix_args
+
extra_args
))
for
buf
,
rval
in
izip
(
non_numeric_states_bufs
,
rvals
):
buf
[
0
]
=
rval
for
pos
in
xrange
(
n_numeric_values
):
buf
=
state_buffers
[
pos
][
0
]
.
get_value
(
borrow
=
True
)
mintap
=
self
.
mintaps
[
pos
]
node_output_storage
[
pos
][
0
]
=
buf
for
out_buf
,
in_buf
in
izip
(
node_output_storage
[
n_numeric_values
:],
non_numeric_states_bufs
):
out_buf
[
0
]
=
in_buf
[
0
]
# 3.3 construct the rval function
def
rval
(
p
=
p
,
i
=
node_input_storage
,
o
=
node_output_storage
,
n
=
node
):
r
=
p
(
n
,
[
x
[
0
]
for
x
in
i
],
o
)
for
out
in
node
.
outputs
:
compute_map
[
out
][
0
]
=
True
return
r
rval
.
inputs
=
node_input_storage
rval
.
outputs
=
node_output_storage
rval
.
perform
=
p
rval
.
lazy
=
False
return
rval
@theano.compile.profilemode.register_profiler_printer
def
profile_printer
(
fct_name
,
compile_time
,
fct_call_time
,
fct_call
,
apply_time
,
apply_cimpl
,
message
,
outputs_size
,
other_time
):
# Scan overhead profile
if
any
([
isinstance
(
node
.
op
,
Scan
)
and
v
>
0
for
(
_
,
node
),
v
in
apply_time
.
items
()]):
print
()
print
(
'Scan overhead:'
)
print
(
'<Scan op time(s)> <sub scan fct time(s)> <sub scan op '
'time(s)> <sub scan fct time(
%
scan op time)> <sub scan '
'op time(
%
scan op time)> <node>'
)
total_super_scan_time
=
0
total_scan_fct_time
=
0
total_scan_op_time
=
0
for
(
_
,
node
),
v
in
iteritems
(
apply_time
):
if
isinstance
(
node
.
op
,
Scan
):
if
v
>
0
:
scan_fct_time
=
node
.
op
.
mode_instance
.
fn_time
scan_op_time
=
node
.
op
.
mode_instance
.
local_time
total_super_scan_time
+=
v
total_scan_fct_time
+=
scan_fct_time
total_scan_op_time
+=
scan_op_time
print
(
'
%5.1
fs
%5.1
fs
%5.1
fs
%5.1
f
%% %5.1
f
%%
'
%
(
v
,
scan_fct_time
,
scan_op_time
,
scan_fct_time
/
v
*
100
,
scan_op_time
/
v
*
100
),
node
)
else
:
print
((
' The node took 0s, so we can not compute the '
'overhead'
),
node
)
print
(
' total
%5.1
fs
%5.1
fs
%5.1
fs
%5.1
f
%% %5.1
f
%%
'
%
(
total_super_scan_time
,
total_scan_fct_time
,
total_scan_op_time
,
total_scan_fct_time
/
total_super_scan_time
*
100
,
total_scan_op_time
/
total_super_scan_time
*
100
))
theano/sandbox/scan_module/scan_utils.py
deleted
100644 → 0
浏览文件 @
7320e1b1
"""
This module provides utility functions for the Scan Op.
See scan.py for details on scan.
"""
from
__future__
import
print_function
__docformat__
=
'restructedtext en'
__authors__
=
(
"Razvan Pascanu "
"Frederic Bastien "
"James Bergstra "
"Pascal Lamblin "
"Arnaud Bergeron"
)
__copyright__
=
"(c) 2010, Universite de Montreal"
__contact__
=
"Razvan Pascanu <r.pascanu@gmail>"
import
copy
import
logging
import
warnings
import
numpy
from
six.moves
import
xrange
import
theano
from
theano.compat
import
izip
from
theano.compile.pfunc
import
rebuild_collect_shared
from
theano
import
gof
from
theano
import
tensor
,
scalar
from
theano.tensor.basic
import
get_scalar_constant_value
# Logging function for sending warning or info
_logger
=
logging
.
getLogger
(
'theano.scan_utils'
)
def
expand
(
tensor_var
,
size
):
"""
Given ``tensor_var``, a Theano tensor of shape (d1, d2, ..), this
function constructs a rval Theano tensor of shape (d1 + size, d2, ..)
filled with 0s, except the first d1 entries which are taken from
``tensor_var``, namely:
rval[:d1] = tensor_var
Parameters
----------
tensor_var : Theano tensor variable.
size : int
"""
# Corner case that I might use in an optimization
if
size
==
0
:
return
tensor_var
shapes
=
[
tensor_var
.
shape
[
x
]
for
x
in
xrange
(
tensor_var
.
ndim
)]
zeros_shape
=
[
size
+
shapes
[
0
]]
+
shapes
[
1
:]
empty
=
tensor
.
zeros
(
zeros_shape
,
dtype
=
tensor_var
.
dtype
)
return
tensor
.
set_subtensor
(
empty
[:
shapes
[
0
]],
tensor_var
)
def
to_list
(
ls
):
"""
Converts ``ls`` to list if it is a tuple, or wraps ``ls`` into a list if
it is not a list already.
"""
if
isinstance
(
ls
,
(
list
,
tuple
)):
return
list
(
ls
)
else
:
return
[
ls
]
class
until
(
object
):
"""
Theano can end on a condition. In order to differentiate this condition
from the other outputs of scan, this class is used to wrap the condition
around it.
"""
def
__init__
(
self
,
condition
):
self
.
condition
=
tensor
.
as_tensor_variable
(
condition
)
assert
self
.
condition
.
ndim
==
0
def
get_updates_and_outputs
(
ls
):
"""
Parses the list ``ls`` into outputs and updates.
The semantics of ``ls`` is defined by the constructive function of scan.
The elemets of ``ls`` are either a list of expressions representing the
outputs/states, a dictionary of updates or a condition.
"""
def
is_list_outputs
(
elem
):
if
(
isinstance
(
elem
,
(
list
,
tuple
))
and
all
([
isinstance
(
x
,
theano
.
Variable
)
for
x
in
elem
])):
return
True
if
isinstance
(
elem
,
theano
.
Variable
):
return
True
return
False
def
is_updates
(
elem
):
if
isinstance
(
elem
,
dict
):
return
True
# Dictionaries can be given as lists of tuples
if
(
isinstance
(
elem
,
(
list
,
tuple
))
and
all
([
isinstance
(
x
,
(
list
,
tuple
))
and
len
(
x
)
==
2
for
x
in
elem
])):
return
True
return
False
def
is_condition
(
elem
):
return
isinstance
(
elem
,
until
)
if
is_list_outputs
(
ls
):
return
None
,
to_list
(
ls
),
{}
if
is_updates
(
ls
):
return
None
,
[],
dict
(
ls
)
if
not
isinstance
(
ls
,
(
list
,
tuple
)):
raise
ValueError
((
'Scan can not parse the return value'
' of your constructive function given to scan'
))
ls
=
list
(
ls
)
deprecation_msg
=
(
'The return value of the lambda function'
' has been restricted. you have to always return first the'
' outputs (if any), afterwards the updates (if any) and'
' at the end the condition'
)
error_msg
=
(
'Scan can not parse the return value of your constructive '
'funtion given to scan'
)
if
len
(
ls
)
==
2
:
if
is_list_outputs
(
ls
[
0
]):
if
is_updates
(
ls
[
1
]):
return
(
None
,
to_list
(
ls
[
0
]),
dict
(
ls
[
1
]))
elif
is_condition
(
ls
[
1
]):
return
(
ls
[
1
]
.
condition
,
to_list
(
ls
[
0
]),
{})
else
:
raise
ValueError
(
error_msg
)
elif
is_updates
(
ls
[
0
]):
if
is_outputs
(
ls
[
1
]):
raise
ValueError
(
deprecation_msg
)
elif
is_condition
(
ls
[
1
]):
return
(
ls
[
1
]
.
condition
,
[],
dict
(
ls
[
0
]))
else
:
raise
ValueError
(
error_msg
)
else
:
raise
ValueError
(
error_msg
)
elif
len
(
ls
)
==
3
:
if
is_outputs
(
ls
[
0
]):
if
is_updates
(
ls
[
1
]):
if
is_condition
(
ls
[
2
]):
return
(
ls
[
2
]
.
condition
,
to_list
(
ls
[
0
]),
dict
(
ls
[
1
]))
else
:
raise
ValueError
(
error_msg
)
else
:
raise
ValueError
(
error_msg
)
else
:
raise
ValueError
(
error_msg
)
def
clone
(
output
,
replace
=
None
,
strict
=
True
,
share_inputs
=
True
):
"""
Function that allows replacing subgraphs of a computational graph.
It returns a copy of the initial subgraph with the corresponding
substitutions.
Parameters
----------
output : Theano Variables (or Theano expressions)
Theano expression that represents the computational graph.
replace: dict
Dictionary describing which subgraphs should be replaced by what.
share_inputs : bool
If True, use the same inputs (and shared variables) as the original
graph. If False, clone them. Note that cloned shared variables still
use the same underlying storage, so they will always have the same
value.
"""
inps
,
outs
,
other_stuff
=
rebuild_collect_shared
(
output
,
[],
replace
,
[],
strict
,
share_inputs
)
return
outs
def
canonical_arguments
(
sequences
,
outputs_info
,
non_sequences
,
go_backwards
,
n_steps
):
"""
This re-writes the arguments obtained from scan into a more friendly
form for the scan_op.
Mainly it makes sure that arguments are given as lists of dictionaries,
and that the different fields of of a dictionary are set to default
value if the user has not provided any.
"""
states_info
=
to_list
(
outputs_info
)
parameters
=
[
tensor
.
as_tensor_variable
(
x
)
for
x
in
to_list
(
non_sequences
)]
inputs
=
[]
if
n_steps
is
not
None
:
negative_n_steps
=
tensor
.
lt
(
tensor
.
as_tensor_variable
(
n_steps
),
0
)
for
input
in
to_list
(
sequences
):
if
not
isinstance
(
input
,
dict
):
nw_input
=
tensor
.
as_tensor_variable
(
input
)
if
go_backwards
:
nw_input
=
nw_input
[::
-
1
]
if
n_steps
is
not
None
:
nw_input
=
tensor
.
switch
(
negative_n_steps
,
nw_input
[::
-
1
],
nw_input
)
inputs
.
append
(
tensor
.
as_tensor_variable
(
nw_input
))
elif
input
.
get
(
'taps'
,
True
)
is
None
:
nw_input
=
tensor
.
as_tensor_variable
(
input
[
'input'
])
if
go_backwards
:
nw_input
=
nw_input
[::
-
1
]
if
n_steps
is
not
None
:
nw_input
=
tensor
.
switch
(
negative_n_steps
,
nw_input
[::
-
1
],
nw_input
)
inputs
.
append
(
nw_input
)
elif
input
.
get
(
'taps'
,
None
):
mintap
=
numpy
.
min
(
input
[
'taps'
])
maxtap
=
numpy
.
max
(
input
[
'taps'
])
orig_input
=
tensor
.
as_tensor_variable
(
input
[
'input'
])
if
go_backwards
:
orig_input
=
orig_input
[::
-
1
]
if
n_steps
is
not
None
:
orig_input
=
tensor
.
switch
(
negative_n_steps
,
orig_input
[::
-
1
],
orig_input
)
for
k
in
input
[
'taps'
]:
# We cut the sequence such that seq[i] to correspond to
# seq[i-k]
if
maxtap
<
0
:
offset_max
=
abs
(
maxtap
)
else
:
offset_max
=
0
if
mintap
<
0
:
offset_min
=
abs
(
mintap
)
else
:
offset_min
=
0
nw_input
=
orig_input
if
maxtap
==
mintap
and
maxtap
!=
0
:
if
maxtap
>
0
:
nw_input
=
nw_input
[
maxtap
:]
else
:
nw_input
=
nw_input
[:
maxtap
]
else
:
st
=
k
+
offset_min
if
maxtap
>
0
:
ed
=
-
(
maxtap
+
offset_min
-
st
)
else
:
ed
=
-
(
offset_min
-
st
)
if
ed
!=
0
:
nw_input
=
nw_input
[
st
:
ed
]
else
:
nw_input
=
nw_input
[
st
:]
inputs
.
append
(
nw_input
)
else
:
raise
ValueError
(
'Provided sequence makes no sense'
,
str
(
input
))
# Since we've added all sequences now we need to level them up based on
# n_steps or their different shapes
if
n_steps
is
None
:
if
len
(
inputs
)
==
0
:
# No information about the number of steps
raise
ValueError
(
'You need to provide either at least '
'one sequence over which scan should loop '
'or a number of steps for scan to loop. '
'Neither of the two had been provided !'
)
T
=
inputs
[
0
]
.
shape
[
0
]
for
input
in
inputs
[
1
:]:
T
=
tensor
.
minimum
(
T
,
input
.
shape
[
0
])
else
:
T
=
abs
(
tensor
.
as_tensor
(
n_steps
))
# Level up sequences
inputs
=
[
input
[:
T
]
for
input
in
inputs
]
# wrap outputs info in a dictionary if they are not already in one
for
i
,
state
in
enumerate
(
states_info
):
if
state
is
not
None
and
not
isinstance
(
state
,
dict
):
states_info
[
i
]
=
dict
(
initial
=
tensor
.
as_tensor_variable
(
state
),
taps
=
[
-
1
])
elif
isinstance
(
state
,
dict
):
if
not
state
.
get
(
'initial'
,
None
)
and
state
.
get
(
'taps'
,
None
):
raise
ValueError
((
'If you are using slices of an output '
'you need to provide a initial state '
'for it'
),
state
)
elif
state
.
get
(
'initial'
,
None
)
and
not
state
.
get
(
'taps'
,
None
):
# initial state but taps not provided
if
'taps'
in
state
:
# explicitly provided a None for taps
_logger
.
warning
(
(
'Output
%
s ( index
%
d) has a initial '
'state but taps is explicitly set to None '
),
getattr
(
states_info
[
i
][
'initial'
],
'name'
,
'None'
),
i
)
states_info
[
i
][
'taps'
]
=
[
-
1
]
states_info
[
i
][
'initial'
]
=
\
tensor
.
as_tensor_variable
(
state
[
'initial'
])
elif
state
.
get
(
'initial'
,
None
):
states_info
[
i
][
'initial'
]
=
\
tensor
.
as_tensor_variable
(
state
[
'initial'
])
else
:
# if a None is provided as the output info we replace it
# with an empty dict() to simplify handling
states_info
[
i
]
=
dict
()
return
inputs
,
states_info
,
parameters
,
T
def
infer_shape
(
outs
,
inputs
,
input_shapes
):
"""
Compute the shape of the outputs given the shape of the inputs
of a theano graph.
We do it this way to avoid compiling the inner function just to get the
shape. Changes to ShapeFeature could require changes in this function.
"""
# We use a ShapeFeature because it has all the necessary logic
# inside. We don't use the full ShapeFeature interface, but we
# let it initialize itself with an empty fgraph, otherwise we will
# need to do it manually
for
inp
,
inp_shp
in
izip
(
inputs
,
input_shapes
):
if
inp_shp
is
not
None
and
len
(
inp_shp
)
!=
inp
.
ndim
:
assert
len
(
inp_shp
)
==
inp
.
ndim
shape_feature
=
tensor
.
opt
.
ShapeFeature
()
shape_feature
.
on_attach
(
theano
.
gof
.
FunctionGraph
([],
[]))
# Initialize shape_of with the input shapes
for
inp
,
inp_shp
in
izip
(
inputs
,
input_shapes
):
shape_feature
.
set_shape
(
inp
,
inp_shp
)
def
local_traverse
(
out
):
"""
Go back in the graph, from out, adding computable shapes to shape_of.
"""
if
out
in
shape_feature
.
shape_of
:
# Its shape is already known
return
elif
out
.
owner
is
None
:
# This is an input of the graph
shape_feature
.
init_r
(
out
)
else
:
# Recurse over inputs
for
inp
in
out
.
owner
.
inputs
:
if
not
inp
in
shape_feature
.
shape_of
:
local_traverse
(
inp
)
# shape_feature.on_import does not actually use an fgraph
# It will call infer_shape and set_shape appropriately
dummy_fgraph
=
None
shape_feature
.
on_import
(
dummy_fgraph
,
out
.
owner
,
reason
=
"dummy"
)
ret
=
[]
for
o
in
outs
:
local_traverse
(
o
)
ret
.
append
(
shape_feature
.
shape_of
[
o
])
return
ret
def
allocate_memory
(
T
,
y_info
,
y
):
"""
Allocates memory for an output of scan.
Parameters
----------
T : scalar
Variable representing the number of steps scan will run.
y_info : dict
Dictionary describing the output (more specifically describing shape
information for the output.
y : Tensor variable
Expression describing the computation resulting in out entry of y.
It can be used to infer the shape of y.
"""
if
'shape'
in
y_info
:
return
tensor
.
zeros
([
T
,
]
+
list
(
y_info
[
'shape'
]),
dtype
=
y
.
dtype
)
else
:
inputs
=
gof
.
graph
.
inputs
([
y
])
ins_shapes
=
[]
for
inp
in
inputs
:
in_shape
=
[
inp
.
shape
[
k
]
for
k
in
xrange
(
inp
.
ndim
)]
ins_shapes
.
append
(
in_shape
)
shape
=
infer_shape
([
y
],
inputs
,
ins_shapes
)[
0
]
return
tensor
.
zeros
([
T
,
]
+
shape
,
dtype
=
y
.
dtype
)
class
ScanPermutation
(
gof
.
Op
):
def
__init__
(
self
,
mintap
=
0
,
inplace
=
False
):
self
.
inplace
=
inplace
self
.
mintap
=
mintap
if
inplace
:
self
.
destroy_map
=
{
0
:
[
0
]}
def
__eq__
(
self
,
other
):
return
type
(
self
)
==
type
(
other
)
and
self
.
inplace
==
other
.
inplace
def
__hash__
(
self
):
return
hash
(
type
(
self
))
^
hash
(
self
.
inplace
)
def
__str__
(
self
):
if
self
.
inplace
:
return
"scan_permutation{inplace}"
else
:
return
"scan_permutation"
def
make_node
(
self
,
membuffer
,
index
):
# index has to be a scalar
assert
index
.
ndim
==
0
# we neeed at least one dimension
assert
membuffer
.
ndim
>
0
return
gof
.
Apply
(
self
,
[
membuffer
,
index
],
[
membuffer
.
type
()])
def
perform
(
self
,
node
,
inputs
,
outputs
):
membuffer
=
inputs
[
0
]
index
=
inputs
[
1
]
+
self
.
mintap
out
=
outputs
[
0
]
if
index
%
membuffer
.
shape
[
0
]
==
0
:
if
self
.
inplace
:
out
[
0
]
=
membuffer
else
:
out
[
0
]
=
membuffer
.
copy
()
else
:
pos
=
index
%
membuffer
.
shape
[
0
]
if
outputs
[
0
]
is
membuffer
:
membuffer
=
membuffer
.
copy
()
print
(
pos
)
out
[
0
][:
membuffer
.
shape
[
0
]
-
pos
]
=
membuffer
[
pos
:]
out
[
0
][
membuffer
.
shape
[
0
]
-
pos
:]
=
membuffer
[:
pos
]
def
R_op
(
self
,
inputs
,
eval_points
):
if
eval_points
[
0
]
is
None
:
return
[
None
]
return
self
.
make_node
(
eval_points
[
0
],
inputs
[
1
])
.
outputs
def
grad
(
self
,
inputs
,
grads
):
pos
=
inputs
[
0
]
.
shape
[
0
]
-
(
inputs
[
1
]
%
inputs
[
0
]
.
shape
[
0
])
return
self
.
make_node
(
grads
[
0
],
pos
)
.
outputs
theano/sandbox/scan_module/tests/__init__.py
deleted
100644 → 0
浏览文件 @
7320e1b1
theano/sandbox/scan_module/tests/test_scan.py
deleted
100644 → 0
浏览文件 @
7320e1b1
from
__future__
import
print_function
import
os
import
shutil
from
tempfile
import
mkdtemp
import
time
import
sys
import
unittest
import
six.moves.cPickle
as
pickle
from
six.moves
import
xrange
import
numpy
from
numpy.testing
import
dec
import
theano
import
theano.sandbox.rng_mrg
from
theano
import
tensor
from
theano.compile.pfunc
import
rebuild_collect_shared
from
theano.tests
import
unittest_tools
as
utt
from
theano.tests.unittest_tools
import
SkipTest
from
.test_utils
import
*
import
theano.sandbox.scan_module
as
scan_module
from
theano.sandbox.scan_module.scan_op
import
ScanOp
class
TestScan
(
unittest
.
TestCase
):
def
setUp
(
self
):
utt
.
seed_rng
()
def
new_run
(
self
,
inputs_info
,
states_info
,
parameters_info
,
n_outputs
,
n_shared_updates
):
"""Generates a test for scan.
:param inputs_info: list of lists of dictionaries
Each list of dictionary represents one input sequence. Each
dictionary is one tap of that sequence. The dictionary has two
keys. ``use`` is either True or False, and it indicates if this
tap should be used in the inner graph or not. ``tap`` is the tap
value.
:param states_info: list of lists of dictionaries
see param ``inputs_info``. ``states_info`` has the same
semantics, just that it is for states and not for inputs
:param paramters_info: list of dictionary
Each dictionary is a different parameter. It has only one key,
namely ``use`` which says if the parameter should be used
internally or not
:param n_outputs: int
Number of pure outputs for scan
:param n_shared_updates: int
Number of shared variable with updates. They are all numeric.
"""
# Check the scan node has at least one output
if
n_outputs
+
n_shared_updates
+
len
(
states_info
)
==
0
:
return
rng
=
numpy
.
random
.
RandomState
(
utt
.
fetch_seed
())
n_ins
=
len
(
inputs_info
)
inputs
=
[
tensor
.
matrix
(
'u
%
d'
%
k
)
for
k
in
xrange
(
n_ins
)]
scan_inputs
=
[]
for
inp
,
info
in
zip
(
inputs
,
inputs_info
):
scan_inputs
.
append
(
dict
(
input
=
inp
,
taps
=
[
x
[
'tap'
]
for
x
in
info
]))
n_states
=
len
(
states_info
)
scan_states
=
[]
states
=
[]
for
info
in
states_info
:
if
len
(
info
)
==
1
and
info
[
0
][
'tap'
]
==
-
1
:
state
=
tensor
.
vector
(
'x
%
d'
%
k
)
states
.
append
(
state
)
scan_states
.
append
(
state
)
else
:
state
=
tensor
.
matrix
(
'x
%
d'
%
k
)
states
.
append
(
state
)
scan_states
.
append
(
dict
(
initial
=
state
,
taps
=
[
x
[
'tap'
]
for
x
in
info
]))
n_parameters
=
len
(
parameters_info
)
parameters
=
[
tensor
.
vector
(
'p
%
d'
%
k
)
for
k
in
xrange
(
n_parameters
)]
original_shared_values
=
[]
shared_vars
=
[]
for
k
in
xrange
(
n_shared_updates
):
data
=
rng
.
uniform
(
size
=
(
4
,))
.
astype
(
theano
.
config
.
floatX
)
original_shared_values
.
append
(
data
)
shared_vars
.
append
(
theano
.
shared
(
data
,
name
=
'z
%
d'
%
k
))
def
inner_function
(
*
args
):
"""
Functions that constructs the inner graph of scan
"""
arg_pos
=
0
to_add
=
None
for
in_info
in
inputs_info
:
for
info
in
in_info
:
arg
=
args
[
arg_pos
]
arg_pos
+=
1
# Construct dummy graph around input
if
info
[
'use'
]:
if
to_add
is
None
:
to_add
=
arg
*
2
else
:
to_add
=
to_add
+
arg
*
2
states_out
=
[
to_add
]
*
n_states
for
dx
,
st_info
in
enumerate
(
states_info
):
for
info
in
st_info
:
arg
=
args
[
arg_pos
]
arg_pos
+=
1
if
info
[
'use'
]:
if
states_out
[
dx
]:
states_out
[
dx
]
=
states_out
[
dx
]
+
arg
*
3
else
:
states_out
[
dx
]
=
arg
*
3
for
info
in
parameters_info
:
arg
=
args
[
arg_pos
]
arg_pos
+=
1
if
info
[
'use'
]:
if
to_add
is
None
:
to_add
=
arg
*
4
else
:
to_add
=
to_add
+
arg
*
4
if
to_add
is
not
None
:
shared_outs
=
[
sh
*
5
+
to_add
for
sh
in
shared_vars
]
rval
=
[]
for
arg
in
states_out
:
if
arg
is
None
:
rval
.
append
(
to_add
)
else
:
rval
.
append
(
arg
+
to_add
)
states_out
=
rval
pure_outs
=
[
to_add
**
2
for
x
in
xrange
(
n_outputs
)]
else
:
shared_outs
=
[
sh
*
5
for
sh
in
shared_vars
]
states_out
=
[
x
for
x
in
states_out
]
pure_outs
=
[
2
for
x
in
xrange
(
n_outputs
)]
return
states_out
+
pure_outs
,
dict
(
izip
(
shared_vars
,
shared_outs
))
def
execute_inner_graph
(
*
args
):
"""
Functions that computes numerically the values that scan should
return
"""
# Check if you need to go back in time over the sequences (the
# first argument is n_steps, the second is go_backwards)
nsteps
=
args
[
0
]
invert
=
False
if
args
[
1
]:
nsteps
=
nsteps
*
-
1
if
nsteps
<
0
:
new_ins
=
[
x
[::
-
1
]
for
x
in
args
[
2
:
2
+
n_ins
]]
else
:
new_ins
=
[
x
for
x
in
args
[
2
:
2
+
n_ins
]]
nsteps
=
abs
(
nsteps
)
# Simplify the inputs by slicing them according to the taps
nw_inputs
=
[]
for
inp
,
info
in
zip
(
new_ins
,
inputs_info
):
taps
=
[
x
[
'tap'
]
for
x
in
info
]
if
numpy
.
min
(
taps
)
<
0
:
_offset
=
abs
(
numpy
.
min
(
taps
))
else
:
_offset
=
0
nw_inputs
+=
[
inp
[
_offset
+
k
:]
for
k
in
taps
]
# Simplify the states by slicing them according to the taps.
# Note that if the memory buffer for the inputs and outputs is
# the same, by changing the outputs we also change the outputs
nw_states_inputs
=
[]
nw_states_outs
=
[]
for
st
,
info
in
zip
(
args
[
2
+
n_ins
:
2
+
n_ins
+
n_states
],
states_info
):
taps
=
[
x
[
'tap'
]
for
x
in
info
]
membuf
=
numpy
.
zeros
((
nsteps
+
abs
(
numpy
.
min
(
taps
)),
4
))
if
abs
(
numpy
.
min
(
taps
))
!=
1
:
membuf
[:
abs
(
numpy
.
min
(
taps
))]
=
st
[:
abs
(
numpy
.
min
(
taps
))]
else
:
membuf
[:
abs
(
numpy
.
min
(
taps
))]
=
st
nw_states_inputs
+=
[
membuf
[
abs
(
numpy
.
min
(
taps
))
+
k
:]
for
k
in
taps
]
nw_states_outs
.
append
(
membuf
[
abs
(
numpy
.
min
(
taps
)):])
parameters_vals
=
args
[
2
+
n_ins
+
n_states
:]
out_mem_buffers
=
[
numpy
.
zeros
((
nsteps
,
4
))
for
k
in
xrange
(
n_outputs
)]
shared_values
=
[
x
.
copy
()
for
x
in
original_shared_values
]
for
step
in
xrange
(
nsteps
):
arg_pos
=
0
to_add
=
None
for
in_info
in
inputs_info
:
for
info
in
in_info
:
arg
=
nw_inputs
[
arg_pos
][
step
]
arg_pos
+=
1
# Construct dummy graph around input
if
info
[
'use'
]:
if
to_add
is
None
:
to_add
=
arg
*
2
else
:
to_add
=
to_add
+
arg
*
2
arg_pos
=
0
for
dx
,
st_info
in
enumerate
(
states_info
):
if
to_add
is
not
None
:
nw_states_outs
[
dx
][
step
]
=
to_add
for
info
in
st_info
:
arg
=
nw_states_inputs
[
arg_pos
][
step
]
arg_pos
+=
1
if
info
[
'use'
]:
nw_states_outs
[
dx
][
step
]
+=
arg
*
3
for
arg
,
info
in
zip
(
parameters_vals
,
parameters_info
):
if
info
[
'use'
]:
if
to_add
is
None
:
to_add
=
arg
*
4
else
:
to_add
=
to_add
+
arg
*
4
if
to_add
is
not
None
:
shared_values
=
[
sh
*
5
+
to_add
for
sh
in
shared_values
]
for
state
in
nw_states_outs
:
state
[
step
]
+=
to_add
for
out
in
out_mem_buffers
:
out
[
step
]
=
to_add
**
2
else
:
shared_values
=
[
sh
*
5
for
sh
in
shared_values
]
for
out
in
out_mem_buffers
:
out
[
step
]
=
2
return
nw_states_outs
+
out_mem_buffers
,
shared_values
possible_n_steps
=
[
-
1
,
1
,
5
,
-
5
]
if
n_ins
>
0
:
possible_n_steps
.
append
(
None
)
for
n_steps
in
[
-
1
,
1
,
5
,
-
5
,
None
]:
for
go_backwards
in
[
True
,
False
]:
outputs
,
updates
=
scan_module
.
scan
(
inner_function
,
sequences
=
scan_inputs
,
outputs_info
=
scan_states
,
non_sequences
=
parameters
,
n_steps
=
n_steps
,
go_backwards
=
go_backwards
,
truncate_gradient
=-
1
)
my_f
=
theano
.
function
(
inputs
+
states
+
parameters
,
outputs
,
updates
=
updates
,
allow_input_downcast
=
True
)
if
n_steps
is
not
None
and
abs
(
n_steps
)
==
1
:
all_nodes
=
my_f
.
maker
.
fgraph
.
toposort
()
assert
len
([
x
for
x
in
all_nodes
if
isinstance
(
x
.
op
,
ScanOp
)])
==
0
print
(
' n_steps'
,
n_steps
,
file
=
sys
.
stderr
)
print
(
' go_backwards'
,
go_backwards
,
file
=
sys
.
stderr
)
print
(
' Scenario 1. Correct shape'
,
file
=
sys
.
stderr
)
if
n_steps
is
not
None
:
_n_steps
=
n_steps
else
:
_n_steps
=
8
# Generating data
# Scenario 1 : Good fit shapes
input_values
=
[]
for
info
in
inputs_info
:
taps
=
[
x
[
'tap'
]
for
x
in
info
]
offset
=
0
if
len
([
x
for
x
in
taps
if
x
<
0
])
>
0
:
offset
+=
abs
(
numpy
.
min
([
x
for
x
in
taps
if
x
<
0
]))
if
len
([
x
for
x
in
taps
if
x
>
0
])
>
0
:
offset
+=
numpy
.
max
([
x
for
x
in
taps
if
x
>
0
])
data
=
rng
.
uniform
(
size
=
(
abs
(
_n_steps
)
+
offset
,
4
))
input_values
.
append
(
data
)
state_values
=
[]
for
info
in
states_info
:
taps
=
[
x
[
'tap'
]
for
x
in
info
]
offset
=
abs
(
numpy
.
min
(
taps
))
if
offset
>
1
:
data
=
rng
.
uniform
(
size
=
(
offset
,
4
))
else
:
data
=
rng
.
uniform
(
size
=
(
4
,))
data
=
numpy
.
arange
(
4
)
state_values
.
append
(
data
)
param_values
=
[
rng
.
uniform
(
size
=
(
4
,))
for
k
in
xrange
(
n_parameters
)]
param_values
=
[
numpy
.
arange
(
4
)
for
k
in
xrange
(
n_parameters
)]
for
var
,
val
in
zip
(
shared_vars
,
original_shared_values
):
var
.
set_value
(
val
)
theano_outs
=
my_f
(
*
(
input_values
+
state_values
+
param_values
))
args
=
([
_n_steps
,
go_backwards
]
+
input_values
+
state_values
+
param_values
)
rvals
=
execute_inner_graph
(
*
args
)
numpy_outs
,
numpy_shared
=
rvals
assert
len
(
numpy_outs
)
==
len
(
theano_outs
)
assert
len
(
numpy_shared
)
==
len
(
shared_vars
)
for
th_out
,
num_out
in
zip
(
theano_outs
,
numpy_outs
):
try
:
assert
numpy
.
allclose
(
th_out
,
num_out
)
except
Exception
:
#import ipdb; ipdb.set_trace()
raise
for
th_out
,
num_out
in
zip
(
shared_vars
,
numpy_shared
):
try
:
assert
numpy
.
allclose
(
th_out
.
get_value
(),
num_out
)
except
Exception
:
#import ipdb; ipdb.set_trace()
raise
# Scenario 2 : Loose fit (sequences longer then required)
print
(
' Scenario 2. Loose shapes'
,
file
=
sys
.
stderr
)
input_values
=
[]
for
pos
,
info
in
enumerate
(
inputs_info
):
taps
=
[
x
[
'tap'
]
for
x
in
info
]
offset
=
0
if
len
([
x
for
x
in
taps
if
x
<
0
])
>
0
:
offset
+=
abs
(
numpy
.
min
([
x
for
x
in
taps
if
x
<
0
]))
if
len
([
x
for
x
in
taps
if
x
>
0
])
>
0
:
offset
+=
numpy
.
max
([
x
for
x
in
taps
if
x
>
0
])
if
n_steps
is
not
None
:
# loose inputs make sense only when n_steps is
# defined
data
=
rng
.
uniform
(
size
=
(
abs
(
_n_steps
)
+
offset
+
pos
+
1
,
4
))
else
:
data
=
rng
.
uniform
(
size
=
(
abs
(
_n_steps
)
+
offset
,
4
))
input_values
.
append
(
data
)
state_values
=
[]
for
pos
,
info
in
enumerate
(
states_info
):
taps
=
[
x
[
'tap'
]
for
x
in
info
]
offset
=
abs
(
numpy
.
min
(
taps
))
if
offset
>
1
:
data
=
rng
.
uniform
(
size
=
(
offset
+
pos
+
1
,
4
))
else
:
data
=
rng
.
uniform
(
size
=
(
4
,))
state_values
.
append
(
data
)
param_values
=
[
rng
.
uniform
(
size
=
(
4
,))
for
k
in
xrange
(
n_parameters
)]
for
var
,
val
in
zip
(
shared_vars
,
original_shared_values
):
var
.
set_value
(
val
)
theano_outs
=
my_f
(
*
(
input_values
+
state_values
+
param_values
))
args
=
([
_n_steps
,
go_backwards
]
+
input_values
+
state_values
+
param_values
)
rvals
=
execute_inner_graph
(
*
args
)
numpy_outs
,
numpy_shared
=
rvals
assert
len
(
numpy_outs
)
==
len
(
theano_outs
)
assert
len
(
numpy_shared
)
==
len
(
shared_vars
)
for
th_out
,
num_out
in
zip
(
theano_outs
,
numpy_outs
):
assert
numpy
.
allclose
(
th_out
,
num_out
)
for
th_out
,
num_out
in
zip
(
shared_vars
,
numpy_shared
):
assert
numpy
.
allclose
(
th_out
.
get_value
(),
num_out
)
# Scenario 3 : Less data then required
print
(
' Scenario 2. Wrong shapes'
,
file
=
sys
.
stderr
)
input_values
=
[]
for
pos
,
info
in
enumerate
(
inputs_info
):
taps
=
[
x
[
'tap'
]
for
x
in
info
]
offset
=
0
if
len
([
x
for
x
in
taps
if
x
<
0
])
>
0
:
offset
+=
abs
(
numpy
.
min
([
x
for
x
in
taps
if
x
<
0
]))
if
len
([
x
for
x
in
taps
if
x
>
0
])
>
0
:
offset
+=
numpy
.
max
([
x
for
x
in
taps
if
x
>
0
])
data
=
rng
.
uniform
(
size
=
(
abs
(
_n_steps
)
+
offset
-
1
,
4
))
input_values
.
append
(
data
)
state_values
=
[]
for
pos
,
info
in
enumerate
(
states_info
):
taps
=
[
x
[
'tap'
]
for
x
in
info
]
offset
=
abs
(
numpy
.
min
(
taps
))
data
=
rng
.
uniform
(
size
=
(
offset
-
1
,
4
))
state_values
.
append
(
data
)
param_values
=
[
rng
.
uniform
(
size
=
(
4
,))
for
k
in
xrange
(
n_parameters
)]
for
var
,
val
in
zip
(
shared_vars
,
original_shared_values
):
var
.
set_value
(
val
)
self
.
assertRaises
(
Exception
,
my_f
,
inputs
+
state_values
+
param_values
)
def
test001_generate_tests
(
self
):
rng
=
numpy
.
random
.
RandomState
(
utt
.
fetch_seed
())
all_inputs_info
=
[[]]
possible_taps_use_pairs
=
[[
dict
(
tap
=
0
,
use
=
True
)],
[
dict
(
tap
=
0
,
use
=
False
)],
[
dict
(
tap
=-
3
,
use
=
True
),
dict
(
tap
=-
1
,
use
=
True
)],
[
dict
(
tap
=-
3
,
use
=
True
),
dict
(
tap
=-
1
,
use
=
False
)],
[
dict
(
tap
=-
3
,
use
=
False
),
dict
(
tap
=-
1
,
use
=
False
)],
[
dict
(
tap
=-
2
,
use
=
True
),
dict
(
tap
=
0
,
use
=
True
)],
[
dict
(
tap
=-
2
,
use
=
False
),
dict
(
tap
=
0
,
use
=
True
)],
[
dict
(
tap
=-
2
,
use
=
False
),
dict
(
tap
=
0
,
use
=
False
)],
[
dict
(
tap
=
0
,
use
=
True
),
dict
(
tap
=
3
,
use
=
True
)],
[
dict
(
tap
=
2
,
use
=
True
),
dict
(
tap
=
3
,
use
=
True
)],
[
dict
(
tap
=-
2
,
use
=
True
),
dict
(
tap
=
3
,
use
=
True
)]]
test_nb
=
0
for
n_ins
in
[
1
,
2
]:
# Randomly pick up 4*n_ins combinations of arguments
for
k
in
xrange
(
4
*
n_ins
):
inp
=
[]
for
inp_nb
in
xrange
(
n_ins
):
pos
=
rng
.
randint
(
len
(
possible_taps_use_pairs
))
inp
.
append
(
possible_taps_use_pairs
[
pos
])
all_inputs_info
.
append
(
inp
)
all_states_info
=
[[]]
possible_taps_use_pairs
=
[[
dict
(
tap
=-
1
,
use
=
True
)],
[
dict
(
tap
=-
1
,
use
=
False
)],
[
dict
(
tap
=-
3
,
use
=
True
)],
[
dict
(
tap
=-
3
,
use
=
False
)],
[
dict
(
tap
=-
3
,
use
=
True
),
dict
(
tap
=-
1
,
use
=
True
)],
[
dict
(
tap
=-
3
,
use
=
True
),
dict
(
tap
=-
1
,
use
=
False
)],
[
dict
(
tap
=-
3
,
use
=
False
),
dict
(
tap
=-
1
,
use
=
False
)],
[
dict
(
tap
=-
4
,
use
=
True
),
dict
(
tap
=-
2
,
use
=
True
)],
[
dict
(
tap
=-
4
,
use
=
False
),
dict
(
tap
=-
2
,
use
=
True
)]]
for
n_ins
in
[
1
,
2
]:
# Randomly pick up 4*n_ins combinations of arguments
for
k
in
xrange
(
4
*
n_ins
):
state
=
[]
for
state_nb
in
xrange
(
n_ins
):
pos
=
rng
.
randint
(
len
(
possible_taps_use_pairs
))
state
.
append
(
possible_taps_use_pairs
[
pos
])
all_states_info
.
append
(
state
)
all_parameters_info
=
[[],
[
dict
(
use
=
False
)],
[
dict
(
use
=
True
)],
[
dict
(
use
=
True
),
dict
(
use
=
True
)],
[
dict
(
use
=
True
),
dict
(
use
=
False
)]]
# This generates errors related to some unfixed bug in the current
# version of scan
# The test will also have to be changesd following some further
# restriction of scan and reduction of the number of corner cases
return
for
n_outputs
in
[
0
,
1
,
2
]:
for
n_shared_updates
in
[
0
,
1
,
2
]:
for
n_random_combinations
in
xrange
(
1
):
pos_inp
=
rng
.
randint
(
len
(
all_inputs_info
))
pos_st
=
rng
.
randint
(
len
(
all_states_info
))
pos_param
=
rng
.
randint
(
len
(
all_parameters_info
))
print
(
file
=
sys
.
stderr
)
print
(
'Test nb'
,
test_nb
,
file
=
sys
.
stderr
)
print
(
' inputs'
,
all_inputs_info
[
pos_inp
],
file
=
sys
.
stderr
)
print
(
' states'
,
all_states_info
[
pos_st
],
file
=
sys
.
stderr
)
print
(
' parameters'
,
\
all_parameters_info
[
pos_param
],
file
=
sys
.
stderr
)
print
(
' n_outputs'
,
n_outputs
,
file
=
sys
.
stderr
)
print
(
' n_shared_updates'
,
n_shared_updates
,
file
=
sys
.
stderr
)
test_nb
+=
1
self
.
new_run
(
inputs_info
=
all_inputs_info
[
pos_inp
],
states_info
=
all_states_info
[
pos_st
],
parameters_info
=
all_parameters_info
[
pos_param
],
n_outputs
=
n_outputs
,
n_shared_updates
=
n_shared_updates
)
def
test002_generator_one_scalar_output
(
self
):
# The test fails, because the `work-in-progress` ScanOp always runs in
# place (even when told not to by DebugMode). As this op will change
# soon, and it is in the sandbox and not for user consumption, the
# error is marked as KnownFailure
raise
SkipTest
(
"Work-in-progress sandbox ScanOp is "
"not fully functional yet"
)
def
f_pow2
(
x_tm1
):
return
2
*
x_tm1
for
n_steps
in
[
-
1
,
1
,
5
,
-
5
]:
state
=
theano
.
tensor
.
scalar
(
'state'
)
output
,
updates
=
scan_module
.
scan
(
f_pow2
,
[],
state
,
[],
n_steps
=
n_steps
,
truncate_gradient
=-
1
,
go_backwards
=
False
)
my_f
=
theano
.
function
([
state
],
output
,
updates
=
updates
,
allow_input_downcast
=
True
)
if
abs
(
n_steps
)
==
1
:
assert
len
([
x
for
x
in
my_f
.
maker
.
fgraph
.
toposort
()
if
isinstance
(
x
.
op
,
scan_module
.
scan_op
.
ScanOp
)])
==
0
rng
=
numpy
.
random
.
RandomState
(
utt
.
fetch_seed
())
state
=
rng
.
uniform
()
numpy_values
=
numpy
.
array
([
state
*
(
2
**
(
k
+
1
))
for
k
in
xrange
(
abs
(
n_steps
))])
theano_values
=
my_f
(
state
)
assert
numpy
.
allclose
(
numpy_values
,
theano_values
)
# simple rnn, one input, one state, weights for each; input/state
# are vectors, weights are scalars
def
test003_one_sequence_one_output_and_weights
(
self
):
# The test fails, because the `work-in-progress` ScanOp always runs in
# place (even when told not to by DebugMode). As this op will change
# soon, and it is in the sandbox and not for user consumption, the
# error is marked as KnownFailure
raise
SkipTest
(
"Work-in-progress sandbox "
"ScanOp is not fully functional yet"
)
def
f_rnn
(
u_t
,
x_tm1
,
W_in
,
W
):
return
u_t
*
W_in
+
x_tm1
*
W
u
=
theano
.
tensor
.
vector
(
'u'
)
x0
=
theano
.
tensor
.
scalar
(
'x0'
)
W_in
=
theano
.
tensor
.
scalar
(
'win'
)
W
=
theano
.
tensor
.
scalar
(
'w'
)
n_steps
=
5
output
,
updates
=
scan_module
.
scan
(
f_rnn
,
u
,
x0
,
[
W_in
,
W
],
n_steps
=
n_steps
,
truncate_gradient
=-
1
,
go_backwards
=
False
)
my_f
=
theano
.
function
([
u
,
x0
,
W_in
,
W
],
output
,
updates
=
updates
,
allow_input_downcast
=
True
)
if
n_steps
is
not
None
and
abs
(
n_steps
)
==
1
:
assert
len
([
x
for
x
in
my_f
.
maker
.
fgraph
.
toposort
()
if
isinstance
(
x
.
op
,
scan_module
.
scan_op
.
ScanOp
)])
==
0
# get random initial values
rng
=
numpy
.
random
.
RandomState
(
utt
.
fetch_seed
())
v_u
=
rng
.
uniform
(
size
=
(
8
,),
low
=-
5.
,
high
=
5.
)
v_x0
=
rng
.
uniform
()
W
=
rng
.
uniform
()
W_in
=
rng
.
uniform
()
# compute the output in numpy
if
n_steps
is
not
None
and
n_steps
<
0
:
_v_u
=
v_u
[::
-
1
]
else
:
_v_u
=
v_u
steps
=
8
if
n_steps
is
not
None
:
steps
=
abs
(
n_steps
)
v_out
=
numpy
.
zeros
((
8
,))
v_out
[
0
]
=
_v_u
[
0
]
*
W_in
+
v_x0
*
W
for
step
in
xrange
(
1
,
steps
):
v_out
[
step
]
=
_v_u
[
step
]
*
W_in
+
v_out
[
step
-
1
]
*
W
v_out
=
v_out
[:
steps
]
theano_values
=
my_f
(
v_u
,
v_x0
,
W_in
,
W
)
assert
numpy
.
allclose
(
theano_values
,
v_out
)
def
test004_multiple_inputs_multiple_outputs
(
self
):
pass
def
test005_collect_parameters_outer_graph
(
self
):
pass
def
test006_multiple_taps
(
self
):
pass
def
test007_updates
(
self
):
pass
theano/sandbox/scan_module/tests/test_utils.py
deleted
100644 → 0
浏览文件 @
7320e1b1
import
six.moves.cPickle
as
pickle
from
six.moves
import
xrange
import
numpy
import
unittest
import
theano
from
theano.compile.pfunc
import
rebuild_collect_shared
import
theano.sandbox.scan_module
as
scan_module
if
theano
.
config
.
mode
==
'FAST_COMPILE'
:
mode_with_opt
=
theano
.
compile
.
mode
.
get_mode
(
'FAST_RUN'
)
else
:
mode_with_opt
=
theano
.
compile
.
mode
.
get_default_mode
()
mode_with_gpu
=
mode_with_opt
.
including
(
'gpu'
,
'scan'
)
# TODO: this should replace the verify_grad in gradient.py
class
multiple_outputs_numeric_grad
:
"""WRITEME"""
type_eps
=
{
'float64'
:
1e-7
,
'float32'
:
3e-3
}
def
__init__
(
self
,
f
,
pt
,
ndarray_mask
=
None
,
eps
=
None
):
"""Return the gradient of f at pt.
This function computes the gradient by a one-sided finite differences
of a fixed step size (eps).
It is assumed that f(...) will return a scalar.
:param eps: the stepsize for the finite differencing. None means
input dtype-dependent. See `type_eps`.
"""
def
prod
(
inputs
):
rval
=
1
for
i
in
inputs
:
rval
*=
i
return
rval
packed_pt
=
False
if
not
isinstance
(
pt
,
(
list
,
tuple
)):
pt
=
[
pt
]
packed_pt
=
True
# This mask tells us if we are dealing with an ndarray input or
# something else ( a random state ? ) with which we shouldn't really
# mess up
if
not
ndarray_mask
:
ndarray_mask
=
[
True
for
x
in
pt
]
dtype_eps
=
multiple_outputs_numeric_grad
.
type_eps
[
'float64'
]
for
i
,
p
in
enumerate
(
pt
):
if
ndarray_mask
[
i
]:
pt
[
i
]
=
numpy
.
array
(
p
)
_eps
=
multiple_outputs_numeric_grad
.
type_eps
[
str
(
pt
[
i
]
.
dtype
)]
if
_eps
>
dtype_eps
:
dtype_eps
=
_eps
self
.
ndarray_mask
=
ndarray_mask
#'''
# Compute clean output:
f_x
=
f
(
*
pt
)
gx
=
[]
# now iterate over the elements of x and call f on those + delta x
for
i
in
xrange
(
len
(
pt
)):
if
ndarray_mask
[
i
]:
# It is a ndarray that we can tweak
if
eps
:
_eps
=
eps
else
:
_eps
=
dtype_eps
if
pt
[
i
]
.
ndim
:
_g
=
[]
# it has several dimensions:
for
pos
in
xrange
(
prod
(
pt
[
i
]
.
shape
)):
t
=
pt
[
i
]
.
copy
()
t
=
t
.
flatten
()
t
[
pos
]
+=
_eps
t
=
t
.
reshape
(
pt
[
i
]
.
shape
)
f_eps
=
f
(
*
(
pt
[:
i
]
+
[
t
]
+
pt
[
i
+
1
:]))
_g
.
append
(
numpy
.
asarray
((
f_eps
-
f_x
)
/
_eps
))
gx
.
append
(
numpy
.
asarray
(
_g
)
.
reshape
(
pt
[
i
]
.
shape
))
else
:
t
=
numpy
.
array
(
pt
[
i
]
+
_eps
)
f_eps
=
f
(
*
(
pt
[:
i
]
+
[
t
]
+
pt
[
i
+
1
:]))
gx
.
append
(
numpy
.
asarray
((
f_eps
-
f_x
)
/
_eps
))
self
.
gx
=
gx
@staticmethod
def
abs_rel_err
(
a
,
b
,
eps
=
1.0e-10
):
"""Return a small number when a and b are close, relative to how big
they are"""
return
abs
(
a
-
b
)
/
(
abs
(
a
)
+
abs
(
b
)
+
eps
)
def
max_err
(
self
,
_g_pt
):
"""Return the biggest relative error between g_pt and self.gx"""
g_pt
=
[]
for
i
in
xrange
(
len
(
_g_pt
)):
if
self
.
ndarray_mask
[
i
]:
g_pt
.
append
(
_g_pt
[
i
])
elif
isinstance
(
_g_pt
[
i
],
numpy
.
ndarray
):
assert
numpy
.
all
(
_g_pt
[
i
]
==
0
)
if
len
(
g_pt
)
!=
len
(
self
.
gx
):
raise
ValueError
(
'argument has wrong number of elements'
,
len
(
g_pt
))
errs
=
[]
for
i
,
(
a
,
b
)
in
enumerate
(
zip
(
g_pt
,
self
.
gx
)):
if
a
.
shape
!=
b
.
shape
:
raise
ValueError
(
'argument element
%
i has wrong shape
%
s'
%
\
(
i
,
str
((
a
.
shape
,
b
.
shape
))))
vv
=
multiple_outputs_numeric_grad
.
abs_rel_err
(
a
,
b
)
errs
.
append
(
numpy
.
max
(
multiple_outputs_numeric_grad
.
abs_rel_err
(
a
,
b
)))
if
numpy
.
all
(
numpy
.
isfinite
(
errs
)):
return
numpy
.
max
(
errs
),
numpy
.
argmax
(
errs
)
else
:
return
numpy
.
inf
,
0
def
scan_project_sum
(
*
args
,
**
kwargs
):
rng
=
theano
.
tensor
.
shared_randomstreams
.
RandomStreams
(
123
)
scan_outputs
,
updates
=
theano
.
scan
(
*
args
,
**
kwargs
)
if
type
(
scan_outputs
)
not
in
[
list
,
tuple
]:
scan_outputs
=
[
scan_outputs
]
# we should ignore the random-state updates so that
# the uniform numbers are the same every evaluation and on every call
rng
.
add_default_updates
=
False
factors
=
[
rng
.
uniform
(
size
=
s
.
shape
,
low
=
0.1
,
high
=
0.9
)
for
s
in
scan_outputs
]
return
(
sum
([(
s
*
f
)
.
sum
()
for
s
,
f
in
zip
(
scan_outputs
,
factors
)]),
updates
)
def
asarrayX
(
value
):
return
theano
.
_asarray
(
value
,
dtype
=
theano
.
config
.
floatX
)
def
clone_optimized_graph
(
f
):
maker_ins
=
[
x
for
x
in
f
.
maker
.
fgraph
.
inputs
if
not
isinstance
(
x
,
theano
.
tensor
.
sharedvar
.
SharedVariable
)]
inps
,
outs
,
_
=
rebuild_collect_shared
(
f
.
maker
.
fgraph
.
outputs
,
maker_ins
,
copy_inputs_over
=
False
)
ins
=
[
x
for
x
in
inps
if
not
isinstance
(
x
,
theano
.
tensor
.
sharedvar
.
SharedVariable
)]
return
(
ins
,
outs
)
def
grab_scan_node
(
output
):
if
output
.
owner
is
None
:
return
None
if
output
.
owner
.
op
.
__class__
.
__name__
==
'Scan'
:
return
[
output
.
owner
]
rval
=
[]
for
i
in
output
.
owner
.
inputs
:
ri
=
grab_scan_node
(
i
)
if
ri
is
not
None
:
rval
+=
ri
if
rval
is
[]:
return
None
else
:
return
rval
class
TestScanUtils
(
unittest
.
TestCase
):
def
test001_cloning_no_replace_strict_copy_inputs
(
self
):
# This has nothing to do with scan, but it refers to the clone
# function that scan uses internally and that pfunc uses now and
# that users might want to use
x
=
theano
.
tensor
.
vector
(
'x'
)
y
=
theano
.
tensor
.
vector
(
'y'
)
z
=
theano
.
shared
(
0.25
)
f1
=
z
*
(
x
+
y
)
**
2
+
5
f2
=
scan_module
.
scan_utils
.
clone
(
f1
,
replace
=
None
,
strict
=
True
,
share_inputs
=
True
)
f2_inp
=
theano
.
gof
.
graph
.
inputs
([
f2
])
assert
z
in
f2_inp
assert
x
in
f2_inp
assert
y
in
f2_inp
def
test002_cloning_no_replace_strict_not_copy_inputs
(
self
):
# This has nothing to do with scan, but it refers to the clone
# function that scan uses internally and that pfunc uses now and
# that users might want to use
x
=
theano
.
tensor
.
vector
(
'x'
)
y
=
theano
.
tensor
.
vector
(
'y'
)
z
=
theano
.
shared
(
0.25
)
f1
=
z
*
(
x
+
y
)
**
2
+
5
f2
=
scan_module
.
scan_utils
.
clone
(
f1
,
replace
=
None
,
strict
=
True
,
share_inputs
=
False
)
f2_inp
=
theano
.
gof
.
graph
.
inputs
([
f2
])
assert
not
z
in
f2_inp
assert
not
x
in
f2_inp
assert
not
y
in
f2_inp
def
test003_cloning_replace_strict_copy_inputs
(
self
):
# This has nothing to do with scan, but it refers to the clone
# function that scan uses internally and that pfunc uses now and
# that users might want to use
x
=
theano
.
tensor
.
vector
(
'x'
)
y
=
theano
.
tensor
.
vector
(
'y'
)
y2
=
theano
.
tensor
.
vector
(
'y2'
)
z
=
theano
.
shared
(
0.25
)
f1
=
z
*
(
x
+
y
)
**
2
+
5
f2
=
scan_module
.
scan_utils
.
clone
(
f1
,
replace
=
{
y
:
y2
},
strict
=
True
,
share_inputs
=
True
)
f2_inp
=
theano
.
gof
.
graph
.
inputs
([
f2
])
assert
z
in
f2_inp
assert
x
in
f2_inp
assert
y2
in
f2_inp
def
test004_cloning_replace_not_strict_copy_inputs
(
self
):
# This has nothing to do with scan, but it refers to the clone
# function that scan uses internally and that pfunc uses now and
# that users might want to use
x
=
theano
.
tensor
.
vector
(
'x'
)
y
=
theano
.
tensor
.
fvector
(
'y'
)
y2
=
theano
.
tensor
.
dvector
(
'y2'
)
z
=
theano
.
shared
(
0.25
)
f1
=
z
*
(
x
+
y
)
**
2
+
5
f2
=
scan_module
.
scan_utils
.
clone
(
f1
,
replace
=
{
y
:
y2
},
strict
=
False
,
share_inputs
=
True
)
f2_inp
=
theano
.
gof
.
graph
.
inputs
([
f2
])
assert
z
in
f2_inp
assert
x
in
f2_inp
assert
y2
in
f2_inp
def
test005_cloning_replace_strict_not_copy_inputs
(
self
):
# This has nothing to do with scan, but it refers to the clone
# function that scan uses internally and that pfunc uses now and
# that users might want to use
x
=
theano
.
tensor
.
vector
(
'x'
)
y
=
theano
.
tensor
.
vector
(
'y'
)
y2
=
theano
.
tensor
.
vector
(
'y2'
)
z
=
theano
.
shared
(
0.25
)
f1
=
z
*
(
x
+
y
)
**
2
+
5
f2
=
scan_module
.
scan_utils
.
clone
(
f1
,
replace
=
{
y
:
y2
},
strict
=
True
,
share_inputs
=
False
)
f2_inp
=
theano
.
gof
.
graph
.
inputs
([
f2
])
assert
not
z
in
f2_inp
assert
not
x
in
f2_inp
assert
not
y2
in
f2_inp
def
test006_cloning_replace_not_strict_not_copy_inputs
(
self
):
# This has nothing to do with scan, but it refers to the clone
# function that scan uses internally and that pfunc uses now and
# that users might want to use
x
=
theano
.
tensor
.
vector
(
'x'
)
y
=
theano
.
tensor
.
fvector
(
'y'
)
y2
=
theano
.
tensor
.
dvector
(
'y2'
)
z
=
theano
.
shared
(
0.25
)
f1
=
z
*
(
x
+
y
)
**
2
+
5
f2
=
scan_module
.
scan_utils
.
clone
(
f1
,
replace
=
{
y
:
y2
},
strict
=
False
,
share_inputs
=
False
)
f2_inp
=
theano
.
gof
.
graph
.
inputs
([
f2
])
assert
not
z
in
f2_inp
assert
not
x
in
f2_inp
assert
not
y2
in
f2_inp
theano/sandbox/symbolic_module.py
deleted
100644 → 0
浏览文件 @
7320e1b1
from
__future__
import
print_function
import
copy
,
inspect
import
theano
import
theano.tensor
as
T
from
six
import
string_types
,
add_metaclass
,
iteritems
from
six.moves
import
xrange
#import klass
def
symbolic
(
f
):
f
.
__is_symbolic
=
True
return
f
class
InitGraph
(
type
):
def
__init__
(
cls
,
name
,
bases
,
dct
):
# print 'INITIALIZING', name
super
(
InitGraph
,
cls
)
.
__init__
(
name
,
bases
,
dct
)
def
just_symbolic
(
dct
):
def
filter
(
k
,
v
):
return
True
if
getattr
(
v
,
'__is_symbolic'
,
False
):
return
True
if
issubclass
(
v
,
SymbolicModule
):
return
True
return
isinstance
(
v
,
theano
.
Variable
)
and
not
k
.
startswith
(
'_'
)
r
=
{}
for
key
,
val
in
iteritems
(
dct
):
if
list
(
filter
(
key
,
val
)):
r
[
key
]
=
val
return
r
build_graph_rval
=
cls
.
build_graph
()
if
not
isinstance
(
build_graph_rval
,
dict
):
raise
TypeError
(
'
%
s.build_graph did not return dictionary'
%
cls
)
dct
=
just_symbolic
(
build_graph_rval
)
for
key
,
val
in
iteritems
(
dct
):
# print ' adding class attribute', key
if
isinstance
(
val
,
theano
.
Variable
)
and
val
.
name
is
None
:
val
.
name
=
key
if
callable
(
val
):
setattr
(
cls
,
key
,
staticmethod
(
val
))
else
:
setattr
(
cls
,
key
,
val
)
# installs class attributes from build_graph after declaration
@add_metaclass
(
InitGraph
)
class
SymbolicModule
(
object
):
# if we call this function, it will return a new SymbolicModule
def
__new__
(
self
,
**
kwargs
):
class
SymMod
(
SymbolicModule
):
@staticmethod
def
build_graph
(
*
bg_args
,
**
bg_kwargs
):
# this one is like self.build_graph,
# except that the kwargs are automatically inserted
kwcopy
=
copy
.
copy
(
kwargs
)
kwcopy
.
update
(
bg_kwargs
)
return
self
.
build_graph
(
*
bg_args
,
**
kwcopy
)
setattr
(
SymMod
,
'__name__'
,
self
.
__name__
+
'_derived'
)
return
SymMod
@staticmethod
def
build_graph
():
return
{}
def
issymbolicmodule
(
thing
):
try
:
return
issubclass
(
thing
,
SymbolicModule
)
except
Exception
:
return
False
def
issymbolicmethod
(
thing
):
return
getattr
(
thing
,
'__symbolic_method'
,
False
)
def
symbolic_module
(
f
):
class
SymMod
(
SymbolicModule
):
build_graph
=
staticmethod
(
f
)
return
SymMod
def
symbolicmethod
(
f
):
f
.
__symbolic_method
=
True
return
f
class
CompiledModule
(
object
):
pass
def
compile_fn
(
f
,
path_locals
,
common_inputs
):
(
args
,
vararg
,
kwarg
,
default
)
=
inspect
.
getargspec
(
f
)
if
default
:
# this can be handled correctly, in that default arguments trump path_locals
raise
NotImplementedError
()
# make new inputs for the vars named in args
# this has the effect of creating new storage for these arguments
# The common storage doesn't get messed with.
inputs
=
[
In
(
path_locals
.
get
(
name
,
name
))
for
name
in
args
]
inputs
.
extend
([
v
for
k
,
v
in
common_inputs
.
items
()
if
k
not
in
args
])
outputs
=
f
()
# print 'inputs', inputs
# print 'outputs', outputs
compiled_f
=
theano
.
function
(
inputs
,
outputs
)
updated
=
[]
return
compiled_f
,
updated
def
compile
(
smod
,
initial_values
=
None
):
"""
:type values: dictionary Variable -> value
"""
if
initial_values
is
None
:
initial_values
=
{}
def
sym_items
(
mod
):
for
k
in
mod
.
__dict__
:
if
k
in
[
'__module__'
,
'build_graph'
,
'__doc__'
]:
pass
else
:
yield
k
,
getattr
(
mod
,
k
)
def
walker
(
root
):
def
modwalker
(
path_locals
,
values
):
for
val
in
values
:
yield
path_locals
,
val
if
isinstance
(
val
,
list
):
for
s
in
modwalker
(
path_locals
,
val
):
yield
s
elif
isinstance
(
val
,
dict
):
for
s
in
modwalker
(
path_locals
,
val
.
values
()):
yield
s
elif
issymbolicmodule
(
val
):
for
s
in
modwalker
(
val
.
__dict__
,
[
v
for
k
,
v
in
sym_items
(
val
)]):
yield
s
elif
isinstance
(
val
,
(
string_types
,
int
,
float
)):
pass
elif
isinstance
(
val
,
theano
.
Variable
):
pass
elif
issymbolicmethod
(
val
):
pass
else
:
# check for weird objects that we would like to disallow
# not all objects can be transfered by the clone mechanism below
raise
TypeError
(
(
val
,
type
(
val
),
getattr
(
val
,
'__name__'
)))
for
blah
in
modwalker
(
root
.
__dict__
,
[
v
for
k
,
v
in
sym_items
(
root
)]):
yield
blah
# Locate all the starting nodes, and create containers entries for their values
inputs
=
{}
for
path_locals
,
val
in
walker
(
smod
):
if
isinstance
(
val
,
theano
.
Variable
)
and
(
val
.
owner
is
None
)
and
(
val
not
in
inputs
):
inputs
[
val
]
=
theano
.
In
(
val
,
value
=
theano
.
gof
.
Container
(
val
,
[
'a'
]))
assert
len
(
inputs
)
==
len
([
v
for
v
in
inputs
.
items
()])
# Locate all the functions to compile, and compile them
compiled_functions
=
{}
for
path_locals
,
val
in
walker
(
smod
):
if
issymbolicmethod
(
val
):
f
,
update_expressions
=
compile_fn
(
val
,
path_locals
,
inputs
)
compiled_functions
[
val
]
=
f
# Now replicate the nested structure of the SymbolicModule smod
# with CompiledModules instead
reflected
=
{}
def
reflect
(
thing
):
# UNHASHABLE TYPES
if
isinstance
(
thing
,
list
):
return
[
reflect
(
e
)
for
e
in
thing
]
if
isinstance
(
thing
,
dict
):
raise
NotImplementedError
()
# HASHABLE TYPES
if
thing
not
in
reflected
:
if
issymbolicmodule
(
thing
):
class
CMod
(
CompiledModule
):
pass
setattr
(
CMod
,
'__name__'
,
thing
.
__name__
+
'_compiled'
)
# TODO: consider an instance of the class, or the class itself?
# which is easier for copying?
cmod
=
CMod
()
reflected
[
thing
]
=
cmod
for
key
,
val
in
sym_items
(
thing
):
setattr
(
CMod
,
key
,
reflect
(
val
))
elif
isinstance
(
thing
,
(
string_types
,
int
,
float
)):
reflected
[
thing
]
=
thing
elif
isinstance
(
thing
,
theano
.
Variable
):
if
thing
.
owner
is
None
:
def
getter
(
s
):
return
inputs
[
thing
]
.
value
.
value
def
setter
(
s
,
v
):
inputs
[
thing
]
.
value
.
storage
[
0
]
=
v
p
=
property
(
getter
,
setter
)
print
(
p
)
reflected
[
thing
]
=
p
else
:
reflected
[
thing
]
=
None
# TODO: how to reflect derived resuls?
elif
issymbolicmethod
(
thing
):
reflected
[
thing
]
=
compiled_functions
[
thing
]
else
:
# check for weird objects that we would like to disallow
# not all objects can be transfered by the clone mechanism below
raise
TypeError
(
'reflecting not supported for'
,
(
thing
,
type
(
thing
),
getattr
(
thing
,
'__name__'
,
None
)))
return
reflected
[
thing
]
rval
=
reflect
(
smod
)
rval
.
__inputs
=
inputs
rval
.
__compiled_functions
=
compiled_functions
return
rval
@symbolic_module
def
LR
(
x
=
None
,
y
=
None
,
v
=
None
,
c
=
None
,
l2_coef
=
None
):
# our points, one point per row
if
x
is
None
:
x
=
T
.
dmatrix
()
# targets , one per row
if
y
is
None
:
y
=
T
.
dmatrix
()
# first layer weights
if
v
is
None
:
v
=
T
.
dmatrix
()
# first layer biases
if
c
is
None
:
c
=
T
.
dvector
()
if
l2_coef
is
None
:
l2_coef
=
T
.
dscalar
()
pred
=
T
.
dot
(
x
,
v
)
+
c
sse
=
T
.
sum
((
pred
-
y
)
*
(
pred
-
y
))
mse
=
sse
/
T
.
shape
(
y
)[
0
]
v_l2
=
T
.
sum
(
T
.
sum
(
v
*
v
))
loss
=
mse
+
l2_coef
*
v_l2
@symbolicmethod
def
params
():
return
[
v
,
c
]
return
locals
()
@symbolic_module
def
Layer
(
x
=
None
,
w
=
None
,
b
=
None
):
# our points, one point per row
if
x
is
None
:
x
=
T
.
dmatrix
()
# first layer weights
if
w
is
None
:
w
=
T
.
dmatrix
()
# first layer bias
if
b
is
None
:
b
=
T
.
dvector
()
y
=
T
.
tanh
(
T
.
dot
(
x
,
w
)
+
b
)
@symbolicmethod
def
params
():
return
[
w
,
b
]
return
locals
()
@symbolic_module
def
NNet
(
x
=
None
,
y
=
None
,
n_hid_layers
=
2
):
# our points, one point per row
if
x
is
None
:
x
=
T
.
dmatrix
()
# targets , one per row
if
y
is
None
:
y
=
T
.
dmatrix
()
layers
=
[]
_x
=
x
for
i
in
xrange
(
n_hid_layers
):
layers
.
append
(
Layer
(
x
=
_x
))
_x
=
layers
[
-
1
]
.
y
classif
=
LR
(
x
=
_x
)
@symbolicmethod
def
params
():
rval
=
classif
.
params
()
for
l
in
layers
:
rval
.
extend
(
l
.
params
())
print
([
id
(
r
)
for
r
in
rval
])
return
rval
if
0
:
@symbolicmethod
def
update
(
x
,
y
):
pp
=
params
()
gp
=
T
.
grad
(
classif
.
loss
,
pp
)
return
dict
((
p
,
p
-
0.01
*
g
)
for
p
,
g
in
zip
(
pp
,
gp
))
return
locals
()
nnet
=
compile
(
NNet
)
print
(
nnet
)
print
(
nnet
.
params
())
print
(
nnet
.
params
.
__dict__
[
'finder'
][
NNet
.
layers
[
0
]
.
w
])
nnet
.
params
[
NNet
.
layers
[
0
]
.
w
]
=
[[
6
]]
print
(
nnet
.
params
())
print
(
nnet
.
params
())
if
0
:
def
deco
(
f
):
class
SymMod
(
SymbolicModule
):
def
__call__
(
self
,
*
args
,
**
kwargs
):
# return another SymbolicModule built like self
def
dummy
(
*
dargs
,
**
dkwargs
):
print
(
'args'
,
args
,
dargs
)
print
(
'kwargs'
,
kwargs
,
dkwargs
)
return
f
(
*
args
,
**
kwargs
)
return
deco
(
dummy
)
locals_dict
=
f
()
for
key
,
val
in
iteritems
(
locals_dict
):
if
isinstance
(
val
,
theano
.
Variable
):
try
:
kres
=
klass
.
KlassMember
(
val
)
except
Exception
:
kres
=
klass
.
KlassVariable
(
val
)
setattr
(
SymMod
,
key
,
kres
)
elif
callable
(
val
)
and
getattr
(
val
,
'__is_symbolic'
):
setattr
(
SymMod
,
key
,
val
)
return
SymMod
()
@deco
def
logistic_regression
(
x
=
T
.
dmatrix
(),
#our points, one point per row
y
=
T
.
dmatrix
(),
#our targets
v
=
T
.
dmatrix
(),
#first layer weights
c
=
T
.
dvector
(),
#first layer bias
l2_coef
=
T
.
dscalar
()
):
pred
=
T
.
dot
(
x
,
v
)
+
c
sse
=
T
.
sum
((
pred
-
y
)
*
(
pred
-
y
))
v_l2
=
T
.
sum
(
T
.
sum
(
v
*
v
))
loss
=
sse
+
l2_coef
*
v_l2
@symbolic
def
params
():
return
[
v
,
c
]
return
just_symbolic
(
locals
())
@deco
def
tanh_layer
(
top_part
=
None
,
x
=
T
.
dmatrix
(),
#our points, one point per row
w
=
T
.
dmatrix
(),
#first layer weights
b
=
T
.
dvector
(),
#first layer bias
**
kwargs
# other things from logistic_regression
):
hid
=
T
.
tanh
(
T
.
dot
(
x
,
w
)
+
b
)
if
top_part
:
print
(
'top_part'
,
top_part
,
'kwargs'
,
kwargs
)
top
=
top_part
(
x
=
hid
,
**
kwargs
)
# SymbolicModule
def
params
():
return
top
.
params
()
+
[
w
,
b
]
else
:
def
params
():
return
[
w
,
b
]
return
just_symbolic
(
locals
())
if
0
:
print
(
'logistic_regression'
,
logistic_regression
)
print
(
'tanh_layer'
,
tanh_layer
)
print
(
'nnet1'
,
nnet1
)
nnet1
=
tanh_layer
(
logistic_regression
)
nnet2
=
tanh_layer
(
nnet1
)
print
(
'nnet2'
,
nnet2
)
if
0
:
class
SymbolicModule
(
object
):
name
=
"__no_name__"
# name of this module
variable_table
=
{}
#map strings (names) to Variables
method_table
=
{}
#map strings to compilable functions
include_list
=
[]
constructor_fn
=
None
def
build
(
self
):
"""Run the body of the included modules in order, using the current variables and imports
"""
def
include
(
self
,
symbolic_module
,
name
=
None
):
"""This redefines the symbols in the kwargs
"""
if
name
is
None
:
name
=
symbolic_module
.
name
def
__init__
(
self
,
constructor_fn
=
None
):
""" A constructor fn builds
- a graph on top of the variable table, and
- compilable methods.
"""
@SymbolicModule_fromFn
def
neural_net
(
x
=
T
.
dmatrix
(),
#our points, one point per row
y
=
T
.
dmatrix
(),
#our targets
w
=
T
.
dmatrix
(),
#first layer weights
b
=
T
.
dvector
(),
#first layer bias
v
=
T
.
dmatrix
(),
#second layer weights
c
=
T
.
dvector
(),
#second layer bias
step
=
T
.
dscalar
(),
# step size for gradient descent
l2_coef
=
T
.
dscalar
()
# l2 regularization amount
):
"""Idea A:
"""
hid
=
T
.
tanh
(
T
.
dot
(
x
,
w
)
+
b
)
pred
=
T
.
dot
(
hid
,
v
)
+
c
sse
=
T
.
sum
((
pred
-
y
)
*
(
pred
-
y
))
w_l2
=
T
.
sum
(
T
.
sum
(
w
*
w
))
v_l2
=
T
.
sum
(
T
.
sum
(
v
*
v
))
loss
=
sse
+
l2_coef
*
(
w_l2
+
v_l2
)
def
symbolic_params
(
cls
):
return
[
cls
.
w
,
cls
.
b
,
cls
.
v
,
cls
.
c
]
def
update
(
cls
,
x
,
y
,
**
kwargs
):
params
=
cls
.
symbolic_params
()
gp
=
T
.
grad
(
cls
.
loss
,
params
)
return
[],
[
In
(
p
,
update
=
p
-
cls
.
step
*
g
)
for
p
,
g
in
zip
(
params
,
gp
)]
def
predict
(
cls
,
x
,
**
kwargs
):
return
cls
.
pred
,
[]
return
locals
()
# at this point there is a neural_net module all built and compiled,
# there is also a neural_net.symbolic_module which can be imported.
@SymbolicModule_fromFn
def
PCA
(
x
=
T
.
dmatrix
(),
var_thresh
=
T
.
dscalar
()
):
# naive version, yes
s
,
v
,
d
=
T
.
svd
(
x
)
acc
=
T
.
accumulate
(
v
)
npc
=
T
.
lsearch
(
acc
,
var_thresh
*
T
.
sum
(
v
))
y
=
s
[:,
:
npc
]
# transform will map future points x into the principle components space
transform
=
d
[:
npc
,
:]
.
T
/
v
[:
npc
]
return
locals
()
# at this point there is a neural_net module all built and compiled,
# there is also a neural_net.symbolic_module which can be imported.
# running this means:
nnet_on_pca
=
neural_net
(
x
=
PCA
.
y
,
submodules
=
[
PCA
])
#nnet_on_pca = SymbolicModule()
# nnet_on_pca.include(PCA) #an already-instantiated Module
# nnet_on_pca.x = nnet_on_pca.PCA.y #configure this Module
# nnet_on_pca.build(neural_net) # instantiate this module
nnet_on_pca
=
neural_net
(
substitute
=
dict
(
x
=
PCA
.
x
),
submodules
=
[
PCA
],
add_symbols
=
dict
(
x
=
PCA
.
x
)
)
nnet
=
logistic_regression
(
redefine
=
{
'x'
:
(
LogisticLayer
.
x
,
LogisticLayer
.
y
)},
submodule
=
{
'hid'
:
LogisticLayer
},
add_symbols
=
{
'x'
:
LogisticLayer
.
x
})
def
stats_collector
(
r
,
stat_name
):
"""stats_collector(nnet_on_pca.x, 'mean')
"""
return
mean_collector
(
x
=
r
)
theano/sandbox/tests/test_scan.py
deleted
100644 → 0
浏览文件 @
7320e1b1
import
theano
import
numpy
from
theano.sandbox
import
scan
def
test_001
():
x0
=
theano
.
tensor
.
fvector
(
'x0'
)
state
=
theano
.
tensor
.
unbroadcast
(
theano
.
tensor
.
shape_padleft
(
x0
),
0
)
out
,
_
=
scan
.
scan
(
lambda
x
:
x
+
numpy
.
float32
(
1
),
states
=
state
,
n_steps
=
5
)
fn
=
theano
.
function
([
x0
],
out
[
0
])
val_x0
=
numpy
.
float32
([
1
,
2
,
3
])
assert
numpy
.
all
(
fn
(
val_x0
)
==
val_x0
+
5
)
def
test_002
():
x0
=
theano
.
tensor
.
fvector
(
'x0'
)
state
=
theano
.
tensor
.
alloc
(
theano
.
tensor
.
constant
(
numpy
.
float32
(
0
)),
6
,
x0
.
shape
[
0
])
state
=
theano
.
tensor
.
set_subtensor
(
state
[
0
],
x0
)
out
,
_
=
scan
.
scan
(
lambda
x
:
x
+
numpy
.
float32
(
1
),
states
=
state
,
n_steps
=
5
)
fn
=
theano
.
function
([
x0
],
out
)
val_x0
=
numpy
.
float32
([
1
,
2
,
3
])
assert
numpy
.
all
(
fn
(
val_x0
)[
-
1
]
==
val_x0
+
5
)
assert
numpy
.
all
(
fn
(
val_x0
)[
0
]
==
val_x0
)
def
test_003
():
x0
=
theano
.
tensor
.
fvector
(
'x0'
)
sq
=
theano
.
tensor
.
fvector
(
'sq'
)
state
=
theano
.
tensor
.
alloc
(
theano
.
tensor
.
constant
(
numpy
.
float32
(
0
)),
6
,
x0
.
shape
[
0
])
state
=
theano
.
tensor
.
set_subtensor
(
state
[
0
],
x0
)
out
,
_
=
scan
.
scan
(
lambda
s
,
x
:
x
+
s
,
sequences
=
sq
,
states
=
state
,
n_steps
=
5
)
fn
=
theano
.
function
([
sq
,
x0
],
out
)
val_x0
=
numpy
.
float32
([
1
,
2
,
3
])
val_sq
=
numpy
.
float32
([
1
,
2
,
3
,
4
,
5
])
assert
numpy
.
all
(
fn
(
val_sq
,
val_x0
)[
-
1
]
==
val_x0
+
15
)
assert
numpy
.
all
(
fn
(
val_sq
,
val_x0
)[
0
]
==
val_x0
)
def
test_004
():
sq
=
theano
.
tensor
.
fvector
(
'sq'
)
nst
=
theano
.
tensor
.
iscalar
(
'nst'
)
out
,
_
=
scan
.
scan
(
lambda
s
:
s
+
numpy
.
float32
(
1
),
sequences
=
sq
,
states
=
[],
n_steps
=
nst
)
fn
=
theano
.
function
([
sq
,
nst
],
out
)
val_sq
=
numpy
.
float32
([
1
,
2
,
3
,
4
,
5
])
assert
numpy
.
all
(
fn
(
val_sq
,
5
)
==
val_sq
+
1
)
def
test_005
():
sq
=
theano
.
tensor
.
fvector
(
'sq'
)
nst
=
theano
.
tensor
.
iscalar
(
'nst'
)
out
,
_
=
scan
.
scan
(
lambda
s
:
s
+
numpy
.
float32
(
1
),
sequences
=
sq
,
states
=
[
None
],
n_steps
=
nst
)
fn
=
theano
.
function
([
sq
,
nst
],
out
)
val_sq
=
numpy
.
float32
([
1
,
2
,
3
,
4
,
5
])
assert
numpy
.
all
(
fn
(
val_sq
,
5
)
==
val_sq
+
1
)
if
__name__
==
'__main__'
:
test_001
()
test_002
()
test_003
()
test_004
()
test_005
()
theano/sandbox/tests/test_theano_object.py
deleted
100644 → 0
浏览文件 @
7320e1b1
from
__future__
import
print_function
from
theano.sandbox.theano_object
import
*
RUN_TESTS
=
False
def
run
(
TF
):
def
deco
(
f
):
if
TF
and
RUN_TESTS
:
print
(
'running test'
,
f
.
__name__
)
f
()
if
RUN_TESTS
:
return
f
else
:
return
None
return
deco
class
MyModule
(
TheanoObject
):
def
__init__
(
self
,
a
=
3
,
b
=
9
):
super
(
MyModule
,
self
)
.
__init__
()
self
.
a
=
self
.
symbolic_member
(
2
)
self
.
b
=
self
.
symbolic_member
(
3
)
self
.
c
=
100
# a constant
self
.
d
=
[
self
.
symbolic_member
(
5
),
self
.
symbolic_member
(
6
)]
self
.
e
=
[
'a'
,
self
.
symbolic_member
(
6
)]
@symbolic_fn
def
add
(
self
,
x
):
return
RVal
(
self
.
a
+
self
.
b
+
x
)
@symbolic_fn_opts
(
mode
=
'FAST_COMPILE'
)
def
sub
(
self
,
x
):
outputs
=
(
self
.
a
-
x
,
self
.
b
-
x
)
updates
=
{
self
.
b
:
self
.
b
-
x
}
return
RVal
(
outputs
,
updates
)
def
normal_function
(
self
,
x
):
return
self
.
add
(
x
)
+
self
.
sub
(
x
)
#use numpy addition
@symbolic_fn
def
use_submodule
(
self
,
x
):
return
RVal
(
self
.
a
+
x
+
self
.
submodule
.
b
)
@run
(
True
)
def
test_outputs
():
MM
=
MyModule
(
3
,
4
)
assert
MM
.
add
(
5
)
==
12
assert
MM
.
b
.
get
()
==
4
MM
.
sub
(
3
)
assert
MM
.
b
.
get
()
==
1
# test get()
assert
MM
.
add
(
5
)
==
9
# test that b's container is shared between add and sub
MM
.
b
.
set
(
2
)
# test set
assert
MM
.
b
.
get
()
==
2
# test get()
assert
MM
.
add
(
5
)
==
10
# test that b's container is shared between add and sub
@run
(
True
)
def
test_submodule
():
MM
=
MyModule
(
1
,
2
)
MM
.
submodule
=
MyModule
(
3
,
4
)
assert
MM
.
add
(
5
)
==
8
MM
.
submodule
.
sub
(
7
)
assert
MM
.
submodule
.
b
.
get
()
==
-
3
assert
MM
.
use_submodule
(
0
)
==
-
2
# self.a is 1 + self.submodule.b is -3
@run
(
False
)
def
test_misc_prints
():
MM
=
MyModule
()
print
(
MM
)
print
(
'add'
,
MM
.
add
(
4
))
print
(
'b'
,
MM
.
value
(
MM
.
b
))
print
(
'sub'
,
MM
.
sub
(
45
))
print
(
'b'
,
MM
.
value
(
MM
.
b
))
print
(
MM
.
sub
(
23
))
print
(
MM
.
add
(
9
))
print
(
MM
.
add
(
19
))
print
(
'b'
,
MM
.
value
(
MM
.
b
))
print
(
'a'
,
MM
.
value
(
MM
.
a
))
MM
.
value_set
(
MM
.
a
,
6
)
MM
.
value_set
(
MM
.
b
,
6
)
print
(
MM
.
add
(
6
))
try
:
MM
.
b
=
5
except
Exception
as
e
:
print
(
e
)
MM
.
del_member
(
MM
.
b
)
try
:
print
(
'b'
,
MM
.
value
(
MM
.
b
))
except
Exception
as
e
:
print
(
e
)
MM
.
b
=
'asdffd'
try
:
print
(
'b'
,
MM
.
value
(
MM
.
b
))
except
Exception
as
e
:
print
(
e
)
try
:
print
(
'b'
,
MM
.
value
(
MM
.
b
))
except
Exception
as
e
:
print
(
'E'
,
e
)
print
(
MM
.
b
)
print
(
'a'
,
MM
.
value
(
MM
.
a
))
theano/sandbox/theano_object.py
deleted
100644 → 0
浏览文件 @
7320e1b1
"""
DRAFT: TheanoObject
N.B. the gotcha with this design is listed in the documentation of
`TheanoObject`.
"""
from
__future__
import
print_function
import
theano
from
theano
import
tensor
import
numpy
def
theano_type
(
x
):
"""
Return a theano Type instance suitable for containing value `x`.
"""
if
type
(
x
)
is
int
:
return
tensor
.
lscalar
else
:
raise
NotImplementedError
()
class
symbolic_fn_callable
(
object
):
"""
This is the class whose instance you get when you access a symbolic function
in a `TheanoObject`.
When you call a symbolic function (`symbolic_fn`) of a TheanoObject,
the `__call__` of this class handles your request.
You can also access the symbolic outputs and updates of a symbolic function
through this class.
Examples
--------
class T(TheanoObject):
@symbolic_fn
def add(self, x):
...
add_outputs = ...
add_updates = ...
return RVal(add_outputs, add_updates)
t = T()
t.add.outputs(5) # returns `add_outputs` from when `x=theano_type(5)`
t.add.updates(5) # returns `add_updates` from when `x=theano_type(5)`
t.add.theano_function(5) # returns the `Function` compiled when
# `x=theano_type(5)`
t.add(5) # runs the `Function` compiled when `x=theano_type(5)`
# with arguments `(5,)`
"""
def
__init__
(
self
,
fn
,
mode
):
self
.
fn
=
fn
self
.
mode
=
mode
def
on
(
self
,
o_self
):
"""
Silly method to work with symbolic_fn.__get__.
"""
self
.
o_self
=
o_self
return
self
def
run_symbolic
(
self
,
*
args
,
**
kwargs
):
return
self
.
o_self
.
_get_method_impl
(
self
.
fn
,
self
.
o_self
,
args
,
kwargs
,
mode
=
self
.
mode
)
def
__call__
(
self
,
*
args
,
**
kwargs
):
return
self
.
run_symbolic
(
*
args
,
**
kwargs
)[
'theano_function'
](
*
args
,
**
kwargs
)
def
theano_function
(
self
,
*
args
,
**
kwargs
):
return
self
.
run_symbolic
(
*
args
,
**
kwargs
)[
'theano_function'
]
def
outputs
(
self
,
*
args
,
**
kwargs
):
return
self
.
run_symbolic
(
*
args
,
**
kwargs
)[
'outputs'
]
def
updates
(
self
,
*
args
,
**
kwargs
):
return
self
.
run_symbolic
(
*
args
,
**
kwargs
)[
'updates'
]
class
symbolic_fn
(
object
):
"""
A property-like class for decorating symbolic functions in `TheanoObject`.
"""
def
__init__
(
self
,
fn
,
mode
=
None
):
self
.
fn
=
fn
self
.
callable
=
symbolic_fn_callable
(
fn
,
mode
)
def
__get__
(
self
,
o_self
,
o_cls
):
return
self
.
callable
.
on
(
o_self
)
def
__set__
(
self
,
o_self
,
new_val
):
pass
# return NotImplemented
def
symbolic_fn_opts
(
**
kwargs
):
"""
Return a decorator for symbolic_functions in a `TheanoObject`.
`kwargs` passed here are passed to `theano.function` via `symbolic_fn`.
"""
def
deco
(
f
):
return
symbolic_fn
(
f
,
**
kwargs
)
return
deco
class
RVal
(
object
):
"""
A Return-Value object for a `symbolic_fn`.
"""
outputs
=
[]
"""
The method will compute values for the variables in this list.
"""
updates
=
{}
"""The method will update module variables in this dictionary.
For items ``(k,v)`` in this dictionary, ``k`` must be a `symbolic_member`
of some module.
On each call to this compiled function, the value of ``k`` will be replaced
with the computed value of the Variable ``v``.
"""
def
__init__
(
self
,
outputs
,
updates
=
None
):
if
updates
is
None
:
updates
=
{}
self
.
outputs
=
outputs
assert
type
(
updates
)
is
dict
self
.
updates
=
updates
class
TheanoObject
(
object
):
"""
Base for Theano-supported classes.
This class provides support for symbolic_fn class attributes.
These will be compiled on demand so that they can be used just like normal
(non-symbolic) methods.
The symbolic functions in a TheanoObject can share member variables that
have been created using the `symbolic_member` method.
Notes
-----
Other variables (ones not created using ``self.symbolic_member``) referred
to in the body of a symbolic function will *not* be shared between symbolic
functions, or between symbolic functions and this class. These other
variables will be locked away in the closure of a symbolic function when
that function is compiled.
.. warning:: It is not recommended for code to interleave
(a) changes to non-symbolic instance variables with
(b) calls to symbolic functions that use those instance variables.
A symbolic function may be compiled multiple times because it must be
compiled for each set of argument types.
Each time the function is compiled, the values of non-symbolic variables
will be locked into the compiled function. Subsequent changes to those
non-symbolic instance variables will not have any effect on the behaviour
of the already-compiled symbolic function.
:todo: Is there an efficient way of recognizing when a compiled symbolic
function is stale, wrt the current values of the class's instance variables?
- One option is to re-evaluate symbolic functions symbolically and see if
the graph can be completely merged with the original graph. This is not
fast enough to do all the time by default though.
"""
def
__init__
(
self
):
self
.
module_method_cache
=
{}
def
_get_method_impl
(
self
,
fn
,
o_self
,
args
,
kwargs
,
mode
):
"""
Retrieve information about the symbolic function (`fn`) in TheanoObject
instance `o_self`, being evaluated on arguments `args` and `kwargs`.
Returns
-------
dict with entries 'theano_function', 'outputs', 'updates'
The theano function compiled for these arguments, the symbolic
outputs of that function, and the symbolic updates performed by
that function.
Notes
-----
This function caches return values in self.`module_method_cache`.
:todo: This may at some point become a class-level cache rather than an
instance-level cache.
"""
if
kwargs
:
raise
NotImplementedError
()
cache
=
self
.
module_method_cache
args_types
=
tuple
(
theano_type
(
arg
)
for
arg
in
args
)
key
=
(
fn
,
args_types
)
if
key
not
in
cache
:
inputs
=
[
a
()
for
a
in
args_types
]
print
(
'compiling'
,
fn
,
'for inputs'
,
inputs
)
rval
=
fn
(
o_self
,
*
inputs
)
print
(
'compiling to compute outputs'
,
rval
.
outputs
)
if
isinstance
(
rval
.
outputs
,
(
tuple
,
list
)):
all_required_inputs
=
theano
.
gof
.
graph
.
inputs
(
rval
.
outputs
)
else
:
all_required_inputs
=
theano
.
gof
.
graph
.
inputs
([
rval
.
outputs
])
# construct In instances for the symbolic_member instances that can automatically be
# included here.
module_inputs
=
[
theano
.
compile
.
io
.
In
(
variable
=
v
,
value
=
v
.
_theanoclass_container
,
mutable
=
(
v
in
rval
.
updates
),
update
=
rval
.
updates
.
get
(
v
,
None
))
for
v
in
all_required_inputs
\
if
hasattr
(
v
,
'_theanoclass_container'
)
and
not
(
v
in
inputs
)]
cache
[
key
]
=
dict
(
theano_function
=
theano
.
function
(
inputs
+
module_inputs
,
rval
.
outputs
),
updates
=
rval
.
updates
,
outputs
=
rval
.
outputs
,
mode
=
mode
)
return
cache
[
key
]
def
symbolic_member
(
self
,
ival
,
name
=
None
):
"""
Create a Variable instance to hold value `ival`.
This function also immediately creates a Container object for ival.
When the returned Variable is used as input to a `TheanoObject`
`symbolic_fn`, (but does not appear as an argument to that symbolic_fn),
then this Container will be used to retrieve (and store) values for the
Variable.
This Variable's Container's contents can be retrieved by its `get()`
method.
This Variable's Container's contents can be written using its
`set(newval)` method.
"""
if
type
(
ival
)
is
not
int
:
raise
NotImplementedError
()
v
=
tensor
.
lscalar
(
name
)
v
.
_theanoclass_container
=
\
theano
.
gof
.
Container
(
v
,
storage
=
[
theano
.
_asarray
(
ival
,
dtype
=
'int64'
)],
readonly
=
False
)
assert
not
hasattr
(
v
,
'set'
)
assert
not
hasattr
(
v
,
'get'
)
v
.
get
=
lambda
:
v
.
_theanoclass_container
.
data
def
setval_in_v
(
newval
):
v
.
_theanoclass_container
.
data
=
newval
v
.
set
=
setval_in_v
return
v
编写
预览
Markdown
格式
0%
重试
或
添加新文件
添加附件
取消
您添加了
0
人
到此讨论。请谨慎行事。
请先完成此评论的编辑!
取消
请
注册
或者
登录
后发表评论