Skip to content
项目
群组
代码片段
帮助
当前项目
正在载入...
登录 / 注册
切换导航面板
P
pytensor
项目
项目
详情
活动
周期分析
仓库
仓库
文件
提交
分支
标签
贡献者
图表
比较
统计图
议题
0
议题
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
CI / CD
CI / CD
流水线
作业
日程
统计图
Wiki
Wiki
代码片段
代码片段
成员
成员
折叠边栏
关闭边栏
活动
图像
聊天
创建新问题
作业
提交
问题看板
Open sidebar
testgroup
pytensor
Commits
44f47751
提交
44f47751
authored
1月 11, 2012
作者:
nouiz
浏览文件
操作
浏览文件
下载
差异文件
Merge pull request #317 from pascanur/new_scan
New scan [work in progress]
上级
86aae00b
6b72779b
隐藏空白字符变更
内嵌
并排
正在显示
9 个修改的文件
包含
2316 行增加
和
1 行删除
+2316
-1
index.txt
doc/developer/index.txt
+1
-0
scan.txt
doc/developer/scan.txt
+153
-0
__init__.py
theano/sandbox/scan_module/__init__.py
+41
-0
scan.py
theano/sandbox/scan_module/scan.py
+574
-0
scan_op.py
theano/sandbox/scan_module/scan_op.py
+390
-0
scan_utils.py
theano/sandbox/scan_module/scan_utils.py
+414
-0
test_scan.py
theano/sandbox/scan_module/tests/test_scan.py
+460
-0
test_utils.py
theano/sandbox/scan_module/tests/test_utils.py
+281
-0
basic.py
theano/tensor/basic.py
+2
-1
没有找到文件。
doc/developer/index.txt
浏览文件 @
44f47751
...
@@ -10,3 +10,4 @@ Theano Design and Implementation Documentation
...
@@ -10,3 +10,4 @@ Theano Design and Implementation Documentation
:maxdepth: 2
:maxdepth: 2
tensor
tensor
scan
doc/developer/scan.txt
0 → 100644
浏览文件 @
44f47751
.. _scan_internals:
Internal documentation of the scan op
=====================================
Top-level description of scan
-----------------------------
The `scan` operation is meant to be able to describe symbolically loops,
recurrent relations or dynamical systems. In general, we will say that the
scan op implements system of equations of the following form:
.. math::
\mathbf{x}_1(t) = f_{\mathbf{x}_1}
(\mathbf{u}_1(t), \mathbf{u}_1(t-1), \ldots, \mathbf{u}_1(t-l_1),
\mathbf{u}_2(t), \ldots, \mathbf{u}_2(t-l_2),
\ldots,
\mathbf{u}_M(t), \ldots, \mathbf{u}_M(t - l_M),
\mathbf{x}_1(t-1), \ldots, \mathbf{x}_1(t-k_1),
\ldots,
\mathbf{x}_N(t-1), \ldots, \mathbf{x}_N(t-k_N),
\mathbf{w}_1, \ldots, \mathbf{w}_Q)
\vdots
\mathbf{x}_N(t) = f_{\mathbf{x}_N}
(\mathbf{u}_1(t), \mathbf{u}_1(t-1), \ldots, \mathbf{u}_1(t-l_1),
\mathbf{u}_2(t), \ldots, \mathbf{u}_2(t-l_2),
\ldots,
\mathbf{u}_M(t), \ldots, \mathbf{u}_M(t - l_M),
\mathbf{x}_1(t-1), \ldots, \mathbf{x}_1(t-k_1),
\ldots,
\mathbf{x}_N(t-1), \ldots, \mathbf{x}_N(t-k_N),
\mathbf{w}_1, \ldots, \mathbf{w}_Q)
\mathbf{y}_1(t) = f_{\mathbf{y}_1}
(\mathbf{u}_1(t), \mathbf{u}_1(t-1), \ldots, \mathbf{u}_1(t-l_1),
\mathbf{u}_2(t), \ldots, \mathbf{u}_2(t-l_2),
\ldots,
\mathbf{u}_M(t), \ldots, \mathbf{u}_M(t - l_M),
\mathbf{x}_1(t-1), \ldots, \mathbf{x}_1(t-k_1),
\ldots,
\mathbf{x}_N(t-1), \ldots, \mathbf{x}_N(t-k_N),
\mathbf{w}_1, \ldots, \mathbf{w}_Q)
\vdots
\mathbf{y}_M(t) = f_{\mathbf{y}_M}
(\mathbf{u}_1(t), \mathbf{u}_1(t-1), \ldots, \mathbf{u}_1(t-l_1),
\mathbf{u}_2(t), \ldots, \mathbf{u}_2(t-l_2),
\ldots,
\mathbf{u}_M(t), \ldots, \mathbf{u}_M(t - l_M),
\mathbf{x}_1(t-1), \ldots, \mathbf{x}_1(t-k_1),
\ldots,
\mathbf{x}_N(t-1), \ldots, \mathbf{x}_N(t-k_N),
\mathbf{w}_1, \ldots, \mathbf{w}_Q)
The equations describe a system evolving in time, where :math:`t` represents the
current step. The system is described by inputs, states, outputs and
parameteres.
The inputs, denoted by :math:`\mathbf{u}` are time-varying quantities,
hence indexed by :math:`t`. They however only influence the system, but are
not influenced by the system.
The states :math:`\mathbf{x}` are time-varying quantities, whose value at
time :math:`t` depends on its (or other state) previous values as well as
the inputs and parameters. Note that the first few values of the states are
always provided, otherwise we could not imploy the recurrent equation to
generate these sequence of values without a starting point.
The outputs, :math:`\mathbf{y}` are outputs of the system, i.e. values that
depend on the previous values of the states and inputs. The difference
between outputs and states is that outputs do not feed back into the system.
The parameters :math:`\mathbf{w}` are fixed quantities that are re-used at
every time step of the evolution of the system.
Each of the equations above are implemented by the **inner function** of scan. You
can think of the **inner function** as a theano function that gets executed
at each step to get the new values. This **inner function** should not be
confused with the **constructive function**, which is what the user gives to
the scan function. The **constructive function** is used to construct the
computational graph that is afterwards compiled into the **inner function**.
Naming conventions
------------------
* ``input_state`` will stand for a state :math:`\mathbf{x}`, when it is
provided as an input to the recurrent formula (the inner function) that
will generate the new value of the state
* ``output_state`` will stand for a state :math:`\math{x}` when it refers
to the result of the recurrent formula (the output of the inner function)
* ``output`` will stand for an output :math:`\mathbf{y}`
* ``input`` will be an input :math:`\mathbf{u}`
* ``parameter`` will stand for a parameter tensor :math:`\mathbf{w}` that stays
constant at each step of the inner function
* ``non_numeric_input_state`` will stand for states that are not numeric in nature,
more specifically *random states*, when they are provided as an input. The
same holds for ``non_numeric_output_state``.
* ``t`` is the time index (the current step in the evolution of the system).
* ``T`` is the total number of steps in the evolution of the system.
* the suffix ``_slices`` added to either ``x`` or ``u`` will mean the list of
variables representing slices of states or inputs. These are the arguments
given to the constructive function of scan (see above).
* the suffix ``_inner`` added to ``x``, ``y``, ``xy``, ``u``, ``w`` or ``z``
will mean the variables representing the state/output/input/weights in the
inner function
* the suffix ``_outer`` added to ``x``, ``y``, ``xy``, ``u``, ``w`` or ``z``
will mean the variables representing the state/output/input/weights in the
main computational graph (the one containing the scan op).
Files
-----
The implementation of scan is spread over several files. The different
files, and section of the code they deal with, are :
* ``scan.py`` implements the ``scan`` function. The ``scan`` function
arranges the arguments of scan correctly, constructs the scan op and
afterwards calls the constructed scan op on the arguments. This function
takes care of figuring out missing inputs and shared variables.
* ``scan_op.py`` implements the ``scanOp`` class. The ``scanOp`` respects
the ``Op`` interface, and contains most of the logic of the scan operator.
* ``scan_utils.py`` contains several helpful functions used through out the
other files that are specific of the scan operator.
* ``scan_views.py`` contains different views of the scan op that have
simpler and easier signatures to be used in specific cases.
* ``scan_opt.py`` contains the list of all optimizations for the scan
operator.
The logical flow
----------------
First the scan arguments are parsed by the function ``canonical_arguments``,
that wraps them into lists and adds default values for the arguments. One
important step that happens in this function is that the inputs arguments
are converted such that they all have a single tap, namely 0. For example
if you have ``[{'input':u, 'taps':[0, 4]}]`` as the list of inputs arguments
to scan, it gets converted into ``[{'input':u, 'taps':[0]}, {'input':u[4:],
'taps':[0]}]``.
The second step is to check if ``n_steps`` is a constant and has the value 1
or -1. If that is true then the function ``one_step_scan`` is called which
unwraps the computation of the inner function into the outer graph without
adding any scan op in the graph.
theano/sandbox/scan_module/__init__.py
0 → 100644
浏览文件 @
44f47751
"""
This module provides the Scan Op
Scanning is a general form of recurrence, which can be used for looping.
The idea is that you *scan* a function along some input sequence, producing
an output at each time-step that can be seen (but not modified) by the
function at the next time-step. (Technically, the function can see the
previous K time-steps of your outputs and L time steps (from the past and
future) of your inputs.
So for example, ``sum()`` could be computed by scanning the ``z+x_i``
function over a list, given an initial state of ``z=0``.
Special cases:
* A *reduce* operation can be performed by returning only the last
output of a ``scan``.
* A *map* operation can be performed by applying a function that
ignores previous steps of the outputs.
Often a for-loop can be expressed as a ``scan()`` operation, and ``scan`` is
the closest that theano comes to looping. The advantage of using ``scan``
over for loops is that it allows the number of iterations to be a part of
the symbolic graph.
The Scan Op should typically be used by calling any of the following
functions: ``scan()``, ``map()``, ``reduce()``, ``foldl()``,
``foldr()``.
"""
__docformat__
=
'restructedtext en'
__authors__
=
(
"Razvan Pascanu "
"Frederic Bastien "
"James Bergstra "
"Pascal Lamblin "
"Arnaud Bergeron "
)
__copyright__
=
"(c) 2010, Universite de Montreal"
__contact__
=
"Razvan Pascanu <r.pascanu@gmail>"
from
scan
import
scan
theano/sandbox/scan_module/scan.py
0 → 100644
浏览文件 @
44f47751
"""
This module provides the Scan Op
Scanning is a general form of recurrence, which can be used for looping.
The idea is that you *scan* a function along some input sequence, producing
an output at each time-step that can be seen (but not modified) by the
function at the next time-step. (Technically, the function can see the
previous K time-steps of your outputs and L time steps (from past and
future) of your inputs.
So for example, ``sum()`` could be computed by scanning the ``z+x_i``
function over a list, given an initial state of ``z=0``.
Special cases:
* A *reduce* operation can be performed by using only the last
output of a ``scan``.
* A *map* operation can be performed by applying a function that
ignores previous steps of the outputs.
Often a for-loop or while-loop can be expressed as a ``scan()`` operation,
and ``scan`` is the closest that theano comes to looping. The advantages
of using ``scan`` over `for` loops in python (amongs other) are:
* it allows the number of iterations to be part of the symbolic graph
* it allows computing gradients through the for loop
* there exist a bunch of optimizations that help re-write your loop
such that less memory is used and that it runs faster
* it ensures that data is not copied from host to gpu and gpu to
host at each step
The Scan Op should typically be used by calling any of the following
functions: ``scan()``, ``map()``, ``reduce()``, ``foldl()``,
``foldr()``.
"""
__docformat__
=
'restructedtext en'
__authors__
=
(
"Razvan Pascanu "
"Frederic Bastien "
"James Bergstra "
"Pascal Lamblin "
)
__copyright__
=
"(c) 2010, Universite de Montreal"
__contact__
=
"Razvan Pascanu <r.pascanu@gmail>"
from
itertools
import
izip
import
logging
import
numpy
from
theano.compile
import
SharedVariable
,
function
from
theano
import
compile
from
theano
import
gof
from
theano.tensor
import
opt
from
theano
import
tensor
from
theano
import
config
from
theano.updates
import
Updates
from
theano.scalar.sharedvar
import
shared
as
scalar_shared
from
theano.compile.pfunc
import
rebuild_collect_shared
import
theano
import
scan_op
import
scan_utils
# Logging function for sending warning or info
_logger
=
logging
.
getLogger
(
'theano.scan_module.scan'
)
def
scan
(
fn
,
sequences
=
None
,
outputs_info
=
None
,
non_sequences
=
None
,
n_steps
=
None
,
truncate_gradient
=-
1
,
go_backwards
=
False
,
mode
=
None
,
name
=
None
,
options
=
None
,
profile
=
False
):
"""
This function constructs and applies a Scan op to the provided
arguments.
:param fn:
``fn`` is a function that describes the operations involved in one
step of ``scan``. ``fn`` should construct variables describing the
output of one iteration step. It should expect as input theano
variables representing all the slices of the input sequences
and previous values of the outputs, as well as all other arguments
given to scan as ``non_sequences``. The order in which scan passes
these variables to ``fn`` is the following :
* all time slices of the first sequence
* all time slices of the second sequence
* ...
* all time slices of the last sequence
* all past slices of the first output
* all past slices of the second otuput
* ...
* all past slices of the last output
* all other arguments (the list given as `non_sequences` to
scan)
The order of the sequences is the same as the one in the list
`sequences` given to scan. The order of the outputs is the same
as the order of ``output_info``. For any sequence or output the
order of the time slices is the same as the one in which they have
been given as taps. For example if one writes the following :
.. code-block:: python
scan(fn, sequences = [ dict(input= Sequence1, taps = [-3,2,-1])
, Sequence2
, dict(input = Sequence3, taps = 3) ]
, outputs_info = [ dict(initial = Output1, taps = [-3,-5])
, dict(initial = Output2, taps = None)
, Output3 ]
, non_sequences = [ Argument1, Argument 2])
``fn`` should expect the following arguments in this given order:
#. ``Sequence1[t-3]``
#. ``Sequence1[t+2]``
#. ``Sequence1[t-1]``
#. ``Sequence2[t]``
#. ``Sequence3[t+3]``
#. ``Output1[t-3]``
#. ``Output1[t-5]``
#. ``Output3[t-1]``
#. ``Argument1``
#. ``Argument2``
The list of ``non_sequences`` can also contain shared variables
used in the function, though ``scan`` is able to figure those
out on its own so they can be skipped. For the clarity of the
code we recommand though to provide them to scan. To some extend
``scan`` can also figure out other ``non sequences`` (not shared)
even if not passed to scan (but used by `fn`). A simple example of
this would be :
.. code-block:: python
import theano.tensor as TT
W = TT.matrix()
W_2 = W**2
def f(x):
return TT.dot(x,W_2)
The function is expected to return two things. One is a list of
outputs ordered in the same order as ``outputs_info``, with the
difference that there should be only one output variable per
output initial state (even if no tap value is used). Secondly
`fn` should return an update dictionary (that tells how to
update any shared variable after each iteration step). The
dictionary can optionally be given as a list of tuples. There is
no constraint on the order of these two list, ``fn`` can return
either ``(outputs_list, update_dictionary)`` or
``(update_dictionary, outputs_list)`` or just one of the two (in
case the other is empty).
To use ``scan`` as a while loop, the user needs to change the
function ``fn`` such that also a stopping condition is returned.
To do so, he/she needs to wrap the condition in an ``until`` class.
The condition should be returned as a third element, for example:
.. code-block:: python
...
return [y1_t, y2_t], {x:x+1}, theano.scan_module.until(x < 50)
Note that a number of steps (considered in here as the maximum
number of steps ) is still required even though a condition is
passed (and it is used to allocate memory if needed). = {}):
:param sequences:
``sequences`` is the list of Theano variables or dictionaries
describing the sequences ``scan`` has to iterate over. If a
sequence is given as wrapped in a dictionary, then a set of optional
information can be provided about the sequence. The dictionary
should have the following keys:
* ``input`` (*mandatory*) -- Theano variable representing the
sequence.
* ``taps`` -- Temporal taps of the sequence required by ``fn``.
They are provided as a list of integers, where a value ``k``
impiles that at iteration step ``t`` scan will pass to ``fn``
the slice ``t+k``. Default value is ``[0]``
Any Theano variable in the list ``sequences`` is automatically
wrapped into a dictionary where ``taps`` is set to ``[0]``
:param outputs_info:
``outputs_info`` is the list of Theano variables or dictionaries
describing the initial state of the outputs computed
recurrently. When this initial states are given as dictionary
optional information can be provided about the output corresponding
to these initial states. The dictionary should have the following
keys:
* ``initial`` -- Theano variable that represents the initial
state of a given output. In case the output is not computed
recursively (think of a map) and does not require a initial
state this field can be skiped. Given that only the previous
time step of the output is used by ``fn`` the initial state
should have the same shape as the output. If multiple time
taps are used, the initial state should have one extra
dimension that should cover all the possible taps. For example
if we use ``-5``, ``-2`` and ``-1`` as past taps, at step 0,
``fn`` will require (by an abuse of notation) ``output[-5]``,
``output[-2]`` and ``output[-1]``. This will be given by
the initial state, which in this case should have the shape
(5,)+output.shape. If this variable containing the initial
state is called ``init_y`` then ``init_y[0]`` *corresponds to*
``output[-5]``. ``init_y[1]`` *correponds to* ``output[-4]``,
``init_y[2]`` corresponds to ``output[-3]``, ``init_y[3]``
coresponds to ``output[-2]``, ``init_y[4]`` corresponds to
``output[-1]``. While this order might seem strange, it comes
natural from splitting an array at a given point. Assume that
we have a array ``x``, and we choose ``k`` to be time step
``0``. Then our initial state would be ``x[:k]``, while the
output will be ``x[k:]``. Looking at this split, elements in
``x[:k]`` are ordered exactly like those in ``init_y``.
* ``taps`` -- Temporal taps of the output that will be pass to
``fn``. They are provided as a list of *negative* integers,
where a value ``k`` implies that at iteration step ``t`` scan
will pass to ``fn`` the slice ``t+k``.
``scan`` will follow this logic if partial information is given:
* If an output is not wrapped in a dictionary, ``scan`` will wrap
it in one assuming that you use only the last step of the output
(i.e. it makes your tap value list equal to [-1]).
* If you wrap an output in a dictionary and you do not provide any
taps but you provide an initial state it will assume that you are
using only a tap value of -1.
* If you wrap an output in a dictionary but you do not provide any
initial state, it assumes that you are not using any form of
taps.
* If you provide a ``None`` instead of a variable or a empty
dictionary ``scan`` assumes that you will not use any taps for
this output (like for example in case of a map)
If ``outputs_info`` is an empty list or None, ``scan`` assumes
that no tap is used for any of the outputs. If information is
provided just for a subset of the outputs an exception is
raised (because there is no convention on how scan should map
the provided information to the outputs of ``fn``)
:param non_sequences:
``non_sequences`` is the list of arguments that are passed to
``fn`` at each steps. One can opt to exclude variable
used in ``fn`` from this list as long as they are part of the
computational graph, though for clarity we encourage not to do so.
:param n_steps:
``n_steps`` is the number of steps to iterate given as an int
or Theano scalar. If any of the input sequences do not have
enough elements, scan will raise an error. If the *value is 0* the
outputs will have *0 rows*. If the value is negative, ``scan``
will run backwards in time. If the ``go_backwards`` flag is already
set and also ``n_steps`` is negative, ``scan`` will run forward
in time. If n stpes is not provided, ``scan`` will figure
out the amount of steps it should run given its input sequences.
:param truncate_gradient:
``truncate_gradient`` is the number of steps to use in truncated
BPTT. If you compute gradients through a scan op, they are
computed using backpropagation through time. By providing a
different value then -1, you choose to use truncated BPTT instead
of classical BPTT, where you go for only ``truncate_gradient``
number of steps back in time.
:param go_backwards:
``go_backwards`` is a flag indicating if ``scan`` should go
backwards through the sequences. If you think of each sequence
as indexed by time, making this flag True would mean that
``scan`` goes back in time, namely that for any sequence it
starts from the end and goes towards 0.
:param name:
When profiling ``scan``, it is crucial to provide a name for any
instance of ``scan``. The profiler will produce an overall
profile of your code as well as profiles for the computation of
one step of each instance of ``scan``. The ``name`` of the instance
appears in those profiles and can greatly help to disambiguate
information.
:param mode:
It is recommended to leave this argument to None, especially
when profiling ``scan`` (otherwise the results are not going to
be accurate). If you prefer the computations of one step of
``scan`` to be done differently then the entire function, you
can use this parameter to describe how the computations in this
loop are done (see ``theano.function`` for details about
possible values and their meaning).
:param profile:
Flag or string. If true, or different from the empty string, a
profile object will be created and attached to the inner graph of
scan. In case ``profile`` is True, the profile object will have the
name of the scan instance, otherwise it will have the passed string.
Profile object collect (and print) information only when running the
inner graph with the new cvm linker ( with default modes,
other linkers this argument is useless)
:rtype: tuple
:return: tuple of the form (outputs, updates); ``outputs`` is either a
Theano variable or a list of Theano variables representing the
outputs of ``scan`` (in the same order as in
``outputs_info``). ``updates`` is a subclass of dictionary
specifying the
update rules for all shared variables used in scan
This dictionary should be passed to ``theano.function`` when
you compile your function. The change compared to a normal
dictionary is that we validate that keys are SharedVariable
and addition of those dictionary are validated to be consistent.
"""
# Note : see the internal documentation of the scan op for naming
# conventions and all other details
if
options
is
None
:
options
=
{}
rvals
=
scan_utils
.
canonical_arguments
(
sequences
,
outputs_info
,
non_sequences
,
go_backwards
,
n_steps
)
inputs
,
states_and_outputs_info
,
parameters
,
T
=
rvals
# If we provided a known number of steps ( before compilation)
# and if that number is 1 or -1, then we can skip the Scan Op,
# and just apply the inner function once
# To do that we check here to see the nature of n_steps
T_value
=
None
if
isinstance
(
n_steps
,
(
float
,
int
)):
T_value
=
int
(
n_steps
)
else
:
try
:
T_value
=
opt
.
get_constant_value
(
n_steps
)
except
(
TypeError
,
AttributeError
):
T_value
=
None
if
T_value
in
(
1
,
-
1
):
return
one_step_scan
(
fn
,
inputs
,
states_and_outputs_info
,
parameters
,
truncate_gradient
)
# 1. Variable representing the current time step
t
=
scalar_shared
(
numpy
.
int64
(
0
),
name
=
't'
)
# 2. Allocate memory for the states of scan.
mintaps
=
[]
lengths
=
[]
for
pos
,
arg_info
in
enumerate
(
states_and_outputs_info
):
if
arg_info
.
get
(
'taps'
,
None
)
==
[
-
1
]:
mintaps
.
append
(
1
)
lengths
.
append
(
scalar_shared
(
numpy
.
int64
(
0
),
name
=
'l
%
d'
%
pos
))
arg_info
[
'initial'
]
=
scan_utils
.
expand
(
tensor
.
unbroadcast
(
tensor
.
shape_padleft
(
arg_info
[
'initial'
]),
0
),
T
)
elif
arg_info
.
get
(
'taps'
,
None
):
if
numpy
.
any
(
numpy
.
array
(
arg_info
.
get
(
'taps'
,
[]))
>
0
):
# Make sure we do not have requests for future values of a
# sequence we can not provide such values
raise
ValueError
(
'Can not use future taps of outputs'
,
arg_info
)
mintap
=
abs
(
numpy
.
min
(
arg_info
[
'taps'
]))
lengths
.
append
(
scalar_shared
(
numpy
.
int64
(
0
),
name
=
'l
%
d'
%
pos
))
mintaps
.
append
(
mintap
)
arg_info
[
'initial'
]
=
scan_utils
.
expand
(
arg_info
[
'initial'
][:
mintap
],
T
)
else
:
mintaps
.
append
(
0
)
lengths
.
append
(
scalar_shared
(
numpy
.
int64
(
0
),
name
=
'l
%
d'
%
pos
))
# 3. Generate arguments for the function passed to scan. This will
# function will return the outputs that need to be computed at every
# timesteps
inputs_slices
=
[
input
[
t
]
for
input
in
inputs
]
states_slices
=
[]
for
n
,
state
in
enumerate
(
states_and_outputs_info
):
# Check if it is actually a state and not an output
if
mintaps
[
n
]
!=
0
:
for
k
in
state
[
'taps'
]:
states_slices
.
append
(
state
[
'initial'
][(
t
+
mintaps
[
n
]
+
k
)
%
lengths
[
n
]])
# 4. Construct outputs that are to be computed by the inner
# function of scan
args
=
inputs_slices
+
states_slices
+
parameters
cond
,
states_and_outputs
,
updates
=
\
scan_utils
.
get_updates_and_outputs
(
fn
(
*
args
))
# User is allowed to provide no information if it only behaves like a
# map
if
(
len
(
states_and_outputs
)
!=
len
(
states_and_outputs_info
)
and
len
(
states_and_outputs_info
)
==
0
):
mintaps
=
[
0
]
*
len
(
states_and_outputs
)
# 5. Construct the scan op
# 5.1 Construct list of shared variables with updates (those that
# can be treated as states (i.e. of TensorType) and those that can not
# (like Random States)
if
cond
is
not
None
:
_cond
=
[
cond
]
else
:
_cond
=
[]
rvals
=
rebuild_collect_shared
(
states_and_outputs
+
_cond
,
updates
=
updates
,
rebuild_strict
=
True
,
copy_inputs_over
=
True
,
no_default_updates
=
False
)
# extracting the arguments
input_variables
,
cloned_outputs
,
other_rval
=
rvals
clone_d
,
update_d
,
update_expr
,
shared_inputs
=
other_rval
additional_input_states
=
[]
additional_output_states
=
[]
additional_lengths
=
[]
additional_mintaps
=
[]
original_numeric_shared_variables
=
[]
non_numeric_input_states
=
[]
non_numeric_output_states
=
[]
original_non_numeric_shared_variables
=
[]
pos
=
len
(
lengths
)
for
sv
in
shared_inputs
:
if
sv
in
update_d
:
if
isinstance
(
sv
,
TensorType
):
# We can treat it as a sit sot
nw_state
=
scan_utils
.
expand
(
tensor
.
unbroadcast
(
tensor
.
shape_padleft
(
sv
,
0
),
T
))
additional_lengths
.
append
(
scalar_shared
(
numpy
.
int64
(
0
),
name
=
'l
%
d'
%
pos
))
pos
=
pos
+
1
additional_mintaps
.
append
(
1
)
additional_input_states
.
append
(
nw_state
)
additional_output_states
.
append
(
scan_utils
.
clone
(
tensor
.
set_subtensor
(
nw_state
[(
t
+
1
)
%
additional_lengths
[
-
1
]],
update_d
[
sv
])))
original_numeric_shared_variables
.
append
(
sv
)
else
:
non_numeric_input_states
.
append
(
sv
)
non_numeric_output_states
.
append
(
update_d
[
sv
])
original_non_numeric_shared_variables
.
append
(
sv
)
# 5.2 Collect inputs/outputs of the inner function
inputs
=
[]
outputs
=
[]
for
n
,
mintap
in
enumerate
(
mintaps
):
if
mintap
!=
0
:
input_state
=
states_and_outputs_info
[
n
][
'initial'
]
inputs
.
append
(
input_state
)
outputs
.
append
(
tensor
.
set_subtensor
(
input_state
[(
t
+
mintap
)
%
lengths
[
n
]],
states_and_outputs
[
n
]))
else
:
mem_buffer
=
scan_utils
.
allocate_memory
(
T
,
states_and_outputs_info
[
n
],
states_and_outputs
[
n
])
inputs
.
append
(
output
)
outputs
.
append
(
tensor
.
set_subtensor
(
output
[
t
%
lengths
[
n
]],
states_and_outputs
[
n
]))
inputs
.
extend
(
additional_input_states
)
outputs
.
extend
(
additional_output_states
)
lengths
.
extend
(
additional_lengths
)
mintaps
.
extend
(
additional_mintaps
)
inputs
.
extend
(
non_numeric_input_states
)
outputs
.
extend
(
non_numeric_output_states
)
all_other_inputs
=
gof
.
graph
.
inputs
(
outputs
)
parameters
=
[
x
for
x
in
all_other_inputs
if
(
x
not
in
inputs
and
x
not
in
lengths
and
x
is
not
t
and
isinstance
(
x
,
gof
.
Variable
)
and
not
isinstance
(
x
,
gof
.
Constant
))]
inputs
.
extend
(
parameters
)
# 5.3 Construct the the options dictionary
options
[
'name'
]
=
name
options
[
'profile'
]
=
profile
options
[
'mode'
]
=
mode
options
[
'inplace'
]
=
False
options
[
'gpu'
]
=
False
options
[
'truncate_gradient'
]
=
truncate_gradient
options
[
'hash_inner_graph'
]
=
0
# 5.4 Construct the ScanOp instance
local_op
=
scan_op
.
ScanOp
(
inputs
=
inputs
,
outputs
=
outputs
,
lengths
=
lengths
,
switches
=
[],
mintaps
=
mintaps
,
index
=
t
,
options
=
options
,
as_repeatUntil
=
cond
)
# Note that we get here all the outputs followed by the update rules to
# the shared variables we had in our scan
# we know that we have (in this given order):
# * len(states_and_outputs) real outputs
# * len(additional_input_states) updates for numeric shared variable
# * len(non_numeric_input_states) updates for non numeric shared
# variables
scan_inputs
=
[
T
]
+
inputs
scan_outputs_update_rules
=
scan_utils
.
to_list
(
local_op
(
*
scan_inputs
))
# 5.5 Collect outputs and add permutation object
scan_outputs
=
[]
for
pos
in
xrange
(
len
(
states_and_outputs
)):
out
=
scan_utils
.
ScanPermutation
(
mintaps
[
pos
])(
scan_outputs_update_rules
[
pos
],
t
)
scan_outputs
.
append
(
out
[
mintap
:])
# 5.6 Construct updates dictionary
update_rules
=
scan_outputs_update_rules
[
len
(
states_and_outputs
):]
updates
=
{}
for
v
,
u
in
izip
(
original_numeric_shared_variables
,
update_rules
[:
len
(
additional_input_states
)]):
updates
[
v
]
=
u
[
-
1
]
for
v
,
u
in
izip
(
original_non_numeric_shared_variables
,
update_rules
[
len
(
additional_input_states
):]):
updates
[
v
]
=
u
# Step 5.7 We are done and can return everything back to the user
return
scan_outputs
,
updates
def
one_step_scan
(
fn
,
inputs
,
states_and_outputs_info
,
parameters
,
truncate_gradient
):
"""
This function is evaluated if `n_steps` evaluates to either 1 or -1.
"""
# 1. Grab slices of sequences
inputs_slices
=
[
input
[
0
]
for
input
in
inputs
]
# 2. Grab slices of states
states_slices
=
[]
for
n
,
arg_info
in
enumerate
(
states_and_outputs_info
):
if
arg_info
.
get
(
'taps'
,
None
)
==
[
-
1
]:
states_slices
.
append
(
arg_info
[
'initial'
])
elif
arg_info
.
get
(
'taps'
,
None
):
if
numpy
.
any
(
numpy
.
array
(
arg_info
.
get
(
'taps'
,
[]))
>
0
):
# Make sure we do not have requests for future values of a
# sequence we can not provide such values
raise
ValueError
(
'Can not use future taps of outputs'
,
arg_info
)
# go through the taps
mintap
=
abs
(
numpy
.
min
(
arg_info
[
'taps'
]))
states_slices
.
append
(
arg_info
[
'initial'
][
k
+
mintap
])
# Re-order args
args
=
(
inputs_slices
+
states_slices
+
parameters
)
cond
,
states_and_outputs
,
updates
=
\
scan_utils
.
get_updates_and_outputs
(
fn
(
*
args
))
# We do not need to use the scan op anymore, so we can just return
# the outputs and updates we have
if
cond
is
not
None
:
_logger
.
warning
((
'When the number of steps is fixed and equal '
'to 1, the provided stopping condition, '
,
str
(
cond
),
' is ignored'
))
states_and_outputs
=
[
tensor
.
unbroadcast
(
tensor
.
shape_padleft
(
arg
),
0
)
for
arg
in
states_and_outputs
]
if
len
(
states_and_outputs
)
==
1
:
states_and_outputs
=
states_and_outputs
[
0
]
return
(
states_and_outputs
,
updates
)
theano/sandbox/scan_module/scan_op.py
0 → 100644
浏览文件 @
44f47751
"""
This module provides the Scan Op
See scan.py for details on scan
"""
__docformat__
=
'restructedtext en'
__authors__
=
(
"Razvan Pascanu "
"Frederic Bastien "
"James Bergstra "
"Pascal Lamblin "
)
__copyright__
=
"(c) 2010, Universite de Montreal"
__contact__
=
"Razvan Pascanu <r.pascanu@gmail>"
import
itertools
import
logging
import
time
from
itertools
import
izip
import
numpy
import
theano
from
theano.compile
import
function
,
Param
,
Out
from
theano
import
compile
from
theano
import
gradient
from
theano.gof.python25
import
any
from
theano.gof
import
PureOp
,
Apply
from
theano
import
gof
from
theano.tensor
import
TensorType
from
theano
import
tensor
from
theano.tensor.opt
import
Shape_i
#from theano.sandbox import cuda
from
theano.compile.profiling
import
ScanProfileStats
import
scan_utils
# Logging function for sending warning or info
_logger
=
logging
.
getLogger
(
'theano.scan_module.scan_op'
)
class
ScanOp
(
PureOp
):
def
__init__
(
self
,
inputs
,
outputs
,
lengths
,
switches
,
mintaps
,
index
,
options
,
as_repeatUntil
):
self
.
inputs
=
inputs
self
.
outputs
=
outputs
self
.
index
=
index
self
.
switches
=
switches
self
.
lengths
=
lengths
self
.
mintaps
=
mintaps
self
.
as_repeatUntil
=
as_repeatUntil
self
.
options
=
options
self
.
name
=
options
[
'name'
]
self
.
mode
=
options
[
'mode'
]
self
.
inplace
=
options
[
'inplace'
]
self
.
gpu
=
options
[
'gpu'
]
self
.
profile
=
options
[
'profile'
]
self
.
hash_inner_graph
=
options
[
'hash_inner_graph'
]
# --Construct the destroy map--
if
self
.
inplace
:
for
idx
in
xrange
(
len
(
outputs
)):
self
.
destroy_map
[
idx
]
=
[
idx
+
1
]
# --Decide on the default mode--
mode_instance
=
compile
.
mode
.
get_mode
(
self
.
mode
)
# if the default mode is used, and that mode is ProfileMode
# then we need to copy the mode otherwise the time for a given
# op will be counted multiple times
if
(
self
.
mode
is
None
and
isinstance
(
mode_instance
,
compile
.
profilemode
.
ProfileMode
)):
mode_instance
=
compile
.
profilemode
.
ProfileMode
(
optimizer
=
mode_instance
.
provided_optimizer
,
linker
=
mode_instance
.
provided_linker
)
compile
.
profilemode
.
prof_mode_instance_to_print
.
append
(
mode_instance
)
self
.
mode_instance
=
mode_instance
if
self
.
name
:
self
.
mode_instance
.
message
=
self
.
name
+
" sub profile"
else
:
self
.
mode_instance
.
message
=
"Scan sub profile"
else
:
self
.
mode_instance
=
mode_instance
# --Adding default name--
if
not
hasattr
(
self
,
'name'
)
or
self
.
name
is
None
:
self
.
name
=
'scan_fn'
def
make_node
(
self
,
*
inputs
):
# Checking if arguments are of the right type is done in the scan
# function
out_types
=
[
out
.
type
()
for
out
in
self
.
outputs
]
return
Apply
(
self
,
inputs
,
out_types
)
def
__eq__
(
self
,
other
):
# Check if we are dealing with same type of objects
if
not
type
(
self
)
==
type
(
other
):
return
False
if
self
.
options
!=
other
.
options
:
return
False
if
self
.
mintals
!=
other
.
mintaps
:
return
False
# Check if the number of different types of arguments is the same
diff_args
=
[
'inputs'
,
'outputs'
,
'lengths'
,
'mintaps'
,
'switches'
]
for
arg
in
diff_args
:
if
len
(
getattr
(
self
,
arg
))
!=
len
(
getattr
(
other
,
arg
)):
return
False
for
x
,
y
in
izip
(
self
.
inputs
,
other
.
inputs
):
if
x
.
type
!=
y
.
type
:
return
False
for
x
,
y
in
izip
(
self
.
lengths
,
other
.
lengths
):
if
x
.
type
!=
y
.
type
:
return
False
s_ins
=
[
self
.
index
]
+
self
.
inputs
+
self
.
lengths
+
self
.
switches
o_ins
=
[
other
.
index
]
+
other
.
inputs
+
other
.
lengths
+
other
.
switches
givens
=
dict
(
izip
(
s_ins
,
o_ins
))
# This part might be slow
for
x
,
y
in
izip
(
self
.
outputs
,
other
.
outputs
):
if
not
gof
.
graph
.
is_same_graph
(
x
,
y
,
givens
=
givens
):
return
False
return
True
def
__str__
(
self
):
if
self
.
gpu
:
gpu_str
=
'gpu'
else
:
gpu_str
=
'cpu'
if
self
.
as_repeatUntil
is
not
None
:
name
=
'repeat/until'
else
:
name
=
'loop'
if
self
.
inplace
:
aux_txt
=
'
%
s{inplace,
%
s,
%
s}'
%
(
name
,
gpu_str
,
str
(
self
.
name
))
else
:
aux_txt
=
'
%
s{
%
s,
%
s}'
%
(
name
,
gpu_str
,
str
(
self
.
name
))
return
aux_txt
def
__hash__
(
self
):
rval
=
hash
(
type
(
self
))
^
self
.
hash_inner_graph
for
val
in
self
.
options
.
values
():
if
isinstance
(
val
,
(
list
,
tuple
)):
for
el
in
val
:
rval
=
rval
^
el
else
:
rval
=
rval
^
val
return
rval
def
infer_shape
(
self
,
node
,
input_shapes
):
for
inp
,
inp_shp
in
izip
(
node
.
inputs
,
input_shapes
):
assert
inp_shp
is
None
or
len
(
inp_shp
)
==
inp
.
type
.
ndim
n_outs
=
len
(
self
.
outputs
)
if
self
.
as_repeatUntil
is
not
None
:
return
[(
Shape_i
(
0
)(
o
),)
+
x
[
1
:]
for
o
,
x
in
izip
(
node
.
outputs
,
input_shapes
[
1
:
n_outs
+
1
])]
else
:
return
input_shapes
[
1
:
n_outs
+
1
]
def
make_thunk
(
self
,
node
,
storage_map
,
compute_map
,
no_recycling
):
"""
:param node: the Apply node returned by the ``make_node`` function
of the scan op class
:param storage_map: dict variable -> one-element-list where a computed
value for this variable may be found.
:param compute_map: dict variable -> one-element-list where a boolean
value will be found. The boolean indicates whether the
variable's storage_map container contains a valid value (True)
or if it has not been computed yet (False).
:param no_recycling: list of variables for which it is forbidden to
reuse memory allocated by a previous call.
:note: If the thunk consults the storage_map on every call, it is safe
for it to ignore the no_recycling argument, because elements of the
no_recycling list will have a value of None in the storage map. If
the thunk can potentially cache return values (like CLinker does),
then it must not do so for variables in the no_recycling list.
"""
# 1. Collect all memory buffers
node_input_storage
=
[
storage_map
[
r
]
for
r
in
node
.
inputs
]
node_output_storage
=
[
storage_map
[
r
]
for
r
in
node
.
outputs
]
node_input_compute
=
[
compute_map
[
r
]
for
r
in
node
.
inputs
]
node_output_compute
=
[
compute_map
[
r
]
for
r
in
node
.
outputs
]
# 2. Construct fake shared variables around every argument of scan
givens
=
{}
base_inputs
=
self
.
inputs
[:
len
(
self
.
outputs
)]
base_buffers
=
node_input_storage
[
1
:
1
+
len
(
base_inputs
)]
aux_inputs
=
self
.
inputs
[
len
(
self
.
outputs
):]
aux_membuffers
=
node_input_storage
[
1
+
len
(
base_inputs
):]
# 2.1 First the auxiliary arguments, those that are parameters or
# input
def
fake_shared
(
var
):
val
=
0
for
dim
in
xrange
(
var
.
ndim
):
val
=
[
val
]
val
=
numpy
.
asarray
(
val
,
dtype
=
var
.
dtype
)
return
theano
.
shared
(
val
,
name
=
var
.
name
)
non_tensor_args
=
[]
non_tensor_buffers
=
[]
aux_buffers
=
[]
for
mem_buf
,
var
in
izip
(
aux_membuffers
,
aux_inputs
):
if
mem_buf
[
0
]
is
not
None
:
givens
[
var
]
=
theano
.
shared
(
mem_buf
[
0
],
name
=
var
.
name
,
borrow
=
True
)
elif
isinstance
(
var
,
TensorType
):
givens
[
var
]
=
fake_shared
(
var
)
aux_buffers
.
append
((
givens
[
var
],
mem_buf
))
else
:
givens
[
var
]
=
var
.
type
()
non_tensor_args
.
append
(
givens
[
var
])
non_tensor_buffers
.
append
(
mem_buf
)
# 2.2. Next the states (numeric) and the outputs
updates
=
{}
state_buffers
=
[]
n_numeric_values
=
len
(
self
.
lengths
)
for
pos
in
xrange
(
n_numeric_values
):
var
=
base_inputs
[
pos
]
mem_buf
=
base_buffers
[
pos
]
expr
=
self
.
outputs
[
pos
]
givens
[
var
]
=
fake_shared
(
var
)
state_buffers
.
append
((
givens
[
var
],
self
.
lengths
[
pos
],
mem_buf
))
updates
[
givens
[
var
]]
=
expr
#2.3 Non-numeric states
n_non_numeric
=
len
(
self
.
outputs
)
-
n_numeric_values
fn_outs
=
self
.
outputs
[
n_numeric_values
:]
for
var
in
base_inputs
[
n_numeric_values
:]:
givens
[
var
]
=
var
.
type
()
non_tensor_args
.
append
(
givens
[
var
])
non_numeric_states_bufs
=
base_buffers
[
n_numeric_values
:]
# 2.4 Add the update for the index of scan
updates
[
self
.
index
]
=
self
.
index
+
numpy
.
int64
(
1
)
# 3.1 Construct the inner function of scan
if
self
.
as_repeatUntil
is
not
None
:
fn_outs
=
self
.
as_repeatUntil
self
.
fn
=
theano
.
function
(
non_tensor_args
,
fn_outs
,
givens
=
givens
,
updates
=
updates
,
mode
=
self
.
mode_instance
,
name
=
self
.
name
,
profile
=
self
.
profile
)
# 3.2 Construct the perform
if
self
.
as_repeatUntil
is
not
None
:
# 3.2.1 as a repeat until
def
p
(
node
,
args
,
outs
):
pos
=
0
cont
=
1
# copy inputs if not inplace
if
not
self
.
inplace
:
for
_
,
_
,
val
in
state_buffers
:
val
[
0
]
=
val
[
0
]
.
copy
()
for
buf
in
non_numeric_states_bufs
:
buf
[
0
]
=
buf
[
0
]
.
copy
()
# reset all switches if any
for
sw
in
self
.
switches
:
sw
.
set_value
(
numpy
.
int8
(
0
),
borrow
=
True
)
# set aux shared variables
for
var
,
val
in
aux_buffers
:
var
.
set_value
(
val
[
0
],
borrow
=
True
)
# set state shared variables
for
var
,
length
,
val
in
state_buffers
:
var
.
set_value
(
val
[
0
],
borrow
=
True
)
length
.
set_value
(
val
[
0
]
.
shape
[
0
],
borrow
=
True
)
# grab fixed arguments
fix_args
=
[
x
[
0
]
for
x
in
non_tensor_buffers
]
while
cont
and
pos
<
node_input_storage
[
0
][
0
]:
extra_args
=
[
x
[
0
]
for
x
in
non_numeric_states_bufs
]
rvals
=
self
.
fn
(
*
(
fix_args
+
extra_args
))
for
buf
,
rval
in
izip
(
non_numeric_states_bufs
,
rvals
):
buf
[
0
]
=
rval
cont
=
rvals
[
-
1
]
pos
=
pos
+
1
# We need to trim the outputs if they are longer
for
pos
in
xrange
(
n_numeric_values
):
buf
=
state_buffers
[
pos
][
2
][
0
]
mintap
=
self
.
mintaps
[
pos
]
if
buf
.
shape
[
0
]
>
pos
+
self
.
mintaps
[
pos
]:
node_output_storage
[
pos
][
0
]
=
buf
[:
pos
+
mintap
]
else
:
node_output_storage
[
pos
][
0
]
=
buf
for
out_buf
,
in_buf
in
izip
(
node_output_storage
[
n_numeric_values
:],
non_numeric_states_bufs
):
out_buf
[
0
]
=
in_buf
[
0
]
else
:
# 3.2.2 as a for
def
p
(
node
,
args
,
outs
):
# copy inputs if not inplace
if
not
self
.
inplace
:
for
_
,
_
,
val
in
state_buffers
:
val
[
0
]
=
val
[
0
]
.
copy
()
for
buf
in
non_numeric_states_bufs
:
buf
[
0
]
=
buf
[
0
]
.
copy
()
# reset all switches if any
for
sw
in
self
.
switches
:
sw
.
set_value
(
numpy
.
int8
(
0
),
borrow
=
True
)
# set aux shared variables
for
var
,
val
in
aux_buffers
:
var
.
set_value
(
val
[
0
],
borrow
=
True
)
# set state shared variables
for
var
,
length
,
val
in
state_buffers
:
var
.
set_value
(
val
[
0
],
borrow
=
True
)
length
.
set_value
(
val
[
0
]
.
shape
[
0
],
borrow
=
True
)
# grab fixed arguments
fix_args
=
[
x
[
0
]
for
x
in
non_tensor_buffers
]
for
dx
in
xrange
(
node_input_storage
[
0
][
0
]):
extra_args
=
[
x
[
0
]
for
x
in
non_numeric_states_bufs
]
rvals
=
self
.
fn
(
*
(
fix_args
+
extra_args
))
for
buf
,
rval
in
izip
(
non_numeric_states_bufs
,
rvals
):
buf
[
0
]
=
rval
for
pos
in
xrange
(
n_numeric_values
):
buf
=
state_buffers
[
pos
][
2
][
0
]
mintap
=
self
.
mintaps
[
pos
]
node_output_storage
[
pos
][
0
]
=
buf
for
out_buf
,
in_buf
in
izip
(
node_output_storage
[
n_numeric_values
:],
non_numeric_states_bufs
):
out_buf
[
0
]
=
in_buf
[
0
]
# 3.3 construct the rval function
def
rval
(
p
=
p
,
i
=
node_input_storage
,
o
=
node_output_storage
,
n
=
node
):
r
=
p
(
n
,
[
x
[
0
]
for
x
in
i
],
o
)
for
out
in
node
.
outputs
:
compute_map
[
out
][
0
]
=
True
return
r
rval
.
inputs
=
node_input_storage
rval
.
outputs
=
node_output_storage
rval
.
perform
=
p
rval
.
lazy
=
False
return
rval
def
grad
(
self
,
args
,
g_outs
):
pass
def
R_op
(
self
,
inputs
,
eval_points
):
pass
@theano.compile.profilemode.register_profiler_printer
def
profile_printer
(
fct_name
,
compile_time
,
fct_call_time
,
fct_call
,
apply_time
,
apply_cimpl
,
message
,
outputs_size
,
other_time
):
# Scan overhead profile
if
any
([
isinstance
(
node
.
op
,
Scan
)
and
v
>
0
for
(
_
,
node
),
v
in
apply_time
.
items
()]):
print
print
'Scan overhead:'
print
(
'<Scan op time(s)> <sub scan fct time(s)> <sub scan op '
'time(s)> <sub scan fct time(
%
scan op time)> <sub scan '
'op time(
%
scan op time)> <node>'
)
total_super_scan_time
=
0
total_scan_fct_time
=
0
total_scan_op_time
=
0
for
(
_
,
node
),
v
in
apply_time
.
items
():
if
isinstance
(
node
.
op
,
Scan
):
if
v
>
0
:
scan_fct_time
=
node
.
op
.
mode_instance
.
fn_time
scan_op_time
=
node
.
op
.
mode_instance
.
local_time
total_super_scan_time
+=
v
total_scan_fct_time
+=
scan_fct_time
total_scan_op_time
+=
scan_op_time
print
'
%5.1
fs
%5.1
fs
%5.1
fs
%5.1
f
%% %5.1
f
%%
'
%
(
v
,
scan_fct_time
,
scan_op_time
,
scan_fct_time
/
v
*
100
,
scan_op_time
/
v
*
100
),
node
else
:
print
(
' The node took 0s, so we can not compute the '
'overhead'
),
node
print
' total
%5.1
fs
%5.1
fs
%5.1
fs
%5.1
f
%% %5.1
f
%%
'
%
(
total_super_scan_time
,
total_scan_fct_time
,
total_scan_op_time
,
total_scan_fct_time
/
total_super_scan_time
*
100
,
total_scan_op_time
/
total_super_scan_time
*
100
)
theano/sandbox/scan_module/scan_utils.py
0 → 100644
浏览文件 @
44f47751
"""
This module provides utility functions for the Scan Op
See scan.py for details on scan
"""
__docformat__
=
'restructedtext en'
__authors__
=
(
"Razvan Pascanu "
"Frederic Bastien "
"James Bergstra "
"Pascal Lamblin "
"Arnaud Bergeron"
)
__copyright__
=
"(c) 2010, Universite de Montreal"
__contact__
=
"Razvan Pascanu <r.pascanu@gmail>"
import
copy
import
logging
from
itertools
import
izip
import
numpy
import
theano
from
theano.compile.pfunc
import
rebuild_collect_shared
from
theano
import
gof
from
theano
import
tensor
,
scalar
from
theano.gof.python25
import
all
from
theano.tensor.basic
import
get_constant_value
# Logging function for sending warning or info
_logger
=
logging
.
getLogger
(
'theano.scan_utils'
)
def
expand
(
tensor_var
,
size
):
"""
Given ``tensor_var``, a Theano tensor of shape (d1, d2, ..), this
function constructs a rval Theano tensor of shape (d1 + size, d2, ..)
filled with 0s, except the first d1 entries which are taken from
``tensor_var``, namely:
rval[:d1] = tensor_var
:param tensor_var: Theano tensor variable
:param size: int
"""
# Corner case that I might use in an optimization
if
size
==
0
:
return
tensor_var
shapes
=
[
tensor_var
.
shape
[
x
]
for
x
in
xrange
(
tensor_var
.
ndim
)]
zeros_shape
=
[
size
+
shapes
[
0
]]
+
shapes
[
1
:]
empty
=
tensor
.
zeros
(
zeros_shape
,
dtype
=
tensor_var
.
dtype
)
return
tensor
.
set_subtensor
(
empty
[:
shapes
[
0
]],
tensor_var
)
def
to_list
(
ls
):
"""
Converts ``ls`` to list if it is a tuple, or wraps ``ls`` into a list if
it is not a list already
"""
if
isinstance
(
ls
,
(
list
,
tuple
)):
return
list
(
ls
)
else
:
return
[
ls
]
class
until
(
object
):
"""
Theano can end on a condition. In order to differentiate this condition
from the other outputs of scan, this class is used to wrap the condition
around it.
"""
def
__init__
(
self
,
condition
):
self
.
condition
=
tensor
.
as_tensor_variable
(
condition
)
assert
self
.
condition
.
ndim
==
0
def
get_updates_and_outputs
(
ls
):
"""
Parses the list ``ls`` into outputs and updates. The semantics
of ``ls`` is defined by the constructive function of scan.
The elemets of ``ls`` are either a list of expressions representing the
outputs/states, a dictionary of updates or a condition.
"""
def
is_list_outputs
(
elem
):
if
(
isinstance
(
elem
,
(
list
,
tuple
))
and
all
([
isinstance
(
x
,
theano
.
Variable
)
for
x
in
elem
])):
return
True
if
isinstance
(
elem
,
theano
.
Variable
):
return
True
return
False
def
is_updates
(
elem
):
if
isinstance
(
elem
,
dict
):
return
True
# Dictionaries can be given as lists of tuples
if
(
isinstance
(
elem
,
(
list
,
tuple
))
and
all
([
isinstance
(
x
,
(
list
,
tuple
))
and
len
(
x
)
==
2
for
x
in
elem
])):
return
True
return
False
def
is_condition
(
elem
):
return
isinstance
(
elem
,
until
)
if
is_list_outputs
(
ls
):
return
None
,
to_list
(
ls
),
{}
if
is_updates
(
ls
):
return
None
,
[],
dict
(
ls
)
if
not
isinstance
(
ls
,
(
list
,
tuple
)):
raise
ValueError
((
'Scan can not parse the return value'
' of your constructive function given to scan'
))
ls
=
list
(
ls
)
deprication_msg
=
(
'The return value of the lambda function'
' has been restricted. you have to always return first the'
' outputs (if any), afterwards the updates (if any) and'
' at the end the condition'
)
error_msg
=
(
'Scan can not parse the return value of your constructive '
'funtion given to scan'
)
if
len
(
ls
)
==
2
:
if
is_list_outputs
(
ls
[
0
]):
if
is_updates
(
ls
[
1
]):
return
(
None
,
to_list
(
ls
[
0
]),
dict
(
ls
[
1
]))
elif
is_condition
(
ls
[
1
]):
return
(
ls
[
1
]
.
condition
,
to_list
(
ls
[
0
]),
{})
else
:
raise
ValueError
(
error_msg
)
elif
is_updates
(
ls
[
0
]):
if
is_outputs
(
ls
[
1
]):
raise
ValueError
(
deprication_msg
)
elif
is_condition
(
ls
[
1
]):
return
(
ls
[
1
]
.
condition
,
[],
dict
(
ls
[
0
]))
else
:
raise
ValueError
(
error_msg
)
else
:
raise
ValueError
(
error_msg
)
elif
len
(
ls
)
==
3
:
if
is_outputs
(
ls
[
0
]):
if
is_updates
(
ls
[
1
]):
if
is_condition
(
ls
[
2
]):
return
(
ls
[
2
]
.
condition
,
to_list
(
ls
[
0
]),
dict
(
ls
[
1
]))
else
:
raise
ValueError
(
error_msg
)
else
:
raise
ValueError
(
error_msg
)
else
:
raise
ValueError
(
error_msg
)
def
clone
(
output
,
replace
=
None
,
strict
=
True
,
copy_inputs
=
True
):
"""
Function that allows replacing subgraphs of a computational
graph. It returns a copy of the initial subgraph with the corresponding
substitutions.
:type output: Theano Variables (or Theano expressions)
:param outputs: Theano expression that represents the computational
graph
:type replace: dict
:param replace: dictionary describing which subgraphs should be
replaced by what
"""
inps
,
outs
,
other_stuff
=
rebuild_collect_shared
(
output
,
[],
replace
,
[],
strict
,
copy_inputs
)
return
outs
def
canonical_arguments
(
sequences
,
outputs_info
,
non_sequences
,
go_backwards
,
n_steps
):
"""
This re-writes the arguments obtained from scan into a more friendly
form for the scan_op.
Mainly it makes sure that arguments are given as lists of dictionaries,
and that the different fields of of a dictionary are set to default
value if the user has not provided any.
"""
states_info
=
to_list
(
outputs_info
)
parameters
=
[
tensor
.
as_tensor_variable
(
x
)
for
x
in
to_list
(
non_sequences
)]
inputs
=
[]
if
n_steps
is
not
None
:
negative_n_steps
=
tensor
.
lt
(
tensor
.
as_tensor_variable
(
n_steps
),
0
)
for
input
in
to_list
(
sequences
):
if
not
isinstance
(
input
,
dict
):
nw_input
=
tensor
.
as_tensor_variable
(
input
)
if
go_backwards
:
nw_input
=
nw_input
[::
-
1
]
if
n_steps
is
not
None
:
nw_input
=
tensor
.
switch
(
negative_n_steps
,
nw_input
[::
-
1
],
nw_input
)
inputs
.
append
(
tensor
.
as_tensor_variable
(
nw_input
))
elif
input
.
get
(
'taps'
,
True
)
is
None
:
nw_input
=
tensor
.
as_tensor_variable
(
input
[
'input'
])
if
go_backwards
:
nw_input
=
nw_input
[::
-
1
]
if
n_steps
is
not
None
:
nw_input
=
tensor
.
switch
(
negative_n_steps
,
nw_input
[::
-
1
],
nw_input
)
inputs
.
append
(
nw_input
)
elif
input
.
get
(
'taps'
,
None
):
mintap
=
numpy
.
min
(
input
[
'taps'
])
maxtap
=
numpy
.
max
(
input
[
'taps'
])
orig_input
=
tensor
.
as_tensor_variable
(
input
[
'input'
])
if
go_backwards
:
orig_input
=
orig_input
[::
-
1
]
if
n_steps
is
not
None
:
orig_input
=
tensor
.
switch
(
negative_n_steps
,
orig_input
[::
-
1
],
orig_input
)
for
k
in
input
[
'taps'
]:
# We cut the sequence such that seq[i] to correspond to
# seq[i-k]
if
maxtap
<
0
:
offset
=
abs
(
maxtap
)
else
:
offset
=
0
nw_input
=
orig_input
if
maxtap
==
mintap
and
maxtap
!=
0
:
nw_input
=
nw_input
[:
abs
(
maxtap
)]
elif
maxtap
-
k
!=
0
:
nw_input
=
nw_input
[
offset
+
k
-
mintap
:
\
-
(
maxtap
-
k
)]
else
:
nw_input
=
nw_input
[
offset
+
k
-
mintap
:]
inputs
.
append
(
nw_input
)
else
:
raise
ValueError
(
'Provided sequence makes no sense'
,
str
(
input
))
# Since we've added all sequences now we need to level them up based on
# n_steps or their different shapes
if
n_steps
is
None
:
if
len
(
inputs
)
==
0
:
# No information about the number of steps
raise
ValueError
(
'You need to provide either at least '
'one sequence over which scan should loop '
'or a number of steps for scan to loop. '
'Neither of the two had been provided !'
)
T
=
inputs
[
0
]
.
shape
[
0
]
for
input
in
inputs
[
1
:]:
T
=
tensor
.
minimum
(
T
,
input
.
shape
[
0
])
else
:
T
=
abs
(
tensor
.
as_tensor
(
n_steps
))
# Level up sequences
inputs
=
[
input
[:
T
]
for
input
in
inputs
]
# wrap outputs info in a dictionary if they are not already in one
for
i
,
state
in
enumerate
(
states_info
):
if
state
is
not
None
and
not
isinstance
(
state
,
dict
):
states_info
[
i
]
=
dict
(
initial
=
tensor
.
as_tensor_variable
(
state
),
taps
=
[
-
1
])
elif
isinstance
(
state
,
dict
):
if
not
state
.
get
(
'initial'
,
None
)
and
state
.
get
(
'taps'
,
None
):
raise
ValueError
((
'If you are using slices of an output '
'you need to provide a initial state '
'for it'
),
state
)
elif
state
.
get
(
'initial'
,
None
)
and
not
state
.
get
(
'taps'
,
None
):
# initial state but taps not provided
if
'taps'
in
state
:
# explicitly provided a None for taps
_logger
.
warning
(
(
'Output
%
s ( index
%
d) has a initial '
'state but taps is explicitly set to None '
),
getattr
(
states_info
[
i
][
'initial'
],
'name'
,
'None'
),
i
)
states_info
[
i
][
'taps'
]
=
[
-
1
]
states_info
[
i
][
'initial'
]
=
\
tensor
.
as_tensor_variable
(
state
[
'initial'
])
elif
state
.
get
(
'initial'
,
None
):
states_info
[
i
][
'initial'
]
=
\
tensor
.
as_tensor_variable
(
state
[
'initial'
])
else
:
# if a None is provided as the output info we replace it
# with an empty dict() to simplify handling
states_info
[
i
]
=
dict
()
return
inputs
,
states_info
,
parameters
,
T
def
infer_shape
(
outs
,
inputs
,
input_shapes
):
'''
Compute the shape of the outputs given the shape of the inputs
of a theano graph.
We do it this way to avoid compiling the inner function just to get
the shape. Changes to ShapeFeature could require changes in this function.
'''
# We use a ShapeFeature because it has all the necessary logic
# inside. We don't use the full ShapeFeature interface, but we
# let it initialize itself with an empty env, otherwise we will
# need to do it manually
for
inp
,
inp_shp
in
izip
(
inputs
,
input_shapes
):
if
inp_shp
is
not
None
and
len
(
inp_shp
)
!=
inp
.
ndim
:
assert
len
(
inp_shp
)
==
inp
.
ndim
shape_feature
=
tensor
.
opt
.
ShapeFeature
()
shape_feature
.
on_attach
(
theano
.
gof
.
Env
([],
[]))
# Initialize shape_of with the input shapes
for
inp
,
inp_shp
in
izip
(
inputs
,
input_shapes
):
shape_feature
.
set_shape
(
inp
,
inp_shp
)
def
local_traverse
(
out
):
'''
Go back in the graph, from out, adding computable shapes to shape_of.
'''
if
out
in
shape_feature
.
shape_of
:
# Its shape is already known
return
elif
out
.
owner
is
None
:
# This is an input of the graph
shape_feature
.
init_r
(
out
)
else
:
# Recurse over inputs
for
inp
in
out
.
owner
.
inputs
:
if
not
inp
in
shape_feature
.
shape_of
:
local_traverse
(
inp
)
# shape_feature.on_import does not actually use an env
# It will call infer_shape and set_shape appropriately
dummy_env
=
None
shape_feature
.
on_import
(
dummy_env
,
out
.
owner
)
ret
=
[]
for
o
in
outs
:
local_traverse
(
o
)
ret
.
append
(
shape_feature
.
shape_of
[
o
])
return
ret
def
allocate_memory
(
T
,
y_info
,
y
):
"""
Allocates memory for an output of scan.
:param T: scalar
Variable representing the number of steps scan will run
:param y_info: dict
Dictionary describing the output (more specifically describing shape
information for the output
:param y: Tensor variable
Expression describing the computation resulting in out entry of y.
It can be used to infer the shape of y
"""
if
'shape'
in
y_info
:
return
tensor
.
zeros
([
T
,
]
+
list
(
y_info
[
'shape'
]),
dtype
=
y
.
dtype
)
else
:
inputs
=
gof
.
graph
.
inputs
([
y
])
ins_shapes
=
[]
for
inp
in
inputs
:
in_shape
=
[
inp
.
shape
[
k
]
for
k
in
xrange
(
inp
.
ndim
)]
ins_shapes
.
append
(
in_shape
)
shape
=
infer_shape
([
y
],
inputs
,
ins_shapes
)[
0
]
return
tensor
.
zeros
([
T
,
]
+
shape
,
dtype
=
y
.
dtype
)
class
ScanPermutation
(
gof
.
Op
):
def
__init__
(
self
,
mintap
=
0
,
inplace
=
False
):
self
.
inplace
=
inplace
self
.
mintap
=
mintap
if
inplace
:
self
.
destroy_map
=
{
0
:
[
0
]}
def
__eq__
(
self
,
other
):
return
type
(
self
)
==
type
(
other
)
and
self
.
inplace
==
other
.
inplace
def
__hash__
(
self
):
return
hash
(
type
(
self
))
^
hash
(
self
.
inplace
)
def
__str__
(
self
):
if
self
.
inplace
:
return
"scan_permutation{inplace}"
else
:
return
"scan_permutation"
def
make_node
(
self
,
membuffer
,
index
):
# index has to be a scalar
assert
index
.
ndim
==
0
# we neeed at least one dimension
assert
membuffer
.
ndim
>
0
return
gof
.
Apply
(
self
,
[
membuffer
,
index
],
[
membuffer
.
type
()])
def
perform
(
self
,
node
,
inputs
,
outputs
):
membuffer
=
inputs
[
0
]
index
=
inputs
[
1
]
+
self
.
mintap
out
=
outputs
[
0
]
if
index
%
membuffer
.
shape
[
0
]
==
0
:
if
self
.
inplace
:
out
[
0
]
=
membuffer
else
:
out
[
0
]
=
membuffer
.
copy
()
else
:
pos
=
index
%
membuffer
.
shape
[
0
]
if
outputs
[
0
]
is
membuffer
:
membuffer
=
membuffer
.
copy
()
print
pos
out
[
0
][:
membuffer
.
shape
[
0
]
-
pos
]
=
membuffer
[
pos
:]
out
[
0
][
membuffer
.
shape
[
0
]
-
pos
:]
=
membuffer
[:
pos
]
def
R_op
(
self
,
inputs
,
eval_points
):
if
eval_points
[
0
]
is
None
:
return
[
None
]
return
self
.
make_node
(
eval_points
[
0
],
inputs
[
1
])
.
outputs
def
grad
(
self
,
inputs
,
grads
):
pos
=
inputs
[
0
]
.
shape
[
0
]
-
(
inputs
[
1
]
%
inputs
[
0
]
.
shape
[
0
])
return
self
.
make_node
(
grads
[
0
],
pos
)
.
outputs
theano/sandbox/scan_module/tests/test_scan.py
0 → 100644
浏览文件 @
44f47751
import
os
import
shutil
from
tempfile
import
mkdtemp
import
time
import
unittest
import
cPickle
import
numpy
from
numpy.testing
import
dec
import
theano
import
theano.sandbox.rng_mrg
from
theano
import
tensor
from
theano.compile.pfunc
import
rebuild_collect_shared
from
theano.gof.python25
import
any
from
theano.tests
import
unittest_tools
as
utt
from
numpy.testing.noseclasses
import
KnownFailureTest
from
test_utils
import
*
import
theano.sandbox.scan_module
as
scan_module
class
TestScan
(
unittest
.
TestCase
):
def
setUp
(
self
):
utt
.
seed_rng
()
def
new_run
(
self
,
inputs_info
,
states_info
,
parameters_info
,
n_outputs
,
n_shared_updates
):
"""Generates a test for scan.
:param inputs_info: list of lists of dictionaries
Each list of dictionary represents one input sequence. Each
dictionary is one tap of that sequence. The dictionary has two
keys. ``use`` is either True or False, and it indicates if this
tap should be used in the inner graph or not. ``tap`` is the tap
value.
:param states_info: list of lists of dictionaries
see param ``inputs_info``. ``states_info`` has the same
semantics, just that it is for states and not for inputs
:param paramters_info: list of dictionary
Each dictionary is a different parameter. It has only one key,
namely ``use`` which says if the parameter should be used
internally or not
:param n_outputs: int
Number of pure outputs for scan
:param n_shared_updates: int
Number of shared variable with updates. They are all numeric.
"""
rng
=
numpy
.
random
.
RandomState
(
utt
.
fetch_seed
())
n_ins
=
len
(
inputs_info
)
inputs
=
[
tensor
.
matrix
(
'u
%
d'
%
k
)
for
k
in
xrange
(
n_ins
)]
scan_inputs
=
[]
for
inp
,
info
in
zip
(
inputs
,
inputs_info
):
scan_inputs
.
append
(
dict
(
input
=
inp
,
taps
=
[
x
[
'tap'
]
for
x
in
info
]))
n_states
=
len
(
states_info
)
states
=
[
tensor
.
matrix
(
'x
%
d'
%
k
)
for
k
in
xrange
(
n_states
)]
scan_states
=
[]
states
=
[]
for
state
,
info
in
zip
(
states
,
states_info
):
if
len
(
info
)
==
1
and
info
[
0
][
'tap'
]
==
-
1
:
state
=
tensor
.
vector
(
'x
%
d'
%
k
)
states
.
append
(
state
)
scan_states
.
append
(
state
)
else
:
state
=
tensor
.
matrix
(
'x
%
d'
%
k
)
states
.
append
(
states
)
scan_states
.
append
(
dict
(
initial
=
state
,
taps
=
[
x
[
'tap'
]
for
x
in
info
]))
n_parameters
=
len
(
parameters_info
)
parameters
=
[
tensor
.
vector
(
'p
%
d'
%
k
)
for
k
in
xrange
(
n_parameters
)]
original_shared_values
=
[]
shared_vars
=
[]
for
k
in
xrange
(
n_shared_updates
):
data
=
rng
.
uniform
(
size
=
(
4
,))
.
astype
(
theano
.
config
.
floatX
)
original_shared_values
.
append
(
data
)
shared_vars
.
append
(
theano
.
shared
(
data
,
name
=
'z
%
d'
%
k
))
def
inner_function
(
*
args
):
"""
Functions that constructs the inner graph of scan
"""
arg_pos
=
0
to_add
=
None
for
in_info
in
inputs_info
:
for
info
in
in_info
:
arg
=
args
[
arg_pos
]
arg_pos
+=
1
# Construct dummy graph around input
if
info
[
'use'
]:
if
to_add
is
None
:
to_add
=
arg
*
2
else
:
to_add
=
to_add
+
arg
*
2
states_out
=
[
to_add
]
*
n_states
for
dx
,
st_info
in
enumerate
(
states_info
):
for
info
in
st_info
:
try
:
arg
=
args
[
arg_pos
]
except
:
import
ipdb
;
ipdb
.
set_trace
()
arg_pos
+=
1
if
info
[
'use'
]:
states_out
[
dx
]
=
states_out
[
dx
]
+
arg
*
3
for
info
in
paramters_info
:
arg
=
args
[
arg_pos
]
arg_pos
+=
1
if
info
[
'use'
]:
if
to_add
is
None
:
to_add
=
arg
*
4
else
:
to_add
=
to_add
+
arg
*
4
shared_outs
=
[
sh
*
5
+
to_add
for
sh
in
shared_vars
]
states_out
=
[
x
+
to_add
for
x
in
states_out
]
pure_outs
=
[
to_add
**
2
for
x
in
xrange
(
n_outs
)]
return
states_out
+
pure_outs
,
dict
(
zip
(
shared_vars
,
shared_outs
))
def
execute_inner_graph
(
*
args
):
"""
Functions that computes numerically the values that scan should
return
"""
# Check if you need to go back in time over the sequences (the
# first argument is n_steps, the second is go_backwards)
n_steps
=
args
[
0
]
invert
=
False
if
n_steps
<
0
or
args
[
1
]:
new_ins
=
[
x
[::
-
1
]
for
x
in
args
[
2
:
2
+
n_ins
]]
n_steps
=
abs
(
n_steps
)
# Simplify the inputs by slicing them according to the taps
nw_inputs
=
[]
for
inp
,
info
in
zip
(
new_ins
,
inputs_info
):
taps
=
[
x
[
'tap'
]
for
x
in
info
]
nw_inputs
+=
[
inp
[
abs
(
numpy
.
min
(
taps
))
+
k
:]
for
k
in
taps
]
# Simplify the states by slicing them according to the taps.
# Note that if the memory buffer for the inputs and outputs is
# the same, by changing the outputs we also change the outputs
nw_states_inputs
=
[]
nw_states_outs
=
[]
for
st
,
info
in
zip
(
args
[
2
+
n_ins
:
2
+
n_ins
+
n_states
],
states_info
):
taps
=
[
x
[
'tap'
]
for
x
in
info
]
membuf
=
numpy
.
zeros
((
n_steps
+
numpy
.
max
(
abs
(
taps
)),
4
))
membuf
[:
numpy
.
max
(
abs
(
taps
))]
=
st
[:
numpy
.
max
(
abs
(
taps
))]
nw_states_inputs
+=
[
membuf
[
numpy
.
max
(
abs
(
taps
))
+
k
:]
for
k
in
taps
]
nw_states_outs
.
append
(
membuf
[
numpy
.
max
(
abs
(
taps
)):])
paramters
=
args
[
2
+
n_ins
+
n_states
:]
out_mem_buffers
=
[
numpy
.
zeros
((
n_steps
,
4
))
for
k
in
n_outs
]
shared_values
=
[
x
.
copy
()
for
x
in
original_shared_values
]
for
step
in
xrange
(
n_steps
):
arg_pos
=
0
to_add
=
None
for
in_info
in
inputs_info
:
for
info
in
in_info
:
arg
=
nw_inputs
[
arg_pos
][
step
]
arg_pos
+=
1
# Construct dummy graph around input
if
info
[
'use'
]:
if
to_add
is
None
:
to_add
=
arg
*
2
else
:
to_add
=
to_add
+
arg
*
2
states_out
=
[
to_add
]
*
n_states
arg_pos
=
0
for
dx
,
st_info
in
enumerate
(
states_info
):
nw_states_outs
[
dx
][
step
]
=
to_add
for
info
in
st_info
:
arg
=
nw_states_inputs
[
arg_pos
][
step
]
arg_pos
+=
1
if
info
[
'use'
]:
nw_states_outs
[
dx
][
step
]
+=
arg
*
3
for
arg
,
info
in
zip
(
parameters
,
paramters_info
):
if
info
[
'use'
]:
if
to_add
is
None
:
to_add
=
arg
*
4
else
:
to_add
=
to_add
+
arg
*
4
shared_values
=
[
sh
*
5
+
to_add
for
sh
in
shared_values
]
for
state
in
nw_states_outs
:
state
[
step
]
+=
to_add
for
out
in
out_mem_buffers
:
out
[
step
]
=
to_add
**
2
return
nw_states_outs
+
out_mem_buffers
,
shared_values
for
n_steps
in
[
-
1
,
1
,
5
,
-
5
,
None
]:
for
go_backwards
in
[
True
,
False
]:
outputs
,
updates
=
scan_module
.
scan
(
inner_function
,
sequences
=
scan_inputs
,
outputs_info
=
scan_states
,
non_sequences
=
parameters
,
n_steps
=
n_steps
,
go_backwards
=
go_backwards
,
truncate_gradient
=-
1
)
my_f
=
theano
.
function
(
inputs
+
states
+
parameters
,
outputs
,
updates
=
updates
,
allow_input_downcast
=
True
)
if
n_steps
is
not
None
and
abs
(
n_steps
)
==
1
:
assert
len
([
x
for
x
in
my_f
.
maker
.
env
.
toposort
()
if
isinstance
(
x
.
op
,
scan_module
.
scan_op
.
ScanOp
)])
==
0
# Generating data
# Scenario 1 : Good fit shapes
inputs_values
=
[]
for
info
in
inputs_info
:
taps
=
[
x
[
'tap'
]
for
x
in
info
]
offset
=
abs
(
numpy
.
min
([
x
for
x
in
taps
if
x
<
0
]))
offset
+=
numpy
.
max
([
x
for
x
in
taps
if
x
>
0
])
data
=
rng
.
uniform
(
size
=
(
n_steps
+
offset
,
4
))
inputs_values
.
append
(
data
)
state_values
=
[]
for
info
in
states_info
:
taps
=
[
x
[
'tap'
]
for
x
in
info
]
offset
=
abs
(
numpy
.
min
(
taps
))
data
=
rng
.
uniform
(
size
=
(
offset
,
4
))
state_values
.
append
(
data
)
param_values
=
[
rng
.
uniform
(
size
=
(
4
,))
for
k
in
xrange
(
n_parameters
)]
for
var
,
val
in
zip
(
shared_vars
,
original_shared_values
):
var
.
set_value
(
val
)
theano_outs
=
my_f
(
*
(
inputs_values
+
state_values
+
param_values
))
args
=
([
n_steps
,
go_backwards
]
+
input_values
+
state_values
+
param_values
)
rvals
=
execute_inner_graph
(
*
args
)
numpy_outs
,
numpy_shared
=
rvals
assert
len
(
numpy_outs
)
==
len
(
theano_outs
)
assert
len
(
numpy_shared
)
==
len
(
shared_vars
)
for
th_out
,
num_out
in
zip
(
theano_outs
,
numpy_outs
):
assert
numpy
.
allclose
(
th_out
,
num_out
)
for
th_out
,
num_out
in
zip
(
shared_outs
,
numpy_shared
):
assert
numpy
.
allclose
(
th_out
.
get_value
(),
num_out
)
# Scenario 2 : Loose fit (sequences longer then required)
inputs_values
=
[]
for
pos
,
info
in
enumerate
(
inputs_info
):
taps
=
[
x
[
'tap'
]
for
x
in
info
]
offset
=
abs
(
numpy
.
min
([
x
for
x
in
taps
if
x
<
0
]))
offset
+=
numpy
.
max
([
x
for
x
in
taps
if
x
>
0
])
data
=
rng
.
uniform
(
size
=
(
n_steps
+
offset
+
pos
+
1
,
4
))
inputs_values
.
append
(
data
)
state_values
=
[]
for
pos
,
info
in
enumerate
(
states_info
):
taps
=
[
x
[
'tap'
]
for
x
in
info
]
offset
=
abs
(
numpy
.
min
(
taps
))
data
=
rng
.
uniform
(
size
=
(
offset
+
pos
+
1
,
4
))
state_values
.
append
(
data
)
param_values
=
[
rng
.
uniform
(
size
=
(
4
,))
for
k
in
xrange
(
n_parameters
)]
for
var
,
val
in
zip
(
shared_vars
,
original_shared_values
):
var
.
set_value
(
val
)
theano_outs
=
my_f
(
*
(
inputs_values
+
state_values
+
param_values
))
args
=
([
n_steps
,
go_backwards
]
+
input_values
+
state_values
+
param_values
)
rvals
=
execute_inner_graph
(
*
args
)
numpy_outs
,
numpy_shared
=
rvals
assert
len
(
numpy_outs
)
==
len
(
theano_outs
)
assert
len
(
numpy_shared
)
==
len
(
shared_vars
)
for
th_out
,
num_out
in
zip
(
theano_outs
,
numpy_outs
):
assert
numpy
.
allclose
(
th_out
,
num_out
)
for
th_out
,
num_out
in
zip
(
shared_outs
,
numpy_shared
):
assert
numpy
.
allclose
(
th_out
.
get_value
(),
num_out
)
# Scenario 3 : Less data then required
inputs_values
=
[]
for
pos
,
info
in
enumerate
(
inputs_info
):
taps
=
[
x
[
'tap'
]
for
x
in
info
]
offset
=
abs
(
numpy
.
min
([
x
for
x
in
taps
if
x
<
0
]))
offset
+=
numpy
.
max
([
x
for
x
in
taps
if
x
>
0
])
data
=
rng
.
uniform
(
size
=
(
n_steps
+
offset
-
1
,
4
))
inputs_values
.
append
(
data
)
state_values
=
[]
for
pos
,
info
in
enumerate
(
states_info
):
taps
=
[
x
[
'tap'
]
for
x
in
info
]
offset
=
abs
(
numpy
.
min
(
taps
))
data
=
rng
.
uniform
(
size
=
(
offset
-
1
,
4
))
state_values
.
append
(
data
)
param_values
=
[
rng
.
uniform
(
size
=
(
4
,))
for
k
in
xrange
(
n_parameters
)]
for
var
,
val
in
zip
(
shared_vars
,
original_shared_values
):
var
.
set_value
(
val
)
self
.
assertRaises
(
Exception
,
my_f
,
inputs
+
state_values
+
param_values
)
def
test000_generate_tests
(
self
):
rng
=
numpy
.
random
.
RandomState
(
utt
.
fetch_seed
())
all_inputs_info
=
[[]]
possible_taps_use_pairs
=
[[
dict
(
tap
=
0
,
use
=
True
)],
[
dict
(
tap
=
0
,
use
=
False
)],
[
dict
(
tap
=-
3
,
use
=
True
),
dict
(
tap
=-
1
,
use
=
True
)],
[
dict
(
tap
=-
3
,
use
=
True
),
dict
(
tap
=-
1
,
use
=
False
)],
[
dict
(
tap
=-
3
,
use
=
False
),
dict
(
tap
=-
1
,
use
=
False
)],
[
dict
(
tap
=-
2
,
use
=
True
),
dict
(
tap
=
0
,
use
=
True
)],
[
dict
(
tap
=-
2
,
use
=
False
),
dict
(
tap
=
0
,
use
=
True
)],
[
dict
(
tap
=-
2
,
use
=
False
),
dict
(
tap
=
0
,
use
=
False
)],
[
dict
(
tap
=
0
,
use
=
True
),
dict
(
tap
=
3
,
use
=
True
)],
[
dict
(
tap
=
2
,
use
=
True
),
dict
(
tap
=
3
,
use
=
True
)],
[
dict
(
tap
=-
2
,
use
=
True
),
dict
(
tap
=
3
,
use
=
True
)]]
for
n_ins
in
[
1
,
2
]:
# Randomly pick up 4*n_ins combinations of arguments
for
k
in
xrange
(
4
*
n_ins
):
inp
=
[]
for
inp_nb
in
xrange
(
n_ins
):
pos
=
rng
.
randint
(
len
(
possible_taps_use_pairs
))
inp
.
append
(
possible_taps_use_pairs
[
pos
])
all_inputs_info
.
append
(
inp
)
all_states_info
=
[[]]
possible_taps_use_pairs
=
[[
dict
(
tap
=-
1
,
use
=
True
)],
[
dict
(
tap
=-
1
,
use
=
False
)],
[
dict
(
tap
=-
3
,
use
=
True
)],
[
dict
(
tap
=-
3
,
use
=
False
)],
[
dict
(
tap
=-
3
,
use
=
True
),
dict
(
tap
=-
1
,
use
=
True
)],
[
dict
(
tap
=-
3
,
use
=
True
),
dict
(
tap
=-
1
,
use
=
False
)],
[
dict
(
tap
=-
3
,
use
=
False
),
dict
(
tap
=-
1
,
use
=
False
)],
[
dict
(
tap
=-
4
,
use
=
True
),
dict
(
tap
=-
2
,
use
=
True
)],
[
dict
(
tap
=-
4
,
use
=
False
),
dict
(
tap
=-
2
,
use
=
True
)]]
for
n_ins
in
[
1
,
2
]:
# Randomly pick up 4*n_ins combinations of arguments
for
k
in
xrange
(
4
*
n_ins
):
state
=
[]
for
state_nb
in
xrange
(
n_ins
):
pos
=
rng
.
randint
(
len
(
possible_taps_use_pairs
))
state
.
append
(
possible_taps_use_pairs
[
pos
])
all_states_info
.
append
(
state
)
all_parameters_info
=
[[],
[
dict
(
use
=
False
)],
[
dict
(
use
=
True
)],
[
dict
(
use
=
True
),
dict
(
use
=
True
)],
[
dict
(
use
=
True
),
dict
(
use
=
False
)]]
for
n_outputs
in
[
0
,
1
,
2
]:
for
n_shared_updates
in
[
0
,
1
,
2
]:
for
n_random_combinations
in
xrange
(
14
):
pos_inp
=
rng
.
randint
(
len
(
all_inputs_info
))
pos_st
=
rng
.
randint
(
len
(
all_states_info
))
pos_param
=
rng
.
randint
(
len
(
all_parameters_info
))
self
.
new_run
(
inputs_info
=
all_inputs_info
[
pos_inp
],
states_info
=
all_states_info
[
pos_st
],
parameters_info
=
all_parameters_info
[
pos_param
],
n_outputs
=
n_outputs
,
n_shared_updates
=
n_shared_updates
)
def
test001_generator_one_scalar_output
(
self
):
def
f_pow2
(
x_tm1
):
return
2
*
x_tm1
for
n_steps
in
[
-
1
,
1
,
5
,
-
5
]:
state
=
theano
.
tensor
.
scalar
(
'state'
)
output
,
updates
=
scan_module
.
scan
(
f_pow2
,
[],
state
,
[],
n_steps
=
n_steps
,
truncate_gradient
=-
1
,
go_backwards
=
False
)
my_f
=
theano
.
function
([
state
],
output
,
updates
=
updates
,
allow_input_downcast
=
True
)
if
abs
(
n_steps
)
==
1
:
assert
len
([
x
for
x
in
my_f
.
maker
.
env
.
toposort
()
if
isinstance
(
x
.
op
,
scan_module
.
scan_op
.
ScanOp
)])
==
0
rng
=
numpy
.
random
.
RandomState
(
utt
.
fetch_seed
())
state
=
rng
.
uniform
()
numpy_values
=
numpy
.
array
([
state
*
(
2
**
(
k
+
1
))
for
k
in
xrange
(
abs
(
n_steps
))])
theano_values
=
my_f
(
state
)
assert
numpy
.
allclose
(
numpy_values
,
theano_values
)
# simple rnn, one input, one state, weights for each; input/state
# are vectors, weights are scalars
def
test002_one_sequence_one_output_and_weights
(
self
):
def
f_rnn
(
u_t
,
x_tm1
,
W_in
,
W
):
return
u_t
*
W_in
+
x_tm1
*
W
u
=
theano
.
tensor
.
vector
(
'u'
)
x0
=
theano
.
tensor
.
scalar
(
'x0'
)
W_in
=
theano
.
tensor
.
scalar
(
'win'
)
W
=
theano
.
tensor
.
scalar
(
'w'
)
output
,
updates
=
scan_module
.
scan
(
f_rnn
,
u
,
x0
,
[
W_in
,
W
],
n_steps
=
n_steps
,
truncate_gradient
=-
1
,
go_backwards
=
False
)
my_f
=
theano
.
function
([
u
,
x0
,
W_in
,
W
],
output
,
updates
=
updates
,
allow_input_downcast
=
True
)
if
n_steps
is
not
None
and
abs
(
n_steps
)
==
1
:
assert
len
([
x
for
x
in
my_f
.
maker
.
env
.
toposort
()
if
isinstance
(
x
.
op
,
scan_module
.
scan_op
.
ScanOp
)])
==
0
# get random initial values
rng
=
numpy
.
random
.
RandomState
(
utt
.
fetch_seed
())
v_u
=
rng
.
uniform
(
size
=
(
8
,),
low
=-
5.
,
high
=
5.
)
v_x0
=
rng
.
uniform
()
W
=
rng
.
uniform
()
W_in
=
rng
.
uniform
()
# compute the output in numpy
if
n_steps
is
not
None
and
n_steps
<
0
:
_v_u
=
v_u
[::
-
1
]
else
:
_v_u
=
v_u
steps
=
8
if
n_steps
is
not
None
:
steps
=
abs
(
n_steps
)
v_out
=
numpy
.
zeros
((
8
,))
v_out
[
0
]
=
_v_u
[
0
]
*
W_in
+
v_x0
*
W
for
step
in
xrange
(
1
,
steps
):
v_out
[
step
]
=
_v_u
[
step
]
*
W_in
+
v_out
[
step
-
1
]
*
W
v_out
=
v_out
[:
steps
]
theano_values
=
my_f
(
v_u
,
v_x0
,
W_in
,
W
)
assert
numpy
.
allclose
(
theano_values
,
v_out
)
def
test003_multiple_inputs_multiple_outputs
(
self
):
pass
def
test004_collect_parameters_outer_graph
(
self
):
pass
def
test005_multiple_taps
(
self
):
pass
def
test006_updates
(
self
):
pass
theano/sandbox/scan_module/tests/test_utils.py
0 → 100644
浏览文件 @
44f47751
import
cPickle
import
numpy
import
unittest
import
theano
from
theano.compile.pfunc
import
rebuild_collect_shared
import
theano.sandbox.scan_module
as
scan_module
if
theano
.
config
.
mode
==
'FAST_COMPILE'
:
mode_with_opt
=
theano
.
compile
.
mode
.
get_mode
(
'FAST_RUN'
)
else
:
mode_with_opt
=
theano
.
compile
.
mode
.
get_default_mode
()
mode_with_gpu
=
mode_with_opt
.
including
(
'gpu'
,
'scan'
)
# TODO: this should replace the verify_grad in tensor/tensor_grad.py
class
multiple_outputs_numeric_grad
:
"""WRITEME"""
type_eps
=
{
'float64'
:
1e-7
,
'float32'
:
3e-3
}
def
__init__
(
self
,
f
,
pt
,
ndarray_mask
=
None
,
eps
=
None
):
"""Return the gradient of f at pt.
This function computes the gradient by a one-sided finite differences
of a fixed step size (eps).
It is assumed that f(...) will return a scalar.
:param eps: the stepsize for the finite differencing. None means
input dtype-dependent. See `type_eps`.
"""
def
prod
(
inputs
):
rval
=
1
for
i
in
inputs
:
rval
*=
i
return
rval
packed_pt
=
False
if
not
isinstance
(
pt
,
(
list
,
tuple
)):
pt
=
[
pt
]
packed_pt
=
True
# This mask tells us if we are dealing with an ndarray input or
# something else ( a random state ? ) with which we shouldn't really
# mess up
if
not
ndarray_mask
:
ndarray_mask
=
[
True
for
x
in
pt
]
dtype_eps
=
multiple_outputs_numeric_grad
.
type_eps
[
'float64'
]
for
i
,
p
in
enumerate
(
pt
):
if
ndarray_mask
[
i
]:
pt
[
i
]
=
numpy
.
array
(
p
)
_eps
=
multiple_outputs_numeric_grad
.
type_eps
[
str
(
pt
[
i
]
.
dtype
)]
if
_eps
>
dtype_eps
:
dtype_eps
=
_eps
self
.
ndarray_mask
=
ndarray_mask
#'''
# Compute clean output:
f_x
=
f
(
*
pt
)
gx
=
[]
# now iterate over the elements of x and call f on those + delta x
for
i
in
xrange
(
len
(
pt
)):
if
ndarray_mask
[
i
]:
# It is a ndarray that we can tweak
if
eps
:
_eps
=
eps
else
:
_eps
=
dtype_eps
if
pt
[
i
]
.
ndim
:
_g
=
[]
# it has several dimensions:
for
pos
in
xrange
(
prod
(
pt
[
i
]
.
shape
)):
t
=
pt
[
i
]
.
copy
()
t
=
t
.
flatten
()
t
[
pos
]
+=
_eps
t
=
t
.
reshape
(
pt
[
i
]
.
shape
)
f_eps
=
f
(
*
(
pt
[:
i
]
+
[
t
]
+
pt
[
i
+
1
:]))
_g
.
append
(
numpy
.
asarray
((
f_eps
-
f_x
)
/
_eps
))
gx
.
append
(
numpy
.
asarray
(
_g
)
.
reshape
(
pt
[
i
]
.
shape
))
else
:
t
=
numpy
.
array
(
pt
[
i
]
+
_eps
)
f_eps
=
f
(
*
(
pt
[:
i
]
+
[
t
]
+
pt
[
i
+
1
:]))
gx
.
append
(
numpy
.
asarray
((
f_eps
-
f_x
)
/
_eps
))
self
.
gx
=
gx
@staticmethod
def
abs_rel_err
(
a
,
b
,
eps
=
1.0e-10
):
"""Return a small number when a and b are close, relative to how big
they are"""
return
abs
(
a
-
b
)
/
(
abs
(
a
)
+
abs
(
b
)
+
eps
)
def
max_err
(
self
,
_g_pt
):
"""Return the biggest relative error between g_pt and self.gx"""
g_pt
=
[]
for
i
in
xrange
(
len
(
_g_pt
)):
if
self
.
ndarray_mask
[
i
]:
g_pt
.
append
(
_g_pt
[
i
])
elif
isinstance
(
_g_pt
[
i
],
numpy
.
ndarray
):
assert
numpy
.
all
(
_g_pt
[
i
]
==
0
)
if
len
(
g_pt
)
!=
len
(
self
.
gx
):
raise
ValueError
(
'argument has wrong number of elements'
,
len
(
g_pt
))
errs
=
[]
for
i
,
(
a
,
b
)
in
enumerate
(
zip
(
g_pt
,
self
.
gx
)):
if
a
.
shape
!=
b
.
shape
:
raise
ValueError
(
'argument element
%
i has wrong shape
%
s'
%
\
(
i
,
str
((
a
.
shape
,
b
.
shape
))))
vv
=
multiple_outputs_numeric_grad
.
abs_rel_err
(
a
,
b
)
errs
.
append
(
numpy
.
max
(
multiple_outputs_numeric_grad
.
abs_rel_err
(
a
,
b
)))
if
numpy
.
all
(
numpy
.
isfinite
(
errs
)):
return
numpy
.
max
(
errs
),
numpy
.
argmax
(
errs
)
else
:
return
numpy
.
inf
,
0
def
scan_project_sum
(
*
args
,
**
kwargs
):
rng
=
theano
.
tensor
.
shared_randomstreams
.
RandomStreams
(
123
)
scan_outputs
,
updates
=
theano
.
scan
(
*
args
,
**
kwargs
)
if
type
(
scan_outputs
)
not
in
[
list
,
tuple
]:
scan_outputs
=
[
scan_outputs
]
# we should ignore the random-state updates so that
# the uniform numbers are the same every evaluation and on every call
rng
.
add_default_updates
=
False
factors
=
[
rng
.
uniform
(
size
=
s
.
shape
,
low
=
0.1
,
high
=
0.9
)
for
s
in
scan_outputs
]
return
(
sum
([(
s
*
f
)
.
sum
()
for
s
,
f
in
zip
(
scan_outputs
,
factors
)]),
updates
)
def
asarrayX
(
value
):
return
theano
.
_asarray
(
value
,
dtype
=
theano
.
config
.
floatX
)
def
clone_optimized_graph
(
f
):
maker_ins
=
[
x
for
x
in
f
.
maker
.
env
.
inputs
if
not
isinstance
(
x
,
theano
.
tensor
.
sharedvar
.
SharedVariable
)]
inps
,
outs
,
_
=
rebuild_collect_shared
(
f
.
maker
.
env
.
outputs
,
maker_ins
,
copy_inputs_over
=
False
)
ins
=
[
x
for
x
in
inps
if
not
isinstance
(
x
,
theano
.
tensor
.
sharedvar
.
SharedVariable
)]
return
(
ins
,
outs
)
def
grab_scan_node
(
output
):
if
output
.
owner
is
None
:
return
None
if
output
.
owner
.
op
.
__class__
.
__name__
==
'Scan'
:
return
[
output
.
owner
]
rval
=
[]
for
i
in
output
.
owner
.
inputs
:
ri
=
grab_scan_node
(
i
)
if
ri
is
not
None
:
rval
+=
ri
if
rval
is
[]:
return
None
else
:
return
rval
class
TestScanUtils
(
unittest
.
TestCase
):
def
test_cloning_no_replace_strict_copy_inputs
(
self
):
# This has nothing to do with scan, but it refers to the clone
# function that scan uses internally and that pfunc uses now and
# that users might want to use
x
=
theano
.
tensor
.
vector
(
'x'
)
y
=
theano
.
tensor
.
vector
(
'y'
)
z
=
theano
.
shared
(
0.25
)
f1
=
z
*
(
x
+
y
)
**
2
+
5
f2
=
scan_module
.
scan_utils
.
clone
(
f1
,
replace
=
None
,
strict
=
True
,
copy_inputs
=
True
)
f2_inp
=
theano
.
gof
.
graph
.
inputs
([
f2
])
assert
z
in
f2_inp
assert
x
in
f2_inp
assert
y
in
f2_inp
def
test_cloning_no_replace_strict_not_copy_inputs
(
self
):
# This has nothing to do with scan, but it refers to the clone
# function that scan uses internally and that pfunc uses now and
# that users might want to use
x
=
theano
.
tensor
.
vector
(
'x'
)
y
=
theano
.
tensor
.
vector
(
'y'
)
z
=
theano
.
shared
(
0.25
)
f1
=
z
*
(
x
+
y
)
**
2
+
5
f2
=
scan_module
.
scan_utils
.
clone
(
f1
,
replace
=
None
,
strict
=
True
,
copy_inputs
=
False
)
f2_inp
=
theano
.
gof
.
graph
.
inputs
([
f2
])
assert
not
z
in
f2_inp
assert
not
x
in
f2_inp
assert
not
y
in
f2_inp
def
test_cloning_replace_strict_copy_inputs
(
self
):
# This has nothing to do with scan, but it refers to the clone
# function that scan uses internally and that pfunc uses now and
# that users might want to use
x
=
theano
.
tensor
.
vector
(
'x'
)
y
=
theano
.
tensor
.
vector
(
'y'
)
y2
=
theano
.
tensor
.
vector
(
'y2'
)
z
=
theano
.
shared
(
0.25
)
f1
=
z
*
(
x
+
y
)
**
2
+
5
f2
=
scan_module
.
scan_utils
.
clone
(
f1
,
replace
=
{
y
:
y2
},
strict
=
True
,
copy_inputs
=
True
)
f2_inp
=
theano
.
gof
.
graph
.
inputs
([
f2
])
assert
z
in
f2_inp
assert
x
in
f2_inp
assert
y2
in
f2_inp
def
test_cloning_replace_not_strict_copy_inputs
(
self
):
# This has nothing to do with scan, but it refers to the clone
# function that scan uses internally and that pfunc uses now and
# that users might want to use
x
=
theano
.
tensor
.
vector
(
'x'
)
y
=
theano
.
tensor
.
fvector
(
'y'
)
y2
=
theano
.
tensor
.
dvector
(
'y2'
)
z
=
theano
.
shared
(
0.25
)
f1
=
z
*
(
x
+
y
)
**
2
+
5
f2
=
scan_module
.
scan_utils
.
clone
(
f1
,
replace
=
{
y
:
y2
},
strict
=
False
,
copy_inputs
=
True
)
f2_inp
=
theano
.
gof
.
graph
.
inputs
([
f2
])
assert
z
in
f2_inp
assert
x
in
f2_inp
assert
y2
in
f2_inp
def
test_cloning_replace_strict_not_copy_inputs
(
self
):
# This has nothing to do with scan, but it refers to the clone
# function that scan uses internally and that pfunc uses now and
# that users might want to use
x
=
theano
.
tensor
.
vector
(
'x'
)
y
=
theano
.
tensor
.
vector
(
'y'
)
y2
=
theano
.
tensor
.
vector
(
'y2'
)
z
=
theano
.
shared
(
0.25
)
f1
=
z
*
(
x
+
y
)
**
2
+
5
f2
=
scan_module
.
scan_utils
.
clone
(
f1
,
replace
=
{
y
:
y2
},
strict
=
True
,
copy_inputs
=
False
)
f2_inp
=
theano
.
gof
.
graph
.
inputs
([
f2
])
assert
not
z
in
f2_inp
assert
not
x
in
f2_inp
assert
not
y2
in
f2_inp
def
test_cloning_replace_not_strict_not_copy_inputs
(
self
):
# This has nothing to do with scan, but it refers to the clone
# function that scan uses internally and that pfunc uses now and
# that users might want to use
x
=
theano
.
tensor
.
vector
(
'x'
)
y
=
theano
.
tensor
.
fvector
(
'y'
)
y2
=
theano
.
tensor
.
dvector
(
'y2'
)
z
=
theano
.
shared
(
0.25
)
f1
=
z
*
(
x
+
y
)
**
2
+
5
f2
=
scan_module
.
scan_utils
.
clone
(
f1
,
replace
=
{
y
:
y2
},
strict
=
False
,
copy_inputs
=
False
)
f2_inp
=
theano
.
gof
.
graph
.
inputs
([
f2
])
assert
not
z
in
f2_inp
assert
not
x
in
f2_inp
assert
not
y2
in
f2_inp
theano/tensor/basic.py
浏览文件 @
44f47751
...
@@ -2866,7 +2866,8 @@ def extract_constant(x):
...
@@ -2866,7 +2866,8 @@ def extract_constant(x):
x
=
get_constant_value
(
x
)
x
=
get_constant_value
(
x
)
except
Exception
:
except
Exception
:
pass
pass
if
isinstance
(
x
,
scal
.
ScalarVariable
):
if
(
isinstance
(
x
,
scal
.
ScalarVariable
)
or
isinstance
(
x
,
scal
.
sharedvar
.
ScalarSharedVariable
)):
if
x
.
owner
and
isinstance
(
x
.
owner
.
op
,
ScalarFromTensor
):
if
x
.
owner
and
isinstance
(
x
.
owner
.
op
,
ScalarFromTensor
):
x
=
x
.
owner
.
inputs
[
0
]
x
=
x
.
owner
.
inputs
[
0
]
else
:
else
:
...
...
编写
预览
Markdown
格式
0%
重试
或
添加新文件
添加附件
取消
您添加了
0
人
到此讨论。请谨慎行事。
请先完成此评论的编辑!
取消
请
注册
或者
登录
后发表评论