Skip to content
项目
群组
代码片段
帮助
当前项目
正在载入...
登录 / 注册
切换导航面板
P
pytensor
项目
项目
详情
活动
周期分析
仓库
仓库
文件
提交
分支
标签
贡献者
图表
比较
统计图
议题
0
议题
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
CI / CD
CI / CD
流水线
作业
日程
统计图
Wiki
Wiki
代码片段
代码片段
成员
成员
折叠边栏
关闭边栏
活动
图像
聊天
创建新问题
作业
提交
问题看板
Open sidebar
testgroup
pytensor
Commits
839eda94
提交
839eda94
authored
1月 15, 2010
作者:
Razvan Pascanu
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
changes into the tutorials
上级
0fc7c8ca
隐藏空白字符变更
内嵌
并排
正在显示
6 个修改的文件
包含
638 行增加
和
878 行删除
+638
-878
adding.txt
doc/tutorial/adding.txt
+4
-6
examples.txt
doc/tutorial/examples.txt
+3
-0
index.txt
doc/tutorial/index.txt
+1
-1
numpy.txt
doc/tutorial/numpy.txt
+25
-5
scan.py
theano/sandbox/scan.py
+595
-420
test_scan.py
theano/sandbox/test_scan.py
+10
-446
没有找到文件。
doc/tutorial/adding.txt
浏览文件 @
839eda94
...
...
@@ -8,8 +8,8 @@ Baby steps - Adding two numbers together
Adding two scalars
==================
So, to get us started
and get a feel of what we're working with, let's
make a simple function: add two numbers together. Here is how you do
So, to get us started
with Theano and get a feel of what we're working with,
let's
make a simple function: add two numbers together. Here is how you do
it:
>>> x = T.dscalar('x')
...
...
@@ -26,7 +26,7 @@ array(28.4)
Let's break this down into several steps. The first step is to define
two symbols
, or Variables,
representing the quantities that you want
two symbols
representing the quantities that you want
to add. Note that from now on, we will use the term :term:`Variable`
to mean "symbol" (in other words, ``x``, ``y``, ``z`` are all Variable
objects). The output of the function ``f`` is a ``numpy.ndarray``
...
...
@@ -36,7 +36,6 @@ If you are following along and typing into an interpreter, you may have
noticed that there was a slight delay in executing the ``function``
instruction. Behind the scenes, ``f`` was being compiled into C code.
.. TODO: help
-------------------------------------------
...
...
@@ -64,8 +63,7 @@ TensorType(float64, scalar)
>>> x.type == T.dscalar
True
You can learn more about the structures in Theano in
the :ref:`advtutorial` and in :ref:`graphstructures`.
You can learn more about the structures in Theano in :ref:`graphstructures`.
By calling ``T.dscalar`` with a string argument, you create a
:term:`Variable` representing a floating-point scalar quantity with the
...
...
doc/tutorial/examples.txt
浏览文件 @
839eda94
...
...
@@ -137,6 +137,9 @@ with respect to the second. In this way, Theano can be used for
`automatic differentiation`_.
.. note::
The second argument of ``T.grad`` can be a list, case in which it
will
The variable of ``T.grad`` has the same dimensions as the
second argument. This is exactly like the first derivative if the
...
...
doc/tutorial/index.txt
浏览文件 @
839eda94
...
...
@@ -10,7 +10,7 @@ Let's start an interactive session and import Theano.
>>> from theano import *
Many of symbols you will need to use are in the ``tensor`` subpackage
of
t
heano. Let's import that subpackage under a handy name. I like
of
T
heano. Let's import that subpackage under a handy name. I like
``T`` (and many tutorials use this convention).
>>> import theano.tensor as T
...
...
doc/tutorial/numpy.txt
浏览文件 @
839eda94
...
...
@@ -8,10 +8,9 @@ NumPy refresher
Here are some quick guides to NumPy:
* `Numpy quick guide for Matlab users <http://www.scipy.org/NumPy_for_Matlab_Users>`__
* `More detailed table showing the NumPy equivalent of Matlab commands <http://www.scribd.com/doc/26685/Matlab-Python-and-R>`__
* `Numpy User Guide <http://docs.scipy.org/doc/numpy/user/index.html>`__
* `More detailed Numpy tutorial <http://www.scipy.org/Tentative_NumPy_Tutorial>`__
.. TODO [DefineBroadcasting Broadcasting]
.. Broadcastable - Implicitly assume that all previous entries are true.
.. [TODO: More doc, e.g. see _test_tensor.py]
...
...
@@ -20,8 +19,10 @@ Matrix conventions for machine learning
Rows are horizontal and columns are vertical.
Every row is an example. Therefore, inputs[10,5] is a matrix of 10 examples with 5 dimensions per.
So to make a NN out of it, multiply by a weight matrix of size (5, #hid).
Every row is an example. Therefore, inputs[10,5] is a matrix of 10 examples
where each example has dimension 5. If this would be the input of a
neural network then the weights from the input the the first hidden
layer would represent a matrix of size (5, #hid).
If I have an array:
...
...
@@ -43,3 +44,22 @@ To access the entry in the 3rd row (row #2) and the 1st column (column #0):
To remember this, keep in mind that we read left-to-right, top-to-bottom,
so each thing that is contiguous is a row. That is, there are 3 rows
and 2 columns.
Broadcasting
============
Numpy does :term:`broadcasting` of numpy arrays of different shapes during
arithmetic operations. What this means in general is that the smaller
array is *broadcasted* across the larger array so that they have
compatible shapes. The example below shows an instance of
*broadcastaing*:
>>> a = numpy.asarray([1.0, 2.0, 3.0])
>>> b = 2.0
>>> a * b
array([2., 4., 6.])
The smaller array ``b`` in this case is *broadcasted* to the same size
as a during the multiplication. This trick is often useful in
simplifying how expression are written. More details about *broadcasting*
can be found at `numpy user guide <http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html>`__ .
theano/sandbox/scan.py
浏览文件 @
839eda94
"""Provide Scan and related functions
Scanning a function over sequential input(s) producing sequential output(s).
Scanning a function over sequential input(s) producing sequential output(s).
Scanning is a general form of recurrence, which can be used for looping.
Scanning is a general form of recurrence, which can be used for looping.
The idea is that you 'scan' a function along some input sequence, producing an output at each
time-step that can be seen (but not modified) by the function at the next time-step.
(Technically, the function can see the previous K time-steps.)
The idea is that you 'scan' a function along some input sequence, producing
an output at each time-step that can be seen (but not modified) by the
function at the next time-step. (Technically, the function can see the
previous K time-steps.)
So for example, ``sum()`` could be computed by scanning the ``z+x_i`` function over a list,
given an initial state of ``z=0``.
So for example, ``sum()`` could be computed by scanning the ``z+x_i``
function over a list,
given an initial state of ``z=0``.
Special cases:
Special cases:
- A ``reduce()`` operation can be performed by returning only the last output of a scan.
- A ``reduce()`` operation can be performed by returning only the last
output of a scan.
- A ``map()`` operation can be performed by applying a function that
ignores each previous
output.
- A ``map()`` operation can be performed by applying a function that
ignores each previous
output.
Often a for loop can be expressed as a scan() operation, and scan is the closest that theano
comes to looping.
Often a for loop can be expressed as a scan() operation, and scan is the
closest that theano
comes to looping.
This module provides scanning functionality with the `Scan` Op.
This module provides scanning functionality with the `Scan` Op.
"""
__docformat__
=
'restructedtext en'
import
traceback
import
numpy
import
theano
import
theano.compile
from
theano.tensor
import
opt
from
theano
import
gof
from
theano.compile
import
optdb
'''
TODO : move out of sandbox !
'''
class
Scan
(
theano
.
Op
):
"""Scan a function `fn` over several inputs producing several outputs
# Logging function for sending warning or info
import
logging
_logger
=
logging
.
getLogger
(
'theano.scan'
)
def
warning
(
*
msg
):
_logger
.
warning
(
'WARNING theano.scan: '
+
' '
.
join
(
msg
))
def
info
(
*
msg
):
_logger
.
info
(
'INFO theano.scan: '
+
' '
.
join
(
msg
))
# Hashing a list; list used by scan are list of numbers, therefore a list
# can be hashed by hashing all elements in the list
def
hash_list
(
list
):
hash_value
=
0
for
v
in
list
:
hash_value
^=
v
return
hash_value
# Hashing a dictionary; the dictionary used by scan has as keys numbers and
# as values either numbers or list of numbers
def
hash_dict
(
dictionary
):
hash_value
=
0
for
k
,
v
in
dictionary
,
iteritems
():
# hash key
hash_value
^=
k
if
type
(
v
)
in
(
list
,
tuple
):
hash_value
^=
hash_list
(
v
)
else
:
hash_value
^=
v
return
hash_value
This Op implements a generalization of scan in which `fn` may consult several previous
outputs from the past, from positions (taps) relative to the current time. The number of
taps (T_j) to use for each output (y_j) must be provided when creating a Scan Op.
Apply Inputs:
def
scan
(
fn
,
sequnces
,
non_sequences
,
seed_values
,
inplace_map
=
{},
sequences_taps
=
{},
outputs_taps
=
{},
len
=
theano
.
tensor
.
zero
(),
force_gradient
=
False
,
truncate_gradient
=
-
1
,
go_backwards
=
False
,
mode
=
'FAST_RUN'
):
'''The function creates a more intuitive interface to the scan op.
X sequence inputs x_1, x_2, ... x_X
This function first creates a scan op object, and afterwards applies it
to the input data. The scan operation iterates over X sequences producing
Y outputs. The function that is applied recursively may consult several
previous outputs from the past as well as past values and future values
of the input. You can see it as havin the inputs :
Y initial states (u_1, u_2, ... u_Y) for our outputs. Each must have appropriate length
(T_1, T_2, ..., T_Y).
X sequences inptus x_1, x_2, .. x_X
W other inputs w_1, w_2, ... w_W
Y seeds/initial values ( u_1, u_2, .. u_Y) for the outputs
Apply Outputs:
W non sequences inputs w_1, w_2, .. w_W
Y sequence outputs y_1, y_2, ... y_Y
Outputs :
Y sequence outputs y_1, y_2, .. y_Y
Each output y_j is computed one time-step at a time according to the formula:
Each otuput y_j computed one time step at a time according to the
formula:
.. code-block:: python
(y_1[t], y_2[t],.., y_Y[t]) = fn(
x_1[t], x_2[t], ... x_X[t], # X current input values
y_1(t-1), y_1(t-2), .., y_1(t-T_1), # T_1 previous outputs for y_1
y_2(t-1), y_2(t-2), ..., y_2(t-T_2), # T_2 previous outputs for y_2
..., # ...
y_Y(t-1), y_Y(t-2), ..., y_Y(t-T_Y), # T_Y previous outputs for y_Y
w_1, w_2,..., w_W) # W 'timeless' inputs
(y_1[t], y_2[t], .. y_Y[t]) = f(
x_1[t-K_1],.. x_1[t],x_1[t+1],.. x_1[t+L_1], # x_1 past and future
#values
x_2[t-K-2],.. x_2[t],x_2[t+1],.. x_2[t+L_2], # x_2 past and future
# values
... # ...
y_1[t-1], y_1[t-2], .. y[t - T_1], # past values of y_1
y_2[t-1], y_2[t-2], .. y[t - T_2],, # past values of y_2
...
w_1, w_2, .., w_W) # 'timeless' inputs
So `fn` must accept X + T_1 + T_2 + ... + T_Y + W arguments.
There are two high-level methods (`symbolic`, `compiled`) for creating a Scan Op besides
the low-level `__init__` constructor. ***Why would you call them?***
:param fn: fn is a lambda expression or a function that given a list of
symbolic inputs returns the update list and symbolic outputs list of the
function that shall be applied recursively.
:param sequences:list of sequences over which the scan op should iterate;
sequnces length should also cover past and future taps; for example if
you also use for a sequence the past tap -3 and future tap +4, to total
length should be n+7, where first 3 values of sequence are those
corresponding to -3 -2 -1 and the last 4 values correspond to n+1 n+2
n+3 and n+4
:param non_sequences: list of inputs over which it shouldn't iterate
:param seed_values: seeds (initial values) of the outputs; if past taps
are this seeds should contain enough values to cover this past values;
note that index 0 of a seed belongs to the largest past tap
:param inplace_map: a dictionary telling which output should be
computed in place of which input sequence ; input sequence has to be
of the same shape as the output
When applying a Scan Op to theano Variables, the order of arguments is very important! When
using the full flexibility of Scan there can be a lot of arguments, but it is essential to
put them in the following order:
:param sequence_taps: a dictionary telling for each sequence what past
and future taps it should use; past values should be negative, future
taps positives; by default 0 is added in this dictionary (current value)
if nothing is provided
1. "Ignored inputs" (x_i with i < n_inplace_ignore) that will be overwritten by an inplace scan.
:param outputs_taps: a dictionary telling for each output what past
taps it should use (negative values); by default -1 is added to this
dictionary if nothing is provided
2. Inputs that will be overwritten by an inplace scan (x_i with i < n_inplace)
:param len: a value (or theano scalar) describing for how many steps
the scan should iterate; 0 means that it should iterate over the entire
length of the input sequence(s)
3. Remaining Inputs (x_i with i >= n_inplace)
:param force_gradient: a flag telling scan op that the gradient can be
computed even though inplace or updates are used - use this on your own
risk
3. Output states (u_j) corresponding to the outputs that are computed inplace (j <
n_inplace)
:param truncate_gradient: tells for how many steps should scan go
back in time on the backward pass of backpropagation through time
4. Remaining output states not given in 3 (u_j with j >= n_inplace)
:param go_backwards: a flag indicating if scan should iterate back from
the end of the sequence to the begining (if it is true) or from 0 to
the end
5. Other inputs (w_1, w_2, ... w_W)
:param mode: indicates the mode that should be used to compile the
function that will be applied recursively
'''
Inplace Operation
=================
The Scan Op supports computing some (`n_inplace`) of the outputs y_j using the memory from
corresponding inputs x_j.
It is not possible to indicate precisely which outputs overwrite which inputs, but without
loss of generality we assume that each of the first `n_inplace` outputs (y_j) overwrites
the corresponding input (x_j).
# check if inputs are just single variables instead of lists
if
not
(
type
(
sequences
)
in
(
list
,
tuple
)):
seqs
=
[
sequences
]
elif
seqs
=
sequences
if
not
type
(
seed_values
)
in
(
list
,
tuple
)):
seeds
=
[
seed_values
]
elif
seeds
=
seed_values
if
not
(
type
(
non_sequences
)
in
(
list
,
tuple
)):
non_seqs
=
[
non_sequences
]
elif
non_seqs
=
non_sequences
# compute number of sequences and number of seeds
n_seqs
=
len
(
seqs
)
# see if there are outputs that do not feed anything back to the function
# applied recursively
outs_tapkeys
=
outputs_taps
.
keys
()
for
k
in
outs_tapkeys
.
sort
():
if
outputs_taps
[
k
]
==
[]
# add empty lists where you have outputs that do not have past
# values
seeds
=
seeds
[:
k
]
+
[[]]
+
seeds
[
k
:]
n_seeds
=
len
(
seeds
)
# update sequences_taps[idx] to contain 0 if it is not defined
for
i
in
xrange
(
n_seqs
):
if
not
sequences_taps
.
has_key
(
i
):
sequences_taps
.
update
({
i
:[
0
]})
# if input sequence is not actually used by the recursive function
elif
sequences_taps
[
i
]
==
[]:
sequences_taps
.
__delitem__
(
i
)
elif
not
(
sequences_taps
[
i
]
in
(
list
,
tuple
)):
sequences_taps
[
i
]
=
[
sequences_taps
[
i
]]
# update outputs_taps[idx] to contain -1 if it is not defined
for
i
in
xrange
(
n_seeds
):
if
not
outputs_taps
.
has_key
(
i
):
outputs_taps
.
update
({
i
:
-
1
})
# if output sequence is not actually used as input to the recursive
# function
elif
outputs_taps
[
i
]
==
[]:
outputs_taps
.
__delitem__
(
i
)
elif
not
(
outputs_taps
[
i
]
in
(
list
,
tuple
)):
outputs_taps
[
i
]
=
[
outputs_taps
[
i
]]
# create theano inputs for the recursive function
args
=
[]
for
(
i
,
seq
)
in
enumerate
(
seqs
):
if
sequences_taps
.
has_key
(
i
):
for
k
in
len
(
sequences_taps
[
i
]):
args
+=
[
seq
[
0
]
.
type
()
]
for
(
i
,
seed
)
in
enumerate
(
seeds
):
if
outputs_taps
.
has_key
(
i
):
for
k
in
len
(
outputs_taps
[
i
]):
args
+=
[
seed
[
0
]
.
type
()
]
args
+=
non_seqs
next_outs
,
updates
=
fn
(
*
args
)
# Create the Scan op object
local_op
=
Scan
(
(
args
,
next_outs
,
updates
),
n_seqs
,
n_seeds
,
inplace_map
,
sequences_taps
,
outputs_taps
,
force_gradient
,
truncate_gradient
,
go_backwards
,
mode
)
# Call the object on the input sequences, seeds, and non sequences
return
local_op
(
*
(
[
thenao
.
tensor
.
as_tensor
(
len
)]
\
+
seqs
\
+
seeds
\
+
non_seqs
))
''' The class implementing the scan op
The actual class. I would not recommend using it directly unless you really
know what you are doing'
'''
class
Scan
(
theano
.
Op
):
def
__init__
(
self
,(
inputs
,
outputs
,
updates
),
n_seqs
,
n_seeds
,
inplace_map
=
{},
seqs_taps
=
{},
outs_taps
=
{},
force_gradient
=
False
,
truncate_gradient
=
-
1
,
go_backwards
=
False
,
inplace
=
False
):
'''
:param inputs: list of symbolic inputs of the function that will
be applied recursively
:param outputs: list of symbolic outputs for the function applied
recursively
Note that using inplace computations destroys information, and may make it
impossible to compute the gradient.
As long as the function 'fn' does not update any of the other
parameters (w_1,..) a gradient of this operation is supported.
***Who will care about this? Someone just using the Op? Someone writing an inplace
optimization?***
:param updates: list of updates for the function applied recursively
Ignored Input
s
==============
:param n_seqs: number of sequences in the input over which it need
s
to iterate
**** Behaviour? Rationale? Use case?
:param n_seeds: number of outputs (same as the number of seeds)
"""
@classmethod
def
symbolic
(
cls
,(
in_args
,
out_args
),
n_ins
,
n_outs
,
\
n_inplace
=
0
,
n_inplace_ignore
=
0
,
taps
=
{},
mode
=
'FAST_RUN'
):
# if in_args is not a list assume it is just a variable and
# convert it to a list (if this is neither the case the code will
# raise an error somewhere else !)
if
not
(
type
(
in_args
)
in
(
list
,
tuple
)):
in_args
=
[
in_args
]
# if out_args is not a list assume it is just a variable and
# convert it to a list
if
not
(
type
(
out_args
)
in
(
list
,
tuple
)):
out_args
=
[
out_args
]
# Create fn
my_fn
=
theano
.
compile
.
sandbox
.
pfunc
(
in_args
,
out_args
,
mode
=
mode
)
# Create gradient function
gy_next
=
[
out_args
[
0
]
.
type
()]
g_inputs
=
theano
.
tensor
.
grad
(
out_args
[
0
],
in_args
,
g_cost
=
gy_next
[
-
1
])
for
y_next
in
out_args
[
1
:]
:
gy_next
+=
[
y_next
.
type
()]
g_ls
=
theano
.
tensor
.
grad
(
y_next
,
in_args
,
g_cost
=
gy_next
[
-
1
])
for
i
in
xrange
(
len
(
in_args
)):
g_inputs
[
i
]
+=
g_ls
[
i
]
g_fn
=
theano
.
compile
.
sandbox
.
pfunc
(
gy_next
+
in_args
,
g_inputs
,
mode
=
mode
)
:param inplace_map: dictionary discribing which output should be
computed inplace of which input
return
cls
(
my_fn
,
g_fn
,
n_ins
,
n_outs
,
\
n_inplace
,
n_inplace_ignore
,
taps
)
:param seqs_taps: dictionary discribing which past and future taps
of the input sequences are used by the recursive function
@classmethod
def
compiled
(
cls
,
fn
,
n_ins
,
n_outs
,
\
n_inplace
=
0
,
n_inplace_ignore
=
0
,
taps
=
{}):
"""Return a Scan instance that will scan the callable `fn` over `n_ins` inputs and
`n_outs` outputs.
:param outs_taps: dictionary discribing which past taps of the
outputs the recursive function is using
:param force_gradient: a flag indicating if the gradient is still
computable even though inplace operation or updates are used
"""
return
cls
(
fn
,
None
,
n_ins
,
n_outs
,
\
n_inplace
,
n_inplace_ignore
,
taps
=
taps
)
:param truncate_gradient: if different from -1 it tells after how
many steps in the backward pass of BPTT
'''
# check inplace map
for
_out
,
_in
in
inplace_map
.
iteritems
():
if
_out
>
n_seeds
:
raise
ValueError
((
'Inplace map reffers to an unexisting'
\
'output
%
d'
)
%
_out
)
if
_in
>
n_seqs
:
raise
ValueError
((
'Inplace map reffers to an unexisting'
\
'input sequence
%
d'
)
%
_in
)
if
(
_in
>=
0
)
and
(
min
(
seqs_taps
[
_in
])
<
0
):
raise
ValueError
((
'Input sequence
%
d uses past values that '
\
'will be overwritten by inplace operation'
)
%
_in
)
def
__init__
(
self
,
fn
,
grad_fn
,
n_ins
,
n_outs
,
n_inplace
=
0
,
n_inplace_ignore
=
0
,
taps
=
{},
inplace
=
False
):
"""Create an instance of the scan class
#check sequences past taps
for
k
,
v
in
seqs_taps
.
map_iteritems
():
if
k
>
n_seqs
:
raise
ValueError
((
'Sequences past taps dictionary reffers to '
'an unexisting sequence
%
d'
)
%
k
)
To use Scan, first you need to create it specifying the number of inputs, outputs,
inplace outputs (see notes below), and inputs to be ignored, a dictionary describing
the time taps used, the function that will be applied recursively and optionally, the
gradient function (or a symbolic definition of the function and the op will compute the
gradient on its own). Secondly you just call the op with a list of parameters.
#check outputs past taps
for
k
,
v
in
outs_taps
.
map_iteritems
():
if
k
>
n_seeds
:
raise
ValueError
((
'Sequences past taps dictionary reffers to '
'an unexisting sequence
%
d'
)
%
k
)
if
max
(
v
)
>
-
1
:
raise
ValueError
((
'Can not require future value
%
d of output'
'
%
d'
)
%
(
k
,
max
(
v
)))
:param fn: compiled function that takes you from time step t-1 to t
:param grad_fn: gradient of the function applied recursevly
:param n_ins: number of inputs; in the list of arguments
they start from 0 to 'n_ins'
:param n_outs: number of outputs; in the list of arguments you
need to give the initial state of each outputs, this will be from
'n_ins' to 'n_outs'; each initial state should be a matrix where
the first dimension is time and should be sufficiently large to
cover the time taps. The matrix for an initial state should be
ordered such that if you use k delays, index 0 of matrix stands for
the value at time -k, index 1 for value at time 1-k, index 2 for
value at time 2-k and index k-1 for value at time -1
:param n_inplace: indicates the number of outputs that should be
computed inplace; in the list of arguments there will be the first
'n_inplace' outputs in place of the first 'n_inplace' inputs
:param n_inplace_ignore: indicates the number of inputs that are
given just to be replaced by the inplace computation and which
should not be given as arguments to the function applied
recursevly
:param taps: a dictionary which for each output index gives
a list of what taps it uses; a tap is given as an int,
where x stands for output(t - x); note that a past trace of 1 makes
no sense, since you get that by default
:param inplace: is used by the optimizer that allows the inplace
computation
"""
if
n_ins
<
1
:
raise
ValueError
(
'Scan should iterate over at least on one input'
)
if
n_outs
<
1
:
raise
ValueError
(
'Scan should have at least one output'
)
if
(
n_inplace
>
n_ins
):
raise
ValueError
(
'Number of inplace outputs should be smaller than '
'the number of inputs.'
)
if
(
n_inplace
<
0
):
raise
ValueError
(
'Number of inplace outputs should be larger '
'or equal to 0'
)
if
(
n_inplace_ignore
>
n_inplace
):
raise
ValueError
(
'Number of inputs to ignore should not be '
\
'larger than number of inplace outputs'
)
if
(
n_inplace_ignore
<
0
):
raise
ValueError
(
'n_inplace_ignore should be non-negative'
)
self
.
destroy_map
=
{}
if
inplace
:
for
i
in
xrange
(
n_inplace
):
self
.
destroy_map
.
update
(
{
i
:[
i
]}
)
for
(
k
,
v
)
in
taps
.
iteritems
():
if
k
<
0
or
k
>
n_outs
:
raise
ValueError
(
'Taps dictionary contains wrong key!'
)
for
vi
in
v
:
# why is it illegal to specify vi < 2?
# what is special about vi == 1?
#
# Would it be simpler to just leave v alone if it is non-empty (checking that
# all vi are >=1) and set v = [1] for all missing output keys?
if
vi
<
2
:
raise
ValueError
(
'Taps dictionary contains wrong values!'
)
self
.
taps
=
taps
self
.
n_ins
=
n_ins
self
.
n_outs
=
n_outs
self
.
n_inplace
=
n_inplace
self
.
inplace
=
inplace
self
.
n_inplace_ignore
=
n_inplace_ignore
self
.
fn
=
fn
self
.
grad_fn
=
grad_fn
self
.
destroy_map
=
inplace_map
self
.
seqs_taps
=
seqs_taps
self
.
outs_taps
=
outs_taps
self
.
n_seqs
=
n_seqs
self
.
n_seeds
=
n_seeds
self
.
n_args
=
n_seqs
+
n_seeds
+
1
self
.
inplace_map
=
inplace_map
self
.
inplace
=
inplace
self
.
inputs
=
inputs
self
.
outputs
=
outputs
self
.
updates
=
updates
self
.
force_gradient
=
force_gradient
self
.
truncate_gradient
=
truncate_gradient
self
.
go_backwards
=
go_backwards
def
make_node
(
self
,
*
inputs
):
"""Create an node for the Scan operation
self
.
fn
=
theano
.
function
(
inputs
,
outputs
,
\
updates
=
updates
,
mode
=
mode
)
:param inputs: list of inputs for the operations; they should be
at least 'self.n_ins'+'self.n_outs' arguments; first 'self.n_inplace'
are inputs that are replaced inplace, followed by oter inputs up
to 'self.n_ins'; next 'self.n_outs' are ouputs followed by other
arguments that will be given to the function applied recursevly
"""
g_y
=
[
outputs
[
0
]
.
type
()]
g_args
=
theano
.
tensor
.
grad
(
outputs
[
0
],
inputs
,
g_cost
=
g_y
[
-
1
])
# for all outputs compute gradients and then sum them up
for
y
in
outputs
[
1
:]:
g_y
+=
[
y
.
type
()]
g_args_y
=
theano
.
tensor
.
grad
(
y
,
inputs
,
g_cost
=
g_y
[
-
1
])
for
i
in
xrange
(
len
(
g_args
)):
g_args
[
i
]
+=
g_args_y
[
i
]
n_args
=
len
(
inputs
)
min_n_args
=
self
.
n_ins
+
self
.
n_outs
if
n_args
<
min_n_args
:
err
=
'There should be at least '
+
str
(
min_n_args
)
+
'arguments'
raise
ValueError
(
err
)
# Create list of output datatypes
out_types
=
[]
for
i
in
xrange
(
self
.
n_ins
,
self
.
n_ins
+
self
.
n_outs
):
out_types
+=
[
theano
.
tensor
.
Tensor
(
dtype
=
inputs
[
i
]
.
dtype
,
\
broadcastable
=
(
False
,)
+
inputs
[
i
]
.
broadcastable
[
1
:])()]
return
theano
.
Apply
(
self
,
inputs
,
out_types
)
self
.
g_ins
=
g_y
+
inputs
self
.
g_outs
=
g_args
def
make_node
(
self
,
*
inputs
):
n_args
=
len
(
inputs
)
if
n_args
<
self
.
n_args
:
err
=
'There should be at least '
+
str
(
self
.
n_args
)
+
'arguments'
raise
ValueError
(
err
)
# Create list of output datatypes
out_types
=
[]
for
i
in
xrange
(
self
.
n_seqs
+
1
,
self
.
n_seqs
+
self
.
n_seeds
+
1
):
out_types
+=
[
theano
.
tensor
.
Tensor
(
dtype
=
inputs
[
i
]
.
dtype
,
\
broadcastable
=
(
False
,)
+
inputs
[
i
]
.
broadcastable
[
1
:])()]
return
theano
.
Apply
(
self
,
inputs
,
out_types
)
def
__eq__
(
self
,
other
):
rval
=
type
(
self
)
==
type
(
other
)
if
rval
:
rval
=
(
self
.
fn
is
other
.
fn
)
and
\
(
self
.
grad_fn
is
other
.
grad_fn
)
and
\
(
self
.
n_ins
==
other
.
n_ins
)
and
\
(
self
.
n_outs
==
other
.
n_outs
)
and
\
(
self
.
n_inplace
==
other
.
n_inplace
)
and
\
(
self
.
n_inplace_ignore
==
other
.
n_inplace_ignore
)
and
\
(
self
.
inplace
==
other
.
inplace
)
and
\
(
self
.
taps
==
other
.
taps
)
return
rval
rval
=
type
(
self
)
==
type
(
other
)
if
rval
:
rval
=
(
self
.
inputs
==
other
.
inputs
)
and
\
(
self
.
outputs
==
other
.
outputs
)
and
\
(
self
.
updates
==
other
.
updates
)
and
\
(
self
.
g_ins
==
other
.
g_ins
)
and
\
(
self
.
g_outs
==
other
.
g_outs
)
and
\
(
self
.
seqs_taps
==
other
.
seqs_taps
)
and
\
(
self
.
outs_taps
==
other
.
outs_taps
)
and
\
(
self
.
inplace_map
==
other
.
inplace_map
)
and
\
(
self
.
n_seqs
==
other
.
n_seqs
)
and
\
(
self
.
inplace
==
other
.
inplace
)
and
\
(
self
.
go_backwards
==
other
.
go_backwards
)
and
\
(
self
.
truncate_gradient
==
other
.
truncate_gradient
)
and
\
(
self
.
force_gradient
=
other
.
force_gradient
)
and
\
(
self
.
n_seeds
==
other
.
n_seeds
)
and
\
(
self
.
n_args
==
other
.
n_args
)
return
rval
def
__hash__
(
self
):
# hash the taps dictionary
taps_hash
=
0
for
k
,
v
in
self
.
taps
.
iteritems
():
taps_hash
^=
k
for
vi
in
v
:
taps_hash
^=
vi
return
hash
(
type
(
self
))
^
\
hash
(
self
.
fn
)
^
\
hash
(
self
.
grad_fn
)
^
\
hash
(
self
.
n_ins
)
^
\
hash
(
self
.
n_outs
)
^
\
hash
(
self
.
n_inplace
)
^
\
hash
(
self
.
n_inplace_ignore
)
^
\
hash
(
self
.
inplace
)
^
\
taps_hash
return
hash
(
type
(
self
))
^
\
hash
(
self
.
n_seqs
)
^
\
hash
(
self
.
n_seeds
)
^
\
hash
(
self
.
force_gradient
)
^
\
hash
(
self
.
inplace
)
^
\
hash
(
self
.
go_backwards
)
^
\
hash
(
self
.
truncate_gradient
)
^
\
hash
(
self
.
n_args
)
^
\
hash_list
(
self
.
outputs
)
^
\
hash_list
(
self
.
inputs
)
^
\
hash_list
(
g_ins
)
^
\
hash_list
(
h_outs
)
^
\
hash_dict
(
self
.
seqs_taps
)
^
\
hash_dict
(
self
.
outs_taps
)
^
\
hash_dict
(
self
.
inplace_map
)
^
\
hash_dict
(
self
.
updates
)
def
grad
(
self
,
inputs
,
g_outs
):
if
self
.
grad_fn
==
None
:
print
'Warning! no gradient for the recursive function was given'
return
[
None
for
i
in
inputs
]
else
:
y
=
self
(
*
inputs
)
if
not
(
type
(
y
)
in
(
list
,
tuple
)):
y
=
[
y
]
for
i
in
xrange
(
len
(
y
)):
if
g_outs
[
i
]
==
None
:
g_outs
[
i
]
=
theano
.
tensor
.
zeros_like
(
y
[
i
])
# Construct my gradient class:
gradScan
=
ScanGrad
(
self
.
grad_fn
,
self
.
n_ins
-
self
.
n_inplace_ignore
,
self
.
n_outs
,
self
.
taps
)
args
=
g_outs
+
y
+
\
inputs
[
self
.
n_inplace_ignore
:]
grads
=
gradScan
(
*
args
)
rval
=
[
None
for
i
in
inputs
[:
self
.
n_inplace_ignore
]]
+
grads
return
rval
def
perform
(
self
,
node
,
args
,
outs
):
# find number of timesteps, note that a precondition is to have
# atleast one input to iterate over
n_steps
=
len
(
args
[
0
])
n_steps
=
0
if
(
self
.
n_seqs
==
0
)
and
(
args
[
0
]
==
0
)
raise
ValueError
(
'Scan does not know over how many steps it '
'should iterate! No input sequence or number of steps to '
'iterate given !'
)
# check if we deal with a inplace operation
n_inplace
=
self
.
n_inplace
n_inplace_ignore
=
self
.
n_inplace_ignore
if
(
args
[
0
]
!=
0
):
n_steps
=
args
[
0
]
for
i
in
xrange
(
self
.
n_seqs
):
if
self
.
seqs_taps
.
has_key
(
i
):
# compute actual length of the sequence ( we need to see what
# past taps this sequence has, and leave room for them
seq_len
=
args
[
i
+
1
]
.
shape
[
0
]
+
min
(
self
.
seqs_taps
[
i
+
1
])
if
self
.
seqs_taps
[
i
+
1
][
2
]
>
0
:
# using future values, so need to end the sequence earlier
seq_len
-=
self
.
seqs_taps
[
i
+
1
][
2
]
if
n_steps
==
0
:
# length of the sequences, leaving room for the largest
n_steps
=
seq_len
if
seq_len
!=
n_steps
:
warning
((
'Input sequence
%
d has a shorter length then the '
'expected number of steps
%
d'
)
%
(
i
,
n_steps
))
n_steps
=
min
(
seq_len
,
n_steps
)
# check if we deal with an inplace operation
inplace_map
=
self
.
inplace_map
if
not
self
.
inplace
:
#if it was not optimized to work inplace
n_inplace
=
0
inplace_map
=
{}
# check lengths of inputs
for
i
in
xrange
(
self
.
n_ins
):
if
args
[
i
]
.
shape
[
0
]
!=
n_steps
:
raise
ValueError
(
'All inputs should have n_steps length!'
)
# check lengths of initial states
for
i
in
xrange
(
self
.
n_ins
,
self
.
n_ins
+
self
.
n_outs
):
req_size
=
1
if
self
.
taps
.
has_key
(
i
-
self
.
n_ins
):
req_size
=
max
(
self
.
taps
[
i
-
self
.
n_ins
])
if
len
(
args
[
i
]
.
shape
)
==
0
:
raise
ValueError
(
'Wrong initial state! '
)
# check lengths of seeds
for
i
in
xrange
(
self
.
n_seqs
+
1
,
\
self
.
n_seqs
+
self
.
n_seeds
+
1
):
if
self
.
outs_taps
.
has_key
(
i
-
self
.
n_seqs
-
1
):
req_size
=
abs
(
min
(
self
.
outs_taps
[
i
-
self
.
n_seqs
-
1
]))
-
1
if
args
[
i
]
.
shape
[
0
]
<
req_size
:
raise
ValueError
(
'Wrong initial state! '
)
# allocate space for the outputs
y
=
[]
# inplace outputs
for
i
in
xrange
(
n_inplace
):
y
+=
[
args
[
i
]]
# add outputs
for
i
in
xrange
(
self
.
n_ins
+
n_inplace
,
self
.
n_ins
+
self
.
n_outs
):
y_shape
=
(
n_steps
,)
+
args
[
i
]
.
shape
[
1
:]
y
+=
[
numpy
.
empty
(
y_shape
,
dtype
=
args
[
i
]
.
dtype
)]
# iterate
for
i
in
xrange
(
n_steps
):
fn_args
=
[]
# get a time slice of inputs
for
j
in
xrange
(
n_inplace_ignore
,
self
.
n_ins
):
fn_args
+=
[
args
[
j
][
i
]]
warning
((
'Initial state for output
%
d has fewer values then '
'required by the maximal past value
%
d. Scan will use 0s'
' for missing values'
)
%
(
i
-
self
.
n_iterable
-
1
,
req_size
))
# get past values of outputs (t-1 + taps)
for
j
in
xrange
(
self
.
n_outs
):
# get list of taps
ls_taps
=
[
1
]
if
self
.
taps
.
has_key
(
j
):
ls_taps
+=
self
.
taps
[
j
]
maxVal
=
max
(
ls_taps
)
for
tap_value
in
ls_taps
:
if
i
-
tap_value
<
0
:
fn_args
+=
[
args
[
j
+
self
.
n_ins
][
maxVal
-
tap_value
+
i
]]
else
:
fn_args
+=
[
y
[
j
][
i
-
tap_value
]]
self
.
n_steps
=
n_steps
y
=
self
.
scan
(
self
.
fn
,
args
[
1
:],
self
.
n_seqs
,
self
.
n_seeds
,
self
.
seqs_taps
,
self
.
outs_taps
,
n_steps
,
self
.
go_backwards
,
inplace_map
)
# get the none iterable parameters
fn_args
+=
list
(
args
[(
self
.
n_ins
+
self
.
n_outs
):])
# compute output
something
=
self
.
fn
(
*
fn_args
)
# update y and inplace outputs
for
j
in
xrange
(
self
.
n_outs
):
y
[
j
][
i
]
=
something
[
j
]
# write to storage
for
i
in
xrange
(
self
.
n_
out
s
):
for
i
in
xrange
(
self
.
n_
seed
s
):
outs
[
i
][
0
]
=
y
[
i
]
def
scan
(
fn
,
args
,
n_seqs
,
n_seeds
,
seqs_taps
,
outs_taps
,
n_steps
,
go_backwards
,
inplace_map
):
y
=
[]
for
i
in
xrange
(
self
.
n_seeds
):
if
inplace_map
.
has_key
(
i
)
and
(
inplace_map
[
i
]
>=
0
):
y
+=
[
args
[
inplace_map
[
i
]]]
else
:
y_shape
=
(
n_steps
,)
+
args
[
i
+
self
.
n_seqs
]
.
shape
[
1
:]
y
+=
[
numpy
.
empty
(
y_shape
,
dtype
=
args
[
i
+
self
.
n_seqs
]
.
dtype
)]
#iterate
if
go_backwards
:
the_range
=
xrange
(
n_steps
-
1
,
-
1
,
-
1
)
else
:
the_range
=
xrange
(
n_steps
)
seqs_mins
=
{}
for
j
in
xrange
(
self
.
n_seqs
):
if
seqs_taps
.
has_key
(
j
):
seqs_mins
.
update
({
j
:
min
(
seqs_taps
[
j
])})
outs_mins
=
{}
seed_size
=
{}
for
j
in
xrange
(
self
.
n_seeds
):
if
outs_taps
.
has_key
(
j
):
outs_mins
.
update
({
j
:
min
(
outs_taps
[
j
])})
seed_size
.
update
({
j
:
args
[
n_seqs
+
j
]
.
shape
[
0
]})
for
i
in
the_range
:
fn_args
=
[]
# sequences over which scan iterates
for
j
in
xrange
(
self
.
n_seqs
):
if
seqs_taps
.
has_key
(
j
):
ls_taps
=
seqs_taps
[
j
]
min_tap
=
seqs_mins
[
j
]
for
tap_value
in
ls_taps
:
k
=
i
-
min_tap
+
tap_value
fn_args
+=
[
args
[
j
][
k
]]
# seeds or past values of outputs
for
j
in
xrange
(
self
.
n_seeds
):
if
outs_taps
.
has_key
(
j
):
ls_taps
=
outs_taps
[
j
]
min_tap
=
outs_mins
[
j
]
seed_sz
=
seed_size
[
j
]
for
tap_value
in
ls_taps
:
if
i
+
tap_value
<
0
:
k
=
i
+
seed_sz
+
tap_value
if
k
<
0
# past value not provided.. issue a warning and use 0s
fn_args
+=
[
numpy
.
zeros
(
args
[
j
][
0
]
.
shape
)]
warning
(
'Past value
%
d for output
%
d not given in seeds'
%
(
j
,
tap_value
))
else
:
fn_args
+=
[
args
[
j
][
k
]]
else
:
fn_args
+=
[
y
[
j
][
i
+
tap_value
]]
# get the non-iterable sequences
fn_args
+=
list
(
args
[(
self
.
n_seqs
+
self
.
n_seedss
):]
# compute output
something
=
fn
(
*
fn_args
)
#update outputs
for
j
in
xrange
(
self
.
n_seeds
):
y
[
j
][
i
]
=
something
[
j
]
return
y
def
grad
(
self
,
args
,
g_outs
):
if
(
not
self
.
force_gradient
)
and
\
((
self
.
updates
.
keys
()
!=
[])
or
(
self
.
inplace_map
.
keys
()
!=
[])):
warning
(
'Can not compute gradients if inplace or updates '
\
'are used. Use force_gradient if you know for sure '
\
'that the gradient can be computed automatically.'
)
return
[
None
for
i
in
inputs
]
else
:
# forward pass
y
=
self
(
*
args
)
if
not
(
type
(
y
)
in
(
list
,
tuple
)):
y
=
[
y
]
# backwards pass
for
i
in
xrange
(
len
(
y
)):
if
g_outs
[
i
]
==
None
:
g_outs
[
i
]
=
theano
.
tensor
.
zeros_like
(
y
[
i
])
g_args
=
[
self
.
n_steps
]
+
g_outs
+
y
# check if go_backwards is true
if
self
.
go_backwards
:
for
seq
in
args
[
1
:
self
.
n_seqs
]:
g_args
+=
[
seq
[::
-
1
]]
else
:
g_args
+=
args
[
1
:
self
.
n_seqs
]
g_args
+=
args
[
1
+
self
.
n_seqs
:
]
g_scan
=
ScanGrad
((
self
.
g_ins
,
self
.
g_outs
),
self
.
n_seqs
,
\
self
.
n_seeds
,
self
.
seqs_taps
,
self
.
outs_taps
,
self
.
truncate_gradient
)
return
g_scan
(
g_args
)
@gof.local_optimizer
([
None
])
def
scan_make_inplace
(
node
):
op
=
node
.
op
if
isinstance
(
op
,
Scan
)
and
(
not
op
.
inplace
)
and
(
op
.
n_inplace
>
0
):
return
Scan
(
op
.
fn
,
op
.
grad_fn
,
op
.
n_ins
,
\
op
.
n_outs
,
op
.
n_inplace
,
op
.
n_inplace_ignore
,
\
op
.
taps
,
inplace
=
True
\
)
.
make_node
(
*
node
.
inputs
)
.
outputs
if
isinstance
(
op
,
Scan
)
and
(
not
op
.
inplace
)
\
and
(
op
.
inplace_map
.
keys
()
!=
[]):
return
Scan
((
op
.
inputs
,
op
.
outputs
,
op
.
updates
),
op
.
n_seqs
,
\
op
.
n_seeds
,
op
.
inplace_map
,
op
.
seqs_taps
,
op
.
outs_taps
,
\
op
.
force_gradient
,
op
.
truncate_gradient
,
\
op
.
go_backwards
,
inplace
=
True
\
)
.
make_node
(
*
node
.
inputs
)
.
outputs
return
False
optdb
.
register
(
'scan_make_inplace'
,
opt
.
in2out
(
scan_make_inplace
,
\
ignore_newtrees
=
True
),
75
,
'fast_run'
,
'inplace'
)
...
...
@@ -428,144 +587,160 @@ optdb.register('scan_make_inplace', opt.in2out(scan_make_inplace,\
class
ScanGrad
(
theano
.
Op
):
"""Gradient Op for Scan"""
def
__init__
(
self
,
grad_fn
,
n_ins
,
n_outs
,
taps
=
{},
inplace
=
False
):
self
.
grad_fn
=
grad_fn
self
.
n_ins
=
n_ins
# number of inputs of Scan op not of Grad Scan !!
self
.
n_outs
=
n_outs
# number of outs of Scan op not of Grad Scan !!
self
.
inplace
=
inplace
self
.
taps
=
taps
def
__init__
(
self
,(
g_ins
,
g_outs
)
,
n_seqs
,
n_outs
,
seqs_taps
=
{},
outs_taps
=
{},
truncate_gradient
=
-
1
):
self
.
grad_fn
=
theano
.
function
(
g_ins
,
g_outs
)
self
.
inputs
=
g_ins
self
.
outputs
=
g_outs
self
.
n_seqs
=
n_seqs
self
.
truncate_gradient
=
truncate_gradient
self
.
n_outs
=
n_outs
self
.
seqs_taps
=
seqs_taps
self
.
outs_taps
=
outs_taps
self
.
destroy_map
=
{}
if
self
.
inplace
:
for
i
in
xrange
(
self
.
n_outs
):
# claiming that output "-i" is destroying inputs is the way to
# declare that no real output is aliased to any inputs. We just
# trash the inputs by using them as workspace.
self
.
destroy_map
.
update
(
{
-
i
:[
i
]})
def
__eq__
(
self
,
other
):
rval
=
type
(
self
)
==
type
(
other
)
if
rval
:
rval
=
(
self
.
grad_fn
is
other
.
grad_fn
)
and
\
(
self
.
n_ins
==
other
.
n_ins
)
and
\
rval
=
(
self
.
inputs
==
other
.
inputs
)
and
\
(
self
.
outputs
==
other
.
outputs
)
and
\
(
self
.
n_seqs
==
other
.
n_seqs
)
and
\
(
self
.
n_outs
==
other
.
n_outs
)
and
\
(
self
.
inplace
==
other
.
inplace
)
and
\
(
self
.
taps
==
other
.
taps
)
(
self
.
truncate_gradient
==
other
.
truncate_gradient
)
and
\
(
self
.
seqs_taps
==
other
.
seqs_taps
)
and
\
(
self
.
outs_taps
==
other
.
outs_taps
)
return
rval
def
__hash__
(
self
):
taps_hash
=
0
for
k
,
v
in
self
.
taps
.
iteritems
():
taps_hash
^=
k
for
vi
in
v
:
taps_hash
^=
vi
return
hash
(
type
(
self
))
^
\
hash
(
self
.
grad_fn
)
^
\
hash
(
self
.
n_ins
)
^
\
hash
(
self
.
n_seqs
)
^
\
hash
(
self
.
n_outs
)
^
\
hash
(
self
.
inplace
)
^
taps_hash
hash
(
self
.
truncate_gradient
)
^
\
hash_list
(
self
.
inputs
)
^
\
hash_list
(
self
.
outputs
)
^
\
hash_dict
(
self
.
seqs_taps
)
^
\
hash_dict
(
self
.
outs_taps
)
def
make_node
(
self
,
*
args
):
# input of the gradient op :
# | g_outs | y |
ins | outs | other_args
|
# | n_outs | n_outs | n_
ins | n_outs
| unknown |
# | g_outs | y |
seqs | outs | non_seqs
|
# | n_outs | n_outs | n_
seqs | n_outs
| unknown |
# return
# | grad of
ins | grad of outs | grad of other_args
|
# | n_
in
s | n_outs | unknown |
# | grad of
seqs | grad of outs | grad of non_seqs
|
# | n_
seq
s | n_outs | unknown |
return
theano
.
Apply
(
self
,
list
(
args
),
[
i
.
type
()
for
i
in
args
[
self
.
n_outs
+
self
.
n_outs
:]
])
[
i
.
type
()
for
i
in
args
[
1
+
2
*
self
.
n_outs
:]
])
def
perform
(
self
,
node
,
args
,
storage
):
# get scan inputs
inputs
=
args
[
self
.
n_outs
+
self
.
n_outs
:]
ins
=
inputs
[:
self
.
n_ins
]
initSt
=
inputs
[
self
.
n_ins
:
self
.
n_ins
+
self
.
n_outs
]
otherArgs
=
inputs
[
self
.
n_outs
+
self
.
n_ins
:]
n_steps
=
args
[
0
]
inputs
=
args
[
2
*
self
.
n_outs
+
1
:]
seqs
=
inputs
[:
self
.
n_seqs
]
seeds
=
inputs
[
self
.
n_seqs
:
self
.
n_seqs
+
self
.
n_outs
]
non_seqs
=
inputs
[
self
.
n_outs
+
self
.
n_seqs
:]
# generate space for gradient
# not do if inplace !?
g_ins
=
[
numpy
.
zeros_like
(
k
)
for
k
in
ins
]
g_initSt
=
[
numpy
.
zeros_like
(
k
)
for
k
in
initSt
]
g_otherArgs
=
[
numpy
.
zeros_like
(
k
)
for
k
in
otherArgs
]
g_seqs
=
[
numpy
.
zeros_like
(
k
)
for
k
in
seqs
]
g_seeds
=
[
numpy
.
zeros_like
(
k
)
for
k
in
seeds
]
g_non_seqs
=
[
numpy
.
zeros_like
(
k
)
for
k
in
non_seqs
]
# get gradient from above
g_outs
=
args
[:
self
.
n_outs
]
# we modify g_outs inplace ..
if
not
self
.
inplace
:
g_outs
=
[
gout
.
copy
()
for
gout
in
g_outs
]
# get the output of the scan operation
outs
=
args
[
self
.
n_outs
:
2
*
self
.
n_outs
]
# check for Nones (non - differentiable )
#for i,g_o in enumerate(g_outs):
# if numpy.all(g_o == 0.):
# g_outs[i] = numpy.zeros_like(outs[i])
# go back through time to 0 (use a time window !?)
for
i
in
xrange
(
len
(
ins
[
0
])
-
1
,
-
1
,
-
1
):
# go back through time to 0 or n_steps - truncate_gradient
lower_limit
=
n_steps
-
self
.
truncate_gradient
if
lower_limit
>
n_steps
-
1
:
the_range
=
xrange
(
n_steps
-
1
,
-
1
,
-
1
)
elif
lower_limit
<
-
1
:
the_range
=
xrange
(
n_steps
-
1
,
-
1
,
-
1
)
else
:
the_range
=
xrange
(
n_steps
-
1
,
lower_limit
,
-
1
)
seqs_mins
=
{}
for
j
in
xrange
(
self
.
n_seqs
):
if
self
.
seqs_taps
.
has_key
(
j
):
seqs_mins
.
update
({
j
:
min
(
self
.
seqs_taps
[
j
])})
outs_mins
=
{}
seed_size
=
{}
for
j
in
xrange
(
self
.
n_outs
):
if
self
.
outs_taps
.
has_key
(
j
):
outs_mins
.
update
({
j
:
min
(
self
.
outs_taps
[
j
])})
seed_size
.
update
({
j
:
g_seeds
[
j
]
..
shape
[
0
]})
for
i
in
the_range
:
# time slice of inputs
_ins
=
[
arg
[
i
]
for
arg
in
ins
]
_ins
=
[]
for
j
in
xrange
(
self
.
n_seqs
)
if
self
.
seqs_taps
.
has_key
(
j
):
ls_taps
=
self
.
seqs_taps
[
j
]
min_tap
=
seqs_mins
[
j
]
for
tap_value
in
ls_taps
:
k
=
i
-
min_tap
+
tap_value
_ins
+=
[
ins
[
j
][
k
]]
# time slice of outputs + taps
_outs
=
[]
for
j
in
xrange
(
self
.
n_outs
):
ls_taps
=
[
1
]
if
self
.
taps
.
has_key
(
j
):
ls_taps
+=
self
.
taps
[
j
]
maxVal
=
max
(
ls_taps
)
for
tap_value
in
ls_taps
:
if
i
-
tap_value
<
0
:
_outs
+=
[
initSt
[
j
][
maxVal
-
tap_value
+
i
]]
if
self
.
outs_taps
.
has_key
(
j
):
ls_taps
=
self
.
outs_taps
[
j
]
min_tap
=
outs_mins
[
j
]
seed_sz
=
seed_size
[
j
]
for
tap_value
in
ls_taps
:
if
i
+
tap_value
<
0
:
k
=
i
+
seed_sz
+
tap_value
if
k
<
0
:
#past value not provided .. issue a warning and use 0
_outs
+=
[
numpy
.
zeros
(
seeds
[
j
][
0
]
.
shape
)]
warning
(
'Past value
%
d for output $d not given'
\
%
(
j
,
tap_value
))
else
:
_outs
+=
[
seeds
[
j
][[
k
]]
else
:
_outs
+=
[
outs
[
j
][
i
-
tap_value
]]
_outs
+=
[
outs
[
j
][
i
+
tap_value
]]
g_out
=
[
arg
[
i
]
for
arg
in
g_outs
]
grad_args
=
g_out
+
_ins
+
_outs
+
otherArg
s
grad_args
=
g_out
+
_ins
+
_outs
+
non_seq
s
grads
=
self
.
grad_fn
(
*
grad_args
)
# get gradient for inputs
for
j
in
xrange
(
self
.
n_ins
):
g_ins
[
j
][
i
]
=
grads
[
j
]
pos
=
0
for
j
in
xrange
(
self
.
n_seqs
):
if
self
.
seqs_taps
.
has_key
(
j
):
ls_taps
=
self
.
seqs_taps
[
j
]
min_tap
=
seqs_mins
[
j
]
for
tap_value
in
ls_taps
:
k
=
i
-
min_tap
+
tap_value
g_ins
[
j
][
k
]
+=
grads
[
pos
]
pos
+=
1
# get gradient for outputs
pos
=
self
.
n_ins
for
j
in
xrange
(
self
.
n_outs
):
ls_taps
=
[
1
]
if
self
.
taps
.
has_key
(
j
):
ls_taps
+=
self
.
taps
[
j
]
maxVal
=
max
(
ls_taps
)
for
tap_value
in
ls_taps
:
if
i
-
tap_value
<
0
:
g_initSt
[
j
][
maxVal
-
tap_value
+
i
]
+=
grads
[
pos
]
pos
+=
1
else
:
g_outs
[
j
][
i
-
tap_value
]
+=
grads
[
pos
]
pos
+=
1
for
j
in
xrange
(
len
(
g_otherArgs
)):
g_otherArgs
[
j
]
+=
grads
[
j
+
pos
]
# return the gradient
for
i
in
xrange
(
len
(
g_ins
)):
storage
[
i
][
0
]
=
g_ins
[
i
]
if
self
.
outs_taps
.
has_key
(
j
):
ls_taps
=
self
.
outs_taps
[
j
]
min_tap
=
outs_mins
[
j
]
seed_sz
=
seed_size
[
j
]
for
tap_value
in
ls_taps
:
if
i
+
tap_value
<
0
:
k
=
i
+
seed_sz
+
tap_value
if
k
>
0
:
g_seeds
[
j
][
k
]
+=
grads
[
pos
]
pos
+=
1
for
j
in
xrange
(
len
(
g_non_seqs
)):
g_non_seqs
[
j
]
+=
grads
[
j
+
pos
]
for
i
in
xrange
(
len
(
g_initSt
)):
storage
[
i
+
self
.
n_ins
][
0
]
=
g_initSt
[
i
]
for
i
in
xrange
(
len
(
g_otherArgs
)):
storage
[
i
+
self
.
n_ins
+
self
.
n_outs
][
0
]
=
g_otherArgs
[
i
]
# return the gradient
for
i
,
v
in
enumerate
(
g_ins
+
g_seeds
+
g_non_seqs
):
storage
[
i
][
0
]
=
v
@gof.local_optimizer
([
None
])
def
grad_scan_make_inplace
(
node
):
op
=
node
.
op
if
isinstance
(
op
,
ScanGrad
)
and
(
not
op
.
inplace
):
return
ScanGrad
(
op
.
grad_fn
,
op
.
n_ins
,
op
.
n_outs
,
op
.
taps
,
inplace
=
True
)
.
make_node
(
*
node
.
inputs
)
.
outputs
return
False
optdb
.
register
(
'grad_scan_make_inplace'
,
opt
.
in2out
(
grad_scan_make_inplace
,
\
ignore_newtrees
=
True
),
75
,
'fast_run'
,
'inplace'
)
theano/sandbox/test_scan.py
浏览文件 @
839eda94
...
...
@@ -7,8 +7,6 @@ import random
import
numpy.random
from
theano.tests
import
unittest_tools
as
utt
def
verify_grad
(
op
,
pt
,
n_tests
=
2
,
rng
=
None
,
eps
=
None
,
tol
=
None
,
mode
=
None
,
cast_to_output_type
=
False
):
pt
=
[
numpy
.
array
(
p
)
for
p
in
pt
]
...
...
@@ -75,455 +73,21 @@ def verify_grad(op, pt, n_tests=2, rng=None, eps = None, tol = None,
class
T_Scan
(
unittest
.
TestCase
):
def
setUp
(
self
):
utt
.
seed_rng
()
x_1
=
theano
.
tensor
.
dscalar
(
'x_1'
)
self
.
my_f
=
theano
.
function
([
x_1
],[
x_1
])
#dummy function
# Naming convention :
# u_1,u_2,.. -> inputs, arrays to iterate over
# x_1,x_2,.. -> outputs at t-1 that are required in the recurrent
# computation
# iu_1,iu_2,.. -> inplace inputs, inputs that are being replaced by
# outputs during computation
# du_1,du_2,.. -> dummy inputs used to do inplace computation, they
# are not passed to my_f
# ix_1,ix_2,.. -> inplace outputs at t-1
# x_1_next,.. -> outputs at t
# ix_1_next,.. -> inplace outputs at time t
# w_1,w_2,.. -> weights, paramters over which scan does not iterate
# my_f -> compiled function that will be applied recurrently
# my_op -> operator class
# final_f -> compiled function that applies the Scan operation
# out_1,.. -> outputs of the Scan operation
###################################################################
def
test_numberOfIterableInputs
(
self
):
def
t1
():
my_op
=
Scan
.
compiled
(
self
.
my_f
,
-
1
,
1
)
def
t2
():
my_op
=
Scan
.
compiled
(
self
.
my_f
,
0
,
1
)
self
.
failUnlessRaises
(
ValueError
,
t1
)
self
.
failUnlessRaises
(
ValueError
,
t2
)
###################################################################
def
test_numberOfOutputs
(
self
):
def
t1
():
my_op
=
Scan
.
compiled
(
self
.
my_f
,
1
,
-
1
)
def
t2
():
my_op
=
Scan
.
compiled
(
self
.
my_f
,
1
,
0
)
self
.
failUnlessRaises
(
ValueError
,
t1
)
self
.
failUnlessRaises
(
ValueError
,
t2
)
#####################################################################
def
test_numberOfInplaceOutputs
(
self
):
def
t1
():
my_op
=
Scan
.
compiled
(
self
.
my_f
,
1
,
1
,
n_inplace
=
-
1
)
def
t2
():
my_op
=
Scan
.
compiled
(
self
.
my_f
,
1
,
1
,
n_inplace
=
2
)
def
t3
():
my_op
=
Scan
.
compiled
(
self
.
my_f
,
2
,
1
,
n_inplace
=
2
)
def
t4
():
my_op
=
Scan
.
compiled
(
self
.
my_f
,
1
,
2
,
n_inplace
=
2
)
def
t5
():
my_op
=
Scan
.
compiled
(
self
.
my_f
,
1
,
1
,
n_inplace
=
1
,
n_inplace_ignore
=
2
)
self
.
failUnlessRaises
(
ValueError
,
t1
)
self
.
failUnlessRaises
(
ValueError
,
t2
)
self
.
failUnlessRaises
(
ValueError
,
t3
)
self
.
failUnlessRaises
(
ValueError
,
t4
)
self
.
failUnlessRaises
(
ValueError
,
t5
)
#####################################################################
def
test_taps
(
self
):
def
t1
():
my_op
=
Scan
.
compiled
(
self
.
my_f
,
1
,
1
,
taps
=
{
2
:[
3
]})
def
t2
():
my_op
=
Scan
.
compiled
(
self
.
my_f
,
1
,
2
,
taps
=
{
0
:[
0
]})
def
t3
():
my_op
=
Scan
.
compiled
(
self
.
my_f
,
1
,
2
,
taps
=
{
0
:[
1
]})
self
.
failUnlessRaises
(
ValueError
,
t1
)
self
.
failUnlessRaises
(
ValueError
,
t2
)
self
.
failUnlessRaises
(
ValueError
,
t3
)
#####################################################################
def
test_makeNode
(
self
):
def
t1
():
######### Test inputs of different lengths
# define the function that is applied recurrently
u_1
=
theano
.
tensor
.
dscalar
(
'u_1'
)
u_2
=
theano
.
tensor
.
dscalar
(
'u_2'
)
x_1
=
theano
.
tensor
.
dscalar
(
'x_1'
)
x_1_next
=
u_1
+
u_2
*
x_1
my_f
=
theano
.
function
([
u_1
,
u_2
,
x_1
],[
x_1_next
])
# define the function that applies the scan operation
my_op
=
Scan
.
compiled
(
my_f
,
2
,
1
)
u_1
=
theano
.
tensor
.
dvector
(
'u_1'
)
u_2
=
theano
.
tensor
.
dvector
(
'u_2'
)
x_1
=
theano
.
tensor
.
dvector
(
'x_1'
)
x_1_next
=
my_op
(
u_1
,
u_2
,
x_1
)
final_f
=
theano
.
function
([
u_1
,
u_2
,
x_1
],[
x_1_next
])
# test the function final_f
u_1
=
numpy
.
random
.
rand
(
3
)
u_2
=
numpy
.
random
.
rand
(
2
)
x_1
=
[
numpy
.
random
.
rand
()]
out
=
final_f
(
u_1
,
u_2
,
x_1
)
def
t2
():
######### Test function does not return correct number of outputs
# define the function that is applied recurrently
u_1
=
theano
.
tensor
.
dscalar
(
'u_1'
)
x_1
=
theano
.
tensor
.
dscalar
(
'x_1'
)
x_1_next
=
u_1
*
x_1
my_f
=
theano
.
function
([
u_1
,
x_1
],[
x_1_next
])
# define the function that applies the scan operation
my_op
=
Scan
.
compiled
(
my_f
,
1
,
2
)
u_1
=
theano
.
tensor
.
dvector
(
'u_1'
)
x_1
=
theano
.
tensor
.
dvector
(
'x_1'
)
x_2
=
theano
.
tensor
.
dvector
(
'x_2'
)
x_1_next
,
x_2_next
=
my_op
(
u_1
,
x_1
,
x_2
)
final_f
=
theano
.
function
([
u_1
,
x_1
,
x_2
],[
x_1_next
,
x_2_next
])
#generate data
u_1
=
numpy
.
random
.
rand
(
3
)
x_1
=
[
numpy
.
random
.
rand
()]
x_2
=
[
numpy
.
random
.
rand
()]
out_1
,
out_2
=
final_f
(
u_1
,
x_1
,
x_2
)
# Naming convention :
# u_1,u_2,.. -> sequences
# s_1,s_2,.. -> initial states
# w_1,w_2,.. -> non-sequences
###################################
class
T_Scan
(
unittest
.
TestCase
):
def
setUp
(
self
):
utt
.
seed_rng
()
self
.
failUnlessRaises
(
ValueError
,
t1
)
self
.
failUnlessRaises
(
TypeError
,
t2
)
#####################################################################
def
test_generator
(
self
):
# compile my_f
u_1
=
theano
.
tensor
.
dscalar
(
'u_1'
)
# dummy input,
# required if no inplace is used!
x_1
=
theano
.
tensor
.
dscalar
(
'x_1'
)
w_1
=
theano
.
tensor
.
dscalar
(
'w_1'
)
x_1_next
=
x_1
*
w_1
my_f
=
theano
.
function
([
u_1
,
x_1
,
w_1
],[
x_1_next
])
# create operation
my_op
=
Scan
.
compiled
(
my_f
,
1
,
1
)
u_1
=
theano
.
tensor
.
dvector
(
'u_1'
)
# dummy input, there is no
#inplace, so output will not be put in place of this u_1!
x_1
=
theano
.
tensor
.
dvector
(
'x_1'
)
w_1
=
theano
.
tensor
.
dscalar
(
'w_1'
)
x_1_next
=
my_op
(
u_1
,
x_1
,
w_1
)
final_f
=
theano
.
function
([
u_1
,
x_1
,
w_1
],[
x_1_next
])
#generate data
x_1
=
numpy
.
ndarray
(
3
)
# dummy input, just tells for how many time
# steps to run recursively
out_1
=
final_f
(
x_1
,[
2
],
2
)
self
.
failUnless
(
numpy
.
all
(
out_1
==
numpy
.
asarray
([
4
,
8
,
16
])))
#####################################################################
def
test_generator_inplace_no_ignore
(
self
):
# compile my_f
u_1
=
theano
.
tensor
.
dscalar
(
'u_1'
)
x_1
=
theano
.
tensor
.
dscalar
(
'x_1'
)
w_1
=
theano
.
tensor
.
dscalar
(
'w_1'
)
x_1_next
=
x_1
*
w_1
my_f
=
theano
.
function
([
u_1
,
x_1
,
w_1
],[
x_1_next
])
# create operation
my_op
=
Scan
.
compiled
(
my_f
,
1
,
1
,
n_inplace
=
1
)
iu_1
=
theano
.
tensor
.
dvector
(
'iu_1'
)
ix_1
=
theano
.
tensor
.
dvector
(
'ix_1'
)
w_1
=
theano
.
tensor
.
dscalar
(
'w_1'
)
ix_1_next
=
my_op
(
iu_1
,
ix_1
,
w_1
)
final_f
=
theano
.
function
([
theano
.
In
(
iu_1
,
mutable
=
True
),
ix_1
,
w_1
],
[
ix_1_next
],
mode
=
'FAST_RUN'
)
#generate data
iu_1
=
numpy
.
ndarray
(
3
)
out_1
=
final_f
(
iu_1
,[
2
],
2
)
# not concretely implemented yet ..
self
.
failUnless
(
numpy
.
all
(
out_1
==
numpy
.
asarray
([
4
,
8
,
16
])))
self
.
failUnless
(
numpy
.
all
(
out_1
==
iu_1
))
#####################################################################
def
test_generator_inplace_no_ignore_2states
(
self
):
# compile my_f
u_1
=
theano
.
tensor
.
dscalar
(
'u_1'
)
u_2
=
theano
.
tensor
.
dscalar
(
'u_2'
)
x_1
=
theano
.
tensor
.
dscalar
(
'x_1'
)
x_2
=
theano
.
tensor
.
dscalar
(
'x_2'
)
w_1
=
theano
.
tensor
.
dscalar
(
'w_1'
)
x_1_next
=
x_1
*
w_1
x_2_next
=
x_2
*
w_1
my_f
=
theano
.
function
([
u_1
,
u_2
,
x_1
,
x_2
,
w_1
],[
x_1_next
,
x_2_next
])
# create operation
my_op
=
Scan
.
compiled
(
my_f
,
2
,
2
,
n_inplace
=
2
)
iu_1
=
theano
.
tensor
.
dvector
(
'iu_1'
)
iu_2
=
theano
.
tensor
.
dvector
(
'iu_2'
)
ix_1
=
theano
.
tensor
.
dvector
(
'ix_1'
)
ix_2
=
theano
.
tensor
.
dvector
(
'ix_2'
)
w_1
=
theano
.
tensor
.
dscalar
(
'w_1'
)
ix_1_next
,
ix_2_next
=
my_op
(
iu_1
,
iu_2
,
ix_1
,
ix_2
,
w_1
)
final_f
=
theano
.
function
([
theano
.
In
(
iu_1
,
mutable
=
True
),
theano
.
In
(
iu_2
,
mutable
=
True
),
ix_1
,
ix_2
,
w_1
],[
ix_1_next
,
ix_2_next
],
mode
=
'FAST_RUN'
)
#generate data
iu_1
=
numpy
.
ndarray
(
3
)
iu_2
=
numpy
.
ndarray
(
3
)
out_1
,
out_2
=
final_f
(
iu_1
,
iu_2
,[
2
],[
1
],
2
)
# not concretely implemented yet ..
self
.
failUnless
(
numpy
.
all
(
out_1
==
numpy
.
asarray
([
4
,
8
,
16
])))
self
.
failUnless
(
numpy
.
all
(
out_1
==
iu_1
))
self
.
failUnless
(
numpy
.
all
(
out_2
==
numpy
.
asarray
([
2
,
4
,
8
])))
self
.
failUnless
(
numpy
.
all
(
out_2
==
iu_2
))
#######################################################################
def
test_generator_inplace
(
self
):
#compile my_f
u_1
=
theano
.
tensor
.
dscalar
(
'u_1'
)
x_1
=
theano
.
tensor
.
dscalar
(
'x_1'
)
x_2
=
theano
.
tensor
.
dscalar
(
'x_2'
)
x_1_next
=
u_1
+
x_1
x_2_next
=
x_1
*
x_2
my_f
=
theano
.
function
([
u_1
,
x_1
,
x_2
],[
x_1_next
,
x_2_next
])
# create operation
my_op
=
Scan
.
compiled
(
my_f
,
2
,
2
,
n_inplace
=
2
,
n_inplace_ignore
=
1
)
du_1
=
theano
.
tensor
.
dvector
(
'du_1'
)
iu_1
=
theano
.
tensor
.
dvector
(
'iu_1'
)
ix_1
=
theano
.
tensor
.
dvector
(
'ix_1'
)
ix_2
=
theano
.
tensor
.
dvector
(
'ix_2'
)
ix_1_next
,
ix_2_next
=
my_op
(
du_1
,
iu_1
,
ix_1
,
ix_2
)
final_f
=
theano
.
function
([
theano
.
In
(
du_1
,
mutable
=
True
),
theano
.
In
(
iu_1
,
mutable
=
True
),
ix_1
,
ix_2
],[
ix_1_next
,
ix_2_next
],
mode
=
'FAST_RUN'
)
# generate data
du_1
=
numpy
.
asarray
([
0.
,
0.
,
0.
])
iu_1
=
numpy
.
asarray
([
1.
,
1.
,
1.
])
ix_1
=
[
1
]
ix_2
=
[
1
]
out_1
,
out_2
=
final_f
(
du_1
,
iu_1
,
ix_1
,
ix_2
)
self
.
failUnless
(
numpy
.
all
(
out_1
==
numpy
.
asarray
([
2
,
3
,
4
])))
self
.
failUnless
(
numpy
.
all
(
out_2
==
numpy
.
asarray
([
1
,
2
,
6
])))
self
.
failUnless
(
numpy
.
all
(
out_1
==
du_1
))
self
.
failUnless
(
numpy
.
all
(
out_2
==
iu_1
))
#####################################################################
def
tets_iterateOnlyOverX
(
self
):
u_1
=
theano
.
tensor
.
dscalar
(
'u_1'
)
x_1
=
theano
.
tensor
.
dscalar
(
'x_1'
)
x_1_next
=
u_1
*
x_1
my_f
=
theano
.
function
([
u_1
,
x_1
],[
x_1_next
])
my_op
=
Scan
.
compiled
(
my_f
,
1
,
1
)
u_1
=
theano
.
tensor
.
dvector
(
'u_1'
)
x_1
=
theano
.
tensor
.
dvector
(
'x_1'
)
x_1_next
=
my_op
(
u_1
,
x_1
)
final_f
=
theano
.
function
([
x_1
,
u_1
],[
x_1_next
])
u_1
=
numpy
.
asarray
([
2
,
2
,
2
])
out_1
=
final_f
(
inp
,
2
)
self
.
failUnless
(
numpy
.
all
(
out_1
==
numpy
.
asarray
([
4
,
8
,
16
])))
#####################################################################
def
test_iterateOverSeveralInputs
(
self
):
u_1
=
theano
.
tensor
.
dscalar
(
'u_1'
)
# input 1
u_2
=
theano
.
tensor
.
dscalar
(
'u_2'
)
# input 2
x_1
=
theano
.
tensor
.
dscalar
(
'x_1'
)
# output
x_1_next
=
(
u_1
+
u_2
)
*
x_1
my_f
=
theano
.
function
([
u_1
,
u_2
,
x_1
],[
x_1_next
])
my_op
=
Scan
.
compiled
(
my_f
,
2
,
1
)
u_1
=
theano
.
tensor
.
dvector
(
'u_1'
)
u_2
=
theano
.
tensor
.
dvector
(
'u_2'
)
x_1
=
theano
.
tensor
.
dvector
(
'x_1'
)
x_1_next
=
my_op
(
u_1
,
u_2
,
x_1
)
final_f
=
theano
.
function
([
u_1
,
u_2
,
x_1
],[
x_1_next
])
u_1
=
numpy
.
asarray
([
1
,
1
,
1
])
u_2
=
numpy
.
asarray
([
1
,
1
,
1
])
x_1
=
[
2
]
out_1
=
final_f
(
u_1
,
u_2
,
x_1
)
self
.
failUnless
(
numpy
.
all
(
out_1
==
numpy
.
asarray
([
4
,
8
,
16
])))
#####################################################################
def
test_iterateOverSeveralInputsSeveralInplace
(
self
):
iu_1
=
theano
.
tensor
.
dscalar
(
'iu_1'
)
u_1
=
theano
.
tensor
.
dscalar
(
'u_1'
)
u_2
=
theano
.
tensor
.
dscalar
(
'u_2'
)
u_3
=
theano
.
tensor
.
dscalar
(
'u_3'
)
u_4
=
theano
.
tensor
.
dscalar
(
'u_4'
)
ix_1
=
theano
.
tensor
.
dscalar
(
'ix_1'
)
ix_2
=
theano
.
tensor
.
dscalar
(
'ix_2'
)
x_1
=
theano
.
tensor
.
dscalar
(
'x_1'
)
w_1
=
theano
.
tensor
.
dscalar
(
'w_1'
)
ix_1_next
=
u_3
+
u_4
ix_2_next
=
ix_1
+
ix_2
x_1_next
=
x_1
+
u_3
+
u_4
+
ix_1
+
ix_2
my_f
=
theano
.
function
([
iu_1
,
u_1
,
u_2
,
u_3
,
u_4
,
ix_1
,
ix_2
,
x_1
,
w_1
],
\
[
ix_1_next
,
ix_2_next
,
x_1_next
])
my_op
=
Scan
.
compiled
(
my_f
,
6
,
3
,
n_inplace
=
2
,
\
n_inplace_ignore
=
1
)
du_1
=
theano
.
tensor
.
dvector
(
'du_1'
)
iu_1
=
theano
.
tensor
.
dvector
(
'iu_1'
)
u_1
=
theano
.
tensor
.
dvector
(
'u_1'
)
u_2
=
theano
.
tensor
.
dvector
(
'u_2'
)
u_3
=
theano
.
tensor
.
dvector
(
'u_3'
)
u_4
=
theano
.
tensor
.
dvector
(
'u_4'
)
x_1
=
theano
.
tensor
.
dvector
(
'x_1'
)
ix_1
=
theano
.
tensor
.
dvector
(
'ix_1'
)
ix_2
=
theano
.
tensor
.
dvector
(
'ix_2'
)
w_1
=
theano
.
tensor
.
dscalar
(
'w_1'
)
[
ix_1_next
,
ix_2_next
,
x_1_next
]
=
\
my_op
(
du_1
,
iu_1
,
u_1
,
u_2
,
u_3
,
u_4
,
x_1
,
ix_1
,
ix_2
,
w_1
)
final_f
=
theano
.
function
([
theano
.
In
(
du_1
,
mutable
=
True
),
theano
.
In
(
iu_1
,
mutable
=
True
),
u_1
,
u_2
,
u_3
,
u_4
,
ix_1
,
ix_2
,
x_1
,
w_1
],
[
ix_1_next
,
ix_2_next
,
x_1_next
],
mode
=
'FAST_RUN'
)
#generate data
du_1
=
numpy
.
asarray
([
0.
,
0.
,
0.
])
iu_1
=
numpy
.
asarray
([
0.
,
1.
,
2.
])
u_1
=
numpy
.
asarray
([
1.
,
2.
,
3.
])
u_2
=
numpy
.
asarray
([
1.
,
1.
,
1.
])
u_3
=
numpy
.
asarray
([
2.
,
2.
,
2.
])
u_4
=
numpy
.
asarray
([
3.
,
2.
,
1.
])
x_1
=
[
1.
]
ix_1
=
[
1.
]
ix_2
=
[
1.
]
w_1
=
2.
out_1
,
out_2
,
out_3
=
final_f
(
du_1
,
iu_1
,
u_1
,
u_2
,
u_3
,
u_4
,
\
ix_1
,
ix_2
,
x_1
,
w_1
)
self
.
failUnless
(
numpy
.
all
(
out_3
==
numpy
.
asarray
([
8.
,
19.
,
33.
])))
self
.
failUnless
(
numpy
.
all
(
out_1
==
numpy
.
asarray
([
5.
,
4.
,
3.
])))
self
.
failUnless
(
numpy
.
all
(
out_2
==
numpy
.
asarray
([
2.
,
7.
,
11.
])))
self
.
failUnless
(
numpy
.
all
(
out_1
==
du_1
))
self
.
failUnless
(
numpy
.
all
(
out_2
==
iu_1
))
#####################################################################
def
test_computeInPlaceArguments
(
self
):
u_1
=
theano
.
tensor
.
dscalar
(
'u_1'
)
x_1
=
theano
.
tensor
.
dscalar
(
'x_1'
)
w_1
=
theano
.
tensor
.
dscalar
(
'w_1'
)
x_1_next
=
u_1
*
w_1
+
x_1
my_f
=
theano
.
function
([
u_1
,
x_1
,
theano
.
In
(
w_1
,
update
=
w_1
*
2
)],
[
x_1_next
])
my_op
=
Scan
.
compiled
(
my_f
,
1
,
1
)
u_1
=
theano
.
tensor
.
dvector
(
'u_1'
)
x_1
=
theano
.
tensor
.
dvector
(
'x_1'
)
w_1
=
theano
.
tensor
.
dscalar
(
'w_1'
)
x_1_next
=
my_op
(
u_1
,
x_1
,
w_1
)
final_f
=
theano
.
function
([
u_1
,
x_1
,
w_1
],
[
x_1_next
])
u_1
=
[
1.
,
1.
,
1.
]
x_1
=
[
1.
]
w_1
=
1.
out_1
=
final_f
(
u_1
,
x_1
,
w_1
)
self
.
failUnless
(
numpy
.
all
(
out_1
==
numpy
.
asarray
([
2
,
4
,
8
])))
#####################################################################
def
test_timeTaps
(
self
):
u_1
=
theano
.
tensor
.
dscalar
(
'u_1'
)
x_1
=
theano
.
tensor
.
dscalar
(
'x_1'
)
x_1_t2
=
theano
.
tensor
.
dscalar
(
'x_1_t2'
)
x_1_t4
=
theano
.
tensor
.
dscalar
(
'x_1_t4'
)
x_1_next
=
u_1
+
x_1
+
x_1_t2
+
x_1_t4
my_f
=
theano
.
function
([
u_1
,
x_1
,
x_1_t2
,
x_1_t4
],[
x_1_next
])
my_op
=
Scan
.
compiled
(
my_f
,
1
,
1
,
taps
=
{
0
:[
2
,
4
]})
u_1
=
theano
.
tensor
.
dvector
(
'u_1'
)
x_1
=
theano
.
tensor
.
dvector
(
'x_1'
)
x_1_next
=
my_op
(
u_1
,
x_1
)
final_f
=
theano
.
function
([
u_1
,
x_1
],[
x_1_next
])
u_1
=
[
1.
,
1.
,
1.
,
1.
,
1.
]
x_1
=
[
1.
,
2.
,
3.
,
4.
]
out_1
=
final_f
(
u_1
,
x_1
)
self
.
failUnless
(
numpy
.
all
(
out_1
==
numpy
.
asarray
([
9.
,
16.
,
29.
,
50.
,
89.
])))
#####################################################################
def
test_constructFunction
(
self
):
u_1
=
theano
.
tensor
.
dscalar
(
'u_1'
)
x_1
=
theano
.
tensor
.
dscalar
(
'x_1'
)
x_1_next
=
u_1
+
x_1
my_op
=
Scan
.
symbolic
(([
u_1
,
x_1
],
x_1_next
),
1
,
1
)
u_1
=
theano
.
tensor
.
dvector
(
'u_1'
)
x_1
=
theano
.
tensor
.
dvector
(
'x_1'
)
x_1_next
=
my_op
(
u_1
,
x_1
)
final_f
=
theano
.
function
([
u_1
,
x_1
],[
x_1_next
])
u_1
=
[
1.
,
1.
,
1.
]
x_1
=
[
1.
]
out_1
=
final_f
(
u_1
,
x_1
)
self
.
failUnless
(
numpy
.
all
(
out_1
==
numpy
.
asarray
([
2.
,
3.
,
4.
])))
######################################################################
def
test_gradOneInputOneOutput
(
self
):
u_1
=
theano
.
tensor
.
dscalar
(
'u_1'
)
x_1
=
theano
.
tensor
.
dscalar
(
'x_1'
)
x_1_next
=
u_1
*
x_1
my_op
=
Scan
.
symbolic
(
([
u_1
,
x_1
],
x_1_next
),
1
,
1
)
u_1
=
[
1.
,
2.
,
3.
]
x_1
=
[
1.
]
verify_grad
(
my_op
,
[
u_1
,
x_1
]
)
#######################################################################
def
test_gradManyInputsManyOutputs
(
self
):
u_1
=
theano
.
tensor
.
dscalar
(
'u_1'
)
u_2
=
theano
.
tensor
.
dscalar
(
'u_2'
)
x_1
=
theano
.
tensor
.
dscalar
(
'x_1'
)
x_2
=
theano
.
tensor
.
dscalar
(
'x_2'
)
x_1_next
=
x_1
*
u_1
+
x_2
x_2_next
=
x_2
*
u_2
+
x_1
my_op
=
Scan
.
symbolic
(
([
u_1
,
u_2
,
x_1
,
x_2
],
[
x_1_next
,
x_2_next
]),
2
,
2
)
u_1
=
[
1.
,
.
2
,
3.
]
u_2
=
[
1.5
,
1.25
,
.
35
]
x_1
=
[
.
5
]
x_2
=
[
.
65
]
verify_grad
(
my_op
,
[
u_1
,
u_2
,
x_1
,
x_2
])
######################################################################
def
test_gradTimeTaps
(
self
):
u_1
=
theano
.
tensor
.
dscalar
(
'u_1'
)
x_1
=
theano
.
tensor
.
dscalar
(
'x_1'
)
x_1_t_2
=
theano
.
tensor
.
dscalar
(
'x_1_t_2'
)
x_1_next
=
x_1_t_2
*
x_1
*
u_1
my_op
=
Scan
.
symbolic
(
([
u_1
,
x_1
,
x_1_t_2
],
[
x_1_next
]),
1
,
1
,
taps
=
{
0
:[
2
]})
u_1
=
[
1.
,
2.
,
3.
,
4.
]
x_1
=
[
2.
,
3.
]
verify_grad
(
my_op
,
[
u_1
,
x_1
])
#######################################################################
def
test_gradManyInputsManyOutputsTimeTaps
(
self
):
u_1
=
theano
.
tensor
.
dscalar
(
'u_1'
)
u_2
=
theano
.
tensor
.
dscalar
(
'u_2'
)
x_1
=
theano
.
tensor
.
dscalar
(
'x_1'
)
x_1_2
=
theano
.
tensor
.
dscalar
(
'x_1_2'
)
x_2
=
theano
.
tensor
.
dscalar
(
'x_2'
)
x_2_2
=
theano
.
tensor
.
dscalar
(
'x_2_2'
)
x_1_n
=
x_1
*
x_2_2
+
u_1
*
x_1_2
x_2_n
=
x_2
*
x_1_2
+
u_2
*
x_2_2
my_op
=
Scan
.
symbolic
(([
u_1
,
u_2
,
x_1
,
x_1_2
,
x_2
,
x_2_2
],[
x_1_n
,
x_2_n
]),
2
,
2
,
taps
=
{
0
:[
2
],
1
:[
2
]})
u_1
=
[
1.
,
2.
,
3.
,
4.
]
u_2
=
[
3.
,
2.
,
4.
,
1.
]
x_1
=
[
0.1
,
0.2
]
x_2
=
[
1.5
,
3.5
]
verify_grad
(
my_op
,
[
u_1
,
u_2
,
x_1
,
x_2
])
def
test_one
(
self
):
pass
if
__name__
==
'__main__'
:
unittest
.
main
()
编写
预览
Markdown
格式
0%
重试
或
添加新文件
添加附件
取消
您添加了
0
人
到此讨论。请谨慎行事。
请先完成此评论的编辑!
取消
请
注册
或者
登录
后发表评论