提交 839eda94 authored 作者: Razvan Pascanu's avatar Razvan Pascanu

changes into the tutorials

上级 0fc7c8ca
......@@ -8,8 +8,8 @@ Baby steps - Adding two numbers together
Adding two scalars
==================
So, to get us started and get a feel of what we're working with, let's
make a simple function: add two numbers together. Here is how you do
So, to get us started with Theano and get a feel of what we're working with,
let's make a simple function: add two numbers together. Here is how you do
it:
>>> x = T.dscalar('x')
......@@ -26,7 +26,7 @@ array(28.4)
Let's break this down into several steps. The first step is to define
two symbols, or Variables, representing the quantities that you want
two symbols representing the quantities that you want
to add. Note that from now on, we will use the term :term:`Variable`
to mean "symbol" (in other words, ``x``, ``y``, ``z`` are all Variable
objects). The output of the function ``f`` is a ``numpy.ndarray``
......@@ -36,7 +36,6 @@ If you are following along and typing into an interpreter, you may have
noticed that there was a slight delay in executing the ``function``
instruction. Behind the scenes, ``f`` was being compiled into C code.
.. TODO: help
-------------------------------------------
......@@ -64,8 +63,7 @@ TensorType(float64, scalar)
>>> x.type == T.dscalar
True
You can learn more about the structures in Theano in
the :ref:`advtutorial` and in :ref:`graphstructures`.
You can learn more about the structures in Theano in :ref:`graphstructures`.
By calling ``T.dscalar`` with a string argument, you create a
:term:`Variable` representing a floating-point scalar quantity with the
......
......@@ -137,6 +137,9 @@ with respect to the second. In this way, Theano can be used for
`automatic differentiation`_.
.. note::
The second argument of ``T.grad`` can be a list, case in which it
will
The variable of ``T.grad`` has the same dimensions as the
second argument. This is exactly like the first derivative if the
......
......@@ -10,7 +10,7 @@ Let's start an interactive session and import Theano.
>>> from theano import *
Many of symbols you will need to use are in the ``tensor`` subpackage
of theano. Let's import that subpackage under a handy name. I like
of Theano. Let's import that subpackage under a handy name. I like
``T`` (and many tutorials use this convention).
>>> import theano.tensor as T
......
......@@ -8,10 +8,9 @@ NumPy refresher
Here are some quick guides to NumPy:
* `Numpy quick guide for Matlab users <http://www.scipy.org/NumPy_for_Matlab_Users>`__
* `More detailed table showing the NumPy equivalent of Matlab commands <http://www.scribd.com/doc/26685/Matlab-Python-and-R>`__
* `Numpy User Guide <http://docs.scipy.org/doc/numpy/user/index.html>`__
* `More detailed Numpy tutorial <http://www.scipy.org/Tentative_NumPy_Tutorial>`__
.. TODO [DefineBroadcasting Broadcasting]
.. Broadcastable - Implicitly assume that all previous entries are true.
.. [TODO: More doc, e.g. see _test_tensor.py]
......@@ -20,8 +19,10 @@ Matrix conventions for machine learning
Rows are horizontal and columns are vertical.
Every row is an example. Therefore, inputs[10,5] is a matrix of 10 examples with 5 dimensions per.
So to make a NN out of it, multiply by a weight matrix of size (5, #hid).
Every row is an example. Therefore, inputs[10,5] is a matrix of 10 examples
where each example has dimension 5. If this would be the input of a
neural network then the weights from the input the the first hidden
layer would represent a matrix of size (5, #hid).
If I have an array:
......@@ -43,3 +44,22 @@ To access the entry in the 3rd row (row #2) and the 1st column (column #0):
To remember this, keep in mind that we read left-to-right, top-to-bottom,
so each thing that is contiguous is a row. That is, there are 3 rows
and 2 columns.
Broadcasting
============
Numpy does :term:`broadcasting` of numpy arrays of different shapes during
arithmetic operations. What this means in general is that the smaller
array is *broadcasted* across the larger array so that they have
compatible shapes. The example below shows an instance of
*broadcastaing*:
>>> a = numpy.asarray([1.0, 2.0, 3.0])
>>> b = 2.0
>>> a * b
array([2., 4., 6.])
The smaller array ``b`` in this case is *broadcasted* to the same size
as a during the multiplication. This trick is often useful in
simplifying how expression are written. More details about *broadcasting*
can be found at `numpy user guide <http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html>`__ .
"""Provide Scan and related functions
Scanning a function over sequential input(s) producing sequential output(s).
Scanning a function over sequential input(s) producing sequential output(s).
Scanning is a general form of recurrence, which can be used for looping.
Scanning is a general form of recurrence, which can be used for looping.
The idea is that you 'scan' a function along some input sequence, producing an output at each
time-step that can be seen (but not modified) by the function at the next time-step.
(Technically, the function can see the previous K time-steps.)
The idea is that you 'scan' a function along some input sequence, producing
an output at each time-step that can be seen (but not modified) by the
function at the next time-step. (Technically, the function can see the
previous K time-steps.)
So for example, ``sum()`` could be computed by scanning the ``z+x_i`` function over a list,
given an initial state of ``z=0``.
So for example, ``sum()`` could be computed by scanning the ``z+x_i``
function over a list, given an initial state of ``z=0``.
Special cases:
Special cases:
- A ``reduce()`` operation can be performed by returning only the last output of a scan.
- A ``reduce()`` operation can be performed by returning only the last
output of a scan.
- A ``map()`` operation can be performed by applying a function that ignores each previous
output.
- A ``map()`` operation can be performed by applying a function that
ignores each previous output.
Often a for loop can be expressed as a scan() operation, and scan is the closest that theano
comes to looping.
Often a for loop can be expressed as a scan() operation, and scan is the
closest that theano comes to looping.
This module provides scanning functionality with the `Scan` Op.
This module provides scanning functionality with the `Scan` Op.
"""
__docformat__ = 'restructedtext en'
import traceback
import numpy
import theano
import theano.compile
from theano.tensor import opt
from theano import gof
from theano.compile import optdb
'''
TODO : move out of sandbox !
'''
class Scan(theano.Op):
"""Scan a function `fn` over several inputs producing several outputs
# Logging function for sending warning or info
import logging
_logger = logging.getLogger('theano.scan')
def warning(*msg):
_logger.warning('WARNING theano.scan: '+' '.join(msg))
def info(*msg):
_logger.info('INFO theano.scan: '+' '.join(msg))
# Hashing a list; list used by scan are list of numbers, therefore a list
# can be hashed by hashing all elements in the list
def hash_list(list):
hash_value = 0
for v in list:
hash_value ^= v
return hash_value
# Hashing a dictionary; the dictionary used by scan has as keys numbers and
# as values either numbers or list of numbers
def hash_dict(dictionary):
hash_value = 0
for k,v in dictionary,iteritems():
# hash key
hash_value ^= k
if type(v) in (list,tuple):
hash_value ^= hash_list(v)
else:
hash_value ^= v
return hash_value
This Op implements a generalization of scan in which `fn` may consult several previous
outputs from the past, from positions (taps) relative to the current time. The number of
taps (T_j) to use for each output (y_j) must be provided when creating a Scan Op.
Apply Inputs:
def scan(fn, sequnces, non_sequences, seed_values, inplace_map={},
sequences_taps={}, outputs_taps = {},
len = theano.tensor.zero(), force_gradient = False,
truncate_gradient = -1, go_backwards = False, mode = 'FAST_RUN'):
'''The function creates a more intuitive interface to the scan op.
X sequence inputs x_1, x_2, ... x_X
This function first creates a scan op object, and afterwards applies it
to the input data. The scan operation iterates over X sequences producing
Y outputs. The function that is applied recursively may consult several
previous outputs from the past as well as past values and future values
of the input. You can see it as havin the inputs :
Y initial states (u_1, u_2, ... u_Y) for our outputs. Each must have appropriate length
(T_1, T_2, ..., T_Y).
X sequences inptus x_1, x_2, .. x_X
W other inputs w_1, w_2, ... w_W
Y seeds/initial values ( u_1, u_2, .. u_Y) for the outputs
Apply Outputs:
W non sequences inputs w_1, w_2, .. w_W
Y sequence outputs y_1, y_2, ... y_Y
Outputs :
Y sequence outputs y_1, y_2, .. y_Y
Each output y_j is computed one time-step at a time according to the formula:
Each otuput y_j computed one time step at a time according to the
formula:
.. code-block:: python
(y_1[t], y_2[t],.., y_Y[t]) = fn(
x_1[t], x_2[t], ... x_X[t], # X current input values
y_1(t-1), y_1(t-2), .., y_1(t-T_1), # T_1 previous outputs for y_1
y_2(t-1), y_2(t-2), ..., y_2(t-T_2), # T_2 previous outputs for y_2
..., # ...
y_Y(t-1), y_Y(t-2), ..., y_Y(t-T_Y), # T_Y previous outputs for y_Y
w_1, w_2,..., w_W) # W 'timeless' inputs
(y_1[t], y_2[t], .. y_Y[t]) = f(
x_1[t-K_1],.. x_1[t],x_1[t+1],.. x_1[t+L_1], # x_1 past and future
#values
x_2[t-K-2],.. x_2[t],x_2[t+1],.. x_2[t+L_2], # x_2 past and future
# values
... # ...
y_1[t-1], y_1[t-2], .. y[t - T_1], # past values of y_1
y_2[t-1], y_2[t-2], .. y[t - T_2],, # past values of y_2
...
w_1, w_2, .., w_W) # 'timeless' inputs
So `fn` must accept X + T_1 + T_2 + ... + T_Y + W arguments.
There are two high-level methods (`symbolic`, `compiled`) for creating a Scan Op besides
the low-level `__init__` constructor. ***Why would you call them?***
:param fn: fn is a lambda expression or a function that given a list of
symbolic inputs returns the update list and symbolic outputs list of the
function that shall be applied recursively.
:param sequences:list of sequences over which the scan op should iterate;
sequnces length should also cover past and future taps; for example if
you also use for a sequence the past tap -3 and future tap +4, to total
length should be n+7, where first 3 values of sequence are those
corresponding to -3 -2 -1 and the last 4 values correspond to n+1 n+2
n+3 and n+4
:param non_sequences: list of inputs over which it shouldn't iterate
:param seed_values: seeds (initial values) of the outputs; if past taps
are this seeds should contain enough values to cover this past values;
note that index 0 of a seed belongs to the largest past tap
:param inplace_map: a dictionary telling which output should be
computed in place of which input sequence ; input sequence has to be
of the same shape as the output
When applying a Scan Op to theano Variables, the order of arguments is very important! When
using the full flexibility of Scan there can be a lot of arguments, but it is essential to
put them in the following order:
:param sequence_taps: a dictionary telling for each sequence what past
and future taps it should use; past values should be negative, future
taps positives; by default 0 is added in this dictionary (current value)
if nothing is provided
1. "Ignored inputs" (x_i with i < n_inplace_ignore) that will be overwritten by an inplace scan.
:param outputs_taps: a dictionary telling for each output what past
taps it should use (negative values); by default -1 is added to this
dictionary if nothing is provided
2. Inputs that will be overwritten by an inplace scan (x_i with i < n_inplace)
:param len: a value (or theano scalar) describing for how many steps
the scan should iterate; 0 means that it should iterate over the entire
length of the input sequence(s)
3. Remaining Inputs (x_i with i >= n_inplace)
:param force_gradient: a flag telling scan op that the gradient can be
computed even though inplace or updates are used - use this on your own
risk
3. Output states (u_j) corresponding to the outputs that are computed inplace (j <
n_inplace)
:param truncate_gradient: tells for how many steps should scan go
back in time on the backward pass of backpropagation through time
4. Remaining output states not given in 3 (u_j with j >= n_inplace)
:param go_backwards: a flag indicating if scan should iterate back from
the end of the sequence to the begining (if it is true) or from 0 to
the end
5. Other inputs (w_1, w_2, ... w_W)
:param mode: indicates the mode that should be used to compile the
function that will be applied recursively
'''
Inplace Operation
=================
The Scan Op supports computing some (`n_inplace`) of the outputs y_j using the memory from
corresponding inputs x_j.
It is not possible to indicate precisely which outputs overwrite which inputs, but without
loss of generality we assume that each of the first `n_inplace` outputs (y_j) overwrites
the corresponding input (x_j).
# check if inputs are just single variables instead of lists
if not (type(sequences) in (list, tuple)):
seqs = [sequences]
elif seqs = sequences
if not type(seed_values) in (list,tuple)):
seeds = [seed_values]
elif
seeds = seed_values
if not (type(non_sequences) in (list,tuple)):
non_seqs = [non_sequences]
elif
non_seqs = non_sequences
# compute number of sequences and number of seeds
n_seqs = len(seqs)
# see if there are outputs that do not feed anything back to the function
# applied recursively
outs_tapkeys = outputs_taps.keys()
for k in outs_tapkeys.sort():
if outputs_taps[k] == []
# add empty lists where you have outputs that do not have past
# values
seeds = seeds[:k] + [[]] + seeds[k:]
n_seeds = len(seeds)
# update sequences_taps[idx] to contain 0 if it is not defined
for i in xrange(n_seqs):
if not sequences_taps.has_key(i):
sequences_taps.update({i:[0]})
# if input sequence is not actually used by the recursive function
elif sequences_taps[i] == []:
sequences_taps.__delitem__(i)
elif not (sequences_taps[i] in (list,tuple)):
sequences_taps[i] = [sequences_taps[i]]
# update outputs_taps[idx] to contain -1 if it is not defined
for i in xrange(n_seeds):
if not outputs_taps.has_key(i):
outputs_taps.update({i:-1})
# if output sequence is not actually used as input to the recursive
# function
elif outputs_taps[i] == []:
outputs_taps.__delitem__(i)
elif not(outputs_taps[i] in (list,tuple)):
outputs_taps[i] = [outputs_taps[i]]
# create theano inputs for the recursive function
args = []
for (i,seq) in enumerate(seqs):
if sequences_taps.has_key(i):
for k in len(sequences_taps[i]):
args += [seq[0].type() ]
for (i,seed) in enumerate(seeds):
if outputs_taps.has_key(i):
for k in len(outputs_taps[i]):
args += [seed[0].type() ]
args += non_seqs
next_outs, updates = fn(*args)
# Create the Scan op object
local_op = Scan( (args,next_outs, updates), n_seqs,n_seeds,inplace_map,
sequences_taps, outputs_taps, force_gradient, truncate_gradient,
go_backwards, mode)
# Call the object on the input sequences, seeds, and non sequences
return local_op( *( [thenao.tensor.as_tensor(len)] \
+ seqs \
+ seeds \
+ non_seqs))
''' The class implementing the scan op
The actual class. I would not recommend using it directly unless you really
know what you are doing'
'''
class Scan(theano.Op):
def __init__(self,(inputs, outputs, updates),n_seqs, n_seeds,
inplace_map={}, seqs_taps={}, outs_taps={},
force_gradient = False, truncate_gradient = -1,
go_backwards = False, inplace=False):
'''
:param inputs: list of symbolic inputs of the function that will
be applied recursively
:param outputs: list of symbolic outputs for the function applied
recursively
Note that using inplace computations destroys information, and may make it
impossible to compute the gradient.
As long as the function 'fn' does not update any of the other
parameters (w_1,..) a gradient of this operation is supported.
***Who will care about this? Someone just using the Op? Someone writing an inplace
optimization?***
:param updates: list of updates for the function applied recursively
Ignored Inputs
==============
:param n_seqs: number of sequences in the input over which it needs
to iterate
**** Behaviour? Rationale? Use case?
:param n_seeds: number of outputs (same as the number of seeds)
"""
@classmethod
def symbolic(cls,(in_args,out_args), n_ins, n_outs,\
n_inplace=0, n_inplace_ignore=0, taps={},
mode = 'FAST_RUN'):
# if in_args is not a list assume it is just a variable and
# convert it to a list (if this is neither the case the code will
# raise an error somewhere else !)
if not( type(in_args) in (list,tuple)):
in_args = [in_args]
# if out_args is not a list assume it is just a variable and
# convert it to a list
if not (type(out_args) in (list,tuple)):
out_args = [out_args]
# Create fn
my_fn = theano.compile.sandbox.pfunc(in_args, out_args, mode = mode)
# Create gradient function
gy_next = [out_args[0].type()]
g_inputs = theano.tensor.grad(out_args[0],in_args,g_cost=gy_next[-1])
for y_next in out_args[1:] :
gy_next +=[y_next.type()]
g_ls = theano.tensor.grad(y_next,in_args,g_cost=gy_next[-1])
for i in xrange(len(in_args)):
g_inputs[i] += g_ls[i]
g_fn=theano.compile.sandbox.pfunc(gy_next+in_args,g_inputs,
mode=mode)
:param inplace_map: dictionary discribing which output should be
computed inplace of which input
return cls(my_fn, g_fn, n_ins, n_outs,\
n_inplace,n_inplace_ignore, taps)
:param seqs_taps: dictionary discribing which past and future taps
of the input sequences are used by the recursive function
@classmethod
def compiled(cls,fn,n_ins, n_outs,\
n_inplace=0, n_inplace_ignore=0, taps={}):
"""Return a Scan instance that will scan the callable `fn` over `n_ins` inputs and
`n_outs` outputs.
:param outs_taps: dictionary discribing which past taps of the
outputs the recursive function is using
:param force_gradient: a flag indicating if the gradient is still
computable even though inplace operation or updates are used
"""
return cls(fn, None, n_ins, n_outs, \
n_inplace, n_inplace_ignore, taps= taps)
:param truncate_gradient: if different from -1 it tells after how
many steps in the backward pass of BPTT
'''
# check inplace map
for _out,_in in inplace_map.iteritems():
if _out > n_seeds:
raise ValueError(('Inplace map reffers to an unexisting'\
'output %d')% _out)
if _in > n_seqs:
raise ValueError(('Inplace map reffers to an unexisting'\
'input sequence %d')%_in)
if (_in >= 0) and (min(seqs_taps[_in]) < 0):
raise ValueError(('Input sequence %d uses past values that '\
'will be overwritten by inplace operation')%_in)
def __init__(self,fn,grad_fn,n_ins,n_outs,
n_inplace=0, n_inplace_ignore=0,
taps={}, inplace=False):
"""Create an instance of the scan class
#check sequences past taps
for k,v in seqs_taps.map_iteritems():
if k > n_seqs:
raise ValueError(('Sequences past taps dictionary reffers to '
'an unexisting sequence %d')%k)
To use Scan, first you need to create it specifying the number of inputs, outputs,
inplace outputs (see notes below), and inputs to be ignored, a dictionary describing
the time taps used, the function that will be applied recursively and optionally, the
gradient function (or a symbolic definition of the function and the op will compute the
gradient on its own). Secondly you just call the op with a list of parameters.
#check outputs past taps
for k,v in outs_taps.map_iteritems():
if k > n_seeds:
raise ValueError(('Sequences past taps dictionary reffers to '
'an unexisting sequence %d')%k)
if max(v) > -1:
raise ValueError(('Can not require future value %d of output'
'%d')%(k,max(v)))
:param fn: compiled function that takes you from time step t-1 to t
:param grad_fn: gradient of the function applied recursevly
:param n_ins: number of inputs; in the list of arguments
they start from 0 to 'n_ins'
:param n_outs: number of outputs; in the list of arguments you
need to give the initial state of each outputs, this will be from
'n_ins' to 'n_outs'; each initial state should be a matrix where
the first dimension is time and should be sufficiently large to
cover the time taps. The matrix for an initial state should be
ordered such that if you use k delays, index 0 of matrix stands for
the value at time -k, index 1 for value at time 1-k, index 2 for
value at time 2-k and index k-1 for value at time -1
:param n_inplace: indicates the number of outputs that should be
computed inplace; in the list of arguments there will be the first
'n_inplace' outputs in place of the first 'n_inplace' inputs
:param n_inplace_ignore: indicates the number of inputs that are
given just to be replaced by the inplace computation and which
should not be given as arguments to the function applied
recursevly
:param taps: a dictionary which for each output index gives
a list of what taps it uses; a tap is given as an int,
where x stands for output(t - x); note that a past trace of 1 makes
no sense, since you get that by default
:param inplace: is used by the optimizer that allows the inplace
computation
"""
if n_ins < 1:
raise ValueError('Scan should iterate over at least on one input')
if n_outs <1:
raise ValueError('Scan should have at least one output')
if (n_inplace > n_ins):
raise ValueError('Number of inplace outputs should be smaller than '
'the number of inputs.')
if (n_inplace < 0):
raise ValueError('Number of inplace outputs should be larger '
'or equal to 0')
if (n_inplace_ignore > n_inplace):
raise ValueError('Number of inputs to ignore should not be '\
'larger than number of inplace outputs')
if (n_inplace_ignore < 0):
raise ValueError('n_inplace_ignore should be non-negative')
self.destroy_map = {}
if inplace:
for i in xrange(n_inplace):
self.destroy_map.update( {i:[i]} )
for (k,v) in taps.iteritems():
if k < 0 or k > n_outs:
raise ValueError('Taps dictionary contains wrong key!')
for vi in v:
# why is it illegal to specify vi < 2?
# what is special about vi == 1?
#
# Would it be simpler to just leave v alone if it is non-empty (checking that
# all vi are >=1) and set v = [1] for all missing output keys?
if vi < 2:
raise ValueError('Taps dictionary contains wrong values!')
self.taps = taps
self.n_ins = n_ins
self.n_outs = n_outs
self.n_inplace = n_inplace
self.inplace = inplace
self.n_inplace_ignore = n_inplace_ignore
self.fn = fn
self.grad_fn = grad_fn
self.destroy_map = inplace_map
self.seqs_taps = seqs_taps
self.outs_taps = outs_taps
self.n_seqs = n_seqs
self.n_seeds = n_seeds
self.n_args = n_seqs+n_seeds+1
self.inplace_map = inplace_map
self.inplace = inplace
self.inputs = inputs
self.outputs = outputs
self.updates = updates
self.force_gradient = force_gradient
self.truncate_gradient = truncate_gradient
self.go_backwards = go_backwards
def make_node(self, *inputs):
"""Create an node for the Scan operation
self.fn = theano.function(inputs,outputs, \
updates = updates, mode = mode)
:param inputs: list of inputs for the operations; they should be
at least 'self.n_ins'+'self.n_outs' arguments; first 'self.n_inplace'
are inputs that are replaced inplace, followed by oter inputs up
to 'self.n_ins'; next 'self.n_outs' are ouputs followed by other
arguments that will be given to the function applied recursevly
"""
g_y = [outputs[0].type()]
g_args = theano.tensor.grad(outputs[0],inputs, g_cost = g_y[-1])
# for all outputs compute gradients and then sum them up
for y in outputs[1:]:
g_y += [y.type()]
g_args_y = theano.tensor.grad(y,inputs, g_cost=g_y[-1])
for i in xrange(len(g_args)):
g_args[i] += g_args_y[i]
n_args = len(inputs)
min_n_args = self.n_ins+self.n_outs
if n_args < min_n_args:
err = 'There should be at least '+str(min_n_args)+ 'arguments'
raise ValueError(err)
# Create list of output datatypes
out_types = []
for i in xrange(self.n_ins,self.n_ins+self.n_outs):
out_types += [theano.tensor.Tensor(dtype=inputs[i].dtype,\
broadcastable=(False,)+inputs[i].broadcastable[1:])()]
return theano.Apply(self,inputs, out_types)
self.g_ins = g_y+inputs
self.g_outs = g_args
def make_node(self,*inputs):
n_args = len(inputs)
if n_args < self.n_args :
err = 'There should be at least '+str(self.n_args)+ 'arguments'
raise ValueError(err)
# Create list of output datatypes
out_types = []
for i in xrange(self.n_seqs+1, self.n_seqs+self.n_seeds+1):
out_types += [theano.tensor.Tensor(dtype=inputs[i].dtype,\
broadcastable=(False,)+inputs[i].broadcastable[1:])()]
return theano.Apply(self,inputs, out_types)
def __eq__(self,other):
rval = type(self) == type(other)
if rval:
rval = (self.fn is other.fn) and \
(self.grad_fn is other.grad_fn) and \
(self.n_ins == other.n_ins) and \
(self.n_outs == other.n_outs) and \
(self.n_inplace == other.n_inplace) and \
(self.n_inplace_ignore == other.n_inplace_ignore) and\
(self.inplace == other.inplace) and\
(self.taps == other.taps)
return rval
rval = type(self) == type(other)
if rval:
rval = (self.inputs == other.inputs) and \
(self.outputs == other.outputs) and \
(self.updates == other.updates) and \
(self.g_ins == other.g_ins) and \
(self.g_outs == other.g_outs) and \
(self.seqs_taps == other.seqs_taps) and \
(self.outs_taps == other.outs_taps) and \
(self.inplace_map == other.inplace_map) and \
(self.n_seqs == other.n_seqs) and\
(self.inplace == other.inplace) and\
(self.go_backwards == other.go_backwards) and\
(self.truncate_gradient == other.truncate_gradient) and\
(self.force_gradient = other.force_gradient) and\
(self.n_seeds == other.n_seeds) and\
(self.n_args == other.n_args)
return rval
def __hash__(self):
# hash the taps dictionary
taps_hash = 0
for k,v in self.taps.iteritems():
taps_hash ^= k
for vi in v :
taps_hash ^= vi
return hash(type(self)) ^ \
hash(self.fn) ^ \
hash(self.grad_fn) ^ \
hash(self.n_ins) ^ \
hash(self.n_outs) ^ \
hash(self.n_inplace) ^ \
hash(self.n_inplace_ignore) ^\
hash(self.inplace) ^\
taps_hash
return hash(type(self)) ^ \
hash(self.n_seqs) ^ \
hash(self.n_seeds) ^ \
hash(self.force_gradient) ^\
hash(self.inplace) ^\
hash(self.go_backwards) ^\
hash(self.truncate_gradient) ^\
hash(self.n_args) ^ \
hash_list(self.outputs) ^ \
hash_list(self.inputs) ^ \
hash_list(g_ins) ^ \
hash_list(h_outs) ^ \
hash_dict(self.seqs_taps) ^\
hash_dict(self.outs_taps) ^\
hash_dict(self.inplace_map) ^\
hash_dict(self.updates)
def grad(self, inputs, g_outs):
if self.grad_fn == None:
print 'Warning! no gradient for the recursive function was given'
return [None for i in inputs]
else:
y = self(*inputs)
if not( type(y) in (list,tuple)):
y = [y]
for i in xrange(len(y)):
if g_outs[i] == None:
g_outs[i] = theano.tensor.zeros_like(y[i])
# Construct my gradient class:
gradScan = ScanGrad(self.grad_fn,
self.n_ins- self.n_inplace_ignore, self.n_outs,
self.taps)
args = g_outs + y + \
inputs[self.n_inplace_ignore:]
grads = gradScan(*args)
rval = [None for i in inputs[:self.n_inplace_ignore]]+grads
return rval
def perform(self,node,args, outs):
# find number of timesteps, note that a precondition is to have
# atleast one input to iterate over
n_steps = len(args[0])
n_steps = 0
if (self.n_seqs ==0 ) and (args[0] == 0)
raise ValueError('Scan does not know over how many steps it '
'should iterate! No input sequence or number of steps to '
'iterate given !')
# check if we deal with a inplace operation
n_inplace = self.n_inplace
n_inplace_ignore = self.n_inplace_ignore
if (args[0] != 0):
n_steps = args[0]
for i in xrange(self.n_seqs):
if self.seqs_taps.has_key(i):
# compute actual length of the sequence ( we need to see what
# past taps this sequence has, and leave room for them
seq_len = args[i+1].shape[0] + min(self.seqs_taps[i+1])
if self.seqs_taps[i+1][2] > 0:
# using future values, so need to end the sequence earlier
seq_len -= self.seqs_taps[i+1][2]
if n_steps == 0 :
# length of the sequences, leaving room for the largest
n_steps = seq_len
if seq_len != n_steps :
warning(('Input sequence %d has a shorter length then the '
'expected number of steps %d')%(i,n_steps))
n_steps = min(seq_len,n_steps)
# check if we deal with an inplace operation
inplace_map = self.inplace_map
if not self.inplace: #if it was not optimized to work inplace
n_inplace = 0
inplace_map = {}
# check lengths of inputs
for i in xrange(self.n_ins):
if args[i].shape[0] != n_steps:
raise ValueError('All inputs should have n_steps length!')
# check lengths of initial states
for i in xrange(self.n_ins, self.n_ins+self.n_outs):
req_size = 1
if self.taps.has_key(i- self.n_ins):
req_size = max(self.taps[i-self.n_ins])
if len(args[i].shape) == 0:
raise ValueError('Wrong initial state! ')
# check lengths of seeds
for i in xrange(self.n_seqs+1, \
self.n_seqs+self.n_seeds+1):
if self.outs_taps.has_key(i-self.n_seqs-1):
req_size = abs(min(self.outs_taps[i-self.n_seqs-1]))-1
if args[i].shape[0] < req_size:
raise ValueError('Wrong initial state! ')
# allocate space for the outputs
y = []
# inplace outputs
for i in xrange(n_inplace):
y += [args[i]]
# add outputs
for i in xrange(self.n_ins+n_inplace,self.n_ins+self.n_outs):
y_shape = (n_steps,)+args[i].shape[1:]
y += [numpy.empty(y_shape, dtype = args[i].dtype)]
# iterate
for i in xrange(n_steps):
fn_args = []
# get a time slice of inputs
for j in xrange(n_inplace_ignore, self.n_ins):
fn_args += [args[j][i]]
warning(('Initial state for output %d has fewer values then '
'required by the maximal past value %d. Scan will use 0s'
' for missing values')%(i-self.n_iterable-1,req_size))
# get past values of outputs (t-1 + taps)
for j in xrange(self.n_outs):
# get list of taps
ls_taps = [1]
if self.taps.has_key(j):
ls_taps += self.taps[j]
maxVal = max(ls_taps)
for tap_value in ls_taps:
if i - tap_value < 0:
fn_args += [args[j+self.n_ins][maxVal-tap_value+i]]
else:
fn_args += [y[j][i-tap_value]]
self.n_steps = n_steps
y = self.scan(self.fn, args[1:],self.n_seqs, self.n_seeds,
self.seqs_taps, self.outs_taps, n_steps, self.go_backwards,
inplace_map)
# get the none iterable parameters
fn_args += list(args[(self.n_ins+self.n_outs):])
# compute output
something = self.fn(*fn_args)
# update y and inplace outputs
for j in xrange(self.n_outs):
y[j][i] = something[j]
# write to storage
for i in xrange(self.n_outs):
for i in xrange(self.n_seeds):
outs[i][0]=y[i]
def scan(fn, args, n_seqs, n_seeds, seqs_taps, outs_taps, n_steps,
go_backwards, inplace_map):
y = []
for i in xrange(self.n_seeds):
if inplace_map.has_key(i) and (inplace_map[i] >= 0):
y += [args[inplace_map[i]]]
else:
y_shape = (n_steps,)+args[i+self.n_seqs].shape[1:]
y += [numpy.empty(y_shape,
dtype=args[i+self.n_seqs].dtype)]
#iterate
if go_backwards:
the_range = xrange(n_steps-1,-1,-1)
else:
the_range = xrange(n_steps)
seqs_mins = {}
for j in xrange(self.n_seqs):
if seqs_taps.has_key(j):
seqs_mins.update({j: min(seqs_taps[j])})
outs_mins = {}
seed_size = {}
for j in xrange(self.n_seeds):
if outs_taps.has_key(j):
outs_mins.update({j: min(outs_taps[j])})
seed_size.update({j: args[n_seqs+j].shape[0]})
for i in the_range:
fn_args = []
# sequences over which scan iterates
for j in xrange(self.n_seqs):
if seqs_taps.has_key(j):
ls_taps = seqs_taps[j]
min_tap = seqs_mins[j]
for tap_value in ls_taps:
k = i - min_tap + tap_value
fn_args += [args[j][k]]
# seeds or past values of outputs
for j in xrange(self.n_seeds):
if outs_taps.has_key(j):
ls_taps = outs_taps[j]
min_tap = outs_mins[j]
seed_sz = seed_size[j]
for tap_value in ls_taps:
if i + tap_value < 0:
k = i + seed_sz + tap_value
if k < 0
# past value not provided.. issue a warning and use 0s
fn_args += [numpy.zeros(args[j][0].shape)]
warning('Past value %d for output %d not given in seeds' %
(j,tap_value))
else:
fn_args += [args[j][k]]
else:
fn_args += [y[j][i + tap_value]]
# get the non-iterable sequences
fn_args += list(args[(self.n_seqs+self.n_seedss):]
# compute output
something = fn(*fn_args)
#update outputs
for j in xrange(self.n_seeds):
y[j][i] = something[j]
return y
def grad(self, args, g_outs):
if (not self.force_gradient) and \
((self.updates.keys() != []) or (self.inplace_map.keys() != [])):
warning('Can not compute gradients if inplace or updates ' \
'are used. Use force_gradient if you know for sure '\
'that the gradient can be computed automatically.')
return [None for i in inputs]
else:
# forward pass
y = self(*args)
if not( type(y) in (list,tuple)):
y = [y]
# backwards pass
for i in xrange(len(y)):
if g_outs[i] == None:
g_outs[i] = theano.tensor.zeros_like(y[i])
g_args = [self.n_steps]+g_outs + y
# check if go_backwards is true
if self.go_backwards:
for seq in args[1:self.n_seqs]:
g_args += [seq[::-1]]
else:
g_args += args[1:self.n_seqs]
g_args += args[1+self.n_seqs: ]
g_scan = ScanGrad((self.g_ins,self.g_outs), self.n_seqs, \
self.n_seeds,self.seqs_taps, self.outs_taps,
self.truncate_gradient)
return g_scan(g_args)
@gof.local_optimizer([None])
def scan_make_inplace(node):
op = node.op
if isinstance(op, Scan) and (not op.inplace) and (op.n_inplace>0):
return Scan(op.fn, op.grad_fn, op.n_ins,\
op.n_outs, op.n_inplace, op.n_inplace_ignore,\
op.taps,inplace=True\
).make_node(*node.inputs).outputs
if isinstance(op, Scan) and (not op.inplace) \
and (op.inplace_map.keys() != []):
return Scan((op.inputs, op.outputs, op.updates), op.n_seqs, \
op.n_seeds, op.inplace_map, op.seqs_taps, op.outs_taps, \
op.force_gradient, op.truncate_gradient, \
op.go_backwards, inplace=True \
).make_node(*node.inputs).outputs
return False
optdb.register('scan_make_inplace', opt.in2out(scan_make_inplace,\
ignore_newtrees=True), 75, 'fast_run', 'inplace')
......@@ -428,144 +587,160 @@ optdb.register('scan_make_inplace', opt.in2out(scan_make_inplace,\
class ScanGrad(theano.Op):
"""Gradient Op for Scan"""
def __init__(self, grad_fn, n_ins, n_outs,
taps = {},inplace=False):
self.grad_fn = grad_fn
self.n_ins = n_ins # number of inputs of Scan op not of Grad Scan !!
self.n_outs = n_outs # number of outs of Scan op not of Grad Scan !!
self.inplace = inplace
self.taps = taps
def __init__(self,(g_ins, g_outs) , n_seqs, n_outs,
seqs_taps = {}, outs_taps= {}, truncate_gradient = -1):
self.grad_fn = theano.function(g_ins, g_outs)
self.inputs = g_ins
self.outputs = g_outs
self.n_seqs = n_seqs
self.truncate_gradient = truncate_gradient
self.n_outs = n_outs
self.seqs_taps = seqs_taps
self.outs_taps = outs_taps
self.destroy_map = {}
if self.inplace:
for i in xrange(self.n_outs):
# claiming that output "-i" is destroying inputs is the way to
# declare that no real output is aliased to any inputs. We just
# trash the inputs by using them as workspace.
self.destroy_map.update( {-i:[i]})
def __eq__(self,other):
rval = type(self) == type(other)
if rval:
rval = (self.grad_fn is other.grad_fn) and \
(self.n_ins == other.n_ins) and \
rval = (self.inputs == other.inputs) and \
(self.outputs == other.outputs) and \
(self.n_seqs == other.n_seqs) and \
(self.n_outs == other.n_outs) and \
(self.inplace == other.inplace) and \
(self.taps == other.taps)
(self.truncate_gradient == other.truncate_gradient) and\
(self.seqs_taps == other.seqs_taps) and \
(self.outs_taps == other.outs_taps)
return rval
def __hash__(self):
taps_hash = 0
for k,v in self.taps.iteritems():
taps_hash ^= k
for vi in v :
taps_hash ^= vi
return hash(type(self)) ^ \
hash(self.grad_fn) ^ \
hash(self.n_ins) ^ \
hash(self.n_seqs) ^ \
hash(self.n_outs) ^ \
hash(self.inplace) ^ taps_hash
hash(self.truncate_gradient) ^\
hash_list(self.inputs) ^ \
hash_list(self.outputs) ^ \
hash_dict(self.seqs_taps) ^ \
hash_dict(self.outs_taps)
def make_node(self, *args):
# input of the gradient op :
# | g_outs | y | ins | outs | other_args |
# | n_outs | n_outs | n_ins | n_outs | unknown |
# | g_outs | y | seqs | outs | non_seqs |
# | n_outs | n_outs | n_seqs | n_outs | unknown |
# return
# | grad of ins | grad of outs | grad of other_args|
# | n_ins | n_outs | unknown |
# | grad of seqs | grad of outs | grad of non_seqs |
# | n_seqs | n_outs | unknown |
return theano.Apply(self, list(args),
[i.type() for i in args[self.n_outs+self.n_outs:] ])
[i.type() for i in args[1+2*self.n_outs:] ])
def perform(self, node, args, storage):
# get scan inputs
inputs = args[self.n_outs+self.n_outs:]
ins = inputs[:self.n_ins]
initSt = inputs[self.n_ins:self.n_ins+self.n_outs]
otherArgs = inputs[self.n_outs+self.n_ins:]
n_steps = args[0]
inputs = args[2*self.n_outs+1:]
seqs = inputs[:self.n_seqs]
seeds = inputs[self.n_seqs:self.n_seqs+self.n_outs]
non_seqs = inputs[self.n_outs+self.n_seqs:]
# generate space for gradient
# not do if inplace !?
g_ins = [numpy.zeros_like(k) for k in ins]
g_initSt = [numpy.zeros_like(k) for k in initSt]
g_otherArgs = [numpy.zeros_like(k) for k in otherArgs]
g_seqs = [numpy.zeros_like(k) for k in seqs]
g_seeds = [numpy.zeros_like(k) for k in seeds]
g_non_seqs = [numpy.zeros_like(k) for k in non_seqs]
# get gradient from above
g_outs = args[:self.n_outs]
# we modify g_outs inplace ..
if not self.inplace:
g_outs = [gout.copy() for gout in g_outs]
# get the output of the scan operation
outs = args[self.n_outs:2*self.n_outs]
# check for Nones (non - differentiable )
#for i,g_o in enumerate(g_outs):
# if numpy.all(g_o == 0.):
# g_outs[i] = numpy.zeros_like(outs[i])
# go back through time to 0 (use a time window !?)
for i in xrange(len(ins[0])-1,-1,-1):
# go back through time to 0 or n_steps - truncate_gradient
lower_limit = n_steps - self.truncate_gradient
if lower_limit > n_steps-1:
the_range = xrange(n_steps-1,-1,-1)
elif lower_limit < -1:
the_range = xrange(n_steps-1,-1,-1)
else:
the_range = xrange(n_steps-1, lower_limit,-1)
seqs_mins = {}
for j in xrange(self.n_seqs):
if self.seqs_taps.has_key(j):
seqs_mins.update({j: min(self.seqs_taps[j])})
outs_mins = {}
seed_size = {}
for j in xrange(self.n_outs):
if self.outs_taps.has_key(j):
outs_mins.update({j: min(self.outs_taps[j])})
seed_size.update({j: g_seeds[j]..shape[0]})
for i in the_range:
# time slice of inputs
_ins = [arg[i] for arg in ins]
_ins = []
for j in xrange(self.n_seqs)
if self.seqs_taps.has_key(j):
ls_taps = self.seqs_taps[j]
min_tap = seqs_mins[j]
for tap_value in ls_taps:
k = i - min_tap + tap_value
_ins += [ins[j][k]]
# time slice of outputs + taps
_outs = []
for j in xrange(self.n_outs):
ls_taps = [1]
if self.taps.has_key(j):
ls_taps += self.taps[j]
maxVal = max(ls_taps)
for tap_value in ls_taps:
if i - tap_value < 0:
_outs += [initSt[j][maxVal-tap_value+i]]
if self.outs_taps.has_key(j):
ls_taps = self.outs_taps[j]
min_tap = outs_mins[j]
seed_sz = seed_size[j]
for tap_value in ls_taps:
if i + tap_value < 0:
k = i + seed_sz + tap_value
if k < 0 :
#past value not provided .. issue a warning and use 0
_outs += [numpy.zeros(seeds[j][0].shape)]
warning('Past value %d for output $d not given' \
%(j,tap_value))
else:
_outs += [seeds[j][[k]]
else:
_outs += [outs[j][i- tap_value]]
_outs += [outs[j][i + tap_value]]
g_out = [arg[i] for arg in g_outs]
grad_args = g_out + _ins + _outs + otherArgs
grad_args = g_out + _ins + _outs + non_seqs
grads=self.grad_fn(*grad_args)
# get gradient for inputs
for j in xrange(self.n_ins):
g_ins[j][i] = grads[j]
pos = 0
for j in xrange(self.n_seqs):
if self.seqs_taps.has_key(j):
ls_taps = self.seqs_taps[j]
min_tap = seqs_mins[j]
for tap_value in ls_taps :
k = i - min_tap + tap_value
g_ins[j][k] += grads[pos]
pos += 1
# get gradient for outputs
pos = self.n_ins
for j in xrange(self.n_outs):
ls_taps = [1]
if self.taps.has_key(j):
ls_taps += self.taps[j]
maxVal = max(ls_taps)
for tap_value in ls_taps:
if i - tap_value < 0:
g_initSt[j][maxVal-tap_value+i] += grads[pos]
pos +=1
else:
g_outs[j][i-tap_value]+= grads[pos]
pos += 1
for j in xrange(len(g_otherArgs)):
g_otherArgs[j] += grads[j+pos]
# return the gradient
for i in xrange(len(g_ins)):
storage[i][0] = g_ins[i]
if self.outs_taps.has_key(j):
ls_taps = self.outs_taps[j]
min_tap = outs_mins[j]
seed_sz = seed_size[j]
for tap_value in ls_taps:
if i+tap_value < 0 :
k = i + seed_sz + tap_value
if k > 0 :
g_seeds[j][k] += grads[pos]
pos += 1
for j in xrange(len(g_non_seqs)):
g_non_seqs[j] += grads[j+pos]
for i in xrange(len(g_initSt)):
storage[i+self.n_ins][0] = g_initSt[i]
for i in xrange(len(g_otherArgs)):
storage[i+self.n_ins+self.n_outs][0] = g_otherArgs[i]
# return the gradient
for i,v in enumerate(g_ins + g_seeds+ g_non_seqs):
storage[i][0] = v
@gof.local_optimizer([None])
def grad_scan_make_inplace(node):
op = node.op
if isinstance(op, ScanGrad) and (not op.inplace):
return ScanGrad(op.grad_fn, op.n_ins, op.n_outs, op.taps,
inplace=True).make_node(*node.inputs).outputs
return False
optdb.register('grad_scan_make_inplace', opt.in2out(grad_scan_make_inplace,\
ignore_newtrees=True), 75, 'fast_run', 'inplace')
......@@ -7,8 +7,6 @@ import random
import numpy.random
from theano.tests import unittest_tools as utt
def verify_grad(op, pt, n_tests=2, rng=None, eps = None, tol = None,
mode = None, cast_to_output_type = False):
pt = [numpy.array(p) for p in pt]
......@@ -75,455 +73,21 @@ def verify_grad(op, pt, n_tests=2, rng=None, eps = None, tol = None,
class T_Scan(unittest.TestCase):
def setUp(self):
utt.seed_rng()
x_1 = theano.tensor.dscalar('x_1')
self.my_f = theano.function([x_1],[x_1]) #dummy function
# Naming convention :
# u_1,u_2,.. -> inputs, arrays to iterate over
# x_1,x_2,.. -> outputs at t-1 that are required in the recurrent
# computation
# iu_1,iu_2,.. -> inplace inputs, inputs that are being replaced by
# outputs during computation
# du_1,du_2,.. -> dummy inputs used to do inplace computation, they
# are not passed to my_f
# ix_1,ix_2,.. -> inplace outputs at t-1
# x_1_next,.. -> outputs at t
# ix_1_next,.. -> inplace outputs at time t
# w_1,w_2,.. -> weights, paramters over which scan does not iterate
# my_f -> compiled function that will be applied recurrently
# my_op -> operator class
# final_f -> compiled function that applies the Scan operation
# out_1,.. -> outputs of the Scan operation
###################################################################
def test_numberOfIterableInputs(self):
def t1():
my_op = Scan.compiled(self.my_f,-1,1)
def t2():
my_op = Scan.compiled(self.my_f,0,1)
self.failUnlessRaises(ValueError,t1)
self.failUnlessRaises(ValueError,t2)
###################################################################
def test_numberOfOutputs(self):
def t1():
my_op = Scan.compiled(self.my_f,1,-1)
def t2():
my_op = Scan.compiled(self.my_f,1,0)
self.failUnlessRaises(ValueError,t1)
self.failUnlessRaises(ValueError,t2)
#####################################################################
def test_numberOfInplaceOutputs(self):
def t1():
my_op =Scan.compiled(self.my_f,1,1,n_inplace = -1)
def t2():
my_op =Scan.compiled(self.my_f,1,1,n_inplace = 2)
def t3():
my_op =Scan.compiled(self.my_f,2,1,n_inplace=2)
def t4():
my_op =Scan.compiled(self.my_f,1,2,n_inplace=2)
def t5():
my_op =Scan.compiled(self.my_f,1,1,n_inplace=1,n_inplace_ignore=2)
self.failUnlessRaises(ValueError,t1)
self.failUnlessRaises(ValueError,t2)
self.failUnlessRaises(ValueError,t3)
self.failUnlessRaises(ValueError,t4)
self.failUnlessRaises(ValueError,t5)
#####################################################################
def test_taps(self):
def t1():
my_op = Scan.compiled(self.my_f,1,1, taps={2:[3]})
def t2():
my_op = Scan.compiled(self.my_f,1,2, taps={0:[0]})
def t3():
my_op = Scan.compiled(self.my_f,1,2, taps={0:[1]})
self.failUnlessRaises(ValueError,t1)
self.failUnlessRaises(ValueError,t2)
self.failUnlessRaises(ValueError,t3)
#####################################################################
def test_makeNode(self):
def t1():
######### Test inputs of different lengths
# define the function that is applied recurrently
u_1 = theano.tensor.dscalar('u_1')
u_2 = theano.tensor.dscalar('u_2')
x_1 = theano.tensor.dscalar('x_1')
x_1_next = u_1+u_2*x_1
my_f = theano.function([u_1,u_2,x_1],[x_1_next])
# define the function that applies the scan operation
my_op = Scan.compiled(my_f,2,1)
u_1 = theano.tensor.dvector('u_1')
u_2 = theano.tensor.dvector('u_2')
x_1 = theano.tensor.dvector('x_1')
x_1_next = my_op(u_1,u_2,x_1)
final_f = theano.function([u_1,u_2,x_1],[x_1_next])
# test the function final_f
u_1 = numpy.random.rand(3)
u_2 = numpy.random.rand(2)
x_1 = [numpy.random.rand()]
out = final_f(u_1,u_2,x_1)
def t2():
######### Test function does not return correct number of outputs
# define the function that is applied recurrently
u_1 = theano.tensor.dscalar('u_1')
x_1 = theano.tensor.dscalar('x_1')
x_1_next = u_1 * x_1
my_f = theano.function([u_1,x_1],[x_1_next])
# define the function that applies the scan operation
my_op = Scan.compiled(my_f,1,2)
u_1 = theano.tensor.dvector('u_1')
x_1 = theano.tensor.dvector('x_1')
x_2 = theano.tensor.dvector('x_2')
x_1_next,x_2_next = my_op(u_1,x_1,x_2)
final_f = theano.function([u_1,x_1,x_2],[x_1_next,x_2_next])
#generate data
u_1 = numpy.random.rand(3)
x_1 = [numpy.random.rand()]
x_2 = [numpy.random.rand()]
out_1,out_2 = final_f(u_1,x_1,x_2)
# Naming convention :
# u_1,u_2,.. -> sequences
# s_1,s_2,.. -> initial states
# w_1,w_2,.. -> non-sequences
###################################
class T_Scan(unittest.TestCase):
def setUp(self):
utt.seed_rng()
self.failUnlessRaises(ValueError,t1)
self.failUnlessRaises(TypeError,t2)
#####################################################################
def test_generator(self):
# compile my_f
u_1 = theano.tensor.dscalar('u_1') # dummy input,
# required if no inplace is used!
x_1 = theano.tensor.dscalar('x_1')
w_1 = theano.tensor.dscalar('w_1')
x_1_next = x_1*w_1
my_f = theano.function([u_1,x_1,w_1],[x_1_next])
# create operation
my_op = Scan.compiled(my_f,1,1)
u_1 = theano.tensor.dvector('u_1') # dummy input, there is no
#inplace, so output will not be put in place of this u_1!
x_1 = theano.tensor.dvector('x_1')
w_1 = theano.tensor.dscalar('w_1')
x_1_next = my_op(u_1,x_1,w_1)
final_f = theano.function([u_1,x_1,w_1],[x_1_next])
#generate data
x_1 = numpy.ndarray(3) # dummy input, just tells for how many time
# steps to run recursively
out_1 = final_f(x_1,[2],2)
self.failUnless(numpy.all(out_1 == numpy.asarray([4,8,16])))
#####################################################################
def test_generator_inplace_no_ignore(self):
# compile my_f
u_1 = theano.tensor.dscalar('u_1')
x_1 = theano.tensor.dscalar('x_1')
w_1 = theano.tensor.dscalar('w_1')
x_1_next = x_1*w_1
my_f = theano.function([u_1,x_1,w_1],[x_1_next])
# create operation
my_op = Scan.compiled(my_f,1,1,n_inplace=1)
iu_1 = theano.tensor.dvector('iu_1')
ix_1 = theano.tensor.dvector('ix_1')
w_1 = theano.tensor.dscalar('w_1')
ix_1_next= my_op(iu_1,ix_1,w_1)
final_f = theano.function([theano.In(iu_1, mutable=True),ix_1,w_1],
[ix_1_next], mode='FAST_RUN')
#generate data
iu_1 = numpy.ndarray(3)
out_1 = final_f(iu_1,[2],2)
# not concretely implemented yet ..
self.failUnless(numpy.all(out_1 == numpy.asarray([4,8,16])))
self.failUnless(numpy.all(out_1 == iu_1))
#####################################################################
def test_generator_inplace_no_ignore_2states(self):
# compile my_f
u_1 = theano.tensor.dscalar('u_1')
u_2 = theano.tensor.dscalar('u_2')
x_1 = theano.tensor.dscalar('x_1')
x_2 = theano.tensor.dscalar('x_2')
w_1 = theano.tensor.dscalar('w_1')
x_1_next = x_1*w_1
x_2_next = x_2*w_1
my_f = theano.function([u_1,u_2,x_1,x_2,w_1],[x_1_next,x_2_next])
# create operation
my_op = Scan.compiled(my_f,2,2,n_inplace=2)
iu_1 = theano.tensor.dvector('iu_1')
iu_2 = theano.tensor.dvector('iu_2')
ix_1 = theano.tensor.dvector('ix_1')
ix_2 = theano.tensor.dvector('ix_2')
w_1 = theano.tensor.dscalar('w_1')
ix_1_next,ix_2_next= my_op(iu_1,iu_2,ix_1,ix_2,w_1)
final_f = theano.function([theano.In(iu_1, mutable=True),
theano.In(iu_2, mutable=True),ix_1,ix_2,
w_1],[ix_1_next,ix_2_next], mode='FAST_RUN')
#generate data
iu_1 = numpy.ndarray(3)
iu_2 = numpy.ndarray(3)
out_1,out_2 = final_f(iu_1,iu_2,[2],[1],2)
# not concretely implemented yet ..
self.failUnless(numpy.all(out_1 == numpy.asarray([4,8,16])))
self.failUnless(numpy.all(out_1 == iu_1))
self.failUnless(numpy.all(out_2 == numpy.asarray([2,4,8])))
self.failUnless(numpy.all(out_2 == iu_2))
#######################################################################
def test_generator_inplace(self):
#compile my_f
u_1 = theano.tensor.dscalar('u_1')
x_1 = theano.tensor.dscalar('x_1')
x_2 = theano.tensor.dscalar('x_2')
x_1_next = u_1 + x_1
x_2_next = x_1 * x_2
my_f = theano.function([u_1,x_1,x_2],[x_1_next,x_2_next])
# create operation
my_op = Scan.compiled(my_f,2,2,n_inplace=2,n_inplace_ignore=1)
du_1 = theano.tensor.dvector('du_1')
iu_1 = theano.tensor.dvector('iu_1')
ix_1 = theano.tensor.dvector('ix_1')
ix_2 = theano.tensor.dvector('ix_2')
ix_1_next,ix_2_next = my_op(du_1,iu_1,ix_1,ix_2)
final_f=theano.function([theano.In(du_1, mutable = True),
theano.In(iu_1, mutable = True),
ix_1,ix_2],[ix_1_next,ix_2_next],mode='FAST_RUN')
# generate data
du_1 = numpy.asarray([0.,0.,0.])
iu_1 = numpy.asarray([1.,1.,1.])
ix_1 = [1]
ix_2 = [1]
out_1,out_2 = final_f(du_1,iu_1,ix_1,ix_2)
self.failUnless(numpy.all(out_1 == numpy.asarray([2,3,4])))
self.failUnless(numpy.all(out_2 == numpy.asarray([1,2,6])))
self.failUnless(numpy.all(out_1 == du_1))
self.failUnless(numpy.all(out_2 == iu_1))
#####################################################################
def tets_iterateOnlyOverX(self):
u_1 = theano.tensor.dscalar('u_1')
x_1 = theano.tensor.dscalar('x_1')
x_1_next = u_1*x_1
my_f = theano.function([u_1,x_1],[x_1_next])
my_op = Scan.compiled(my_f,1,1)
u_1 = theano.tensor.dvector('u_1')
x_1 = theano.tensor.dvector('x_1')
x_1_next = my_op(u_1,x_1)
final_f = theano.function([x_1,u_1],[x_1_next])
u_1 = numpy.asarray([2,2,2])
out_1 = final_f(inp,2)
self.failUnless(numpy.all(out_1==numpy.asarray([4,8,16])))
#####################################################################
def test_iterateOverSeveralInputs(self):
u_1 = theano.tensor.dscalar('u_1') # input 1
u_2 = theano.tensor.dscalar('u_2') # input 2
x_1 = theano.tensor.dscalar('x_1') # output
x_1_next = (u_1+u_2)*x_1
my_f = theano.function([u_1,u_2,x_1],[x_1_next])
my_op = Scan.compiled(my_f,2,1)
u_1 = theano.tensor.dvector('u_1')
u_2 = theano.tensor.dvector('u_2')
x_1 = theano.tensor.dvector('x_1')
x_1_next = my_op(u_1,u_2,x_1)
final_f = theano.function([u_1,u_2,x_1],[x_1_next])
u_1 = numpy.asarray([1,1,1])
u_2 = numpy.asarray([1,1,1])
x_1 = [2]
out_1 = final_f(u_1,u_2,x_1)
self.failUnless(numpy.all(out_1==numpy.asarray([4,8,16])))
#####################################################################
def test_iterateOverSeveralInputsSeveralInplace(self):
iu_1 = theano.tensor.dscalar('iu_1')
u_1 = theano.tensor.dscalar('u_1')
u_2 = theano.tensor.dscalar('u_2')
u_3 = theano.tensor.dscalar('u_3')
u_4 = theano.tensor.dscalar('u_4')
ix_1 = theano.tensor.dscalar('ix_1')
ix_2 = theano.tensor.dscalar('ix_2')
x_1 = theano.tensor.dscalar('x_1')
w_1 = theano.tensor.dscalar('w_1')
ix_1_next = u_3 + u_4
ix_2_next = ix_1 + ix_2
x_1_next = x_1 + u_3 + u_4 + ix_1 + ix_2
my_f = theano.function([iu_1,u_1,u_2,u_3,u_4,ix_1,ix_2,x_1,w_1],\
[ix_1_next,ix_2_next, x_1_next])
my_op = Scan.compiled(my_f,6,3, n_inplace=2,\
n_inplace_ignore=1)
du_1 = theano.tensor.dvector('du_1')
iu_1 = theano.tensor.dvector('iu_1')
u_1 = theano.tensor.dvector('u_1')
u_2 = theano.tensor.dvector('u_2')
u_3 = theano.tensor.dvector('u_3')
u_4 = theano.tensor.dvector('u_4')
x_1 = theano.tensor.dvector('x_1')
ix_1 = theano.tensor.dvector('ix_1')
ix_2 = theano.tensor.dvector('ix_2')
w_1 = theano.tensor.dscalar('w_1')
[ix_1_next,ix_2_next,x_1_next]= \
my_op(du_1,iu_1,u_1,u_2,u_3,u_4,x_1,ix_1,ix_2,w_1)
final_f=theano.function([theano.In(du_1, mutable = True),
theano.In(iu_1, mutable = True),
u_1,u_2,u_3,u_4,ix_1,ix_2,x_1,w_1],
[ix_1_next,ix_2_next,
x_1_next],mode='FAST_RUN')
#generate data
du_1 = numpy.asarray([0.,0.,0.])
iu_1 = numpy.asarray([0.,1.,2.])
u_1 = numpy.asarray([1.,2.,3.])
u_2 = numpy.asarray([1.,1.,1.])
u_3 = numpy.asarray([2.,2.,2.])
u_4 = numpy.asarray([3.,2.,1.])
x_1 = [1.]
ix_1 = [1.]
ix_2 = [1.]
w_1 = 2.
out_1,out_2,out_3 = final_f(du_1,iu_1,u_1,u_2,u_3,u_4,\
ix_1,ix_2,x_1,w_1)
self.failUnless(numpy.all(out_3 == numpy.asarray([8.,19.,33.])))
self.failUnless(numpy.all(out_1 == numpy.asarray([5.,4.,3.])))
self.failUnless(numpy.all(out_2 == numpy.asarray([2.,7.,11.])))
self.failUnless(numpy.all(out_1 == du_1))
self.failUnless(numpy.all(out_2 == iu_1))
#####################################################################
def test_computeInPlaceArguments(self):
u_1 = theano.tensor.dscalar('u_1')
x_1 = theano.tensor.dscalar('x_1')
w_1 = theano.tensor.dscalar('w_1')
x_1_next = u_1*w_1+x_1
my_f = theano.function([u_1,x_1,theano.In(w_1,update=w_1*2)],
[x_1_next])
my_op = Scan.compiled(my_f,1,1)
u_1 = theano.tensor.dvector('u_1')
x_1 = theano.tensor.dvector('x_1')
w_1 = theano.tensor.dscalar('w_1')
x_1_next = my_op(u_1,x_1,w_1)
final_f = theano.function([u_1,x_1,w_1], [x_1_next])
u_1 = [1.,1.,1.]
x_1 = [1.]
w_1 = 1.
out_1 = final_f(u_1,x_1,w_1)
self.failUnless(numpy.all(out_1 == numpy.asarray([2,4,8])))
#####################################################################
def test_timeTaps(self):
u_1 = theano.tensor.dscalar('u_1')
x_1 = theano.tensor.dscalar('x_1')
x_1_t2 = theano.tensor.dscalar('x_1_t2')
x_1_t4 = theano.tensor.dscalar('x_1_t4')
x_1_next = u_1+x_1+x_1_t2+x_1_t4
my_f = theano.function([u_1,x_1,x_1_t2,x_1_t4],[x_1_next])
my_op = Scan.compiled(my_f,1,1,taps={0:[2,4]})
u_1 = theano.tensor.dvector('u_1')
x_1 = theano.tensor.dvector('x_1')
x_1_next = my_op(u_1,x_1)
final_f = theano.function([u_1,x_1],[x_1_next])
u_1 = [1.,1.,1.,1.,1.]
x_1 = [1.,2.,3.,4.]
out_1 = final_f(u_1,x_1)
self.failUnless(numpy.all(out_1==numpy.asarray([9.,16.,29.,50.,89.])))
#####################################################################
def test_constructFunction(self):
u_1 = theano.tensor.dscalar('u_1')
x_1 = theano.tensor.dscalar('x_1')
x_1_next = u_1 + x_1
my_op = Scan.symbolic(([u_1,x_1],x_1_next),1,1)
u_1 = theano.tensor.dvector('u_1')
x_1 = theano.tensor.dvector('x_1')
x_1_next = my_op(u_1,x_1)
final_f = theano.function([u_1,x_1],[x_1_next])
u_1 = [1.,1.,1.]
x_1 = [1.]
out_1 = final_f(u_1,x_1)
self.failUnless(numpy.all(out_1==numpy.asarray([2.,3.,4.])))
######################################################################
def test_gradOneInputOneOutput(self):
u_1 = theano.tensor.dscalar('u_1')
x_1 = theano.tensor.dscalar('x_1')
x_1_next = u_1*x_1
my_op = Scan.symbolic( ([u_1,x_1],x_1_next), 1,1)
u_1 = [1.,2.,3.]
x_1 = [1.]
verify_grad( my_op , [u_1,x_1] )
#######################################################################
def test_gradManyInputsManyOutputs(self):
u_1 = theano.tensor.dscalar('u_1')
u_2 = theano.tensor.dscalar('u_2')
x_1 = theano.tensor.dscalar('x_1')
x_2 = theano.tensor.dscalar('x_2')
x_1_next = x_1*u_1+x_2
x_2_next = x_2*u_2+x_1
my_op = Scan.symbolic( ([u_1,u_2,x_1,x_2],
[x_1_next,x_2_next]),
2,2)
u_1 = [1.,.2,3.]
u_2 = [1.5,1.25,.35]
x_1 = [.5]
x_2 = [.65]
verify_grad(my_op, [u_1,u_2,x_1,x_2])
######################################################################
def test_gradTimeTaps(self):
u_1 = theano.tensor.dscalar('u_1')
x_1 = theano.tensor.dscalar('x_1')
x_1_t_2 = theano.tensor.dscalar('x_1_t_2')
x_1_next = x_1_t_2*x_1*u_1
my_op = Scan.symbolic( ([u_1,x_1,x_1_t_2],
[x_1_next]),
1,1,taps={0:[2]})
u_1 = [1.,2.,3.,4.]
x_1 = [2.,3.]
verify_grad(my_op, [u_1,x_1])
#######################################################################
def test_gradManyInputsManyOutputsTimeTaps(self):
u_1 = theano.tensor.dscalar('u_1')
u_2 = theano.tensor.dscalar('u_2')
x_1 = theano.tensor.dscalar('x_1')
x_1_2 = theano.tensor.dscalar('x_1_2')
x_2 = theano.tensor.dscalar('x_2')
x_2_2 = theano.tensor.dscalar('x_2_2')
x_1_n = x_1*x_2_2 + u_1*x_1_2
x_2_n = x_2*x_1_2 + u_2*x_2_2
my_op = Scan.symbolic(([u_1,u_2,x_1,x_1_2,
x_2,x_2_2],[x_1_n,
x_2_n]),2,2,taps=
{0:[2],1:[2]})
u_1 = [1.,2.,3.,4.]
u_2 = [3.,2.,4.,1.]
x_1 = [0.1,0.2]
x_2 = [1.5,3.5]
verify_grad(my_op, [u_1,u_2,x_1,x_2])
def test_one(self):
pass
if __name__ == '__main__':
unittest.main()
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论