提交 7526e601 authored 作者: Pascal Lamblin's avatar Pascal Lamblin

Whitespace fixes.

上级 dfcc51d6
...@@ -318,10 +318,10 @@ is a :ref:`variable` we statically know the value of. ...@@ -318,10 +318,10 @@ is a :ref:`variable` we statically know the value of.
Now the code works the way we want it to. Now the code works the way we want it to.
.. note:: .. note::
Most Theano Ops follow this convention of up-casting literal Most Theano Ops follow this convention of up-casting literal
make_node arguments to Constants. make_node arguments to Constants.
This makes typing expressions more natural. If you do This makes typing expressions more natural. If you do
not want a constant somewhere in your graph, you have to pass a Variable not want a constant somewhere in your graph, you have to pass a Variable
(like ``double('x')`` here). (like ``double('x')`` here).
...@@ -343,7 +343,7 @@ arithmetic operators: ...@@ -343,7 +343,7 @@ arithmetic operators:
from theano import gof from theano import gof
class BinaryDoubleOp(gof.Op): class BinaryDoubleOp(gof.Op):
def __init__(self, name, fn): def __init__(self, name, fn):
self.name = name self.name = name
self.fn = fn self.fn = fn
...@@ -353,7 +353,7 @@ arithmetic operators: ...@@ -353,7 +353,7 @@ arithmetic operators:
def __hash__(self): def __hash__(self):
return hash(type(self)) ^ hash(self.name) ^ hash(self.fn) return hash(type(self)) ^ hash(self.name) ^ hash(self.fn)
def make_node(self, x, y): def make_node(self, x, y):
if isinstance(x, (int, float)): if isinstance(x, (int, float)):
x = gof.Constant(double, x) x = gof.Constant(double, x)
...@@ -362,22 +362,22 @@ arithmetic operators: ...@@ -362,22 +362,22 @@ arithmetic operators:
if x.type != double or y.type != double: if x.type != double or y.type != double:
raise TypeError('%s only works on doubles' % self.name) raise TypeError('%s only works on doubles' % self.name)
return gof.Apply(self, [x, y], [double()]) return gof.Apply(self, [x, y], [double()])
def perform(self, node, (x, y), (z, )): def perform(self, node, (x, y), (z, )):
z[0] = self.fn(x, y) z[0] = self.fn(x, y)
def __str__(self): def __str__(self):
return self.name return self.name
add = BinaryDoubleOp(name = 'add', add = BinaryDoubleOp(name = 'add',
fn = lambda x, y: x + y) fn = lambda x, y: x + y)
sub = BinaryDoubleOp(name = 'sub', sub = BinaryDoubleOp(name = 'sub',
fn = lambda x, y: x - y) fn = lambda x, y: x - y)
mul = BinaryDoubleOp(name = 'mul', mul = BinaryDoubleOp(name = 'mul',
fn = lambda x, y: x * y) fn = lambda x, y: x * y)
div = BinaryDoubleOp(name = 'div', div = BinaryDoubleOp(name = 'div',
fn = lambda x, y: x / y) fn = lambda x, y: x / y)
......
...@@ -2,7 +2,7 @@ ...@@ -2,7 +2,7 @@
.. _lib_scan: .. _lib_scan:
================================ ================================
:mod:`scan` -- Looping in Theano :mod:`scan` -- Looping in Theano
================================ ================================
...@@ -10,23 +10,23 @@ Guide ...@@ -10,23 +10,23 @@ Guide
===== =====
The scan functions provides the basic functionality needed to do loops The scan functions provides the basic functionality needed to do loops
in Theano. Scan comes with many whistles and bells, that can be easily in Theano. Scan comes with many whistles and bells, that can be easily
introduced through a few examples : introduced through a few examples :
Basic functionality : Computing :math:`A^k` Basic functionality : Computing :math:`A^k`
-------------------------------------------- --------------------------------------------
Assume that, given *k* you want to get ``A**k`` using a loop. Assume that, given *k* you want to get ``A**k`` using a loop.
More precisely, if *A* is a tensor you want to compute More precisely, if *A* is a tensor you want to compute
``A**k`` elemwise. The python/numpy code would loop like ``A**k`` elemwise. The python/numpy code would loop like
.. code-block:: python .. code-block:: python
result = 1 result = 1
for i in xrange(k): for i in xrange(k):
result = result * A result = result * A
The equivalent Theano code would be The equivalent Theano code would be
.. code-block:: python .. code-block:: python
...@@ -39,29 +39,29 @@ The equivalent Theano code would be ...@@ -39,29 +39,29 @@ The equivalent Theano code would be
# compiled function that returns A**k # compiled function that returns A**k
f = theano.function([A,k], result[-1], updates = updates) f = theano.function([A,k], result[-1], updates = updates)
Let us go through the example line by line. What we did is first to Let us go through the example line by line. What we did is first to
construct a function (using a lambda expression) that given `x_tm1` and construct a function (using a lambda expression) that given `x_tm1` and
`A` returns `x_tm1*A`. Given the order of the parameters, `x_tm1` `A` returns `x_tm1*A`. Given the order of the parameters, `x_tm1`
is the value of our output at time step ``t-1``. Therefore is the value of our output at time step ``t-1``. Therefore
``x_t`` (value of output at time `t`) is `A` times value of output ``x_t`` (value of output at time `t`) is `A` times value of output
at `t-1`. at `t-1`.
Next we initialize the output as a tensor with same Next we initialize the output as a tensor with same
shape as A filled with ones. We give A to scan as a non sequence parameter and shape as A filled with ones. We give A to scan as a non sequence parameter and
specify the number of steps k to iterate over our lambda expression. specify the number of steps k to iterate over our lambda expression.
Scan will return a tuple, containing our result (``result``) and a Scan will return a tuple, containing our result (``result``) and a
dictionary of updates ( empty in this case). Note that the result dictionary of updates ( empty in this case). Note that the result
is not a matrix, but a 3D tensor containing the value of ``A**k`` for is not a matrix, but a 3D tensor containing the value of ``A**k`` for
each step. We want the last value ( after k steps ) so we compile each step. We want the last value ( after k steps ) so we compile
a function to return just that. Note that there is an optimization, that a function to return just that. Note that there is an optimization, that
at compile time will detect that you are using just the last value of the at compile time will detect that you are using just the last value of the
result and ensure that scan does not store all the intermediate values result and ensure that scan does not store all the intermediate values
that are used. So do not worry if A and k are large. that are used. So do not worry if A and k are large.
Multiple outputs, several taps values - Recurrent Neural Network with Scan Multiple outputs, several taps values - Recurrent Neural Network with Scan
-------------------------------------------------------------------------- --------------------------------------------------------------------------
A more practical task would be to implement a RNN using scan. Assume A more practical task would be to implement a RNN using scan. Assume
that our RNN is defined as follows : that our RNN is defined as follows :
.. math:: .. math::
...@@ -72,11 +72,11 @@ that our RNN is defined as follows : ...@@ -72,11 +72,11 @@ that our RNN is defined as follows :
Note that this network is far from a classical recurrent neural Note that this network is far from a classical recurrent neural
network and might be useless. The reason we defined as such network and might be useless. The reason we defined as such
is to better ilustrate the features of scan. is to better ilustrate the features of scan.
In this case we have a sequence over which we need to iterate ``u``, In this case we have a sequence over which we need to iterate ``u``,
and two outputs ``x`` and ``y``. To implement this with scan we first and two outputs ``x`` and ``y``. To implement this with scan we first
construct a function that computes one iteration step : construct a function that computes one iteration step :
.. code-block:: python .. code-block:: python
...@@ -90,22 +90,22 @@ construct a function that computes one iteration step : ...@@ -90,22 +90,22 @@ construct a function that computes one iteration step :
return [x_t, y_t] return [x_t, y_t]
As naming convention for the variables we used ``a_tmb`` to mean ``a`` at As naming convention for the variables we used ``a_tmb`` to mean ``a`` at
``t-b`` and ``a_tpb`` to be ``a`` at ``t+b``. ``t-b`` and ``a_tpb`` to be ``a`` at ``t+b``.
Note the order in which the parameters are given, and in which the Note the order in which the parameters are given, and in which the
result is returned. Try to respect cronological order among result is returned. Try to respect cronological order among
the taps ( time slices of sequences or outputs) used. For scan is crucial only the taps ( time slices of sequences or outputs) used. For scan is crucial only
for the variables representing the different time taps to be in the same order for the variables representing the different time taps to be in the same order
as the one in which these taps are given. Also, not only taps should respect as the one in which these taps are given. Also, not only taps should respect
an order, but also variables, since this is how scan figures out what should an order, but also variables, since this is how scan figures out what should
be represented by what. Given that we have all be represented by what. Given that we have all
the Theano variables needed we construct our RNN as follows : the Theano variables needed we construct our RNN as follows :
.. code-block:: python .. code-block:: python
u = T.matrix() # it is a sequence of vectors u = T.matrix() # it is a sequence of vectors
x0 = T.matrix() # initial state of x has to be a matrix, since x0 = T.matrix() # initial state of x has to be a matrix, since
# it has to cover x[-3] # it has to cover x[-3]
y0 = T.vector() # y0 is just a vector since scan has only to provide y0 = T.vector() # y0 is just a vector since scan has only to provide
# y[-1] # y[-1]
...@@ -120,31 +120,31 @@ the Theano variables needed we construct our RNN as follows : ...@@ -120,31 +120,31 @@ the Theano variables needed we construct our RNN as follows :
Now ``x_vals`` and ``y_vals`` are symbolic variables pointing to the Now ``x_vals`` and ``y_vals`` are symbolic variables pointing to the
sequence of x and y values generated by iterating over u. The sequence of x and y values generated by iterating over u. The
``sequence_taps``, ``outputs_taps`` give to scan information about what ``sequence_taps``, ``outputs_taps`` give to scan information about what
slices are exactly needed. Note that if we want to use ``x[t-k]`` we do slices are exactly needed. Note that if we want to use ``x[t-k]`` we do
not need to also have ``x[t-(k-1)], x[t-(k-2)],..``, but when applying not need to also have ``x[t-(k-1)], x[t-(k-2)],..``, but when applying
the compiled function, the numpy array given to represent this sequence the compiled function, the numpy array given to represent this sequence
should be large enough to cover this values. Assume that we compile the should be large enough to cover this values. Assume that we compile the
above function, and we give as ``u`` the array ``uvals = [0,1,2,3,4,5,6,7,8]``. above function, and we give as ``u`` the array ``uvals = [0,1,2,3,4,5,6,7,8]``.
By abusing notations, scan will consider ``uvals[0]`` as ``u[-4]``, and By abusing notations, scan will consider ``uvals[0]`` as ``u[-4]``, and
will start scaning from ``uvals[4]`` towards the end. will start scaning from ``uvals[4]`` towards the end.
Using shared variables - Gibbs sampling Using shared variables - Gibbs sampling
--------------------------------------- ---------------------------------------
Another useful feature of scan, is that it can handle shared variables. Another useful feature of scan, is that it can handle shared variables.
For example, if we want to implement a Gibbs chain of length 10 we would do For example, if we want to implement a Gibbs chain of length 10 we would do
the following: the following:
.. code-block:: python .. code-block:: python
W = theano.shared ( W_values ) # we assume that ``W_values`` contains the W = theano.shared ( W_values ) # we assume that ``W_values`` contains the
# initial values of your weight matrix # initial values of your weight matrix
bvis = theano.shared( bvis_values) bvis = theano.shared( bvis_values)
bhid = theano.shared( bhid_values) bhid = theano.shared( bhid_values)
trng = T.shared_randomstreams.RandomStreams(1234) trng = T.shared_randomstreams.RandomStreams(1234)
def OneStep( vsample) : def OneStep( vsample) :
...@@ -160,18 +160,18 @@ the following: ...@@ -160,18 +160,18 @@ the following:
gibbs10 = theano.function([sample], values[-1], updates = updates) gibbs10 = theano.function([sample], values[-1], updates = updates)
Note that if we use shared variables ( ``W``, ``bvis``, ``bhid``) but Note that if we use shared variables ( ``W``, ``bvis``, ``bhid``) but
we do not iterate over them ( so scan doesn't really need to know we do not iterate over them ( so scan doesn't really need to know
anything in particular about them, just that they are used inside the anything in particular about them, just that they are used inside the
function applied at each step) you do not need to pass them as function applied at each step) you do not need to pass them as
arguments. Scan will find them on its on and add them to the graph. Of arguments. Scan will find them on its on and add them to the graph. Of
course, if you wish to (and it is good practice) you can add them, when course, if you wish to (and it is good practice) you can add them, when
you call scan (they would be in the list of non sequence inputs). you call scan (they would be in the list of non sequence inputs).
The second, and probably most crucial observation is that the updates The second, and probably most crucial observation is that the updates
dictionary becomes important in this case. It links a shared variable dictionary becomes important in this case. It links a shared variable
with its updated value after k steps. In this case it tells how the with its updated value after k steps. In this case it tells how the
random streams get updated after 10 iterations. If you do not pass this random streams get updated after 10 iterations. If you do not pass this
update dictionary to your function, you will always get the same 10 update dictionary to your function, you will always get the same 10
sets of random numbers. You can even use the ``updates`` dictionary sets of random numbers. You can even use the ``updates`` dictionary
afterwards. Look at this example : afterwards. Look at this example :
...@@ -181,27 +181,27 @@ afterwards. Look at this example : ...@@ -181,27 +181,27 @@ afterwards. Look at this example :
a = theano.shared(1) a = theano.shared(1)
values,updates = theano.scan( lambda : {a:a+1}, n_steps = 10 ) values,updates = theano.scan( lambda : {a:a+1}, n_steps = 10 )
In this case the lambda expression does not require any input parameters In this case the lambda expression does not require any input parameters
and returns an update dictionary which tells how ``a`` should be updated and returns an update dictionary which tells how ``a`` should be updated
after each step of scan. If we write : after each step of scan. If we write :
.. code-block:: python .. code-block:: python
b = a+1 b = a+1
c = updates[a] + 1 c = updates[a] + 1
f = theano.function([], [b,c], updates = updates) f = theano.function([], [b,c], updates = updates)
print b print b
print c print c
print a.value print a.value
We will see that because ``b`` does not use the updated version of We will see that because ``b`` does not use the updated version of
``a``, it will be 2, ``c`` will be 12, while ``a.value`` is ``11``. ``a``, it will be 2, ``c`` will be 12, while ``a.value`` is ``11``.
If we call the function again, ``b`` will become 12, ``c`` will be 22 If we call the function again, ``b`` will become 12, ``c`` will be 22
and ``a.value`` 21. and ``a.value`` 21.
If we do not pass the ``updates`` dictionary to the function, then If we do not pass the ``updates`` dictionary to the function, then
``a.value`` will always remain 1, ``b`` will always be 2 and ``c`` ``a.value`` will always remain 1, ``b`` will always be 2 and ``c``
will always be ``12``. will always be ``12``.
......
...@@ -2,7 +2,7 @@ ...@@ -2,7 +2,7 @@
"Thank YOU for correcting it so quickly. I wish all packages I worked "Thank YOU for correcting it so quickly. I wish all packages I worked
with would have such an active maintenance - this is as good as it with would have such an active maintenance - this is as good as it
gets :-)" gets :-)"
-- Jan Antolik, [theano-users] strange behaviour, Mon, Aug 2, 2010 at 1:36 PM -- Jan Antolik, [theano-users] strange behaviour, Mon, Aug 2, 2010 at 1:36 PM
------------------------- -------------------------
......
...@@ -4,7 +4,7 @@ Proposal for New Linking Strategy supporting Lazy Evaluation: Op.make_thunk ...@@ -4,7 +4,7 @@ Proposal for New Linking Strategy supporting Lazy Evaluation: Op.make_thunk
============================================================================= =============================================================================
.. note:: .. note::
Proposal made June 2010. Proposal made June 2010.
...@@ -103,17 +103,17 @@ in the PureOp class (a superclass of Op) that will return a Thunk. ...@@ -103,17 +103,17 @@ in the PureOp class (a superclass of Op) that will return a Thunk.
class PureOp(object): # recall: class PureOp(object): # recall:
# Op inherits from PureOp # Op inherits from PureOp
def make_node(self, *inputs): # leave alone def make_node(self, *inputs): # leave alone
... ...
def perform(self, node, def perform(self, node,
inputs, output_storage): # move to `Op` class inputs, output_storage): # move to `Op` class
... ...
def make_thunk(self, node, # new function def make_thunk(self, node, # new function
input_computed, output_computed, input_computed, output_computed,
input_registers, output_registers, input_registers, output_registers,
): ):
""" """
:type node: Apply instance :type node: Apply instance
:param node: previous rval from make_node(self, *inputs) :param node: previous rval from make_node(self, *inputs)
...@@ -137,7 +137,7 @@ in the PureOp class (a superclass of Op) that will return a Thunk. ...@@ -137,7 +137,7 @@ in the PureOp class (a superclass of Op) that will return a Thunk.
:param input_registers: the i'th input can be read from :param input_registers: the i'th input can be read from
input_registers[i][0] when input_computed[i][0] == 1. input_registers[i][0] when input_computed[i][0] == 1.
:param output_registers: the i'th output must be stored to :param output_registers: the i'th output must be stored to
output_registers[i][0], at which point the thunk must set output_computed[i][0] == 1. output_registers[i][0], at which point the thunk must set output_computed[i][0] == 1.
:returns: a Thunk (subclass) instance :returns: a Thunk (subclass) instance
......
...@@ -34,7 +34,7 @@ welcome on the mailing list. ...@@ -34,7 +34,7 @@ welcome on the mailing list.
1) Known errors should not be used to encode "feature wish lists", as 1) Known errors should not be used to encode "feature wish lists", as
is currently the case. is currently the case.
2) Incorrect results should raise errors and not known errors (this 2) Incorrect results should raise errors and not known errors (this
has always been the case) has always been the case)
3) All known errors should have a ticket and a reference to that 3) All known errors should have a ticket and a reference to that
ticket in the error message. ticket in the error message.
......
...@@ -21,7 +21,7 @@ version of your Theano file. ...@@ -21,7 +21,7 @@ version of your Theano file.
In the ModuleInstance, your symbolic variables have become containers (containing None), In the ModuleInstance, your symbolic variables have become containers (containing None),
and your Methods have become callable functions. and your Methods have become callable functions.
You should initialize the symbolic variables by calling You should initialize the symbolic variables by calling
``ModuleInstance.initialize()`` (although make() will call it for you, ``ModuleInstance.initialize()`` (although make() will call it for you,
on the top-level ModuleInstance.) on the top-level ModuleInstance.)
You can compile a Module several times, to create multiple ModuleInstances. You can compile a Module several times, to create multiple ModuleInstances.
...@@ -51,7 +51,7 @@ All of the elements of what is called the "module system" or "modules" are ...@@ -51,7 +51,7 @@ All of the elements of what is called the "module system" or "modules" are
components. components.
A component subclass is represents a symbolic theano thing, and implements the A component subclass is represents a symbolic theano thing, and implements the
``build`` function. ``build`` function.
The ``build`` function is responsible for converting the symbolic thing into a The ``build`` function is responsible for converting the symbolic thing into a
non-symbolic thing. non-symbolic thing.
...@@ -60,7 +60,7 @@ Compiling with make ...@@ -60,7 +60,7 @@ Compiling with make
------------------- -------------------
Conversion from a Component graph to a ComponentInstance graph is performed by `Component.make`. Conversion from a Component graph to a ComponentInstance graph is performed by `Component.make`.
This method traverses the Component graph in multiple passes. This method traverses the Component graph in multiple passes.
In the first pass (the allocate pass), it creates storage for all Variables that are contained in the graph (see In the first pass (the allocate pass), it creates storage for all Variables that are contained in the graph (see
`Component.allocate`). These are the module variables. `Component.allocate`). These are the module variables.
......
...@@ -73,7 +73,7 @@ Each key in the updates dictionary must be the name of an existing ``Member`` of ...@@ -73,7 +73,7 @@ Each key in the updates dictionary must be the name of an existing ``Member`` of
Inner Module Inner Module
------------ ------------
To share a ``Member`` between modules, the modules must be linked through the inner module mechanism. To share a ``Member`` between modules, the modules must be linked through the inner module mechanism.
Usage: Usage:
...@@ -140,7 +140,7 @@ Now, using ``Module``: ...@@ -140,7 +140,7 @@ Now, using ``Module``:
#m.dec = M.Method(n, [], updates = {c: m.c - n})#global c don't exist #m.dec = M.Method(n, [], updates = {c: m.c - n})#global c don't exist
#m.plus10 does not update the state #m.plus10 does not update the state
m.plus10 = M.Method([], m.c + 10) # m.c is always accessible since it is a member of this mlass m.plus10 = M.Method([], m.c + 10) # m.c is always accessible since it is a member of this mlass
inst = m.make(c = 0) # here, we make an "instance" of the module with c initialized to 0 inst = m.make(c = 0) # here, we make an "instance" of the module with c initialized to 0
assert inst.c == 0 assert inst.c == 0
inst.inc(2) inst.inc(2)
...@@ -148,7 +148,7 @@ Now, using ``Module``: ...@@ -148,7 +148,7 @@ Now, using ``Module``:
inst.dec(3) inst.dec(3)
assert inst.c == -1 assert inst.c == -1
assert inst.plus10() == 9 assert inst.plus10() == 9
Benefits of ``Module`` over ``function`` in this example: Benefits of ``Module`` over ``function`` in this example:
* There is no need to manipulate the containers directly * There is no need to manipulate the containers directly
* The fact inc and dec share a state is more obvious syntactically. * The fact inc and dec share a state is more obvious syntactically.
...@@ -170,8 +170,8 @@ Using function: ...@@ -170,8 +170,8 @@ Using function:
inc = theano.function([n, ((c, c + n), 0)], []) inc = theano.function([n, ((c, c + n), 0)], [])
dec = theano.function([n, ((c, c - n), inc.container[c])], [])#inc and dec share the same state. dec = theano.function([n, ((c, c - n), inc.container[c])], [])#inc and dec share the same state.
return inc,dec return inc,dec
inc1, dec1 = make_incdec_function() inc1, dec1 = make_incdec_function()
inc2, dec2 = make_incdec_function() inc2, dec2 = make_incdec_function()
a, b = T.scalars('ab') a, b = T.scalars('ab')
...@@ -193,7 +193,7 @@ Using Module: ...@@ -193,7 +193,7 @@ Using Module:
m.inc = M.Method(n, [], updates = {m.c: m.c + n}) # m.c <= m.c + n m.inc = M.Method(n, [], updates = {m.c: m.c + n}) # m.c <= m.c + n
m.dec = M.Method(n, [], updates = {m.c: m.c - n}) # m.c <= m.c - n m.dec = M.Method(n, [], updates = {m.c: m.c - n}) # m.c <= m.c - n
return m return m
m = M.Module() m = M.Module()
m.incdec1 = make_incdec_module() m.incdec1 = make_incdec_module()
m.incdec2 = make_incdec_module() m.incdec2 = make_incdec_module()
...@@ -227,12 +227,12 @@ Here is how we use the model: ...@@ -227,12 +227,12 @@ Here is how we use the model:
data_x = N.random.randn(4, 10) data_x = N.random.randn(4, 10)
data_y = [ [int(x)] for x in N.random.randn(4) > 0] data_y = [ [int(x)] for x in N.random.randn(4) > 0]
model = SoftmaxXERegression(regularize = False).make(input_size = 10, model = SoftmaxXERegression(regularize = False).make(input_size = 10,
target_size = 1, target_size = 1,
stepsize = 0.1) stepsize = 0.1)
for i in xrange(1000): for i in xrange(1000):
xe = model.update(data_x, data_y) xe = model.update(data_x, data_y)
if i % 100 == 0: if i % 100 == 0:
...@@ -240,7 +240,7 @@ Here is how we use the model: ...@@ -240,7 +240,7 @@ Here is how we use the model:
pass pass
#for inputs, targets in my_training_set(): #for inputs, targets in my_training_set():
#print "cost:", model.update(inputs, targets) #print "cost:", model.update(inputs, targets)
print "final weights:", model.w print "final weights:", model.w
print "final biases:", model.b print "final biases:", model.b
...@@ -255,12 +255,12 @@ Extending ``Methods`` ...@@ -255,12 +255,12 @@ Extending ``Methods``
model_module = SoftmaxXERegression(regularize = False) model_module = SoftmaxXERegression(regularize = False)
model_module.sum = T.scalar() # we add a module member to hold the sum model_module.sum = T.scalar() # we add a module member to hold the sum
model_module.update.updates.update(sum = model_module.sum + model_module.cost) # now update will also update sum! model_module.update.updates.update(sum = model_module.sum + model_module.cost) # now update will also update sum!
model = model_module.make(input_size = 4, model = model_module.make(input_size = 4,
target_size = 2, target_size = 2,
stepsize = 0.1, stepsize = 0.1,
sum = 0) # we mustn't forget to initialize the sum sum = 0) # we mustn't forget to initialize the sum
test = model.update([[0,0,1,0]], [[0,1]]) + model.update([[0,1,0,0]], [[1,0]]) test = model.update([[0,0,1,0]], [[0,1]]) + model.update([[0,1,0,0]], [[1,0]])
assert model.sum == test assert model.sum == test
......
...@@ -100,7 +100,7 @@ write an Op:** ...@@ -100,7 +100,7 @@ write an Op:**
raise NotImplementedError('only floatingpoint is implemented') raise NotImplementedError('only floatingpoint is implemented')
scalar_xlogx = XlogX(scalar.upgrade_to_float, name='scalar_xlogx') scalar_xlogx = XlogX(scalar.upgrade_to_float, name='scalar_xlogx')
xlogx = tensor.Elemwise(scalar_xlogx, name='xlogx') xlogx = tensor.Elemwise(scalar_xlogx, name='xlogx')
**It is also necessary to talk about UnaryScalarOp vs. BinaryOp.** **It is also necessary to talk about UnaryScalarOp vs. BinaryOp.**
UnaryScalarOp is the same as scalar.ScalarOp with member variable nin=1. UnaryScalarOp is the same as scalar.ScalarOp with member variable nin=1.
......
...@@ -77,7 +77,7 @@ squared difference between two matrices ``a`` and ``b`` at the same time: ...@@ -77,7 +77,7 @@ squared difference between two matrices ``a`` and ``b`` at the same time:
>>> diff_squared = diff**2 >>> diff_squared = diff**2
>>> f = function([a, b], [diff, abs_diff, diff_squared]) >>> f = function([a, b], [diff, abs_diff, diff_squared])
.. note:: .. note::
`dmatrices` produces as many outputs as names that you provide. It is a `dmatrices` produces as many outputs as names that you provide. It is a
shortcut for allocating symbolic variables that we will often use in the shortcut for allocating symbolic variables that we will often use in the
tutorials. tutorials.
...@@ -162,17 +162,17 @@ array([[ 0.25 , 0.19661193], ...@@ -162,17 +162,17 @@ array([[ 0.25 , 0.19661193],
The resulting function computes the gradient of its first argument The resulting function computes the gradient of its first argument
with respect to the second. In this way, Theano can be used for with respect to the second. In this way, Theano can be used for
`automatic differentiation <http://en.wikipedia.org/wiki/Automatic_differentiation>`_. `automatic differentiation <http://en.wikipedia.org/wiki/Automatic_differentiation>`_.
As opposed to what this page tell, Theano do efficient symbolic differentiation As opposed to what this page tell, Theano do efficient symbolic differentiation
even for function with many inputs. even for function with many inputs.
.. note:: .. note::
The second argument of ``T.grad`` can be a list, in which case the The second argument of ``T.grad`` can be a list, in which case the
output is also a list. The order in both list is important, element output is also a list. The order in both list is important, element
*i* of the output list is the gradient of the first argument of *i* of the output list is the gradient of the first argument of
``T.grad`` with respect to the *i*-th element of the list given as second argument. ``T.grad`` with respect to the *i*-th element of the list given as second argument.
The first argument of ``T.grad`` has to be a scalar (a tensor The first argument of ``T.grad`` has to be a scalar (a tensor
of size 1). For more information on the semantics of the arguments of of size 1). For more information on the semantics of the arguments of
``T.grad`` and details about the implementation, see :ref:`this <libdoc_gradient>`. ``T.grad`` and details about the implementation, see :ref:`this <libdoc_gradient>`.
...@@ -269,7 +269,7 @@ will replace the ``.value`` of each shared variable with the result of the ...@@ -269,7 +269,7 @@ will replace the ``.value`` of each shared variable with the result of the
corresponding expression". Above, our accumulator replaces the ``state``'s value with the sum corresponding expression". Above, our accumulator replaces the ``state``'s value with the sum
of the state and the increment amount. of the state and the increment amount.
Anyway, let's try it out! Anyway, let's try it out!
.. If you modify this code, also change : .. If you modify this code, also change :
.. theano/tests/test_tutorial.py:T_examples.test_examples_8 .. theano/tests/test_tutorial.py:T_examples.test_examples_8
...@@ -314,7 +314,7 @@ updates). Also, theano has more control over where and how shared variables are ...@@ -314,7 +314,7 @@ updates). Also, theano has more control over where and how shared variables are
allocated, which is one of the important elements of getting good performance allocated, which is one of the important elements of getting good performance
on the GPU. on the GPU.
It may happen that you expressed some formula using a shared variable, but It may happen that you expressed some formula using a shared variable, but
you do *not* want to use its value. In this case, you can use the you do *not* want to use its value. In this case, you can use the
``givens`` parameter of ``function`` which replaces a particular node in a graph ``givens`` parameter of ``function`` which replaces a particular node in a graph
for the purpose of one particular function. for the purpose of one particular function.
...@@ -330,7 +330,7 @@ for the purpose of one particular function. ...@@ -330,7 +330,7 @@ for the purpose of one particular function.
>>> skip_shared(1, 3) # we're using 3 for the state, not state.value >>> skip_shared(1, 3) # we're using 3 for the state, not state.value
array(7) array(7)
>>> state.value # old state still there, but we didn't use it >>> state.value # old state still there, but we didn't use it
array(0) array(0)
The givens parameter can be used to replace any symbolic variable, not just a The givens parameter can be used to replace any symbolic variable, not just a
shared variable. You can replace constants, and expressions, in general. Be shared variable. You can replace constants, and expressions, in general. Be
...@@ -340,7 +340,7 @@ the substitutions have to work in any order. ...@@ -340,7 +340,7 @@ the substitutions have to work in any order.
In practice, a good way of thinking about the ``givens`` is as a mechanism In practice, a good way of thinking about the ``givens`` is as a mechanism
that allows you to replace any part of your formula with a different that allows you to replace any part of your formula with a different
expression that evaluates to a tensor of same shape and dtype. ``givens`` expression that evaluates to a tensor of same shape and dtype. ``givens``
.. _using_random_numbers: .. _using_random_numbers:
...@@ -348,15 +348,15 @@ Using Random Numbers ...@@ -348,15 +348,15 @@ Using Random Numbers
==================== ====================
Because in Theano you first express everything symbolically and Because in Theano you first express everything symbolically and
afterwards compile this expression to get functions, afterwards compile this expression to get functions,
using pseudo-random numbers is not as straightforward as it is in using pseudo-random numbers is not as straightforward as it is in
numpy, though also not too complicated. numpy, though also not too complicated.
The way to think about putting randomness into Theano's computations is The way to think about putting randomness into Theano's computations is
to put random variables in your graph. Theano will allocate a numpy to put random variables in your graph. Theano will allocate a numpy
RandomStream object (a random number generator) for each such RandomStream object (a random number generator) for each such
variable, and draw from it as necessary. We will call this sort of variable, and draw from it as necessary. We will call this sort of
sequence of random numbers a *random stream*. *Random streams* are at sequence of random numbers a *random stream*. *Random streams* are at
their core shared variables, so the observations on shared variables their core shared variables, so the observations on shared variables
hold here as well. hold here as well.
......
...@@ -9,21 +9,21 @@ Mode ...@@ -9,21 +9,21 @@ Mode
==== ====
Everytime :func:`theano.function <function.function>` is called Everytime :func:`theano.function <function.function>` is called
the symbolic relationships between the input and output Theano *variables* the symbolic relationships between the input and output Theano *variables*
are optimized and compiled. The way this compilation occurs are optimized and compiled. The way this compilation occurs
is controlled by the value of the ``mode`` parameter. is controlled by the value of the ``mode`` parameter.
Theano defines the following modes by name: Theano defines the following modes by name:
- ``'FAST_COMPILE'``: Apply just a few graph optimizations, but use C implementations where possible. - ``'FAST_COMPILE'``: Apply just a few graph optimizations, but use C implementations where possible.
- ``'FAST_RUN'``: Apply all optimizations, and use C implementations where possible. - ``'FAST_RUN'``: Apply all optimizations, and use C implementations where possible.
- ``'DEBUG_MODE'``: Verify the correctness of all optimizations, and compare C and python - ``'DEBUG_MODE'``: Verify the correctness of all optimizations, and compare C and python
implementations. This mode can take much longer than the other modes, implementations. This mode can take much longer than the other modes,
but can identify many kinds of problems. but can identify many kinds of problems.
- ``'PROFILE_MODE'``: Same optimization then FAST_RUN, put print some profiling information - ``'PROFILE_MODE'``: Same optimization then FAST_RUN, put print some profiling information
The default mode is typically ``FAST_RUN``, but it can be controlled via The default mode is typically ``FAST_RUN``, but it can be controlled via
the configuration variable :attr:`config.mode`, the configuration variable :attr:`config.mode`,
which can be overridden by passing the keyword argument to which can be overridden by passing the keyword argument to
:func:`theano.function <function.function>`. :func:`theano.function <function.function>`.
...@@ -43,14 +43,14 @@ PROFILE_MODE ``compile.profilemode.ProfileMode()`` ...@@ -43,14 +43,14 @@ PROFILE_MODE ``compile.profilemode.ProfileMode()``
Using DebugMode Using DebugMode
=============== ===============
While normally you should use the ``FAST_RUN`` or ``FAST_COMPILE`` mode, While normally you should use the ``FAST_RUN`` or ``FAST_COMPILE`` mode,
it is useful at first (especially when you are defining new kinds of it is useful at first (especially when you are defining new kinds of
expressions or new optimizations) to run your code using the DebugMode expressions or new optimizations) to run your code using the DebugMode
(available via ``mode='DEBUG_MODE'``). The DebugMode is designed to (available via ``mode='DEBUG_MODE'``). The DebugMode is designed to
do several self-checks and assertations that can help to diagnose do several self-checks and assertations that can help to diagnose
possible programming errors that can lead to incorect output. Note that possible programming errors that can lead to incorect output. Note that
``DEBUG_MODE`` is much slower then ``FAST_RUN`` or ``FAST_COMPILE`` so ``DEBUG_MODE`` is much slower then ``FAST_RUN`` or ``FAST_COMPILE`` so
use it only during development (not when you launch 1000 process on a use it only during development (not when you launch 1000 process on a
cluster!). cluster!).
...@@ -65,9 +65,9 @@ DebugMode is used as follows: ...@@ -65,9 +65,9 @@ DebugMode is used as follows:
f = theano.function([x], 10*x, mode='DEBUG_MODE') f = theano.function([x], 10*x, mode='DEBUG_MODE')
f([5]) f([5])
f([0]) f([0])
f([7]) f([7])
If any problem is detected, DebugMode will raise an exception according to If any problem is detected, DebugMode will raise an exception according to
...@@ -92,8 +92,8 @@ is quite strict. ...@@ -92,8 +92,8 @@ is quite strict.
ProfileMode ProfileMode
=========== ===========
Beside checking for errors, another important task is to profile your Beside checking for errors, another important task is to profile your
code. For this Theano uses a special mode called ProfileMode which has code. For this Theano uses a special mode called ProfileMode which has
to be passed as an argument to :func:`theano.function <function.function>`. Using the ProfileMode is a three-step process. to be passed as an argument to :func:`theano.function <function.function>`. Using the ProfileMode is a three-step process.
To change the default to it, put the theano flags mode to PROFILE_MODE. To change the default to it, put the theano flags mode to PROFILE_MODE.
...@@ -101,7 +101,7 @@ To change the default to it, put the theano flags mode to PROFILE_MODE. ...@@ -101,7 +101,7 @@ To change the default to it, put the theano flags mode to PROFILE_MODE.
Creating a ProfileMode Instance Creating a ProfileMode Instance
------------------------------- -------------------------------
First create a ProfileMode instance. First create a ProfileMode instance.
>>> from theano import ProfileMode >>> from theano import ProfileMode
>>> profmode = theano.ProfileMode(optimizer='fast_run', linker=theano.gof.OpWiseCLinker()) >>> profmode = theano.ProfileMode(optimizer='fast_run', linker=theano.gof.OpWiseCLinker())
...@@ -112,7 +112,7 @@ application. For example, a user wanting to profile the Python ...@@ -112,7 +112,7 @@ application. For example, a user wanting to profile the Python
implementation only, should use the gof.PerformLinker (or "py" for implementation only, should use the gof.PerformLinker (or "py" for
short). On the other hand, a user wanting to profile his graph using C short). On the other hand, a user wanting to profile his graph using C
implementations wherever possible should use the ``gof.OpWiseCLinker`` implementations wherever possible should use the ``gof.OpWiseCLinker``
(or "c|py"). For testing the speed of your code we would recommend (or "c|py"). For testing the speed of your code we would recommend
using the 'fast_run' optimizer and ``gof.OpWiseCLinker`` linker. using the 'fast_run' optimizer and ``gof.OpWiseCLinker`` linker.
Compiling your Graph with ProfileMode Compiling your Graph with ProfileMode
...@@ -138,13 +138,13 @@ of its time. ...@@ -138,13 +138,13 @@ of its time.
This is best shown through an example. This is best shown through an example.
Lets use the example of logistic Lets use the example of logistic
regression. (Code for this example is in the file regression. (Code for this example is in the file
``benchmark/regression/regression.py``.) ``benchmark/regression/regression.py``.)
Compiling the module with ProfileMode and calling ``profmode.print_summary()`` Compiling the module with ProfileMode and calling ``profmode.print_summary()``
generates the following output: generates the following output:
.. code-block:: python .. code-block:: python
""" """
ProfileMode.print_summary() ProfileMode.print_summary()
--------------------------- ---------------------------
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论