提交 b968046f authored 作者: James Bergstra's avatar James Bergstra

updated basic tutorial to use shared variables

上级 d95468cb
...@@ -59,13 +59,18 @@ Computing more than one thing at the same time ...@@ -59,13 +59,18 @@ Computing more than one thing at the same time
Theano supports functions with multiple outputs. For example, we can Theano supports functions with multiple outputs. For example, we can
compute the :term:`elementwise` difference, absolute difference, and compute the :term:`elementwise` difference, absolute difference, and
squared difference between two matrices ``x`` and ``y`` at the same time: squared difference between two matrices ``a`` and ``b`` at the same time:
>>> x, y = T.dmatrices('x', 'y') >>> a, b = T.dmatrices('a', 'b')
>>> diff = x - y >>> diff = a - b
>>> abs_diff = abs(diff) >>> abs_diff = abs(diff)
>>> diff_squared = diff**2 >>> diff_squared = diff**2
>>> f = function([x, y], [diff, abs_diff, diff_squared]) >>> f = function([a, b], [diff, abs_diff, diff_squared])
.. note::
`dmatrices` produces as many outputs as names that you provide. It's a
shortcut for allocating symbolic variables that we will often use in the
tutorials.
When we use the function, it will return the three variables (the printing When we use the function, it will return the three variables (the printing
was reformatted for readability): was reformatted for readability):
...@@ -78,9 +83,6 @@ was reformatted for readability): ...@@ -78,9 +83,6 @@ was reformatted for readability):
array([[ 1., 0.], array([[ 1., 0.],
[ 1., 4.]])] [ 1., 4.]])]
Also note the call to ``dmatrices``. This is a shortcut, use it wisely
;)
Computing gradients Computing gradients
=================== ===================
...@@ -153,40 +155,49 @@ one. You can do it like this: ...@@ -153,40 +155,49 @@ one. You can do it like this:
>>> x, y = T.dscalars('x', 'y') >>> x, y = T.dscalars('x', 'y')
>>> z = x + y >>> z = x + y
>>> f = function([x, In(y, value = 1)], z) >>> f = function([x, Param(y, default=1)], z)
>>> f(33) >>> f(33)
array(34.0) array(34.0)
>>> f(33, 2) >>> f(33, 2)
array(35.0) array(35.0)
This makes use of the :ref:`In <function_inputs>` class which allows This makes use of the :ref:`Param <function_inputs>` class which allows
you to specify properties of your inputs with greater detail. Here we you to specify properties of your function's parameters with greater detail. Here we
give a default value of 1 for ``y`` by creating an In instance with give a default value of 1 for ``y`` by creating a ``Param`` instance with
its value field set to 1. its ``default`` field set to 1.
Inputs with default values should follow inputs without default Inputs with default values must follow inputs without default
values. There can be multiple inputs with default values. Defaults can values (like python's functions). There can be multiple inputs with default values. These parameters can
be set positionally or by name, as in standard Python: be set positionally or by name, as in standard Python:
>>> x, y, w = T.dscalars('x', 'y', 'w') >>> x, y, w = T.dscalars('x', 'y', 'w')
>>> z = (x + y) * w >>> z = (x + y) * w
>>> f = function([x, In(y, value = 1), In(w, value = 2)], z) >>> f = function([x, Param(y, default=1), Param(w, default=2, name='w_by_name')], z)
>>> f(33) >>> f(33)
array(68.0) array(68.0)
>>> f(33, 2) >>> f(33, 2)
array(70.0) array(70.0)
>>> f(33, 0, 1) >>> f(33, 0, 1)
array(33.0) array(33.0)
>>> f(33, w=1) >>> f(33, w_by_name=1)
array(34.0) array(34.0)
>>> f(33, w=1, y=0) >>> f(33, w_by_name=1, y=0)
array(33.0) array(33.0)
.. note::
``Param`` does not know the name of the local variables ``y`` and ``w``
that are passed as arguments. The symbolic variable objects have name
attributes (set by ``dscalars`` in the example above) and *these* are the
names of the keyword parameters in the functions that we build. This is
the mechanism at work in ``Param(y, default=1)``. In the case of ``Param(w,
default=2, name='w_by_name')``, we override the symbolic variable's name
attribute with a name to be used for this function.
.. _functionstateexample: .. _functionstateexample:
Making a function with state Including values in a symbolic graph
============================ ====================================
It is also possible to make a function with an internal state. For It is also possible to make a function with an internal state. For
example, let's say we want to make an accumulator: at the beginning, example, let's say we want to make an accumulator: at the beginning,
...@@ -194,59 +205,87 @@ the state is initialized to zero. Then, on each function call, the state ...@@ -194,59 +205,87 @@ the state is initialized to zero. Then, on each function call, the state
is incremented by the function's argument. We'll also make it so that is incremented by the function's argument. We'll also make it so that
the increment has a default value of 1. the increment has a default value of 1.
First let's define the accumulator function: First let's define the ``accumulator`` function. It adds its argument to the
internal state, and returns the old state value.
>>> inc = T.scalar('inc')
>>> state = T.scalar('state_name') >>> state = shared(0)
>>> new_state = state + inc >>> inc = T.iscalar('inc')
>>> accumulator = function([In(inc, value = 1), In(state, value = 0, update = new_state)], new_state) >>> accumulator = function([inc], state, updates=[(state, state+inc)])
The first argument, as seen in the previous section, defines a default This code introduces a few new concepts. The ``shared`` function constructs
value of 1 for ``inc``. The second argument adds another argument to so-called *shared variables*. These are hybrid symbolic and non-symbolic
In, ``update``, which works as follows: every time ``accumulator`` is variables. Shared variables can be used in symbolic expressions just like
called, the value of the internal ``state`` will be replaced by the the objects returned by ``dmatrices(...)`` but they also have a ``.value``
value computed as ``new_state``. In this case, the state will be property that defines the value taken by this symbolic variable in *all* the
replaced by the result of incrementing it by ``inc``. functions that use it. It is called a *shared* variable because its value is
shared between many functions. We'll come back to this soon.
.. We recommend (insist?) that internal state arguments occur after any plain
arguments and arguments with default values. The other new thing in this code is the ``updates`` parameter of function.
The updates is a list of pairs of the form (shared-variable, new expression).
There is no limit to how many states you can have and you can name It can also be a dictionary whose keys are shared-variables and values are
them however you like as long as the name does not conflict with the the new expressions. Either way, it means "whenever this function runs, it
names of other inputs. will replace the ``.value`` of each shared variable with the result of the
corresponding expression". Above, our accumulator replaces the ``state``'s value with the sum
Anyway, let's try it out! The state can be accessed using the square of the state and the increment amount.
brackets notation ``[]``. You may access the state either by using
the :ref:`variable` representing it or the name of that Anyway, let's try it out!
:ref:`variable`. In our example we can access the state either with the
``state`` object or the string 'state_name'. >>> state.value
array(0)
>>> accumulator[state] >>> accumulator(1)
array(0.0) array(0)
>>> accumulator['state_name'] >>> state.value
array(0.0) array(1)
Here we use the accumulator and check that the state is correct each
time:
>>> accumulator()
array(1.0)
>>> accumulator['state_name']
array(1.0)
>>> accumulator(300) >>> accumulator(300)
array(301.0) array(1)
>>> accumulator['state_name'] >>> state.value
array(301.0) array(301)
It is possible to reset the state. This is done It is possible to reset the state. Just assign to the ``.value`` property:
by assigning to the state using the square brackets
notation: >>> state.value = -1
>>> accumulator(3)
>>> accumulator['state_name'] = 5 array(-1)
>>> accumulator(0.9) >>> state.value
array(5.9000000000000004) array(2)
>>> accumulator['state_name']
array(5.9000000000000004) As we mentioned above, you can define more than one function to use the same
shared variable. These functions can both update the value.
>>> decrementor = function([inc], state, updates=[(state, state-inc)])
>>> decrementor(2)
array(2)
>>> state.value
array(0)
You might be wondering why the updates mechanism exists. You can always
achieve a similar thing by returning the new expressions, and working with
them in numpy as usual. The updates mechanism can be a syntactic convenience,
but it is mainly there for efficiency. Updates to shared variables can
sometimes be done more quickly using in-place algorithms (e.g. low-rank matrix
updates). Also, theano has more control over where and how shared variables are
allocated, which is one of the important elements of getting good performance
on the GPU.
It may happen that you have constructed a symbolic graph on top of a
shared variable, but you do *not* want to use its value. In this case, you can use the
``givens`` parameter of ``function`` which replaces a particular node in a graph
for the purpose of one particular function.
>>> fn_of_state = state * 2 + inc
>>> non_shared_state = state.type()
>>> skip_shared = function([inc, non_shared_state], fn_of_state,
givens=[(state, non_shared_state)])
>>> skip_shared(1, 3) # we're using 3 for the state, not state.value
array(7)
>>> state.value # old state still there, but we didn't use it
array(0)
The givens parameter can be used to replace any symbolic variable, not just a
shared variable. You can replace constants, and expressions, in general. Be
careful though, not to allow the expressions introduced by a givens
substitution to be co-dependent, the order of substitution is not defined, so
the substitutions have to work in any order.
Mode Mode
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论