提交 e504b1c3 authored 作者: James Bergstra's avatar James Bergstra

scan - reformatted docsting

上级 11829243
...@@ -68,124 +68,114 @@ def scan(fn, sequences, initial_states, non_sequences, inplace_map={}, \ ...@@ -68,124 +68,114 @@ def scan(fn, sequences, initial_states, non_sequences, inplace_map={}, \
mode = None): mode = None):
'''Function that constructs and applies a Scan op '''Function that constructs and applies a Scan op
:param fn: Function that describes the operations involved in one step :param fn:
of scan Given variables representing all the slices of input Function that describes the operations involved in one step of scan
and past values of outputs and other non sequences parameters, Given variables representing all the slices of input and past values of
``fn`` should produce variables describing the output of one outputs and other non sequences parameters, ``fn`` should produce
time step of scan. The order in which the argument to this variables describing the output of one time step of scan. The order in
function are given is very important. You should have the which the argument to this function are given is very important. You
following order: should have the following order:
* all time slices of the first sequence (as given in the * all time slices of the first sequence (as given in the
``sequences`` list) ordered cronologically ``sequences`` list) ordered cronologically
* all time slices of the second sequence (as given in the * all time slices of the second sequence (as given in the
``sequences`` list) ordered cronologically ``sequences`` list) ordered cronologically
* ... * ...
* all time slices of the first output (as given in the * all time slices of the first output (as given in the
``initial_state`` list) ordered cronologically ``initial_state`` list) ordered cronologically
* all time slices of the second otuput (as given in the * all time slices of the second otuput (as given in the
``initial_state`` list) ordered cronologically ``initial_state`` list) ordered cronologically
* ... * ...
* all other parameters over which scan doesn't iterate given * all other parameters over which scan doesn't iterate given
in the same order as in ``non_sequences`` in the same order as in ``non_sequences`` If you are using shared
If you are using shared variables over which you do not want to variables over which you do not want to iterate, you do not need to
iterate, you do not need to provide them as arguments to provide them as arguments to ``fn``, though you can if you wish so. The
``fn``, though you can if you wish so. The function should function should return the outputs after each step plus the updates for
return the outputs after each step plus the updates for any of any of the shared variables. You can either return only outputs or only
the shared variables. You can either return only outputs or updates. If you have both outputs and updates the function should return
only updates. If you have both outputs and updates the them as a tuple : (outputs, updates) or (updates, outputs).
function should return them as a tuple : (outputs, updates)
or (updates, outputs). Outputs can be just a theano expression if you have only one outputs or
a list of theano expressions. Updates can be given either as a list of
Outputs can be just a theano expression if you have only one as a dictionary. If you have a list of outputs, the order of these
outputs or a list of theano expressions. Updates can be given should match that of their ``initial_states``.
either as a list of as a dictionary. If you have a list of
outputs, the order of these should match that of their :param sequences:
``initial_states``. list of Theano variables over which scan needs to iterate.
:param sequences: list of Theano variables over which scan needs to
iterate. :param initial_states:
:param initial_states: list of Theano variables containing the initial list of Theano variables containing the initial state used for the
state used for the output. Note that if the output. Note that if the function applied recursively uses only the
function applied recursively uses only the previous value of the output or none, this initial state should have
previous value of the output or none, this same shape as one time step of the output; otherwise, the initial state
initial state should have same shape should have the same number of dimension as output. This can easily be
as one time step of the output; otherwise, the understand through an example. For computing ``y[t]`` let assume that we
initial state should have the same number of need ``y[t-1]``, ``y[t-2]`` and ``y(t-4)``. Through an abuse of
dimension as output. This can easily be understand notation, when ``t = 0``, we would need values for ``y[-1]``, ``y[-2]``
through an example. For computing ``y[t]`` let and ``y[-4]``. These values are provided by the initial state of ``y``,
assume that we need ``y[t-1]``, ``y[t-2]`` and which should have same number of dimension as ``y``, where the first
``y(t-4)``. Through an abuse of notation, dimension should be large enough to cover all past values, which in this
when ``t = 0``, we would need values for case is 4. If ``init_y`` is the variable containing the initial state
``y[-1]``, ``y[-2]`` and ``y[-4]``. These values of ``y``, then ``init_y[0]`` corresponds to ``y[-4]``, ``init_y[1]``
are provided by the initial state of ``y``, which corresponds to ``y[-3]``, ``init_y[2]`` corresponds to ``y[-2]``,
should have same number of dimension as ``y``, ``init_y[3]`` corresponds to ``y[-1]``. By default, scan is set to use
where the first dimension should be large enough the last time step for each output.
to cover all past values, which in this case is 4.
If ``init_y`` is the variable containing the :param non_sequences:
initial state of ``y``, then ``init_y[0]`` Parameters over which scan should not iterate. These parameters are
corresponds to ``y[-4]``, ``init_y[1]`` given at each time step to the function applied recursively.
corresponds to ``y[-3]``, ``init_y[2]``
corresponds to ``y[-2]``, ``init_y[3]`` :param inplace_map:
corresponds to ``y[-1]``. By default, scan is set Dictionary describing outputs computed *inplace*. ``inplace_map`` is a
to use the last time step for each output. dictionary where keys are output indexes, and values are sequence
:param non_sequences: Parameters over which scan should not iterate. indexes. Assigning a value ``j`` to a key ``i`` means that output
These parameters are given at each time step to number ``j`` will be computed inplace (in the same memory buffer) as the
the function applied recursively. input number ``i``.
:param inplace_map: Dictionary describing outputs computed *inplace*.
``inplace_map`` is a dictionary where keys are :param sequences_taps:
output indexes, and values are sequence indexes. Dictionary describing what slices of the input sequences scan should
Assigning a value ``j`` to a key ``i`` means that use. At each step of the iteration you can use different slices of your
output number ``j`` will be computed inplace (in the input sequences(called here taps), and this dictionary lets you define
same memory buffer) as the input number ``i``. exactly that. The keys of the dictionary are sequence indexes, the
:param sequences_taps: Dictionary describing what slices of the input values are list of numbers. Having the following entry ``i :
sequences scan should use. At each step of the [k_1,k_2,k_3]``, means that at step ``t``, for sequence ``x``, that has
iteration you can use different slices of your the index ``i`` in the list of sequences, you would use the values
input sequences(called here taps), and this ``x[t+k_1]``, ``x[t+k_2]`` and ``x[t+k_3]``. ``k_1``, ``k_2``, ``k_3``
dictionary lets you define exactly that. The values can be positive or negative and the sequence for you request this
keys of the dictionary are sequence indexes, taps should be large enough to accomodate them. If in the chronological
the values are list of numbers. Having the order, ``k`` is the first past value of sequence ``x``, then index 0 of
following entry ``i : [k_1,k_2,k_3]``, means that ``x`` will correspond to step ``k`` (if ``k`` is -3, then, abusing
at step ``t``, for sequence ``x``, that has the notation ``x[0]`` will be seen by scan as ``x[-3]``). If you do not want
index ``i`` in the list of sequences, you would to use any taps for a given sequence you need to set the corresponding
use the values ``x[t+k_1]``, ``x[t+k_2]`` and entry in the dictionary to the empy list. By default, for each sequence
``x[t+k_3]``. ``k_1``, ``k_2``, ``k_3`` values that is not represented in the dictionary scan will assume that the at
can be positive or negative and the sequence for every step it needs to provide the current value of that sequence.
you request this taps should be large enough to
accomodate them. If in the cronological order, :param outputs_taps:
``k`` is the first past value of sequence ``x``, Dictionary describing what slices of the input sequences scan should
then index 0 of ``x`` will correspond to step ``k`` use. The ``outputs_taps`` are defined in an analogous way to
(if ``k`` is -3, then, abusing notation ``x[0]`` ``sequences_taps``, just that the taps are for the outputs generated by
will be seen by scan as ``x[-3]``). If you do not scan. As such they can only be negative, i.e. refer to past value of
want to use any taps for a given sequence you need outputs. By default scan will expect to use for any output the last time
to set the corresponding entry in the dictionary step, if nothing else is specified.
to the empy list. By default, for each sequence
that is not represented in the dictionary scan :param n_steps:
will assume that the at every step it needs to Number of steps to iterate. Sometimes you want to either enforce a fixed
provide the current value of that sequence. number of steps, or you might not even have any sequences you want to
:param outputs_taps: Dictionary describing what slices of the input iterate over, but rather just to repeat some computation for a fixed
sequences scan should use. The ``outputs_taps`` are number of steps. It can be a theano scalar or a number.
defined in an analogouws way to ``sequences_taps``,
just that the taps are for the outputs generated by :param truncate_gradient:
scan. As such they can only be negative, i.e. refer Number of steps to use in truncated BPTT. If you compute gradients
to past value of outputs. By default scan will through a scan op, they are computed using backpropagation through time.
expect to use for any outpu the last time step, if By providing a different value then -1, you choose to use truncated BPTT
nothing else is specified. instead of classical BPTT, where you only do ``truncate_gradient``
:param n_steps: Number of steps to iterate. Sometimes you want to either number of steps. (NOT YET IMPLEMENTED)
enforce a fixed number of steps, or you might not even
have any sequences you want to iterate over, but rather :param go_backwards:
just to repeat some computation for a fixed number of Flag indicating if you should go backwards through the sequences
steps. ``n_steps`` gives you this possibility. It can be
a theano scalar or a number.
:param truncate_gradient: Number of steps to use in truncated BPTT.
If you compute gradients through a scan op,
they are computed using backpropagation through
time. By providing a different value then -1,
you choose to use truncated BPTT instead of
classical BPTT, where you only do
``truncate_gradient`` number of steps.
:param go_backwards: Flag indicating if you should go bacwards through
the sequences
:rtype: tuple :rtype: tuple
:return: tuple of the form (outputs, updates); ``outputs`` is either a :return: tuple of the form (outputs, updates); ``outputs`` is either a
Theano variable or a list of Theano variables representing the Theano variable or a list of Theano variables representing the
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论