提交 a5641dce authored 作者: Razvan Pascanu's avatar Razvan Pascanu

Fixed all warnings/errors with scan documentation

上级 cb7bbb7c
...@@ -12,10 +12,9 @@ function over a list, given an initial state of ``z=0``. ...@@ -12,10 +12,9 @@ function over a list, given an initial state of ``z=0``.
Special cases: Special cases:
- A ``reduce()`` operation can be performed by returning only the last * A ``reduce()`` operation can be performed by returning only the last
output of a ``scan``. output of a ``scan``.
* A ``map()`` operation can be performed by applying a function that
- A ``map()`` operation can be performed by applying a function that
ignores each previous output. ignores each previous output.
Often a for loop can be expressed as a ``scan()`` operation, and ``scan`` is Often a for loop can be expressed as a ``scan()`` operation, and ``scan`` is
...@@ -69,98 +68,130 @@ def scan(fn, sequences, initial_states, non_sequences, inplace_map={}, \ ...@@ -69,98 +68,130 @@ def scan(fn, sequences, initial_states, non_sequences, inplace_map={}, \
mode = None): mode = None):
'''Function that constructs and applies a Scan op '''Function that constructs and applies a Scan op
:param fn: Function that describes the operations involved in one step of scan :param fn: Function that describes the operations involved in one step
Given variables representing all the slices of input and of scan Given variables representing all the slices of input
past values of outputs and other non sequences parameters, ``fn`` should and past values of outputs and other non sequences parameters,
produce variables describing the output of one time step of scan. ``fn`` should produce variables describing the output of one
The order in which the argument to this function are given is very time step of scan. The order in which the argument to this
important. You should have the following order function are given is very important. You should have the
* all time slices of the first sequence (as given in the ``sequences`` list) ordered cronologically following order:
* all time slices of the second sequence (as given in the ``sequences`` list) ordered cronologically
* all time slices of the first sequence (as given in the
``sequences`` list) ordered cronologically
* all time slices of the second sequence (as given in the
``sequences`` list) ordered cronologically
* ... * ...
* all time slices of the first output (as given in the ``initial_state`` list) ordered cronologically * all time slices of the first output (as given in the
* all time slices of the second otuput (as given in the ``initial_state`` list) ordered cronologically ``initial_state`` list) ordered cronologically
* all time slices of the second otuput (as given in the
``initial_state`` list) ordered cronologically
* ... * ...
* all other parameters over which scan doesn't iterate given in the same order as in ``non_sequences`` * all other parameters over which scan doesn't iterate given
If you are using shared variables over which you do not want to iterate, you do not need to provide them as
arguments to ``fn``, though you can if you wish so. The function should return the outputs after each step plus in the same order as in ``non_sequences``
the updates for any of the shared variables. You can either return only outputs or only updates. If you have If you are using shared variables over which you do not want to
both outputs and updates the function should return them as a tuple : (outputs, updates) or (updates, outputs). iterate, you do not need to provide them as arguments to
Outputs can be just a theano expression if you have only one outputs or a list of theano expressions. Updates ``fn``, though you can if you wish so. The function should
can be given either as a list of as a dictionary. If you have a list of outputs, the order of these should return the outputs after each step plus the updates for any of
match that of their ``initial_states``. the shared variables. You can either return only outputs or
only updates. If you have both outputs and updates the
:param sequences: list of Theano variables over which scan needs to iterate. function should return them as a tuple : (outputs, updates)
or (updates, outputs).
:param initial_states: list of Theano variables containing the initial state used for the output.
Note that if the function applied recursively uses only the previous value of the output or none, this initial state Outputs can be just a theano expression if you have only one
should have same shape as one time step of the output; otherwise, the outputs or a list of theano expressions. Updates can be given
initial state should have the same number of dimension as output. This either as a list of as a dictionary. If you have a list of
can easily be understand through an example. For computing ``y[t]`` let outputs, the order of these should match that of their
assume that we need ``y[t-1]``, ``y[t-2]`` and ``y(t-4)``. Through an abuse of notation, ``initial_states``.
when ``t = 0``, we would need values for ``y[-1]``, ``y[-2]`` and :param sequences: list of Theano variables over which scan needs to
``y[-4]``. These values are provided by the initial state of ``y``, which iterate.
should have same number of dimension as ``y``, where the first dimension should :param initial_states: list of Theano variables containing the initial
be large enough to cover all past values, which in this case is 4. state used for the output. Note that if the
If ``init_y`` is the variable containing the initial state of ``y``, then function applied recursively uses only the
``init_y[0]`` corresponds to ``y[-4]``, ``init_y[1]`` corresponds to ``y[-3]``, previous value of the output or none, this
``init_y[2]`` corresponds to ``y[-2]``, ``init_y[3]`` corresponds to ``y[-1]``. initial state should have same shape
By default, scan is set to use the last time step for each output. as one time step of the output; otherwise, the
initial state should have the same number of
dimension as output. This can easily be understand
through an example. For computing ``y[t]`` let
assume that we need ``y[t-1]``, ``y[t-2]`` and
``y(t-4)``. Through an abuse of notation,
when ``t = 0``, we would need values for
``y[-1]``, ``y[-2]`` and ``y[-4]``. These values
are provided by the initial state of ``y``, which
should have same number of dimension as ``y``,
where the first dimension should be large enough
to cover all past values, which in this case is 4.
If ``init_y`` is the variable containing the
initial state of ``y``, then ``init_y[0]``
corresponds to ``y[-4]``, ``init_y[1]``
corresponds to ``y[-3]``, ``init_y[2]``
corresponds to ``y[-2]``, ``init_y[3]``
corresponds to ``y[-1]``. By default, scan is set
to use the last time step for each output.
:param non_sequences: Parameters over which scan should not iterate. :param non_sequences: Parameters over which scan should not iterate.
These parameters are given at each time step to the function applied recursively. These parameters are given at each time step to
the function applied recursively.
:param inplace_map: Dictionary describing outputs computed *inplace*. :param inplace_map: Dictionary describing outputs computed *inplace*.
``inplace_map`` is a dictionary where keys are output indexes, ``inplace_map`` is a dictionary where keys are
and values are sequence indexes. Assigning a value ``j`` to a key ``i`` means that output indexes, and values are sequence indexes.
output number ``j`` will be computed inplace (in the same Assigning a value ``j`` to a key ``i`` means that
memory buffer) as the input number ``i``. output number ``j`` will be computed inplace (in the
same memory buffer) as the input number ``i``.
:param sequences_taps: Dictionary describing what slices of the input sequences scan should use. :param sequences_taps: Dictionary describing what slices of the input
At each step of the iteration you can use different slices of your input sequences(called here taps), sequences scan should use. At each step of the
and this dictionary lets you define exactly that. The iteration you can use different slices of your
keys of the dictionary are sequence indexes, the values are list of input sequences(called here taps), and this
numbers. Having the following entry ``i : [k_1,k_2,k_3]``, means that dictionary lets you define exactly that. The
at step ``t``, for sequence ``x``, that has the index ``i`` in the list of keys of the dictionary are sequence indexes,
sequences, you would use the values ``x[t+k_1]``, ``x[t+k_2]`` and ``x[t+k_3]``. the values are list of numbers. Having the
``k_1``, ``k_2``, ``k_3`` values can be positive or negative and the sequence for following entry ``i : [k_1,k_2,k_3]``, means that
which you request this taps should be large enough to accomodate them. If in the at step ``t``, for sequence ``x``, that has the
cronological order, ``k`` is the first past value of sequence ``x``, index ``i`` in the list of sequences, you would
then index 0 of ``x`` will correspond to step ``k`` (if ``k`` is -3, then, abusing notation use the values ``x[t+k_1]``, ``x[t+k_2]`` and
``x[0]`` will be seen by scan as ``x[-3]``). ``x[t+k_3]``. ``k_1``, ``k_2``, ``k_3`` values
If you do not want to use any taps for a given sequence you need to set the corresponding entry can be positive or negative and the sequence for
in the dictionary to the empy list. By default, for each sequence that is not represented in the you request this taps should be large enough to
dictionary scan will assume that the at every step it needs to provide the current value of that accomodate them. If in the cronological order,
sequence. ``k`` is the first past value of sequence ``x``,
then index 0 of ``x`` will correspond to step ``k``
:param outputs_taps: Dictionary describing what slices of the input sequences scan should use. (if ``k`` is -3, then, abusing notation ``x[0]``
The ``outputs_taps`` are defined in an analogouws way to ``sequences_taps``, will be seen by scan as ``x[-3]``). If you do not
just that the taps are for the outputs generated by scan. As such they can want to use any taps for a given sequence you need
only be negative, i.e. refer to past value of outputs. to set the corresponding entry in the dictionary
By default scan will expect to use for any outpu the last time step, if nothing to the empy list. By default, for each sequence
else is specified. that is not represented in the dictionary scan
will assume that the at every step it needs to
:param n_steps: Number of steps to iterate. provide the current value of that sequence.
Sometimes you want to either enforce a fixed number of steps, or :param outputs_taps: Dictionary describing what slices of the input
you might not even have any sequences you want to iterate over, but rather sequences scan should use. The ``outputs_taps`` are
just to repeat some computation for a fixed number of steps. ``n_steps`` defined in an analogouws way to ``sequences_taps``,
gives you this possibility. It can be a theano scalar or a number. just that the taps are for the outputs generated by
scan. As such they can only be negative, i.e. refer
to past value of outputs. By default scan will
expect to use for any outpu the last time step, if
nothing else is specified.
:param n_steps: Number of steps to iterate. Sometimes you want to either
enforce a fixed number of steps, or you might not even
have any sequences you want to iterate over, but rather
just to repeat some computation for a fixed number of
steps. ``n_steps`` gives you this possibility. It can be
a theano scalar or a number.
:param truncate_gradient: Number of steps to use in truncated BPTT. :param truncate_gradient: Number of steps to use in truncated BPTT.
If you compute gradients through a scan op, If you compute gradients through a scan op,
they are computed using backpropagation through time. By providing a they are computed using backpropagation through
different value then -1, you choose to use truncated BPTT instead of time. By providing a different value then -1,
classical BPTT, where you only do ``truncate_gradient`` number of steps. you choose to use truncated BPTT instead of
classical BPTT, where you only do
:param go_backwards: Flag indicating if you should go bacwards through the sequences ``truncate_gradient`` number of steps.
:param go_backwards: Flag indicating if you should go bacwards through
the sequences
:rtype: tuple :rtype: tuple
:return: tuple of the form (outputs, updates) :return: tuple of the form (outputs, updates); ``outputs`` is either a
``outputs`` is either a Theano variable or a list of Theano variables Theano variable or a list of Theano variables representing the
representing the outputs of scan. ``updates`` outputs of scan. ``updates`` is a dictionary specifying the
is a dictionary specifying the updates rules for all shared updates rules for all shared variables used in the scan
variables used in the scan operation; this dictionary should be pass operation; this dictionary should be pass to ``theano.function``
to ``theano.function``
''' '''
# check if inputs are just single variables instead of lists # check if inputs are just single variables instead of lists
...@@ -348,30 +379,30 @@ class Scan(theano.Op): ...@@ -348,30 +379,30 @@ class Scan(theano.Op):
go_backwards = False, stored_steps_output = {}, go_backwards = False, stored_steps_output = {},
mode = 'FAST_RUN', inplace=False): mode = 'FAST_RUN', inplace=False):
''' '''
:param (inputs,outputs): inputs and outputs Theano variables that :param (inputs,outputs, givens): inputs and outputs Theano variables
describe the function that is applied recursively that describe the function that is
applied recursively; givens
:param n_seqs: number of sequences over which scan will have to iterate list is used to replace shared
variables with not shared ones
:param n_seqs: number of sequences over which scan will have to
iterate
:param n_outs: number of outputs of the scan op :param n_outs: number of outputs of the scan op
:param inplace_map: see scan function above :param inplace_map: see scan function above
:param seqs_taps: see scan function above :param seqs_taps: see scan function above
:param outs_taps: see scan function above :param outs_taps: see scan function above
:param truncate_gradient: number of steps after which scan should
:param truncate_gradient: number of steps after which scan should truncate truncate -1 implies no truncation
-1 implies no truncation
:param go_bacwards: see scan funcion above :param go_bacwards: see scan funcion above
:param stored_steps_output: a list of booleans of same size as the
:param stored_steps_output: a list of booleans of same size as the number of number of outputs; the value at position
outputs; the value at position ``i`` in the list corresponds to the ``i`` in the list corresponds to the
``i-th`` output, and it tells how many steps (from the end towards ``i-th`` output, and it tells how many
the begining) of the outputs you really need and should return; steps (from the end towards the begining)
given this information, scan can know (if possible) to allocate only of the outputs you really need and should
the amount of memory needed to compute that many entries return; given this information, scan can
know (if possible) to allocate only
the amount of memory needed to compute
that many entries
''' '''
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论