Skip to content
项目
群组
代码片段
帮助
当前项目
正在载入...
登录 / 注册
切换导航面板
P
pytensor
项目
项目
详情
活动
周期分析
仓库
仓库
文件
提交
分支
标签
贡献者
图表
比较
统计图
议题
0
议题
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
CI / CD
CI / CD
流水线
作业
日程
统计图
Wiki
Wiki
代码片段
代码片段
成员
成员
折叠边栏
关闭边栏
活动
图像
聊天
创建新问题
作业
提交
问题看板
Open sidebar
testgroup
pytensor
Commits
96676ed5
提交
96676ed5
authored
10月 14, 2010
作者:
Razvan Pascanu
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
[scan][doc][coding-style] re-arranged the documentation of scan parameters
上级
b15fadcc
显示空白字符变更
内嵌
并排
正在显示
1 个修改的文件
包含
214 行增加
和
128 行删除
+214
-128
scan.py
theano/scan.py
+214
-128
没有找到文件。
theano/scan.py
浏览文件 @
96676ed5
...
@@ -268,161 +268,247 @@ def foldr( fn
...
@@ -268,161 +268,247 @@ def foldr( fn
# Yes, actually it will be exactly 2 ( if there are no other constraints)
# Yes, actually it will be exactly 2 ( if there are no other constraints)
def
scan
(
fn
,
sequences
=
[],
outputs_info
=
[],
non_sequences
=
[],
def
scan
(
fn
n_steps
=
None
,
truncate_gradient
=
-
1
,
go_backwards
=
False
,
,
sequences
=
None
mode
=
None
,
name
=
None
):
,
outputs_info
=
None
"""Function that constructs and applies a Scan op
,
non_sequences
=
None
,
n_steps
=
None
,
truncate_gradient
=
-
1
,
go_backwards
=
False
,
mode
=
None
,
name
=
None
):
"""
This function constructs and applies a Scan op to the provided
arguments.
:param fn:
:param fn:
Function that describes the operations involved in one step of scan
``fn`` is a function that describes the operations involved in one step
Given variables representing all the slices of input and past values of
of ``scan``. ``fn`` should construct variables describing the output of
outputs and other non sequences parameters, ``fn`` should produce
one iteration step. It should expect as input theano variables
variables describing the output of one time step of scan. The order in
representing all the time slices of the input sequences and outputs,
which the argument to this function are given is very important. You
and all other arguments given to scan as ``non_sequences``. The order
should have the following order:
in which scan passes this variables to ``fn`` is the following :
* all time slices of the first sequence (as given in the
* all time slices of the first sequence
``sequences`` list) ordered in the same fashion as the time taps provided
* all time slices of the second sequence
* all time slices of the second sequence (as given in the
``sequences`` list) ordered in the same fashion as the time taps provided
* ...
* ...
* all time slices of the first output (as given in the
* all time slices of the last sequence
``initial_state`` list) ordered in the same fashion as the time taps provided
* all time slices of the first output
* all time slices of the second otuput (as given in the
* all time slices of the second otuput
``initial_state`` list) ordered in the same fashion as the time taps provided
* ...
* ...
* all other parameters over which scan doesn't iterate ordered accordingly
* all time slices of the last output
* all other arguments (the list given as `non_sequences` to
If you are using shared variables over which you do not want to iterate,
scan)
you do not need to provide them as arguments to ``fn``, though you can if you
wish so. The function should return the outputs after each step plus the updates
The order of the sequences is the same as the one in the list
for any of the shared variables. You can either return only outputs or only
`sequences` given to scan. The order of the outputs is the sane
updates. If you have both outputs and updates the function should return
as the order of ``output_info``. For any sequence or output the
them as a tuple : (outputs, updates) or (updates, outputs).
order of the time slices is the same as the order of the time
taps provided. For example if one writes the following :
.. code-block:: python
scan(fn, sequences = [ dict( Sequence1, taps = [-3,2,-1])
, Sequence2
, dict( Sequence3, taps = 3) ]
, outputs_info = [ dict( Output1, taps = [-3,-5])
, dict( Output2, taps = None)
, Output3 ]
, non_sequences = [ Argument1, Argument 2])
``fn`` should expect the following arguments in this given order:
#. ``Sequence1[t-3]``
#. ``Sequence1[t+2]``
#. ``Sequence1[t-1]``
#. ``Sequence2[t]``
#. ``Sequence3[t+3]``
#. ``Output1[t-3]``
#. ``Output1[t-5]``
#. ``Output3[t-1]``
#. ``Argument1``
#. ``Argument2``
The list of ``non_sequences`` can also contain shared variables
used in the function, though ``scan`` is able to figure those
out on its own so they can be skipped. For the clarity of the
code we recommand though to provide them to scan.
The function is expected to return two things. One is a list of
outputs ordered in the same order as ``outputs_info``, with the
difference that there should be only one output variable per
output initial state (even if no tap value is used). Secondly
`fn` should return an update dictionary ( that tells how to
update any shared variable after each iteration ste). The
dictionary can optionally be given as a list of tuples. There is
no constraint on the order of these two list, ``fn`` can return
either ``(outputs_list, update_dictionary)`` or ``(update_dictionary,
outputs_list)`` or just one of the two (in case the other is
empty).
Outputs can be just a theano expression if you have only one output or
a list of theano expressions. Updates can be given either as a list of tuples or
as a dictionary. If you have a list of outputs, the order of these
should match that of their ``initial_states``.
:param sequences:
:param sequences:
list of Theano variables or dictionaries containing Theano variables over which
``sequences`` is the list of Theano variables or dictionaries
scan needs to iterate. The reason you might want to wrap a certain Theano
describing the sequences ``scan`` has to iterate over. If a
variable in a dictionary is to provide auxiliary information about how to iterate
sequence is given as wrapped in a dictionary a set of optional
over that variable. For example this is how you specify that you want to use
information can be provided about the sequence. The dictionary
several time slices of this sequence at each iteration step. The dictionary
should have the following keys:
should have the following keys :
* ``input`` (*mandatory*) -- Theano variable representing the
* ``input`` -- Theano variable representing the sequence
sequence.
* ``taps`` -- temporal taps to use for this sequence. They are given as a list
of ints, where a value ``k`` means that at iteration step ``t`` scan needs to
* ``taps`` -- Temporal taps of the sequence required by ``fn``.
provide also the slice ``t+k`` The order in which you provide these int values
They are provided as a list of integers, where a value ``k`` impiles
here is the same order in which the slices will be provided to ``fn``.
that at iteration step ``t`` scan will pass to ``fn`` the slice
``t+k``. Default value is ``[0]``
If you do not wrap a variable around a dictionary, scan will do it for you, under
the assumption that you use only one slice, defined as a tap of offset 0. This
Any Theano variable in the list ``sequences`` is automatically
means that at step ``t`` scan will provide the slice at position ``t``.
wrapped into a dictionary where ``taps`` is set to ``[0]``
:param outputs_info:
:param outputs_info:
list of Theano variables or dictionaries containing Theano variables used
``outputs_info`` is the list of Theano variables or dictionaries
to initialize the outputs of scan. As before (for ``sequences``) the reason
describing the initial state of the outputs computed
you would wrap a Theano variable in a dictionary is to provide additional
recurrently. When this initial states are given as dictionary
information about how scan should deal with that specific output. The dictionary
optional information can be provided about the output corresponding
should contain the following keys:
to these initial states. The dictionary should have the following
keys:
* ``initial`` -- Theano variable containing the initial state of the output
* ``taps`` -- temporal taps to use for this output. The taps are given as a
* ``initial`` -- Theano variable that represents the initial
list of ints (only negative .. since you can not use future values of outputs),
state of a given output. In case the output is not computed
with the same meaning as for ``sequences`` (see above).
recursively (think of a map) and does not require a initial
* ``inplace`` -- theano variable pointing to one of the input sequences; this
state this field can be skiped. Given that only the previous
flag tells scan that the output should be computed in the memory space occupied
time step of the output is used by ``fn`` the initial state
by that input sequence. Note that scan will only do this if allowed by the
should have the same shape as the output. If multiple time
rest of your computational graph and if you are not using past taps of the
taps are used, the initial state should have one extra
input.
dimension that should cover all the possible taps. For example
* ``return_steps`` how many steps to return from your output. If not given, or
if we use ``-5``, ``-2`` and ``-1`` as past taps, at step 0,
0 scan will return all steps, otherwise it will return the last ``return_steps``.
``fn`` will require (by an abuse of notation) ``output[-5]``,
Note that if you set this to something else then 0, scan will try to be smart
``output[-2]`` and ``output[-1]``. This will be given by
about the amount of memory it allocates for a given input.
the initial state, which in this case should have the shape
(5,)+output.shape. If this variable containing the initial
If the function applied recursively uses only the
state is called ``init_y`` then ``init_y[0]`` *corresponds to*
previous value of the output, the initial state should have
``output[-5]``. ``init_y[1]`` *correponds to* ``output[-4]``,
same shape as one time step of the output; otherwise, the initial state
``init_y[2]`` corresponds to ``output[-3]``, ``init_y[3]``
should have the same number of dimension as output. This is easily
coresponds to ``output[-2]``, ``init_y[4]`` corresponds to
understood through an example. For computing ``y[t]`` let us assume that we
``output[-1]``. While this order might seem strange, it comes
need ``y[t-1]``, ``y[t-2]`` and ``y[t-4]``. Through an abuse of
natural from splitting an array at a given point. Assume that
notation, when ``t = 0``, we would need values for ``y[-1]``, ``y[-2]``
we have a array ``x``, and we choose ``k`` to be time step
and ``y[-4]``. These values are provided by the initial state of ``y``,
``0``. Then our initial state would be ``x[:k]``, while the
which should have same number of dimension as ``y``, where the first
output will be ``x[k:]``. Looking at this split, elements in
dimension should be large enough to cover all the required past values, which in
``x[:k]`` are ordered exactly like those in ``init_y``.
this case is 4. If ``init_y`` is the variable containing the initial state
* ``taps`` -- Temporal taps of the output that will be pass to
of ``y``, then ``init_y[0]`` corresponds to ``y[-4]``, ``init_y[1]``
``fn``. They are provided as a list of *negative* integers,
corresponds to ``y[-3]``, ``init_y[2]`` corresponds to ``y[-2]``,
where a value ``k`` implies that at iteration step ``t`` scan will
``init_y[3]`` corresponds to ``y[-1]``. The default behaviour of scan is
pass to ``fn`` the slice ``t+k``.
the following :
* ``inplace`` -- One of the Theano variables provided as
``sequences``. ``scan`` will try to compute this output *in
* if you do not wrap an output in a dictionary, scan will wrap it for you
place* of the provided input *iff* it respects the following
assuming that you use only the last step of the output ( i.e. it makes your tap
constraints:
value list equal to [-1]) and that it is not computed inplace
* if you wrap an output in a dictionary and you do not provide any taps but
* There is no other output that is denied to be computed in
you provide an initial state it will assume that you are using only a tap value
place for whatever reason.
of -1
* if you wrap an output in a dictionary but you do not provide any initial state,
* ``fn`` is not using past taps of the input sequence that
it assumes that you are not using any form of taps
will get overwritten by the output
* if you provide a ``None`` instead of a variable or a dictionary scan assumes
that you will not use any taps for this output (this would be the case for map)
* ``return_steps`` -- Integer representing the number of steps
to return for the current steps. For example, if ``k`` is
If you did not provide any information for your outputs, scan will assume by
provided, ``scan`` will return ``output[-k:]``. This is meant as a
default that you are not using any taps for any of the outputs. If you provide
hint, based on ``k`` and the past taps of the outputs used, scan
information for just a subset of outputs, scan will not know to which outputs
can be smart about the amount of memory it requires to store
these correspond and will raise an error.
intermidiate results. If not given, or ``0``, ``scan`` will return
all computed steps.
* ``store_steps`` -- Integer representing the number of
intermidiate steps ``scan`` should use for a given output. Use
this key only if you really know what you are doing. In general
is recommendat to let scan decide for you the ammount of memory
it should use.
``scan`` will follow this logic if partial information is given:
* If an output is not wrapped in a dictionary, ``scan`` will wrap
it in one assuming that you use only the last step of the output
(i.e. it makes your tap value list equal to [-1]) and that it is
not computed inplace.
* If you wrap an output in a dictionary and you do not provide any
taps but you provide an initial state it will assume that you are
using only a tap value of -1.
* If you wrap an output in a dictionary but you do not provide any
initial state, it assumes that you are not using any form of
taps.
* If you provide a ``None`` instead of a variable or a dictionary
``scan`` assumes that you will not use any taps for this output
(like for example in case of a map)
If ``outputs_info`` is an empty list or None, ``scan`` assumes
that no tap is used for any of the otuputs. If information is
provided just for a subset of the outputs an exception is
raised (because there is no convention on how scan should map
the provided information to the outputs of ``fn``)
:param non_sequences:
:param non_sequences:
Parameters over which scan should not iterate. These parameters are
``non_sequences`` is the list of arguments that are passed to
given at each time step to the function applied recursively.
``fn`` at each steps. Once can opt to exclude shared variables
used in ``fn`` from this list.
:param n_steps:
:param n_steps:
Number of steps to iterate. If the input sequences are not long enough, scan
``n_steps`` is the number of steps to iterate given as an int
will produce a warning and run only for the maximal amount of steps allowed by
or Theano scalar. If any of the input sequences do not have
the input sequences. If the value is 0, the outputs will have 0 rows. If the
enough elements, scan will produce a warning and run only for
value is negative, scan will run backwards (or if the flag go_backwards is
the maximal amount of steps it can. If the *value is 0* the
already set to true it will run forward in time). If n_steps is not provided,
outputs will have *0 rows*. If the value is negative, ``scan``
or evaluetes to None, inf or nan, scan will figure out the maximal amount of
run backwards in time. If the ``go_backwards`` flag is already
steps it can run given the input sequences and do that.
set and also ``n_steps`` is negative, ``scan`` will run forward
in time. If n stpes is not provided, or evaluates to ``None``,
``inf`` or ``NaN``, ``scan`` will figure out the amount of
steps it should run given its input sequences.
:param truncate_gradient:
:param truncate_gradient:
Number of steps to use in truncated BPTT. If you compute gradients
``truncate_gradient`` is the number of steps to use in truncated
through a scan op, they are computed using backpropagation through time.
BPTT. If you compute gradients through a scan op, they are
By providing a different value then -1, you choose to use truncated BPTT
computed using backpropagation through time. By providing a
instead of classical BPTT, where you only do ``truncate_gradient``
different value then -1, you choose to use truncated BPTT instead
number of steps.
of classical BPTT, where you go for only ``truncate_gradient``
number of steps back in time.
:param go_backwards:
:param go_backwards:
Flag indicating if you should go backwards through the sequences ( if you
``go_backwards`` is a flag indicating if ``scan`` should go
think as the sequences being indexed by time, this would mean go backwards
backwards through the sequences. If you think of each sequence
in time)
as indexed by time, making this flag True would mean that
``scan`` goes back in time, namely that for any sequence it
starts from the end and goes towards 0.
:param name:
:param name:
The name of the theano function compiled by the Scan op. It will show in the
When profiling ``scan`` it is crucial to provide a name for any
profiler output.
instance of ``scan``. The profiler will produce an overall
profile of your code as well as profiles for doing one iteration
step for each instance of ``scan``. The ``name`` of the instance is
how you differentiate between all these profiles.
:param mode:
:param mode:
The mode used when compiling the theano function in the Scan op.
It is recommended to leave this argument to None, especially
If None, it will use the config mode. If None and the config mode is set to
when profiling ``scan`` (otherwise the results are not going to
profile mode, it we will create a new instance of the ProfileMode in order
be accurate). If you prefer the computations of one step os
to compute the timming correctly.
``scan`` to be done differently then the entire function set
If no new instance is created the time spend in Scan will show up twice in the
this parameters (see ``theano.function`` for details about
profiling, once as the time taken by scan, and the second time as the time
possible values and their meaning).
taken by the ops inside scan. This will be even worse for multiple cascading
scans.
The new profiler instance will be printed when python exits.
:rtype: tuple
:rtype: tuple
:return: tuple of the form (outputs, updates); ``outputs`` is either a
:return: tuple of the form (outputs, updates); ``outputs`` is either a
Theano variable or a list of Theano variables representing the
Theano variable or a list of Theano variables representing the
outputs of scan. ``updates`` is a dictionary specifying the
outputs of ``scan`` (in the same order as in
``outputs_info``. ``updates`` is a dictionary specifying the
updates rules for all shared variables used in the scan
updates rules for all shared variables used in the scan
operation; this dictionary should be pass to ``theano.function``
operation. This dictionary should be pass to ``theano.function``
when you compile your function.
"""
"""
# General observation : this code is executed only once, at creation
# General observation : this code is executed only once, at creation
# of the computational graph, so we don't yet need to be smart about
# of the computational graph, so we don't yet need to be smart about
...
...
编写
预览
Markdown
格式
0%
重试
或
添加新文件
添加附件
取消
您添加了
0
人
到此讨论。请谨慎行事。
请先完成此评论的编辑!
取消
请
注册
或者
登录
后发表评论