Skip to content
项目
群组
代码片段
帮助
当前项目
正在载入...
登录 / 注册
切换导航面板
P
pytensor
项目
项目
详情
活动
周期分析
仓库
仓库
文件
提交
分支
标签
贡献者
图表
比较
统计图
议题
0
议题
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
CI / CD
CI / CD
流水线
作业
日程
统计图
Wiki
Wiki
代码片段
代码片段
成员
成员
折叠边栏
关闭边栏
活动
图像
聊天
创建新问题
作业
提交
问题看板
Open sidebar
testgroup
pytensor
Commits
3e0c5bb7
提交
3e0c5bb7
authored
4月 24, 2015
作者:
Cesar Laurent
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
Made the doc more clear.
上级
5ce67be5
隐藏空白字符变更
内嵌
并排
正在显示
1 个修改的文件
包含
141 行增加
和
124 行删除
+141
-124
scan.txt
doc/library/scan.txt
+141
-124
没有找到文件。
doc/library/scan.txt
浏览文件 @
3e0c5bb7
...
...
@@ -210,6 +210,138 @@ with all values set to zero except at the provided array indices.
This demonstrates that you can introduce new Theano variables into a scan function.
Using shared variables - Gibbs sampling
---------------------------------------
Another useful feature of scan, is that it can handle shared variables.
For example, if we want to implement a Gibbs chain of length 10 we would do
the following:
.. code-block:: python
W = theano.shared(W_values) # we assume that ``W_values`` contains the
# initial values of your weight matrix
bvis = theano.shared(bvis_values)
bhid = theano.shared(bhid_values)
trng = T.shared_randomstreams.RandomStreams(1234)
def OneStep(vsample) :
hmean = T.nnet.sigmoid(theano.dot(vsample, W) + bhid)
hsample = trng.binomial(size=hmean.shape, n=1, p=hmean)
vmean = T.nnet.sigmoid(theano.dot(hsample, W.T) + bvis)
return trng.binomial(size=vsample.shape, n=1, p=vmean,
dtype=theano.config.floatX)
sample = theano.tensor.vector()
values, updates = theano.scan(OneStep, outputs_info=sample, n_steps=10)
gibbs10 = theano.function([sample], values[-1], updates=updates)
The first, and probably most crucial observation is that the updates
dictionary becomes important in this case. It links a shared variable
with its updated value after k steps. In this case it tells how the
random streams get updated after 10 iterations. If you do not pass this
update dictionary to your function, you will always get the same 10
sets of random numbers. You can even use the ``updates`` dictionary
afterwards. Look at this example :
.. code-block:: python
a = theano.shared(1)
values, updates = theano.scan(lambda: {a: a+1}, n_steps=10)
In this case the lambda expression does not require any input parameters
and returns an update dictionary which tells how ``a`` should be updated
after each step of scan. If we write :
.. code-block:: python
b = a + 1
c = updates[a] + 1
f = theano.function([], [b, c], updates=updates)
print b
print c
print a.value
We will see that because ``b`` does not use the updated version of
``a``, it will be 2, ``c`` will be 12, while ``a.value`` is ``11``.
If we call the function again, ``b`` will become 12, ``c`` will be 22
and ``a.value`` 21. If we do not pass the ``updates`` dictionary to the
function, then ``a.value`` will always remain 1, ``b`` will always be 2 and
``c`` will always be ``12``.
The second observation is that if we use shared variables ( ``W``, ``bvis``,
``bhid``) but we do not iterate over them (ie scan doesn't really need to know
anything in particular about them, just that they are used inside the
function applied at each step) you do not need to pass them as arguments.
Scan will find them on its own and add them to the graph.
However, passing them to the scan function is a good practice, as it avoids
Scan Op calling any earlier (external) Op over and over. This results in a
simpler computational graph, which speeds up the optimization and the
execution. To pass the shared variables to Scan you need to put them in a list
and give it to the ``non_sequences`` argument. Here is the Gibbs sampling code
updated:
.. code-block:: python
W = theano.shared(W_values) # we assume that ``W_values`` contains the
# initial values of your weight matrix
bvis = theano.shared(bvis_values)
bhid = theano.shared(bhid_values)
trng = T.shared_randomstreams.RandomStreams(1234)
# OneStep, with explicit use of the shared variables (W, bvis, bhid)
def OneStep(vsample, W, bvis, bhid):
hmean = T.nnet.sigmoid(theano.dot(vsample, W) + bhid)
hsample = trng.binomial(size=hmean.shape, n=1, p=hmean)
vmean = T.nnet.sigmoid(theano.dot(hsample, W.T) + bvis)
return trng.binomial(size=vsample.shape, n=1, p=vmean,
dtype=theano.config.floatX)
sample = theano.tensor.vector()
# The new scan, with the shared variables passed as non_sequences
values, updates = theano.scan(fn=OneStep,
outputs_info=sample,
non_sequences=[W, bvis, bhid],
n_steps=10)
gibbs10 = theano.function([sample], values[-1], updates=updates)
Using shared variables - the strict flag
----------------------------------------
As we just saw, passing the shared variables to scan may result in a simpler
computational graph, which speeds up the optimization and the execution. A
good way to remember to pass every shared variable used during scan is to use
the ``strict`` flag. When set to true, scan assumes that all the necessary
shared variables in ``fn`` are passed as a part of ``non_sequences``. This has
to be ensured by the user. Otherwise, it will result in an error.
Using the previous Gibbs sampling example:
.. code-block:: python
# The new scan, using strict=True
values, updates = theano.scan(fn=OneStep,
outputs_info=sample,
non_sequences=[W, bvis, bhid],
n_steps=10,
strict=True)
If you omit to pass ``W``, ``bvis`` or ``bhid`` as a ``non_sequence``, it will
result in an error.
Multiple outputs, several taps values - Recurrent Neural Network with Scan
--------------------------------------------------------------------------
...
...
@@ -238,10 +370,10 @@ construct a function that computes one iteration step :
def oneStep(u_tm4, u_t, x_tm3, x_tm1, y_tm1, W, W_in_1, W_in_2, W_feedback, W_out):
x_t = T.tanh(
theano.dot(x_tm1, W) + \
theano.dot(u_t, W_in_1) + \
theano.dot(u_tm4, W_in_2) + \
theano.dot(y_tm1, W_feedback))
x_t = T.tanh(theano.dot(x_tm1, W) + \
theano.dot(u_t, W_in_1) + \
theano.dot(u_tm4, W_in_2) + \
theano.dot(y_tm1, W_feedback))
y_t = theano.dot(x_tm3, W_out)
return [x_t, y_t]
...
...
@@ -266,10 +398,11 @@ the Theano variables needed we construct our RNN as follows :
# y[-1]
([x_vals, y_vals],updates) = theano.scan(fn = oneStep, \
sequences = dict(input = u, taps= [-4,-0]), \
outputs_info = [dict(initial = x0, taps = [-3,-1]),y0], \
non_sequences = [W,W_in_1,W_in_2,W_feedback, W_out])
([x_vals, y_vals], updates) = theano.scan(fn=oneStep,
sequences=dict(input=u, taps=[-4,-0]),
outputs_info=[dict(initial=x0, taps=[-3,-1]), y0],
non_sequences=[W, W_in_1, W_in_2, W_feedback, W_out],
strict=True)
# for second input y, scan adds -1 in output_taps by default
...
...
@@ -286,122 +419,6 @@ By abusing notations, scan will consider ``uvals[0]`` as ``u[-4]``, and
will start scaning from ``uvals[4]`` towards the end.
Using shared variables - Gibbs sampling
---------------------------------------
Another useful feature of scan, is that it can handle shared variables.
For example, if we want to implement a Gibbs chain of length 10 we would do
the following:
.. code-block:: python
W = theano.shared(W_values) # we assume that ``W_values`` contains the
# initial values of your weight matrix
bvis = theano.shared(bvis_values)
bhid = theano.shared(bhid_values)
trng = T.shared_randomstreams.RandomStreams(1234)
def OneStep(vsample) :
hmean = T.nnet.sigmoid(theano.dot(vsample, W) + bhid)
hsample = trng.binomial(size=hmean.shape, n=1, p=hmean)
vmean = T.nnet.sigmoid(theano.dot(hsample, W.T) + bvis)
return trng.binomial(size=vsample.shape, n=1, p=vmean,
dtype=theano.config.floatX)
sample = theano.tensor.vector()
values, updates = theano.scan(OneStep, outputs_info=sample, n_steps=10)
gibbs10 = theano.function([sample], values[-1], updates=updates)
Note that if we use shared variables ( ``W``, ``bvis``, ``bhid``) but
we do not iterate over them (so scan doesn't really need to know
anything in particular about them, just that they are used inside the
function applied at each step) you do not need to pass them as
arguments. Scan will find them on its own and add them to the graph. Of
course, if you wish to (and it is good practice) you can add them, when
you call scan (they would be in the list of non-sequence inputs).
The second, and probably most crucial observation is that the updates
dictionary becomes important in this case. It links a shared variable
with its updated value after k steps. In this case it tells how the
random streams get updated after 10 iterations. If you do not pass this
update dictionary to your function, you will always get the same 10
sets of random numbers. You can even use the ``updates`` dictionary
afterwards. Look at this example :
.. code-block:: python
a = theano.shared(1)
values, updates = theano.scan(lambda: {a: a+1}, n_steps=10)
In this case the lambda expression does not require any input parameters
and returns an update dictionary which tells how ``a`` should be updated
after each step of scan. If we write :
.. code-block:: python
b = a + 1
c = updates[a] + 1
f = theano.function([], [b, c], updates=updates)
print b
print c
print a.value
We will see that because ``b`` does not use the updated version of
``a``, it will be 2, ``c`` will be 12, while ``a.value`` is ``11``.
If we call the function again, ``b`` will become 12, ``c`` will be 22
and ``a.value`` 21.
If we do not pass the ``updates`` dictionary to the function, then
``a.value`` will always remain 1, ``b`` will always be 2 and ``c``
will always be ``12``.
Using shared variables - the strict flag
----------------------------------------
You also have the possibility to use the ``strict`` flag. When set to true,
Scan assumes that all the necessary shared variables in ``fn`` are passed as a
part of ``non_sequences``. This has to be ensured by the user. Otherwise, it
will result in an error. It avoids Scan Op calling any earlier (external) Op
over and over. This results in a simpler computational graph, which speeds up
the optimization and the execution.
Here is a simple RRN example:
.. math::
x(n) = \tanh(u(n) + W x(n-1))
And the code using ``strict=True``:
.. code-block:: python
u = T.matrix() # The input sequence
x0 = T.vector() # The initial state
W = theano.shared(W_values) # we assume that ``W_values`` contains the
# initial values of your weight matrix.
def oneStep(u_t, x_tm1, W):
return T.tanh(u_t + T.dot(W, x_tm1))
# Using strict=True, and passing W as a non_sequence
x_vals, updates = theano.scan(fn=oneStep,
sequences=dict(input=u, taps=[0]),
outputs_info=[dict(initial=x0,
taps=[-1])],
non_sequences=[W], # Don't forget to pass W!
strict=True)
If you omit to pass ``W`` as a ``non_sequence``, it will result in an error.
Conditional ending of Scan
--------------------------
...
...
编写
预览
Markdown
格式
0%
重试
或
添加新文件
添加附件
取消
您添加了
0
人
到此讨论。请谨慎行事。
请先完成此评论的编辑!
取消
请
注册
或者
登录
后发表评论