提交 e13a072e authored 作者: Sina Honari's avatar Sina Honari

getting rid of some bugs

上级 b1b0e8e1
......@@ -2,7 +2,8 @@
===========================
Frequently Asked Questions
==================================
===========================
How to update a subset of weights?
==================================
If you want to update only a subset of a weight matrix (such as
......@@ -18,36 +19,37 @@ the only rows that should get updated are those containing embeddings
used during the forward propagation. Here is how the theano function
should be written:
>>> # defining a shared variable for the lookup table
Defining a shared variable for the lookup table
>>> lookup_table = theano.shared(matrix_ndarray).
>>>
>>> # getting a subset of the table (some rows
>>> # or some columns) by passing an integer vector of
>>> # indices corresponding to those rows or columns.
>>> slice = lookup_table[vector_of_indices]
>>>
>>> # From now on, use only 'slice'.
>>> # Do not call lookup_table[vector_of_indices] again.
>>> # This causes problems with grad as this will create new variables.
>>>
>>> # defining cost which depends only on slice
>>> # and not the entire lookup_table
>>> cost = something that depends on slice
>>> g = theano.grad(cost, slice)
>>>
>>> # There are two ways for updating the parameters:
>>> # either use inc_subtensor or set_subtensor.
>>> # It is recommended to use inc_subtensor.
>>> # Some theano optimizations do the conversion between
>>> # the two functions, but not in all cases.
>>> updates = inc_subtensor(slice, g*lr)
>>> # OR
>>> updates = set_subtensor(slice, slice + g*lr)
>>>
>>> # Note that currently we just cover the case here,
>>> # not if you use inc_subtensor or set_subtensor with other types of indexing.
>>>
>>> # defining the theano function
Getting a subset of the table (some rows or some columns) by passing
an integer vector of indices corresponding to those rows or columns.
>>> subset = lookup_table[vector_of_indices]
From now on, use only 'subset'. Do not call lookup_table[vector_of_indices]
again. This causes problems with grad as this will create new variables.
Defining cost which depends only on subset and not the entire lookup_table
>>> cost = something that depends on subset
>>> g = theano.grad(cost, subset)
There are two ways for updating the parameters:
Either use inc_subtensor or set_subtensor. It is recommended to use
inc_subtensor. Some theano optimizations do the conversion between
the two functions, but not in all cases.
>>> updates = inc_subtensor(subset, g*lr)
OR
>>> updates = set_subtensor(subset, subset + g*lr)
Currently we just cover the case here,
not if you use inc_subtensor or set_subtensor with other types of indexing.
Defining the theano function
>>> f=theano.function(..., updates=updates)
Note that you can compute the gradient of the cost function w.r.t.
......
......@@ -46,4 +46,4 @@ you out.
extending_theano_c
python-memory-management
multi_cores
faq
faq_tutorial
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论