提交 28a6161e authored 作者: Razvan Pascanu's avatar Razvan Pascanu

fix to tutorials

...@@ -7,7 +7,7 @@ evaluate mathematical expressions involving multi-dimensional ...@@ -7,7 +7,7 @@ evaluate mathematical expressions involving multi-dimensional
arrays efficiently. Theano features: arrays efficiently. Theano features:
* **tight integration with numpy** * **tight integration with numpy**
* **near-transparent use of a GPU** to accelerate for intense calculations [COMING]. * **near-transparent use of a GPU** to accelerate for intense calculations [JAN 2010].
* **symbolic differentiation** * **symbolic differentiation**
* **speed and stability optimizations**: write ``log(1+exp(x))`` and get the right answer. * **speed and stability optimizations**: write ``log(1+exp(x))`` and get the right answer.
* **dynamic C code generation** for faster expression evaluation * **dynamic C code generation** for faster expression evaluation
......
.. _libdoc_gradient: .. _libdoc_gradient:
============================================================== ===========================================
:mod:`gradient` -- symbolic differentiation [doc TODO] :mod:`gradient` -- symbolic differentiation
============================================================== ===========================================
.. module:: gradient
:platform: Unix, Windows
:synopsis: low-level automatic differentiation
.. moduleauthor:: LISA
Symbolic gradient is usually computed from :func:`tensor.grad`, which offers a
more convenient syntax for the common case of wanting the gradient in some
expressions with respect to a scalar cost. The :func:`grad_sources_inputs`
function does the underlying work, and is more flexible, but is also more
awkward to use when :func:`tensor.grad` can do the job.
.. function:: grad_sources_inputs(sources, graph_inputs, warn_type=True)
A gradient source is a pair (``r``, ``g_r``), in which ``r`` is a `Variable`, and ``g_r`` is a
`Variable` that is a gradient wrt ``r``.
This function traverses the graph backward from the ``r`` sources,
calling ``op.grad(...)`` for all ops with some non-None gradient on an output.
The ``op.grad(...)`` functions are called like this:
.. code-block:: python
op.grad(op.inputs[:], [total_gradient(v) for v in op.outputs])
This call to ``op.grad`` should return a list or tuple: one symbolic gradient per input.
If ``op`` has a single input, then ``op.grad`` should return a list or tuple of length 1.
For each input wrt to which ``op`` is not differentiable, it should return ``None`` instead
of a `Variable` instance.
If a source ``r`` receives a gradient from another source ``r2``, then the effective
gradient on ``r`` is the sum of both gradients.
:type sources: list of pairs of Variable: (v, gradient-on-v) to
initialize the total_gradient dictionary
:param sources: gradients to back-propagate using chain rule
:param warn_type: True will trigger warnings via the logging module when
the gradient on an expression has a different type than the original
expression
:type warn_type: bool
:type graph_inputs: list of Variable
:param graph_inputs: variables considered to be constant
(do not backpropagate through them)
:rtype: dictionary whose keys and values are of type `Variable`
:returns: mapping from each Variable encountered in the backward traversal to its [total] gradient.
...@@ -149,3 +149,63 @@ Fourier Transforms ...@@ -149,3 +149,63 @@ Fourier Transforms
[James has some code for this, but hasn't gotten it into the source tree yet.] [James has some code for this, but hasn't gotten it into the source tree yet.]
= =
=======
.. function:: dot(X, Y)
:param X: left term
:param Y: right term
:type X: symbolic matrix or vector
:type Y: symbolic matrix or vector
:rtype: symbolic matrix or vector
:return: the inner product of `X` and `Y`.
.. function:: outer(X, Y)
:param X: left term
:param Y: right term
:type X: symbolic vector
:type Y: symbolic vector
:rtype: symbolic matrix
:return: vector-vector outer product
.. function:: tensordot(X, Y, axes=2)
This is a symbolic standing for ``numpy.tensordot``.
:param X: left term
:param Y: right term
:param axes: sum out these axes from X and Y.
:type X: symbolic tensor
:type Y: symbolic tensor
:rtype: symbolic tensor
:type axes: see numpy.tensordot
:return: tensor product
Gradient / Differentiation
==========================
.. function:: grad(cost, wrt, g_cost=None, consider_constant=[], warn_type=False)
Return symbolic gradients for one or more variables with respect to some
cost.
:type cost: 0-d tensor variable
:type wrt: tensor variable or list of tensor variables
:type g_cost: same as `cost`
:type consider_constant: list of variables
:type warn_type: bool
:param cost: a scalar with respect to which we are differentiating
:param wrt: term[s] for which we want gradients
:param g_cost: the gradient on the cost
:param consider_constant: variables whose gradients will be held at 0.
:param warn_type: True will trigger warnings via the logging module when
the gradient on an expression has a different type than the original
expression
:rtype: variable or list of variables (matching `wrt`)
:returns: gradients with respect to cost for each of the `wrt` terms
...@@ -17,4 +17,5 @@ sanity, they are grouped into the following sections: ...@@ -17,4 +17,5 @@ sanity, they are grouped into the following sections:
basic basic
shared_randomstreams shared_randomstreams
signal
.. _libdoc_tensor_signal:
======================================================
:mod:`signal` -- Signal processing
======================================================
.. module:: signal
:platform: Unix, Windows
:synopsis: ops for signal processing
.. moduleauthor:: LISA
TODO: Give examples for how to use these things! They are pretty complicated.
.. function:: conv2D(*todo)
.. function:: downsample2D(*todo)
.. function:: fft(*todo)
...@@ -23,6 +23,7 @@ installation (see :ref:`install`). ...@@ -23,6 +23,7 @@ installation (see :ref:`install`).
numpy numpy
adding adding
examples examples
loading_and_saving
debugmode debugmode
profilemode profilemode
debug_faq debug_faq
......
.. tutorial_loadsave:
==================
Loading and Saving
==================
Many Theano objects can be serialized. However, you will want to consider different mechanisms
depending on the amount of time you anticipate between saving and reloading. For short-term
(such as temp files and network transfers) pickling is possible. For longer-term (such as
saving models from an experiment) you should not rely on pickled theano objects; we recommend
loading and saving the underlying shared objects as you would in the course of any other python
program.
pickling -- Short-term serialization
=====================================
Pickling and unpickling of functions. Caveats... basically don't do this for long-term storage.
***TODO***
not-pickling -- Long-term serialization
=======================================
***TODO***
Give a short example of how to add a __getstate__ and __setstate__ to a class. Point out to
use protocol=-1 for numpy ndarrays.
Point to the python docs for further reading.
...@@ -21,26 +21,6 @@ _msg_badlen = 'op.grad(...) returned wrong number of gradients' ...@@ -21,26 +21,6 @@ _msg_badlen = 'op.grad(...) returned wrong number of gradients'
def grad_sources_inputs(sources, graph_inputs, warn_type=True): def grad_sources_inputs(sources, graph_inputs, warn_type=True):
""" """
A gradient source is a pair (``r``, ``g_r``), in which ``r`` is a `Variable`, and ``g_r`` is a
`Variable` that is a gradient wrt ``r``.
This function traverses the graph backward from the ``r`` sources,
calling ``op.grad(...)`` for all ops with some non-None gradient on an output.
The ``op.grad(...)`` functions are called like this:
.. code-block:: python
op.grad(op.inputs[:], [total_gradient(v for v in op.outputs)])
This call to ``op.grad`` should return a list or tuple: one symbolic gradient per input.
If ``op`` has a single input, then ``op.grad`` should return a list or tuple of length 1.
For each input wrt to which ``op`` is not differentiable, it should return ``None`` instead
of a `Variable` instance.
If a source ``r`` receives a gradient from another source ``r2``, then the effective
gradient on ``r`` is the sum of both gradients.
:type sources: list of pairs of Variable: (v, gradient-on-v) :type sources: list of pairs of Variable: (v, gradient-on-v)
:param sources: gradients to back-propagate using chain rule :param sources: gradients to back-propagate using chain rule
:type graph_inputs: list of Variable :type graph_inputs: list of Variable
......
...@@ -27,6 +27,6 @@ def test_no_shared_var_graph(): ...@@ -27,6 +27,6 @@ def test_no_shared_var_graph():
f = theano.function([a,b],[a+b], mode=mode_with_gpu) f = theano.function([a,b],[a+b], mode=mode_with_gpu)
l = f.maker.env.toposort() l = f.maker.env.toposort()
assert len(l)==4 assert len(l)==4
assert any(isinstance(x.op,cuda.GpuElemwise) for x in l) assert numpy.any(isinstance(x.op,cuda.GpuElemwise) for x in l)
assert any(isinstance(x.op,cuda.GpuFromHost) for x in l) assert numpy.any(isinstance(x.op,cuda.GpuFromHost) for x in l)
assert any(isinstance(x.op,cuda.HostFromGpu) for x in l) assert numpy.any(isinstance(x.op,cuda.HostFromGpu) for x in l)
...@@ -3285,6 +3285,8 @@ class TensorDot(Op): ...@@ -3285,6 +3285,8 @@ class TensorDot(Op):
return "tensordot" return "tensordot"
tensordot = TensorDot tensordot = TensorDot
#TODO: tensordot should be function as described in rst docs.
class Outer(Op): class Outer(Op):
""" Compute vector-vector outer product """ Compute vector-vector outer product
""" """
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论