fix to tutorials

28a6161e · Razvan Pascanu · e789f49c · 29daaf70 · 28a6161e · 28a6161e
--- a/doc/index.txt
+++ b/doc/index.txt
@@ -7,7 +7,7 @@ evaluate mathematical expressions involving multi-dimensional
 arrays efficiently. Theano features:
 * **tight integration with numpy**
-* **near-transparent use of a GPU** to accelerate for intense calculations [COMING].
+* **near-transparent use of a GPU** to accelerate for intense calculations [JAN 2010].
 * **symbolic differentiation**
 * **speed and stability optimizations**: write ``log(1+exp(x))`` and get the right answer.
 * **dynamic C code generation** for faster expression evaluation

--- a/doc/library/gradient.txt
+++ b/doc/library/gradient.txt
 .. _libdoc_gradient:
-==============================================================
+===========================================
-:mod:`gradient` -- symbolic differentiation [doc TODO]
+:mod:`gradient` -- symbolic differentiation
-==============================================================
+===========================================
+.. module:: gradient
+   :platform: Unix, Windows
+   :synopsis: low-level automatic differentiation
+.. moduleauthor:: LISA
+Symbolic gradient is usually computed from :func:`tensor.grad`, which offers a
+more convenient syntax for the common case of wanting the gradient in some
+expressions with respect to a scalar cost.  The :func:`grad_sources_inputs`
+function does the underlying work, and is more flexible, but is also more
+awkward to use when :func:`tensor.grad` can do the job.
+.. function:: grad_sources_inputs(sources, graph_inputs, warn_type=True)
+    A gradient source is a pair (``r``, ``g_r``), in which ``r`` is a `Variable`, and ``g_r`` is a
+    `Variable` that is a gradient wrt ``r``.
+    This function traverses the graph backward from the ``r`` sources,
+    calling ``op.grad(...)`` for all ops with some non-None gradient on an output.
+    The ``op.grad(...)`` functions are called like this:
+    .. code-block:: python
+        op.grad(op.inputs[:], [total_gradient(v) for v in op.outputs])
+    This call to ``op.grad`` should return a list or tuple: one symbolic gradient per input.
+    If ``op`` has a single input, then ``op.grad``  should return a list or tuple of length 1.
+    For each input wrt to which ``op`` is not differentiable, it should return ``None`` instead
+    of a `Variable` instance.
+    If a source ``r`` receives a gradient from another source ``r2``, then the effective
+    gradient on ``r`` is the sum of both gradients.
+    :type sources: list of pairs of Variable: (v, gradient-on-v) to 
+                   initialize the total_gradient dictionary
+    :param sources: gradients to back-propagate using chain rule
+    :param warn_type: True will trigger warnings via the logging module when
+       the gradient on an expression has a different type than the original
+       expression
+    :type warn_type: bool
+    :type graph_inputs: list of Variable
+    :param graph_inputs: variables considered to be constant 
+                         (do not backpropagate through them)
+    :rtype: dictionary whose keys and values are of type `Variable`
+    :returns: mapping from each Variable encountered in the backward traversal to its [total] gradient.
--- a/doc/library/tensor/basic.txt
+++ b/doc/library/tensor/basic.txt
@@ -149,3 +149,63 @@ Fourier Transforms
 [James has some code for this, but hasn't gotten it into the source tree yet.]
 =
+=======
+.. function:: dot(X, Y)
+    :param X: left term
+    :param Y: right term
+    :type X: symbolic matrix or vector
+    :type Y: symbolic matrix or vector
+    :rtype: symbolic matrix or vector
+    :return: the inner product of `X` and `Y`.
+.. function:: outer(X, Y)
+    :param X: left term
+    :param Y: right term
+    :type X: symbolic vector
+    :type Y: symbolic vector
+    :rtype: symbolic matrix 
+    :return: vector-vector outer product
+.. function:: tensordot(X, Y, axes=2)
+    This is a symbolic standing for ``numpy.tensordot``.
+    :param X: left term
+    :param Y: right term
+    :param axes: sum out these axes from X and Y.
+    :type X: symbolic tensor
+    :type Y: symbolic tensor
+    :rtype: symbolic tensor 
+    :type axes: see numpy.tensordot
+    :return: tensor product
+Gradient / Differentiation
+==========================
+.. function:: grad(cost, wrt, g_cost=None, consider_constant=[], warn_type=False)
+    Return symbolic gradients for one or more variables with respect to some
+    cost.
+    :type cost: 0-d tensor variable
+    :type wrt: tensor variable or list of tensor variables
+    :type g_cost: same as `cost`
+    :type consider_constant: list of variables
+    :type warn_type: bool
+    :param cost: a scalar with respect to which we are differentiating
+    :param wrt: term[s] for which we want gradients
+    :param g_cost: the gradient on the cost
+    :param consider_constant: variables whose gradients will be held at 0.
+    :param warn_type: True will trigger warnings via the logging module when
+       the gradient on an expression has a different type than the original
+       expression
+    :rtype: variable or list of variables (matching `wrt`)
+    :returns: gradients with respect to cost for each of the `wrt` terms 
--- a/doc/library/tensor/index.txt
+++ b/doc/library/tensor/index.txt
@@ -17,4 +17,5 @@ sanity, they are grouped into the following sections:
    basic
    shared_randomstreams
+    signal
--- a/doc/library/tensor/signal.txt
+++ b/doc/library/tensor/signal.txt
+.. _libdoc_tensor_signal:
+======================================================
+:mod:`signal` -- Signal processing
+======================================================
+.. module:: signal
+   :platform: Unix, Windows
+   :synopsis: ops for signal processing
+.. moduleauthor:: LISA
+TODO: Give examples for how to use these things! They are pretty complicated.
+.. function:: conv2D(*todo)
+.. function:: downsample2D(*todo)
+.. function:: fft(*todo)
--- a/doc/tutorial/index.txt
+++ b/doc/tutorial/index.txt
@@ -23,6 +23,7 @@ installation (see :ref:`install`).
    numpy
    adding
    examples
+    loading_and_saving
    debugmode
    profilemode
    debug_faq

--- a/doc/tutorial/loading_and_saving.txt
+++ b/doc/tutorial/loading_and_saving.txt
+.. tutorial_loadsave:
+==================
+Loading and Saving
+==================
+Many Theano objects can be serialized.  However, you will want to consider different mechanisms
+depending on the amount of time you anticipate between saving and reloading.  For short-term
+(such as temp files and network transfers) pickling is possible.  For longer-term (such as
+saving models from an experiment) you should not rely on pickled theano objects; we recommend
+loading and saving the underlying shared objects as you would in the course of any other python
+program.
+pickling -- Short-term serialization
+=====================================
+Pickling and unpickling of functions. Caveats... basically don't do this for long-term storage.
+***TODO***
+not-pickling -- Long-term serialization
+=======================================
+***TODO***
+Give a short example of how to add a __getstate__ and __setstate__ to a class.  Point out to
+use protocol=-1 for numpy ndarrays.
+Point to the python docs for further reading.
--- a/theano/gradient.py
+++ b/theano/gradient.py
@@ -21,26 +21,6 @@ _msg_badlen = 'op.grad(...) returned wrong number of gradients'
 def grad_sources_inputs(sources, graph_inputs, warn_type=True):
    """
-    A gradient source is a pair (``r``, ``g_r``), in which ``r`` is a `Variable`, and ``g_r`` is a
-    `Variable` that is a gradient wrt ``r``.
-    This function traverses the graph backward from the ``r`` sources,
-    calling ``op.grad(...)`` for all ops with some non-None gradient on an output.
-    The ``op.grad(...)`` functions are called like this:
-    .. code-block:: python
-        op.grad(op.inputs[:], [total_gradient(v for v in op.outputs)])
-    This call to ``op.grad`` should return a list or tuple: one symbolic gradient per input.
-    If ``op`` has a single input, then ``op.grad``  should return a list or tuple of length 1.
-    For each input wrt to which ``op`` is not differentiable, it should return ``None`` instead
-    of a `Variable` instance.
-    If a source ``r`` receives a gradient from another source ``r2``, then the effective
-    gradient on ``r`` is the sum of both gradients.
    :type sources: list of pairs of Variable: (v, gradient-on-v)
    :param sources: gradients to back-propagate using chain rule
    :type graph_inputs: list of Variable

--- a/theano/sandbox/cuda/tests/test_opt.py
+++ b/theano/sandbox/cuda/tests/test_opt.py
@@ -27,6 +27,6 @@ def test_no_shared_var_graph():
    f = theano.function([a,b],[a+b], mode=mode_with_gpu)
    l = f.maker.env.toposort()
    assert len(l)==4
-    assert any(isinstance(x.op,cuda.GpuElemwise) for x in l)
+    assert numpy.any(isinstance(x.op,cuda.GpuElemwise) for x in l)
-    assert any(isinstance(x.op,cuda.GpuFromHost) for x in l)
+    assert numpy.any(isinstance(x.op,cuda.GpuFromHost) for x in l)
-    assert any(isinstance(x.op,cuda.HostFromGpu) for x in l)
+    assert numpy.any(isinstance(x.op,cuda.HostFromGpu) for x in l)
--- a/theano/tensor/basic.py
+++ b/theano/tensor/basic.py
@@ -3285,6 +3285,8 @@ class TensorDot(Op):
        return "tensordot"
 tensordot = TensorDot
+#TODO: tensordot should be function as described in rst docs.
 class Outer(Op):
    """ Compute vector-vector outer product
    """