Merge remote-tracking branch 'central/master' into op_doc_merge

Conflicts: doc/cifarSC2011/extending_theano.txt doc/extending/op.txt

Merge remote-tracking branch 'central/master' into op_doc_merge
d6d69aa8 · Frederic · 5dbcb081 · ce7593d8 · d6d69aa8 · d6d69aa8
--- a/doc/cifarSC2011/extending_theano.txt
+++ b/doc/cifarSC2011/extending_theano.txt
@@ -67,31 +67,33 @@ Op contract

 .. ../extending/op.txt

-There are 2 mandatory methods. The first is :func:`make_node`. The
-second is the one that expresses what computation should be done at run
-time. Currently you have 4 possibilities: implement the :func:`perform`
-and/or :func:`c_code <Op.c_code>` (and other related :ref:`C functions
-<cop>`), or the :func:`make_thunk` method. The ``perform`` method allows you
-to easily wrap an existing Python function in Theano. The ``c_code``
-and related methods allow you to have your op generate C code and
-have Theano compile and link to it. The ``make_thunk`` method will
-be called during compilation and should generate a ``thunk``: a
-method that when called will do the desired computation. This is
-useful if you want to generate code and compile it yourself. For
-example, this allow you to use PyCUDA to compile GPU code.
-
-There are 2 mandatory/highly recommended methods. They are needed for a basic
-optimization that merges duplicate computations in a Theano function. Thus,
-if you don't want Theano to perform your computations multiple times for no
-good reason, implement these! Those methods are :func:`__eq__` and
+There are 2 mandatory methods that one needs to implement.
+The first one is :func:`make_node`. The second one 
+would describe the computations that are required to be done
+at run time. Currently there are 2 different possibilites:
+implement the :func:`perform`
+and/or :func:`c_code <Op.c_code>` (and other related :ref:`c methods
+<cop>`), or the :func:`make_thunk` method. The ``perform`` allows
+to easily wrap an existing python function into Theano. The ``c_code``
+and related methods allow the op to generate c code that will be 
+compiled and linked by Theano. On the other hand, the ``make_thunk``
+method will be called only once during compilation and should generate
+a ``thunk``: a standalone function that when called will do the wanted computations.
+This is usefull if you want to generate code and compile it yourself. For
+example, this allows you to use PyCUDA to compile gpu code.
+
+Also there are 2 methods that are highly recommended to be implemented. They are
+needed in order to merge duplicate computations involving your op. So if you
+do not want Theano to execute your op multiple times with the same inputs,
+do implement them. Those methods are :func:`__eq__` and
 :func:`__hash__`.

-The :func:`infer_shape` method allows for some very interesting
-optimizations, such as not performing your op's computations simply to
-determine the shape your Op's output.
+The :func:`infer_shape` method allows to infer shape of some variable, somewhere in the
+middle of the computational graph without actually computing the outputs (when possible).
+This could be helpful if one only needs the shape of the output instead of the actual outputs.

-The :func:`grad` method is needed if you want symbolic differentiation to
-work with your Op.
+The :func:`grad` method is required if you want to differentiate some cost whose expression
+includes your op.

 The :func:`__str__` is useful in order to provide a more meaningful string
 representation of your Op.

--- a/doc/extending/op.txt
+++ b/doc/extending/op.txt
@@ -141,14 +141,15 @@ following methods:

   Optional.

-   This method is needed for shape optimization. ``shapes`` is a
-   list with one tuple for each input to the Apply node linked to this Op.
-   Each tuple contains 1 element for each dimension of the
-   corresponding input.  The value corresponds to the input's size
-   along the given dimension.
+   This function is needed for shape optimization. ``shapes`` is a
+   list with one tuple for each input of the Apply node (which corresponds
+   to the inputs of the op).  Each tuple contains 1 element for 
+   each dimension of the corresponding input. The value is the 
+   shape (number of elements) along the corresponding dimension of that
+   specific input.

-   This sounds complicated, but this is just the corresponding input's
-   shape in a symbolic variable.
+   While this might sound complicated, it is nothing more then the shape
+   of each input as symbolic variables (one per dimension).

   The function should return a list with one tuple for each output.
   Each tuple should contain the corresponding output's computed shape.
@@ -165,9 +166,30 @@ following methods:

   Optional.

-   This function is needed for theano.tensor.Rop to work with this op.
-
-   TODO: add more detail.
+   This function implements the application of the R-operator on the
+   function represented by your op. Let assume that function is :math:`f`,
+   with input :math:`x`, applying the R-operator means computing the 
+   Jacobian of :math:`f` and right-multiplying it by :math:`v`, the evaluation 
+   point, namely: :math:`\frac{\partial f}{\partial x} v`. 
+
+   ``inputs`` are the symbolic variables corresponding to the value of 
+   the input where you want to evaluate the jacobian, and ``eval_points``
+   are the symbolic variables corresponding to the value you want to
+   right multiply the jacobian with. 
+
+   Same conventions as for the grad method hold. If your op is not
+   differentiable, you can return None. Note that in contrast to 
+   the method :func:`grad`, for :func:`R_op` you need to return the
+   same number of outputs as there are ouputs of the op. You can think
+   of it in the following terms. You have all your inputs concatenated
+   into a single vector :math:`x`. You do the same with the evaluation 
+   points (which are as many as inputs and of the shame shape) and obtain
+   another vector :math:`v`. For each output, you reshape it into a vector, 
+   compute the jacobian of that vector with respect to :math:`x` and 
+   multiply it by :math:`v`. As a last step you reshape each of these
+   vectors you obtained for each outputs (that have the same shape as 
+   the outputs) back to their corresponding shapes and return them as the 
+   output of the :func:`R_op` method.

 .. attribute:: default_output

@@ -184,7 +206,7 @@ following methods:
  Syntactic shortcut to make_node which returns the output
  Variables of the Op.

-  *Default:* this is done for you by Op.
+  *Default:* this is implemented in the parent class and you do not need to change it.

 .. function:: __str__()


--- a/theano/compile/sharedvalue.py
+++ b/theano/compile/sharedvalue.py
@@ -7,6 +7,9 @@ import logging
 import traceback
 import warnings

+# Third-party imports
+import numpy
+
 # Theano imports
 from theano import config
 from theano.configparser import (TheanoConfigParser, AddConfigVar, EnumStr,
@@ -168,6 +171,28 @@ class SharedVariable(Variable):
            update = shared(update)
        return update

+    def __getitem__(self, *args):
+        # __getitem__ is not available for generic SharedVariable objects.
+        # We raise a TypeError like Python would do if __getitem__ was not
+        # implemented at all, but with a more explicit error message to help
+        # Theano users figure out the root of the problem more easily.
+        value = self.get_value(borrow=True)
+        if isinstance(value, numpy.ndarray):
+            # Array probably had an unknown dtype.
+            msg = ("a Numpy array with dtype: '%s'. This data type is not "
+                   "currently recognized by Theano tensors: please cast "
+                   "your data into a supported numeric type if you need "
+                   "Theano tensor functionalities." % value.dtype)
+        else:
+            msg = ('an object of type: %s. Did you forget to cast it into '
+                   'a Numpy array before calling theano.shared()?' %
+                   type(value))
+
+        raise TypeError(
+                "The generic 'SharedVariable' object is not subscriptable. "
+                "This shared variable contains %s" % msg)
+
+
 def shared_constructor(ctor):
    shared.constructors.append(ctor)
    return ctor

--- a/theano/sandbox/cuda/var.py
+++ b/theano/sandbox/cuda/var.py
@@ -140,6 +140,12 @@ class CudaNdarraySharedVariable(SharedVariable, _operators):
                other.type.broadcastable)))
        return GpuFromHost()(other)

+    def __getitem__(self, *args):
+        # Defined to explicitly use the implementation from `_operators`, since
+        # the definition in `SharedVariable` is only meant to raise an error.
+        return _operators.__getitem__(self, *args)
+
+
 CudaNdarrayType.SharedVariable = CudaNdarraySharedVariable

 def cuda_shared_constructor(value, name=None, strict=False,

--- a/theano/scan_module/scan.py
+++ b/theano/scan_module/scan.py
@@ -53,7 +53,7 @@ from theano.tensor import opt
 from theano import tensor
 from theano import config
 from theano.updates import Updates
-from theano.sandbox import cuda
+

 import scan_op
 import scan_utils
@@ -914,6 +914,11 @@ def scan( fn
                   shared_inner_outputs  )
    if condition is not None:
        inner_outs.append(condition)
+    # Cuda is imported here, instead of being imported on top of the file
+    # because forces on the user some dependencies that we might do not want
+    # to. Currently we are working on removing the dependencies on sandbox
+    # code completeley.
+    from theano.sandbox import cuda
    if cuda.cuda_available:
        # very often we end up in this situation when we want to
        # replace w with w_copy, where w is CudaNdarray