Merge pull request #133 from nouiz/op_doc

Op doc Due to lack of experience with git, I'll merge this pull request and send another one that fixes all the bugs I've seen.

Merge pull request #133 from nouiz/op_doc
0fe8bcb1 · Razvan Pascanu · 11ee6cee · 9501ca28 · 0fe8bcb1 · 0fe8bcb1
--- a/doc/cifarSC2011/extending_theano.txt
+++ b/doc/cifarSC2011/extending_theano.txt
@@ -11,6 +11,7 @@ Theano graphs
 - Theano works with symbolic graphs
 - Those graphs are bi-partite graphs (graph with 2 types of nodes)
 - Those 2 nodes types are Apply and Variable nodes
+- Apply node have a link to the Op that it execute

 Inputs and Outputs are lists of Theano variables

@@ -25,21 +26,59 @@ Op contract

    import theano

-    class MyOp(Op):
+    class MyOp(theano.Op):
+        def make_node(self, *inputs):
        def __eq__(self, other):
        def __hash__(self):
        def __str__(self):
-        def make_node(self, x):
+
        # Python implementation:
        def perform(self, node, inputs_storage, output_storage):
-        # C implementation: [see theano web site]
+
+        # C implementation: [see theano web site for other functions]
+	def c_code(...):
+	# ...
+
        # others implementation (pycuda, ...):
        def make_thunk(self, node, storage_map, _, _2):
+
        # optional:
        def __init__(self, ...):
        def grad(self, inputs, g):
+	def R_op(self, inputs, eval_points):
        def infer_shape(node, (i0_shapes, ...))

+.. ../extending/op.txt
+
+There is 2 mandatory function. The first is :func:`make_node`. The
+second is the one that do/tell the computation to do at run
+time. Currently you have 4 posibility: implement the :func:`perform`
+and/or :func:`c_code <Op.c_code>` (and other related :ref:`c functions
+<cop>`), or the :func:`make_thunk` function. The ``perform`` allow you
+to easily wrap an existing python function in Theano. The ``c_code``
+and related function allow you to have your op generate c code and
+have Theano compile and link to it. The ``make_thunk`` function will
+be called during compilation and should generate a ``thunk``: a
+function that when called will do the wanted computation. This is
+usefull if you want to generate code and compile it yourself. For
+example, this allow you to use PyCUDA to compile gpu code.
+
+There is 2 mandatory/highly suggested function. They are needed to for a basic
+optimization that merge duplicate computation in a Theano function. So
+if you don't want Theano to do you computation multiple time for no
+good reason, implement them! Those function are :func:`__eq__` and
+:func:`__hash__`.
+
+The :func:`infer_shape` method allow some very interesting
+optimization like don't performing the computation of your op just to
+take the shape your Op's output.
+
+The :func:`grad` method is needed you want want differentiation to
+work with your op.
+
+The :func:`__str__` is usefull to have a better printing of you op.
+
+The :func:`R_op` is needed if you want theano.tensor.Rop to work with your op.

 Op example
 ----------

--- a/doc/extending/cop.txt
+++ b/doc/extending/cop.txt
+.. _cop:

 ====================================
 Implementing the arithmetic Ops in C

--- a/doc/extending/op.txt
+++ b/doc/extending/op.txt
@@ -31,23 +31,6 @@ following methods:
  ordered correctly: a subsequent ``self.make_node(*apply.inputs)``
  must produce something equivalent to the first ``apply``.

-.. attribute:: default_output
-
-  *Default:* None
-
-  If this member variable is an integer, then the default
-  implementation of ``__call__`` will return
-  ``node.outputs[self.default_output]``, where ``node`` was returned
-  by ``make_node``.  Otherwise, the entire list of outputs will be
-  returned.
-
-.. function:: __call__(*inputs)
-
-  Syntactic shortcut to make_node which returns the output
-  Variables of the Op.
-
-  *Default:* this is done for you by Op.
-
 .. function:: perform(node, inputs, output_storage)

  This method computes the function associated to this Op. The
@@ -57,7 +40,7 @@ following methods:
  variables of the computation must be put. More specifically:

    - ``node``: This is a reference to an Apply node which was previously
-      obtained via ``mul``'s ``make_node`` method. It is typically not
+      obtained via the ``Op``'s ``make_node`` method. It is typically not
      used in simple Ops, but it contains symbolic information that
      could be required for complex Ops.

@@ -111,18 +94,14 @@ following methods:
  lifetime of self.  Op instances should be immutable in this
  sense.

-.. function:: __ne__(other)
-
-  *Default:* ``(not (self==other))``
-
 .. function:: grad(inputs, output_gradients)

-  Optional.
+  Optional (but needed if you want to have it work with {tensor,sparse}.grad())

  If the Op you are defining is differentiable, you can define its
  gradient symbolically in this method.

-  Both the ``inputs`` and ``output_gradients`` will be
+  Both the ``inputs`` and ``output_gradients`` will be list of Theano
  Variables. This method must return a list containing one Variable
  (or ``None``) for each input. Each returned Variable represents the
  gradient with respect to that input given the symbolic gradients
@@ -158,11 +137,64 @@ following methods:
  Both the partial derivation and that multiplication have to be done by
  :func:`grad`.

+.. function:: infer_shape(node, shapes)
+
+   Optional.
+
+   This function is needed for shape optimization. ``shapes`` is a
+   list with one tuple for each input the Apply node linked to this op
+   have.  Each tuple contain 1 element for each dimensions of the
+   corresponding inputs.  The value is the the corresponding
+   dimensions shape of the corresponding inputs.
+
+   This sound complicated, but this is just the corresponding inputs
+   shape in symbolic variable.
+
+   The function should return a list with one tuple for each output.
+   Each tuple should contain the corresponding output's shape.
+
+.. function:: make_thunk(node, storage_map, compute_map, no_recycling)
+
+   TODO
+
+.. function:: R_op(inputs, eval_points)
+
+   Optional.
+
+   This function is needed for theano.tensor.Rop to work with this op.
+
+   TODO: add more detail.
+
+.. attribute:: default_output
+
+  *Default:* None
+
+  If this member variable is an integer, then the default
+  implementation of ``__call__`` will return
+  ``node.outputs[self.default_output]``, where ``node`` was returned
+  by ``make_node``.  Otherwise, the entire list of outputs will be
+  returned.
+
+.. function:: __call__(*inputs)
+
+  Syntactic shortcut to make_node which returns the output
+  Variables of the Op.
+
+  *Default:* this is done for you by Op.
+
+.. function:: __str__()
+
+   *Default:* python default: module_path_to_your_class.CLASSNAME
+
+   This allow you to have a better printing of Op. If an Op have parameter
+   it is highly recommented that it make the ``__str__`` function
+   print the name of the op and the Op's parameters values.

 At a bare minimum, a new Op must define ``make_node`` and ``perform``, which have no defaults.

-For more details, including the interface for providing a C
-implementation of ``perform()``, refer to the documentation for :ref:`op`.
+Also you can provide a :ref:`C implementation <cop>` of
+``perform()``. For other details refer to the documentation for
+:ref:`op`.


 Defining an Op: ``mul``

--- a/doc/library/config.txt
+++ b/doc/library/config.txt
@@ -474,6 +474,7 @@ import theano and print the config variable, as in:

    When not ``'off'``, the value of this option dictates what happens when
    an Op's inputs do not provide appropriate test values:
+
        - ``'ignore'`` will silently skip the debug mechanism for this Op
        - ``'warn'`` will raise a UserWarning and skip the debug mechanism for
          this Op

--- a/theano/misc/check_blas.py
+++ b/theano/misc/check_blas.py
@@ -92,7 +92,7 @@ if __name__ == "__main__":

    if verbose:
        print """
-        Some result that you can compare again. They where 10 executions of gemm in float64 with matrix of shape 2000x2000 on FC9.
+        Some result that you can compare again. They where 10 executions of gemm in float64 with matrix of shape 2000x2000.

        Cpu tested: Xeon E5345(2.33Ghz, 8M L2 cache, 1333Mhz FSB), Xeon E5430(2.66Ghz, 12M L2 cache, 1333Mhz FSB),
                    Xeon E5450(3Ghz, 12M L2 cache, 1333Mhz FSB), Xeon X5560(2.8Ghz, 12M L2 cache, 6.4GT/s QPI, hyper-threads enabled?)