Initial run of documentation for GPU ops.

ff13f2ec · Arnaud Bergeron · 0d08205b · ff13f2ec · ff13f2ec
--- a/doc/extending/extending_theano_gpu.txt
+++ b/doc/extending/extending_theano_gpu.txt
+.. _extending_theano_gpu:
+==============================
+Extending Theano with a GPU Op
+==============================
+This tutorial covers how to extend Theano with an op that offers a GPU
+implementation.  It assumes you are familiar with how to write new
+Theano ops.  If that is not the case you should probably follow the
+:ref:`extending_theano` and :ref:`extending_theano_c` section before
+continuing on.
+Writing a new GPU op can be done in python for some simple tasks, but
+will usually done in C to access the complete API and avoid paying the
+overhead of a python function call.
+Dealing with the context
+========================
+One of the major differences with GPU ops is that they require a
+context (a.k.a. device) to execute.  Most of the time you can infer
+the context to run on from your inputs.  There is a way for the user
+to transfer things between contexts and to tag certain varaibles for
+transfer.  It might also be the case that your inputs are not all from
+the same context and you would have to choose which one to run on.
+In order to support all of those options and have a consistent
+interface, :func:`theano.gpuarray.basic_ops.infer_context_name` was
+written.  An example usage is below::
+    def make_node(self, a, b, c):
+        ctx = infer_context_name(a, b, c)
+        a = as_gpuarray_variable(a, ctx)
+        b = as_gpuarray_variable(b, ctx)
+        c = as_gpuarray_variable(c, ctx)
+        return Apply(self, [a, b, c], [a.type()])
+In this example the Op takes three inputs, all on the GPU.  In case
+one or more of your inputs is not supposed to be on the GPU, you
+should not pass it to `infer_context_name`.
+Also note that :func:`theano.gpuarray.basic_ops.as_gpuarray_variable`
+takes `context_name` as a mandatory parameter.  This is because it's
+not enough to know you want the value to be on the GPU, you also want
+to know which GPU to put it on.  In almost all cases, you can pass in
+the return value of `infer_context_name` there.
+If you also need the context during runtime (for example to allocate
+the output).  You can use the context of one of your inputs to know
+which one to use.  Here is another example::
+    def perform(self, node, inputs, output_storage):
+        A, B = inputs
+        C, = output_storage
+        C[0] = pygpu.empty([A.shape[0], B.shape[1]], dtype=A.dtype, A.context)
+        pygpu.blas.gemm(1, A, B, 0, C, overwrite_c=True)
+Finally if you require the context before perform, such as during
+make_thunk() to initialize kernels and such, you can access the
+context of your inputs through the type if the variables::
+    def make_thunk(self, node, storage_map, compute_map, no_recycling):
+        ctx = node.inputs[0].type.context
+Note that GpuArrayType objects also have a `context_name` attribute
+which is the symbolic equivalent of `context`.  It can't be used for
+calls to pygpu or libgpuarray, but it should be used for theano
+operations and variables.
+The last place where you might need the context is in the C
+initialization code.  For that you will have to use the :ref:`params
+<extending_op_params>`.  The params type should be
+:class:`theano.gpuarray.type.gpu_context_type` and the params object
+should be a context object from one of your input variables.
+If you don't have any input variables on the GPU you can follow the
+the example of :class:`theano.gpuarray.basic_ops.GpuFromHost` or
+:class:`theano.gpuarray.basic_ops.GpuEye`.  This is not a case that
+you should encounter often, so it will not be covered further.
--- a/doc/extending/index.txt
+++ b/doc/extending/index.txt
@@ -45,6 +45,7 @@ with Theano itself.
    ctype
    cop
    using_params
+    extending_theano_gpu
    optimization
    tips
    unittest