提交 ff13f2ec authored 作者: Arnaud Bergeron's avatar Arnaud Bergeron

Initial run of documentation for GPU ops.

上级 0d08205b
.. _extending_theano_gpu:
==============================
Extending Theano with a GPU Op
==============================
This tutorial covers how to extend Theano with an op that offers a GPU
implementation. It assumes you are familiar with how to write new
Theano ops. If that is not the case you should probably follow the
:ref:`extending_theano` and :ref:`extending_theano_c` section before
continuing on.
Writing a new GPU op can be done in python for some simple tasks, but
will usually done in C to access the complete API and avoid paying the
overhead of a python function call.
Dealing with the context
========================
One of the major differences with GPU ops is that they require a
context (a.k.a. device) to execute. Most of the time you can infer
the context to run on from your inputs. There is a way for the user
to transfer things between contexts and to tag certain varaibles for
transfer. It might also be the case that your inputs are not all from
the same context and you would have to choose which one to run on.
In order to support all of those options and have a consistent
interface, :func:`theano.gpuarray.basic_ops.infer_context_name` was
written. An example usage is below::
def make_node(self, a, b, c):
ctx = infer_context_name(a, b, c)
a = as_gpuarray_variable(a, ctx)
b = as_gpuarray_variable(b, ctx)
c = as_gpuarray_variable(c, ctx)
return Apply(self, [a, b, c], [a.type()])
In this example the Op takes three inputs, all on the GPU. In case
one or more of your inputs is not supposed to be on the GPU, you
should not pass it to `infer_context_name`.
Also note that :func:`theano.gpuarray.basic_ops.as_gpuarray_variable`
takes `context_name` as a mandatory parameter. This is because it's
not enough to know you want the value to be on the GPU, you also want
to know which GPU to put it on. In almost all cases, you can pass in
the return value of `infer_context_name` there.
If you also need the context during runtime (for example to allocate
the output). You can use the context of one of your inputs to know
which one to use. Here is another example::
def perform(self, node, inputs, output_storage):
A, B = inputs
C, = output_storage
C[0] = pygpu.empty([A.shape[0], B.shape[1]], dtype=A.dtype, A.context)
pygpu.blas.gemm(1, A, B, 0, C, overwrite_c=True)
Finally if you require the context before perform, such as during
make_thunk() to initialize kernels and such, you can access the
context of your inputs through the type if the variables::
def make_thunk(self, node, storage_map, compute_map, no_recycling):
ctx = node.inputs[0].type.context
Note that GpuArrayType objects also have a `context_name` attribute
which is the symbolic equivalent of `context`. It can't be used for
calls to pygpu or libgpuarray, but it should be used for theano
operations and variables.
The last place where you might need the context is in the C
initialization code. For that you will have to use the :ref:`params
<extending_op_params>`. The params type should be
:class:`theano.gpuarray.type.gpu_context_type` and the params object
should be a context object from one of your input variables.
If you don't have any input variables on the GPU you can follow the
the example of :class:`theano.gpuarray.basic_ops.GpuFromHost` or
:class:`theano.gpuarray.basic_ops.GpuEye`. This is not a case that
you should encounter often, so it will not be covered further.
...@@ -45,6 +45,7 @@ with Theano itself. ...@@ -45,6 +45,7 @@ with Theano itself.
ctype ctype
cop cop
using_params using_params
extending_theano_gpu
optimization optimization
tips tips
unittest unittest
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论