提交 98dea78f authored 作者: Pierre Luc Carrier's avatar Pierre Luc Carrier

Added more complex example on a C op

上级 b1fc6111
......@@ -310,9 +310,10 @@ class Op that are related to the C implementation. Of particular interest are:
:meth:`Op.c_no_compile_args` to specify requirements regarding how
the op's C code should be compiled.
This section describes the methods :meth:`Op.c_code`, :meth:`Op.c_support_code` and
:meth:`Op.c_code_cache_version` because they are the ones that are most commonly
used.
This section describes the methods :meth:`Op.c_code`,
:meth:`Op.c_support_code`, :meth:`Op.c_support_code_apply` and
:meth:`Op.c_code_cache_version` because they are the ones that are most
commonly used.
.. method:: c_code(node, name, input_names, output_names, sub)
......@@ -333,8 +334,17 @@ used.
Finally, ``sub`` is a dictionary of extras parameters to the c_code
method. Among other things, it contains ``sub['fail']`` which is a string
of C code that you should execute (after ensuring that a Python exception
is set) if your C code needs to raise an exception.
of C code that you should include in your C code (after ensuring that a
Python exception is set) if it needs to raise an exception. Ex:
.. code-block:: python
c_code = """
PyErr_Format(PyExc_ValueError, "X does not have the right value");
%(fail)s;
""" % {'fail' : sub['fail']}
to raise a ValueError Python exception with the specified message.
:note:
Your C code should not return the output of the computation but
......@@ -343,9 +353,19 @@ used.
.. method:: c_support_code()
Returns a string containing the support C code for this op. This code
Returns a string containing some support C code for this op. This code
will be included at the global scope level and can be used to define
functions and structs that will be used by every apply of this op.
.. method:: c_support_code_apply()
Returns a string containing some support C code for this op. This code
will be included at the global scope level and can be used to define
functions and structs that will be used by the op's main C code.
functions and structs that will be used by this op. The difference between
this method and ``c_support_code()`` is that the C code specified in
``c_support_code_apply()`` should be specific to each apply of the Op,
while ``c_support_code()`` is for support code that is not specific to
each apply.
.. method:: c_code_cache_version()
......@@ -367,11 +387,13 @@ used.
Simple C Op example
=====================
===================
In this section, we put together every concept that was covered in this
In this section, we put together the concepts that were covered in this
tutorial to generate an op which multiplies every element in a vector
by a scalar and returns the resulting vector.
by a scalar and returns the resulting vector. This is intended to be a simple
example so the methods ``c_support_code()`` and ``c_support_code_apply()`` are
not used because they are not required.
In the C code below notice how the reference count on the output variable is
managed. Also take note of how the new variables required for the op's
......@@ -393,10 +415,6 @@ need to validate that the output storage has been allocated and has the same
shape as our vector input. If it is not the case, we allocate a new output
storage with the right shape and number of dimensions.
:note:
Given the simple nature of this op, there was no need to use the
``c_support_code()`` function.
.. code-block:: python
import numpy
......@@ -429,6 +447,9 @@ storage with the right shape and number of dimensions.
x, y = inp
z, = out
# Extract the dtypes of the inputs and outputs storage to
# be able to declare pointers for those dtypes in the C
# code.
dtype_x = node.inputs[0].dtype
dtype_y = node.inputs[1].dtype
dtype_z = node.outputs[0].dtype
......@@ -481,3 +502,156 @@ storage with the right shape and number of dimensions.
"""
return c_code % locals()
More complex C Op example
=========================
This section introduces a new example, slightly more complex than the previous
one, with an op to perform an element-wise multiplication between the elements
of two vectors. This new example differs from the previous one in its use
of the methods ``c_support_code()`` and ``c_support_code_apply()`` (it does
not `need` to use them but it does so to explain their use) and its capacity
to support inputs of different dtypes.
Recall the method ``c_support_code()`` is meant to produce code that will
be used for every apply of the op. This means that the C code in this
method must be valid in every setting your op supports. If the op is meant
to supports inputs of various dtypes, the C code in this method should be
generic enough to work with every supported dtype. If the op operates on
inputs that can be vectors or matrices, the C code in this method should
be able to accomodate both kinds of inputs.
In our example, the method ``c_support_code()`` is used to declare a C
function to validate that two vectors have the same shape. Because our
op only supports vectors as inputs, this function is allowed to rely
on its inputs being vectors. However, our op should support multiple
dtypes so this function cannot rely on a specific dtype in its inputs.
The method ``c_support_code_apply()``, on the other hand, is allowed
to depend on the inputs to the op because it is apply-specific. Therefore, we
use it to define a function to perform the multiplication between two vectors.
Variables or functions defined in the method ``c_support_code_apply()`` will
be included at the global scale for every apply of the Op. Because of this,
the names of those variables and functions should include the name of the op,
like in the example. Otherwise, using the op twice in the same graph will give
rise to conflicts as some elements will be declared more than once.
The last interesting difference occurs in the ``c_code()`` method. Because the
dtype of the output is variable and not guaranteed to be the same as any of
the inputs (because of the upcast in the method ``make_node()``), the typenum
of the output has to be obtained in the Python code and then included in the
C code.
.. code-block:: python
class VectorTimesVector(gof.Op):
__props__ = ()
def __init__(self, **kwargs):
gof.Op.__init__(self, **kwargs)
def make_node(self, x, y):
# Validate the inputs' type
if x.type.ndim != 1:
raise TypeError('x must be a 1-d vector')
if y.type.ndim != 1:
raise TypeError('y must be a 1-d vector')
# Create an output variable of the same type as x
print x.dtype
print y.dtype
print theano.scalar.upcast(x.dtype, y.dtype)
print "----"
output_var = theano.tensor.TensorType(
dtype=theano.scalar.upcast(x.dtype, y.dtype),
broadcastable=[False])()
return gof.Apply(self, [x, y], [output_var])
def c_code_cache_version(self):
return (1, 0, 1)
def c_support_code(self):
c_support_code = """
bool vector_same_shape(PyArrayObject* arr1,
PyArrayObject* arr2)
{
return (PyArray_DIMS(arr1)[0] == PyArray_DIMS(arr2)[0]);
}
"""
return c_support_code
def c_support_code_apply(self, node, name):
dtype_x = node.inputs[0].dtype
dtype_y = node.inputs[1].dtype
dtype_z = node.outputs[0].dtype
c_support_code = """
void vector_elemwise_mult_%(name)s(npy_%(dtype_x)s* x_ptr,
int x_str, npy_%(dtype_y)s* y_ptr, int y_str,
npy_%(dtype_z)s* z_ptr, int z_str, int nbElements)
{
for (int i=0; i < nbElements; i++){
z_ptr[i * z_str] = x_ptr[i * x_str] * y_ptr[i * y_str];
}
}
"""
return c_support_code % locals()
def c_code(self, node, name, inp, out, sub):
x, y = inp
z, = out
dtype_x = node.inputs[0].dtype
dtype_y = node.inputs[1].dtype
dtype_z = node.outputs[0].dtype
itemsize_x = numpy.dtype(dtype_x).itemsize
itemsize_y = numpy.dtype(dtype_y).itemsize
itemsize_z = numpy.dtype(dtype_z).itemsize
typenum_z = numpy.dtype(dtype_z).num
fail = sub['fail']
c_code = """
// Validate that the inputs have the same shape
if ( !vector_same_shape(%(x)s, %(y)s))
{
PyErr_Format(PyExc_ValueError, "x.shape[0] != y.shape[0]");
%(fail)s;
}
// Validate that the output storage exists and has the same
// dimension as x.
if (NULL == %(z)s || !(vector_same_shape(%(x)s, %(z)s)))
{
/* Reference received to invalid output variable.
Decrease received reference's ref count and allocate new
output variable */
Py_XDECREF(%(z)s);
%(z)s = (PyArrayObject*)PyArray_EMPTY(1,
PyArray_DIMS(%(x)s),
%(typenum_z)s,
0);
if (!%(z)s) {
%(fail)s;
}
}
// Perform the vector elemwise multiplication
vector_elemwise_mult_%(name)s(
(npy_%(dtype_x)s*)PyArray_DATA(%(x)s),
PyArray_STRIDES(%(x)s)[0] / %(itemsize_x)s,
(npy_%(dtype_y)s*)PyArray_DATA(%(y)s),
PyArray_STRIDES(%(y)s)[0] / %(itemsize_y)s,
(npy_%(dtype_z)s*)PyArray_DATA(%(z)s),
PyArray_STRIDES(%(z)s)[0] / %(itemsize_z)s,
PyArray_DIMS(%(x)s)[0]);
"""
return c_code % locals()
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论