提交 2f8a4deb authored 作者: Frédéric Bastien's avatar Frédéric Bastien

Merge pull request #1715 from delallea/minor

Minor fixes
.. _other_ops: .. _other_ops:
============================= ==============================
Implementing some specific Op Implementing some specific Ops
============================= ==============================
This page is a guide on the implementation of some specific types of Ops, This page is a guide on the implementation of some specific types of Ops,
and point to some examples of such implementations. and points to some examples of such implementations.
For the random number generating Ops, it explains different possible For the random number generating Ops, it explains different possible
implementation strategies. implementation strategies.
...@@ -18,10 +18,10 @@ Scalar/Elemwise/Reduction Ops ...@@ -18,10 +18,10 @@ Scalar/Elemwise/Reduction Ops
Implementing a Theano scalar Op allows that scalar operation to be reused Implementing a Theano scalar Op allows that scalar operation to be reused
by our elemwise operations on tensors. If the scalar operation has C code, the by our elemwise operations on tensors. If the scalar operation has C code, the
elemwise implementation it will automaticaly have C code too. This elemwise implementation will automatically have C code too. This
will enable the fusion of elemwise operations using your new scalar will enable the fusion of elemwise operations using your new scalar
operation. It can also reuse the GPU elemwise code. It is similar for operation. It can also reuse the GPU elemwise code. It is similar for
reduction operation. reduction operations.
For examples of how to add new scalar operations, you can have a look at For examples of how to add new scalar operations, you can have a look at
those 2 pull requests, that add `GammaLn and Psi those 2 pull requests, that add `GammaLn and Psi
...@@ -84,11 +84,11 @@ instead of ``as_tensor_variable(x)``. ...@@ -84,11 +84,11 @@ instead of ``as_tensor_variable(x)``.
Another difference is that you need to use ``SparseVariable`` and Another difference is that you need to use ``SparseVariable`` and
``SparseType`` instead of ``TensorVariable`` and ``TensorType``. ``SparseType`` instead of ``TensorVariable`` and ``TensorType``.
Don't forget that we support only sparse matrices (so only 2 dimensions) Do not forget that we support only sparse matrices (so only 2 dimensions)
and they don't support broadcasting operation by default, as SciPy sparse and (like in SciPy) they do not support broadcasting operations by default
matrix class does (but a few Ops do it when called manually). Also, we support 2 (although a few Ops do it when called manually). Also, we support only two
formats for sparse type: ``csr`` and ``csc``. So in ``make_mode()``, formats for sparse type: ``csr`` and ``csc``. So in ``make_mode()``,
you can create outputs variables like this: you can create output variables like this:
.. code-block:: python .. code-block:: python
...@@ -97,11 +97,11 @@ you can create outputs variables like this: ...@@ -97,11 +97,11 @@ you can create outputs variables like this:
See the sparse :class:`theano.sparse.basic.Cast` op `code See the sparse :class:`theano.sparse.basic.Cast` op `code
<https://github.com/Theano/Theano/blob/master/theano/sparse/basic.py#L753>`_ <https://github.com/Theano/Theano/blob/master/theano/sparse/basic.py#L753>`_
for a good example for a sparse op with Python code. for a good example of a sparse op with Python code.
.. note:: .. note::
From the definition of CSR and CSC format, CSR column indices are From the definition of CSR and CSC formats, CSR column indices are
not necessarily sorted. Likewise for CSC row indices. Use not necessarily sorted. Likewise for CSC row indices. Use
:class:`EnsureSortedIndices :class:`EnsureSortedIndices
<theano.sparse.basic.EnsureSortedIndices>` if your code does not <theano.sparse.basic.EnsureSortedIndices>` if your code does not
...@@ -129,7 +129,7 @@ Sparse C code ...@@ -129,7 +129,7 @@ Sparse C code
------------- -------------
Theano does not have a native C code interface for sparse matrices. The Theano does not have a native C code interface for sparse matrices. The
reason is simple, we use the SciPy sparse matrix object and they don't reason is simple: we use the SciPy sparse matrix objects and they don't
have a C object. So we use a simple trick: a sparse matrix is made of have a C object. So we use a simple trick: a sparse matrix is made of
4 fields that are NumPy vector arrays: ``data``, ``indices``, ``indptr`` 4 fields that are NumPy vector arrays: ``data``, ``indices``, ``indptr``
and ``shape``. So to make and ``shape``. So to make
...@@ -183,17 +183,17 @@ distributions here:: ...@@ -183,17 +183,17 @@ distributions here::
2) Extend MRG implementation by reusing existing Theano Op. Look into 2) Extend MRG implementation by reusing existing Theano Op. Look into
the ``theano/sandbox/rng_mrg.py`` file and grep for all code about the ``theano/sandbox/rng_mrg.py`` file and grep for all code about
binomal(). This distribution uses the output of the uniform binomial(). This distribution uses the output of the uniform
distribution and converts it to a binomial distribution with distribution and converts it to a binomial distribution with
existing Theano operations. The tests go in existing Theano operations. The tests go in
``theano/sandbox/test_rng_mrg.py`` ``theano/sandbox/test_rng_mrg.py``
3) Extend MRG implementation with a new Op that takes an uniform as 3) Extend MRG implementation with a new Op that takes a uniform sample as
input. Look in the ``theano/sandbox/{rng_mrg,multinomial}.py`` file input. Look in the ``theano/sandbox/{rng_mrg,multinomial}.py`` file
and its test in ``theano/sandbox/test_multinomal.py``. This is and its test in ``theano/sandbox/test_multinomal.py``. This is
recommended when current Theano ops aren't well suited to modify recommended when current Theano ops aren't well suited to modify
the uniform to the target distribution. This can happen in the uniform to the target distribution. This can happen in
particular is there is a loop or complicated condition. particular if there is a loop or complicated condition.
.. note:: .. note::
...@@ -214,16 +214,16 @@ the ``__init__()`` method, it must have an ``openmp=None`` parameter ...@@ -214,16 +214,16 @@ the ``__init__()`` method, it must have an ``openmp=None`` parameter
and must call ``super(MyOpClass, self).__init__(openmp=openmp)``. and must call ``super(MyOpClass, self).__init__(openmp=openmp)``.
The ``OpenMPOp`` class also implements ``c_compile_args`` and The ``OpenMPOp`` class also implements ``c_compile_args`` and
``make_thunk``. This makes it add the correct g++ flag to compile with ``make_thunk``. This makes it add the correct g++ flags to compile with
OpenMP. It also disables OpenMP and prints a warning if the version of OpenMP. It also disables OpenMP and prints a warning if the version of
g++ don't support it. g++ does not support it.
The Theano flag ``openmp`` is currently False by default as we don't The Theano flag ``openmp`` is currently False by default as we do not
have code that gets speed up with it. The only current implementation have code that gets sped up with it. The only current implementation
is ConvOp. It speeds up some cases, but slows down others. That is why is ConvOp. It speeds up some cases, but slows down others. That is why
we disable it by default. But we have all the code to have it enabled we disable it by default. But we have all the code to have it enabled
by default if there is more then 1 core and that the environment by default if there is more than 1 core and the environment
variable OMP_NUM_THREADS isn't 1. This allows Theano to respect the variable OMP_NUM_THREADS is not 1. This allows Theano to respect the
current convention. current convention.
.. note: .. note:
......
...@@ -4,23 +4,23 @@ ...@@ -4,23 +4,23 @@
OpFromGraph OpFromGraph
=========== ===========
This page descripbe :class:`theano.OpFromGraph This page describes :class:`theano.OpFromGraph
<theano.compile.builders.OpFromGraph>`. an Op that allow to <theano.compile.builders.OpFromGraph>`, an Op that allows to
encapsulate a Theano graph in an op. encapsulate a Theano graph in an op.
This can be used to encapsulate some functionality in one block. It is This can be used to encapsulate some functionality in one block. It is
useful to scale Theano compilation for regular bigger graph when we useful to scale Theano compilation for regular bigger graphs when we
reuse that encapsulated fonctionality with different inputs many reuse that encapsulated fonctionality with different inputs many
times. Due to this encapsulation, it can make Theano compilation phase times. Due to this encapsulation, it can make Theano compilation phase
faster for graph with many nodes. faster for graphs with many nodes.
Using this for small graph isn't recommanded as it disable Using this for small graphs is not recommended as it disables
optimization between what is inside the encapsulation and outside it. optimizations between what is inside the encapsulation and outside of it.
.. note: .. note:
This wasn't used widely up to now. If you have any This was not used widely up to now. If you have any
questions/comments don't contact us on the mailing list. questions/comments do not hesitate to contact us on the mailing list.
......
...@@ -846,18 +846,18 @@ Reductions ...@@ -846,18 +846,18 @@ Reductions
:Parameter: *no_zeros_in_input* - The grad of prod is complicated :Parameter: *no_zeros_in_input* - The grad of prod is complicated
as we need to handle 3 different cases: without zeros in the as we need to handle 3 different cases: without zeros in the
input reduced group, with 1 zeros or with more zeros. input reduced group, with 1 zero or with more zeros.
This could slow you down, but more importantly, we currently This could slow you down, but more importantly, we currently
don't support the second derivative of the 3 cases. So you don't support the second derivative of the 3 cases. So you
can not take the second derivative of the default prod(). cannot take the second derivative of the default prod().
To remove the handling of the special cases of 0 and so get To remove the handling of the special cases of 0 and so get
some small speed up and allow second derivative set some small speed up and allow second derivative set
``no_zeros_in_inputs`` to ``True``. It default to ``False``. ``no_zeros_in_inputs`` to ``True``. It defaults to ``False``.
**It is the user responsability to make sure there is no zeros **It is the user responsibility to make sure there are no zeros
in the inputs. If there is, the grad will be wrong.** in the inputs. If there are, the grad will be wrong.**
:Returns: product of every term in *x* along *axis* :Returns: product of every term in *x* along *axis*
......
...@@ -467,10 +467,10 @@ Final Note ...@@ -467,10 +467,10 @@ Final Note
========== ==========
A more extensive discussion of this section's content may be found in A more extensive discussion of this section's content may be found in
the advanced tutorial :ref:`Extending Theano<extending>` the advanced tutorial :ref:`Extending Theano<extending>`.
The section :ref:`Other ops <other_ops>` include more instruction for The section :ref:`Other ops <other_ops>` includes more instructions for
specific case: the following specific cases:
- :ref:`scalar_ops` - :ref:`scalar_ops`
- :ref:`scipy_ops` - :ref:`scipy_ops`
......
...@@ -6,10 +6,10 @@ from theano.gof import ops_with_inner_function ...@@ -6,10 +6,10 @@ from theano.gof import ops_with_inner_function
class OpFromGraph(gof.Op): class OpFromGraph(gof.Op):
"""This create an `Op` from inputs and outputs list of variables. """This creates an `Op` from inputs and outputs lists of variables.
The signature is similar to theano.function() and the resulting The signature is similar to theano.function() and the resulting
`Op` perform will do the same operation as:: `Op`'s perform will do the same operation as::
orig_function(inputs, outputs, **kwargs) orig_function(inputs, outputs, **kwargs)
...@@ -19,13 +19,13 @@ class OpFromGraph(gof.Op): ...@@ -19,13 +19,13 @@ class OpFromGraph(gof.Op):
- c_code() to remove the double overhead? - c_code() to remove the double overhead?
- opt to unfold it, work inplace on inputs - opt to unfold it, work inplace on inputs
- grad() make it support DisconnectedType and the new interface - grad() make it support DisconnectedType and the new interface
- check how it work with updates. - check how it works with updates.
- add test with constant as input or inside the inner graph. - add test with constant as input or inside the inner graph.
- Add support for the GPU? Probably just need an opt to remove transfer - Add support for the GPU? Probably just need an opt to remove transfer
- Add support to pickle this Op. - Add support to pickle this Op.
- Add support/test with random generator - Add support/test with random generator
:note: :note:
- We support shared variable in the inner graph. This is automatic and - We support shared variables in the inner graph. This is automatic and
invisible to the user. They can be as input to the node or in the invisible to the user. They can be as input to the node or in the
inner graph. inner graph.
- We support unused inputs. This is needed for the grad. - We support unused inputs. This is needed for the grad.
......
// REMEMBER TO RAISE c_code_cache_version when changing this file // REMEMBER TO INCREASE c_code_cache_version when changing this file
// //
//TODO detect SHARED_SIZE dynamically //TODO detect SHARED_SIZE dynamically
#define SHARED_SIZE (16*1024) #define SHARED_SIZE (16*1024)
......
...@@ -31,21 +31,21 @@ class GpuConv(gof.Op): ...@@ -31,21 +31,21 @@ class GpuConv(gof.Op):
imshp=None, imshp=None,
max_threads_dim0=None): max_threads_dim0=None):
""" """
:param version: each version of c_code implement many kernel for the :param version: each version of c_code implements many kernels for the
convolution. By default we try to guess the best one. convolution. By default we try to guess the best one.
You can force one version with this parameter. This You can force one version with this parameter. This
parameter is used by the tests. parameter is used by the tests.
:param verbose: for value of 1,2 and 3. Print more information during :param verbose: for value of 1,2 and 3. Print more information during
the execution of the convolution. Mostly used for the execution of the convolution. Mostly used for
optimization or debugging. optimization or debugging.
:param kshp: The size of the kernel. If provided, can genera :param kshp: The size of the kernel. If provided, can generate
faster code. If the GpuConv op is automatically faster code. If the GpuConv op is automatically
inserted, inserted,
we take its value automatically from the Conv op. we take its value automatically from the Conv op.
:param imshp: The size of the image. Not used for code generation but :param imshp: The size of the image. Not used for code generation but
allow to select an experimental new version in another allows to select an experimental new version in another
repo. repo.
:param max_threads_dim0: The maximum number of thread for the :param max_threads_dim0: The maximum number of threads for the
block size dimensions 0 (blockDim.x) used by the block size dimensions 0 (blockDim.x) used by the
GPU function. GPU function.
......
// REMEMBER TO RAISE c_code_cache_version when changing this file // REMEMBER TO INCREASE c_code_cache_version when changing this file
// //
//implement the valid convolution only //implement the valid convolution only
......
...@@ -164,7 +164,7 @@ def conv3d(signals, filters, ...@@ -164,7 +164,7 @@ def conv3d(signals, filters,
border_mode='valid'): border_mode='valid'):
"""Convolve spatio-temporal filters with a movie. """Convolve spatio-temporal filters with a movie.
It flip the filters. It flips the filters.
:param signals: timeseries of images whose pixels have color channels. :param signals: timeseries of images whose pixels have color channels.
shape: [Ns, Ts, C, Hs, Ws] shape: [Ns, Ts, C, Hs, Ws]
......
...@@ -3950,7 +3950,7 @@ def local_greedy_distributor(node): ...@@ -3950,7 +3950,7 @@ def local_greedy_distributor(node):
The following expressions are simplified: The following expressions are simplified:
1. ((a/x + b/y) * x * y) --> a*y + b*x 1. ((a/x + b/y) * x * y) --> a*y + b*x
2. ((a/x + b) * x) --> a + b*x 2. ((a/x + b) * x) --> a + b*x
3. There other form too where node is a true_div. 3. There are other forms too where node is a true_div.
The following expressions are not simplified: The following expressions are not simplified:
4. ((a + b) * x) -/-> a*x + b*x 4. ((a + b) * x) -/-> a*x + b*x
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论