提交 ae146361 authored 作者: Olivier Breuleux's avatar Olivier Breuleux

merge

......@@ -89,9 +89,9 @@ Configuring the environment
---------------------------
Two environment variables are used to control automatic code generation.
(It is possible to use theano in a way that avoids all automatic code generation, but the functions you make using {{{theano.function}}} will execute more slowly.)
(It is possible to use theano in a way that avoids all automatic code generation, but the functions you make using ``theano.function`` will execute more slowly.)
- `THEANO_BLAS_LDFLAGS`:
- `THEANO_BLAS_LDFLAGS`:
a space-separated list of library names to link against for BLAS functions. Default: `-lblas`
- `THEANO_COMPILEDIR`:
......
Basically, this file contains stuff that should be documented, but is not.
Feel free to contribute things that you want documented, as well as to add
or correct documentation.
======================================
What happens if grad is not defined?
======================================
If an Op does not define ``grad``, but this Op does not appear in the path when
you compute the gradient, then there is no problem.
If an Op does not define ``grad``, and this Op *does* appear in the path when
you compute the gradient, **WRITEME**.
**This documentation is useful when we show users how to write Ops.**
======================================
What is staticmethod, st_impl?
======================================
``st_impl`` is an optional method in an Op.
``@staticmethod`` is a Python decorator for a class method that does not
implicitly take the class instance as a first argument. Hence, st_impl
can be used for Op implementations when no information from the Op
instance is needed. This can be useful for testing an implementation.
See :api:`XlogX` for an example.
**This documentation is useful when we show users how to write Ops.
Olivier says this behavior should be discouraged but I feel that st_impl
should be encouraged where possible.**
============================================================
how do we write scalar ops and upgrade them to tensor ops?
============================================================
**Olivier says that :api:`XlogX` gives a good example. In fact, I would
like to beef xlogx up into our running example for demonstrating how to
write an Op:**
.. code-block:: python
class XlogX(scalar.UnaryScalarOp):
"""
Compute X * log(X), with special case 0 log(0) = 0.
"""
@staticmethod
def st_impl(x):
if x == 0.0:
return 0.0
return x * numpy.log(x)
def impl(self, x):
return XlogX.st_impl(x)
def grad(self, (x,), (gz,)):
return [gz * (1 + scalar.log(x))]
def c_code(self, node, name, (x,), (z,), sub):
if node.inputs[0].type in [scalar.float32, scalar.float64]:
return """%(z)s =
%(x)s == 0.0
? 0.0
: %(x)s * log(%(x)s);""" % locals()
raise NotImplementedError('only floatingpoint is implemented')
scalar_xlogx = XlogX(scalar.upgrade_to_float, name='scalar_xlogx')
xlogx = tensor.Elemwise(scalar_xlogx, name='xlogx')
**It is also necessary to talk about UnaryScalarOp vs. BinaryOp.**
UnaryScalarOp is the same as scalar.ScalarOp with member variable nin=1.
**give an example of this**
=======================================================
Documentation on how to write tests
=======================================================
Guillaume can you make sure to hit these points:
* What are canonical examples of tests?
* What are the different test patterns?
* nnet.py:
* What is going on with test1, test2, test3, test4?
* What is the right eq function to use?
* There are a lot of tests that define their own epsilon, but this should be standardized. e.g. in test_elemwise.py ``self.failUnless((numpy.abs(f(xv) - zv) < 1e-10).all())``
* If the expected result of a test is that an Exception is thrown, how do we correctly detect and handle that?
nosetests has ``failUnlessRaises``
* Convention is that all test files must start with test_, not _test_, so rename all that use the old convention?
=======================================================
How to use the PrintOp
=======================================================
** This is also useful in the How to write an Op tutorial. **
=======================================================
Modules
=======================================================
* What is the correct way to tie weights?
=======================================================
Mammouth
=======================================================
**This is internal documentation. Guillaume can you make sure to hit these points:**
export THEANO_BLAS_LDFLAGS='-lmkl -liomp5 -fopenmp'
**Do we want the following:**
export OMP_NUM_THREADS=2
=======================================================
Cache
=======================================================
The compile cache is written to ``THEANO_COMPILEDIR``. If this environment
variable is not present, the compile cache defaults to ``$HOME/.theano``.
The compile cache is based upon the C++ code of the graph to be compiled.
So, if you change compilation environment variables, such as
``THEANO_BLAS_LDFLAGS``, you will need to manually remove your compile cache.
=======================================================
Type checking
=======================================================
* Are there functions for doing type checking?
like dtype of this matrix is an int-type (not just int32
or int64)
"if isinstance(item, int):" is the preferred way to do it in
python now, so mimic this
If the type is wrong, what exception should be raised?
......@@ -23,13 +23,21 @@ Glossary of terminology
Broadcasting a row matrix. T and F respectively stand for
True and False and indicate along which dimensions we allow
broadcasting.
If the second argument were a vector, its shape would be
``(2,)`` and its broadcastable pattern ``(F,)``. They would
be automatically expanded to the **left** to match the
dimensions of the matrix (adding ``1`` to the shape and ``T``
to the pattern), resulting in ``(1, 2)`` and ``(T, F)``.
It would then behave just like the example above.
Unlike numpy which does broadcasting dynamically, Theano needs
to know, for any operation which supports broadcasting, which
dimensions will need to be broadcasted. When applicable, this
information is given in the :term:`Type` of a :term:`Result`.
See also:
* :ref:`How broadcasting is used in Theano's tensor types <tensortypes>`
* `SciPy documentation about numpy's broadcasting <http://www.scipy.org/EricsBroadcastingDoc>`_
* `OnLamp article about numpy's broadcasting <http://www.onlamp.com/pub/a/python/2000/09/27/numerically.html>`_
......@@ -40,7 +48,7 @@ Glossary of terminology
elementwise
An elementwise operation ``f`` on two matrices ``M`` and ``N``
is one such that:
``f(M, N)[i, j] = f(M[i, j], N[i, j])``
In other words, each element of an input matrix is combined
......
......@@ -9,6 +9,7 @@ Mode
WRITEME
.. _tensortypes:
Types
=====
......@@ -46,7 +47,7 @@ Dimensionality is one of:
code shape Rows :term:`broadcastable <broadcasting>`? Columns :term:`broadcastable <broadcasting>`?
====== ====== ========================================== =============================================
scalar [] Yes Yes
vector [n] Yes N/A
vector [n] Yes N/A (vectors are used like row vectors)
row [1, n] Yes No
col [m, 1] No Yes
matrix [m, n] No No
......@@ -56,13 +57,14 @@ So for example if you want a row of 32-bit floats, it is available
under ``theano.tensor.frow`` and if you want a matrix of unsigned
32-bit integers it is available under ``theano.tensor.imatrix``.
Each of the methods described above have a singular version and a
plural version. When called, the singular version takes a single
argument which is the name of the :term:`Result` we want to make and
it makes a single Result of that type. The plural version can either
take an integer or a string. If an integer is provided, it will return
that many Results and if a string is provided, it will create one
Result for each letter of the string, using the letter as the Result's
Each of the types described above can be constructed by two methods:
a singular version (e.g., ``dmatrix``) and a plural version
(``dmatrices``). When called, the singular version takes a single
argument which is the name of the :term:`Result` we want to make and it
makes a single Result of that type. The plural version can either take
an integer or several strings. If an integer is provided, the method
will return that many Results and if strings are provided, it will
create one Result for each string, using the string as the Result's
name. For example:
.. code-block:: python
......@@ -74,14 +76,14 @@ name. For example:
xyz = dmatrix('xyz') # creates one Result with name 'xyz'
x, y, z = dmatrices(3) # creates three Results with no names
x, y, z = dmatrices('xyz') # creates three Results named 'x', 'y' and 'z'
x, y, z = dmatrices('x', 'y', 'z') # creates three Results named 'x', 'y' and 'z'
Custom tensor types
-------------------
If you wish to use a type which is not available here (for example, a
3D tensor) you can build an appropriate type using
If you wish to use a type of tensor which is not already available here
(for example, a 3D tensor) you can build an appropriate type using
``theano.tensor.Tensor``. The first argument you pass is the ``dtype``
and the second is the ``broadcastable pattern``.
......@@ -106,10 +108,11 @@ complex128 complex 128 (two float64)
.. note::
There are no premade complex types, so you need to make them
explicitly with Tensor. Furthermore, few operations are fully
supported for complex types: as of version 0.1, only elementary
operations (``+-*/``) have C implementations.
Even though ``theano.tensor`` does not define any type using
``complex`` dtypes (``complex64`` or ``complex128``), you can define
them explicitly with ``Tensor`` (see example below). However, few
operations are fully supported for complex types: as of version 0.1,
only elementary operations (``+-*/``) have C implementations.
The broadcastable pattern, on the other hand, indicates both the
......@@ -133,13 +136,24 @@ pattern interpretation
[False, False, False] A MxNxP tensor (pattern of a + b)
===================== =================================
When two tensors have a different number of dimensions, the broadcastable
pattern is *expanded to the left*, by padding with ``True``. So, for example,
a vector's pattern, ``[False]``, could be expanded to ``[True, False]``, and
would behave like a row (1xN matrix). In the same way, a matrix (``[False,
False]``) would behave like a 1xNxP tensor (``[True, False, False]``).
So if we wanted to create a type representing a 3D array of unsigned
bytes, we would simply do:
.. code-block:: python
# 3D tensor of signed bytes
mytype = theano.tensor.Tensor('uint8', [False]*3)
# complex types (based on complex64)
my_cscalar = theano.tensor.Tensor('complex64', [])
my_cmatrix = theano.tensor.Tensor('complex64', [False, False])
Ops
===
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论