Merge pull request #1747 from nouiz/doc

[MRG]Doc

Merge pull request #1747 from nouiz/doc
d3ca3329 · abergeron · ab720fb5 · 0b4c5463 · d3ca3329 · d3ca3329
--- a/doc/library/sandbox/index.txt
+++ b/doc/library/sandbox/index.txt
@@ -16,6 +16,4 @@
    cuda/index
    linalg
    neighbours
-
-
-
+    rng_mrg
--- a/doc/library/sandbox/rng_mrg.txt
+++ b/doc/library/sandbox/rng_mrg.txt
+.. _libdoc_rng_mrg:
+
+===================================================================
+:mod:`sandbox.rng_mrg` --  MRG random number generator
+===================================================================
+
+.. module:: sandbox.rng_mrg
+   :platform: Unix, Windows
+   :synopsis: MRG random number generator
+.. moduleauthor:: LISA
+
+API
+===
+
+.. automodule:: theano.sandbox.rng_mrg
+    :members:
--- a/doc/library/sparse/index.txt
+++ b/doc/library/sparse/index.txt
@@ -11,20 +11,20 @@ In the tutorial section, you can find a :ref:`sparse tutorial
 The sparse submodule is not loaded when we import Theano. You must
 import ``theano.sparse`` to enable it.

-The sparse module provide the same functionalities as the tensor
-module. The difference lies under the cover because sparse matrices
-does not store data in a contiguous array. Note that there are no GPU
-implementations for sparse matrices implemented in Theano. The sparse
-module has been used in:
+The sparse module provides the same functionality as the tensor
+module. The difference lies under the covers because sparse matrices
+do not store data in a contiguous array. Note that there are no GPU
+implementations for sparse matrices in Theano. The sparse module has
+been used in:

 - NLP: Dense linear transformations of sparse vectors.
- Audio: Filterbank in Fourier domain.
+- Audio: Filterbank in the Fourier domain.

 Compressed Sparse Format
 ========================

-This section tries to explain how information is store for the two
-sparse formats of SciPy supported by Theano. There is more formats
+This section tries to explain how information is stored for the two
+sparse formats of SciPy supported by Theano. There are more formats
 that can be used with SciPy and some documentation about them may be
 found `here
 <http://deeplearning.net/software/theano/sandbox/sparse.html>`_.
@@ -50,14 +50,14 @@ attributes: ``data``, ``indices``, ``indptr`` and ``shape``.
 CSC Matrix
 ----------

-In the *Compressed Sparse Column* format, ``indices`` stands for index
-inside the column vectors of the matrix and ``indptr`` tells where the
-column starts in the ``data`` and in the ``indices``
-attributes. ``indptr`` can be tought as giving the slice which must be
-applied to the other attribute in order to get each column of the
-matrix. In other words, ``slice(indptr[i], indptr[i+1])`` correspond
-to the slice needed to find the i-th column of the matrix in the
-``data`` and in the ``indices`` fields.
+In the *Compressed Sparse Column* format, ``indices`` stands for
+indexes inside the column vectors of the matrix and ``indptr`` tells
+where the column starts in the ``data`` and in the ``indices``
+attributes. ``indptr`` can be thought of as giving the slice which
+must be applied to the other attribute in order to get each column of
+the matrix. In other words, ``slice(indptr[i], indptr[i+1])``
+corresponds to the slice needed to find the i-th column of the matrix
+in the ``data`` and ``indices`` fields.

 The following example builds a matrix and returns its columns. It
 prints the i-th column, i.e. a list of indices in the column and their
@@ -84,18 +84,18 @@ corresponding value in the second list.
 CSR Matrix
 ----------

-In the *Compressed Sparse Row* format, ``indices`` stands for index
+In the *Compressed Sparse Row* format, ``indices`` stands for indexes
 inside the row vectors of the matrix and ``indptr`` tells where the
 row starts in the ``data`` and in the ``indices``
-attributes. ``indptr`` can be tought as giving the slice which must be
-applied to the other attribute in order to get each row of the
-matrix. In other words, ``slice(indptr[i], indptr[i+1])`` correspond
+attributes. ``indptr`` can be thought of as giving the slice which
+must be applied to the other attribute in order to get each row of the
+matrix. In other words, ``slice(indptr[i], indptr[i+1])`` corresponds
 to the slice needed to find the i-th row of the matrix in the ``data``
-and in the ``indices`` fields.
+and ``indices`` fields.

 The following example builds a matrix and returns its rows. It prints
-the i-th row, i.e. a list of indices in the row and their corresponding value
-in the second list.
+the i-th row, i.e. a list of indices in the row and their
+corresponding value in the second list.

 >>> data = np.asarray([7, 8, 9])
 >>> indices = np.asarray([0, 1, 2])
@@ -120,7 +120,7 @@ List of Implemented Operations

 - Moving from and to sparse
    - :class:`DenseFromSparse <theano.sparse.basic.DenseFromSparse>` and ``dense_from_sparse``.
-      Both grad are implemented. Structured by default.
+      Both grads are implemented. Structured by default.
    - :class:`SparseFromDense <theano.sparse.basic.SparseFromDense>` and ``csr_from_dense``, ``csc_from_dense``.
      The grad implemented is structured.
    - Theano SparseVariable object have a method ``toarray()`` that is the same as ``dense_from_sparse``.
@@ -201,51 +201,55 @@ List of Implemented Operations
        - One of the inputs must be sparse, the other sparse or dense.
        - The grad implemented is regular.
        - No C code for perform and no C code for grad.
-        - Return a dense for perform and a dense for grad.
+        - Returns a dense for perform and a dense for grad.
    - :class:`StructuredDot <theano.sparse.basic.StructuredDot>`
      and :func:`structured_dot <theano.sparse.basic.structured_dot>`.

        - The first input is sparse, the second can be sparse or dense.
        - The grad implemented is structured.
        - C code for perform and grad.
-        - Return a dense for perforn and a sparse for grad.
+        - It returns a sparse output if both inputs are sparse and
+          dense one if one of the inputs is dense.
+        - Returns a sparse grad for sparse inputs and dense grad for
+          dense inputs.
    - :class:`TrueDot <theano.sparse.basic.TrueDot>` and
      :func:`true_dot <theano.sparse.basic.true_dot>`.

        - The first input is sparse, the second can be sparse or dense.
        - The grad implemented is regular.
        - No C code for perform and no C code for grad.
-        - Return a Sparse for perform and a Sparse for grad.
-        - Flags trough constructor can change the output of
-          grad to be dense if the second input of the op is dense.
+        - Returns a Sparse.
+        - The gradient returns a Sparse for sparse inputs and by
+          default a dense for dense inputs. The parameter
+          ``grad_preserves_dense`` can be set to False to return a
+          sparse grad for dense inputs.
    - :class:`SamplingDot <theano.sparse.basic.SamplingDot>` and
      ``sampling_dot``.

-        - Both input must be dense.
+        - Both inputs must be dense.
        - The grad implemented is structured for `p`.
        - Sample of the dot and sample of the gradient.
        - C code for perform but not for grad.
-        - Return sparse for perform and grad.
+        - Returns sparse for perform and grad.
    - :class:`Usmm <theano.sparse.basic.Usmm>` and ``usmm``.

        - You *shouldn't* insert this op yourself!
-           - There is optimization that transform a
+           - There is an optimization that transform a
             :class:`Dot <theano.sparse.basic.Dot>` to ``Usmm`` when possible.

        - This op is the equivalent of gemm for sparse dot.
-        - There is no grad implemented for this op and this is not needed as
-          you don't insert it yourself.
+        - There is no grad implemented for this op.
        - One of the inputs must be sparse, the other sparse or dense.
-        - Return a dense for perform
+        - Returns a dense from perform.

 - Slice Operations
-    - sparse_variable[N, N], return a tensor scalar.
+    - sparse_variable[N, N], returns a tensor scalar.
      There is no grad implemented for this operation.
-    - sparse_variable[M:N, O:P], return a sparse matrix
+    - sparse_variable[M:N, O:P], returns a sparse matrix
      There is no grad implemented for this operation.
-    - Sparse variable don't support [M, N:O] and [M:N, O] as we don't support sparse vector
-      and returning a sparse matrix would break the numpy interface.
-      Use [M:M+1, N:O] and [M:N, O:O+1] instead.
+    - Sparse variables don't support [M, N:O] and [M:N, O] as we don't
+      support sparse vectors and returning a sparse matrix would break
+      the numpy interface.  Use [M:M+1, N:O] and [M:N, O:O+1] instead.
    - :class:`Diag <theano.sparse.basic.Diag>` and ``diag``.
      The grad implemented is regular.


--- a/doc/library/tensor/raw_random.txt
+++ b/doc/library/tensor/raw_random.txt
--- a/doc/tutorial/examples.txt
+++ b/doc/tutorial/examples.txt
@@ -5,13 +5,13 @@
 More Examples
 =============

-At this point it would be wise to begin familiarizing yourself 
-more systematically with Theano's fundamental objects and operations by browsing
-this section of the library: :ref:`libdoc_basic_tensor`.
+At this point it would be wise to begin familiarizing yourself more
+systematically with Theano's fundamental objects and operations by
+browsing this section of the library: :ref:`libdoc_basic_tensor`.

-As the tutorial unfolds, you should also gradually acquaint yourself with the other
-relevant areas of the library and with the relevant subjects of the documentation
-entrance page.
+As the tutorial unfolds, you should also gradually acquaint yourself
+with the other relevant areas of the library and with the relevant
+subjects of the documentation entrance page.


 Logistic Function
@@ -30,13 +30,13 @@ the logistic curve, which is given by:
    A plot of the logistic function, with x on the x-axis and s(x) on the
    y-axis.

-You want to compute the function :ref:`elementwise <libdoc_tensor_elementwise>` on matrices of
-doubles, which means that you want to apply this function to each
-individual element of the matrix.
+You want to compute the function :ref:`elementwise
+<libdoc_tensor_elementwise>` on matrices of doubles, which means that
+you want to apply this function to each individual element of the
+matrix.

 Well, what you do is this:

-
 .. If you modify this code, also change :
 .. theano/tests/test_tutorial.py:T_examples.test_examples_1

@@ -450,6 +450,10 @@ Other Random Distributions

 There are :ref:`other distributions implemented <libdoc_tensor_raw_random>`. 

+Other Implementations
+---------------------
+
+There is 2 other implementations based on :class:`CURAND <theano.sandbox.cuda.rng_curand>` and :ref:`MRG31k3p <libdoc_rng_mrg>`

 .. _logistic_regression:

@@ -457,7 +461,8 @@ There are :ref:`other distributions implemented <libdoc_tensor_raw_random>`.
 A Real Example: Logistic Regression
 ===================================

-The preceding elements are featured in this more realistic example.  It will be used repeatedly.  
+The preceding elements are featured in this more realistic example.
+It will be used repeatedly.

 .. code-block:: python


--- a/doc/tutorial/multi_cores.txt
+++ b/doc/tutorial/multi_cores.txt
@@ -2,30 +2,43 @@
 Multi cores support in Theano
 =============================

-Parallel element wise op with openmp
-====================================
+BLAS operation
+==============

-Beacuse element wise ops work on every tensor entry indipedently they can be
-easly parallelized using openmp.
+BLAS is an interface for some mathematic operations between two
+vectors, a vector and a matrix or two matrices (e.g. the dot product
+between vector/matrix and matrix/matrix). Many different
+implementations of that interface exist and some of them are
+parallelized.

-To use openmp you must set the openmp flag in Theano configuration.
+Theano tries to use that interface as frequently as possible for
+performance reasons. So if Theano links to a parallel implementation,
+those operations will run in parallel in Theano.

-Yuo can use the flag openmp_elemwise_minsize to set the minimum tensor size
-for which the operation is parallelized because for short tensor using opemp
-can slow down the operation.
+The most frequent way to control the number of threads used is via the
+``OMP_NUM_THREADS`` environment variable. Set it to the number of threads
+you want to use before starting the python process.

-If it is no specified the default value (200000) is used.

-For simple(fast) operation you can obtain a speed up for very long tensor
-while for more complex operation you ca obtain a good speed up also for not
-too long tensor. 
-There is a script (elemwise_openmp_speedup.py in theano/misc/) which you can
-use to choose that value for your machine.
-The script run two elemwise operation (a fast and a slow one) for a vector of
-size openmp_elemwise_minsize with and without openmp and show the time
-difference between the two cases.
+Parallel element wise ops with OpenMP
+=====================================

+Because element wise ops work on every tensor entry independently they
+can be easily parallelized using OpenMP.

+To use OpenMP you must set the OpenMP flag in Theano configuration.

+You can use the flag ``openmp_elemwise_minsize`` to set the minimum
+tensor size for which the operation is parallelized because for short
+tensors using OpenMP can slow down the operation. The default value is
+``200000``.

+For simple(fast) operation you can obtain a speed up with very large
+tensors while for more complex operation you can obtain a good speed
+up also for smaller tensor.

+There is a script ``elemwise_openmp_speedup.py`` in ``theano/misc/``
+which you can use to tune the value of ``openmp_elemwise_minsize`` for
+your machine.  The script runs two elemwise operations (a fast one and
+a slow one) for a vector of size ``openmp_elemwise_minsize`` with and
+without OpenMP and shows the time difference between the cases.
--- a/theano/misc/check_blas.py
+++ b/theano/misc/check_blas.py
@@ -205,6 +205,7 @@ if __name__ == "__main__":
        gpu
        K20m/ECC                 0.07s
        K20/NOECC                0.07s
+        M2090             0.19s
        C2075                           0.25s
        M2075                    0.25s
        M2070                    0.25s         0.27s         0.32s

--- a/theano/sandbox/cuda/tests/test_type.py
+++ b/theano/sandbox/cuda/tests/test_type.py
@@ -12,10 +12,13 @@ if cuda_available:
 # >>> with open('CudaNdarray.pkl', 'wb') as fp:
 # >>> cPickle.dump(theano.sandbox.cuda.CudaNdarray(np.array([-42.0], dtype=np.float32)), fp)

+
 def test_unpickle_flag_is_false_by_default():
-    assert not config.experimental.unpickle_gpu_on_cpu, "Config flag experimental.unpickle_gpu_on_cpu is " \
-                                                      + "set to true. Make sure the default value stays false " \
-                                                      + "and that you have not set the flag manually."
+    assert not config.experimental.unpickle_gpu_on_cpu, (
+        "Config flag experimental.unpickle_gpu_on_cpu is "
+        "set to true. Make sure the default value stays false "
+        "and that you have not set the flag manually.")
+

 def test_unpickle_cudandarray_as_numpy_ndarray_flag0():
    oldflag = config.experimental.unpickle_gpu_on_cpu

--- a/theano/sandbox/rng_mrg.py
+++ b/theano/sandbox/rng_mrg.py
@@ -734,9 +734,11 @@ class MRG_RandomStreams(object):

        :param low: Lower bound of the interval on which values are sampled.
        If the ``dtype`` arg is provided, ``low`` will be cast into dtype.
+        This bound is excluded.

        :param high: Higher bound of the interval on which values are sampled.
        If the ``dtype`` arg is provided, ``high`` will be cast into dtype.
+        This bound is excluded.

        :param size: Can be a list of integer or Theano variable
                (ex: the shape of other Theano Variable)

--- a/theano/scalar/basic.py
+++ b/theano/scalar/basic.py
@@ -869,7 +869,8 @@ class ScalarOp(Op):
            return self.name
        else:
            param = [(k, v) for k, v in self.__dict__.items()
-                     if k not in ["name", "_op_use_c_code"]]
+                     if k not in ["name", "_op_use_c_code",
+                                  "output_types_preference"]]
            if param:
                return "%s{%s}" % (self.__class__.__name__,
                                   ", ".join("%s=%s" % (k, v)

--- a/theano/sparse/basic.py
+++ b/theano/sparse/basic.py
@@ -2623,11 +2623,14 @@ class TrueDot(gof.op.Op):
        self.grad_preserves_dense = grad_preserves_dense

    def __eq__(self, other):
-        return (type(self) == type(other) and
-                self.grad_preserves_dense == other.grad_preserves_dense)
+        # The grad_preserves_dense attribute doesn't change the
+        # execution behavior.  To let the optimizer merge nodes with
+        # different values of this attribute we shouldn't compare it
+        # here.
+        return type(self) == type(other)

    def __hash__(self):
-        return hash(type(self)) ^ hash(self.grad_preserves_dense)
+        return hash(type(self))

    def __ne__(self, other):
        return not (self == other)
@@ -2712,15 +2715,17 @@ class TrueDot(gof.op.Op):
 def true_dot(x, y, grad_preserves_dense=True):
    """
    Operation for efficiently calculating the dot product when
-    one or all operands is sparse. Supported format are CSC and CSR.
+    one or all operands are sparse. Supported formats are CSC and CSR.
    The output of the operation is sparse.

-    :param x: Matrix variable.
-    :param y: Matrix variable.
-    :param grad_preserves_dense: if True and one on the input is dense,
-        make the output dense.
+    :param x: Sparse matrix or 2d tensor variable.
+    :param y: Sparse matrix or 2d tensor variable.
+    :param grad_preserves_dense: if True (default), makes the grad of
+        dense inputs dense.  Otherwise the grad is always sparse.

    :return: The dot product `x`.`y` in a sparse format.
+
+    :note: one of ``x`` or ``y`` must be sparse.
    """
    # TODO
    # Maybe the triple-transposition formulation

--- a/theano/tensor/nnet/Conv3D.py
+++ b/theano/tensor/nnet/Conv3D.py
@@ -562,9 +562,13 @@ conv3D = Conv3D()
 :note: The order of dimensions does not correspond to the one in `conv2d`.
       This is for optimization.

-:note: The GPU implementation is very slow. You are better to use
-    :func:`conv3d2d <theano.tensor.nnet.conv3d2d.conv3d>` that is faster
-    on GPU.
+:note: The GPU implementation is very slow. You should use
+    :func:`conv3d2d <theano.tensor.nnet.conv3d2d.conv3d>` for a GPU
+    graph instead.
+
+:see: Someone made a script that shows how to swap the axes between
+      both 3d convolution implementations in Theano. See the last
+      `attachment <https://groups.google.com/d/msg/theano-users/1S9_bZgHxVw/0cQR9a4riFUJ>`_.

 """


--- a/theano/tensor/nnet/conv3d2d.py
+++ b/theano/tensor/nnet/conv3d2d.py
@@ -178,6 +178,10 @@ def conv3d(signals, filters,
           Another way to define signals: (batch,  time, in channel, row, column)
           Another way to define filters: (out channel,time,in channel, row, column)

+    :see: Someone made a script that shows how to swap the axes between
+          both 3d convolution implementations in Theano. See the last
+          `attachment <https://groups.google.com/d/msg/theano-users/1S9_bZgHxVw/0cQR9a4riFUJ>`_.
+
    """

    if isinstance(border_mode, str):

--- a/theano/tensor/raw_random.py
+++ b/theano/tensor/raw_random.py
@@ -576,11 +576,11 @@ def random_integers(random_state, size=None, low=0, high=1, ndim=None,


 def choice_helper(random_state, a, replace, p, size):
-    """
-    Helper function to draw random numbers using numpy's choice function.
+    """Helper function to draw random numbers using numpy's choice function.

-    This is a generalization of numpy.random.choice to the case where `a`,
-    `replace` and `p` are tensors.
+    This is a generalization of numpy.random.choice that coerces
+    `replace` to a bool and replaces `p` with None when p is a vector
+    of 0 elements.
    """
    if a.ndim > 1:
        raise ValueError('a.ndim (%i) must be 0 or 1' % a.ndim)
@@ -622,16 +622,6 @@ def choice(random_state, size=None, a=2, replace=True, p=None, ndim=None,
                                                         broadcastable=bcast))
    return op(random_state, size, a, replace, p)

-def poisson_helper(random_state, lam, size):
-    """
-    Helper function to draw random numbers using numpy's poisson function.
-
-    This is a generalization of numpy.random.poisson to the case where 
-    `lam` is a tensor.
-    """
-
-    return random_state.poisson(lam, size)
-
 def poisson(random_state, size=None, lam=1.0, ndim=None, dtype='int64'):
    """
    Draw samples from a Poisson distribution.
@@ -652,7 +642,7 @@ def poisson(random_state, size=None, lam=1.0, ndim=None, dtype='int64'):
    
    ndim, size, bcast = _infer_ndim_bcast(ndim, size)

-    op = RandomFunction(poisson_helper, tensor.TensorType(dtype=dtype,
+    op = RandomFunction("poisson", tensor.TensorType(dtype=dtype,
                                                     broadcastable=bcast))
    return op(random_state, size, lam)

@@ -668,6 +658,9 @@ def permutation_helper(random_state, n, shape):

    If you wish to perform a permutation of the elements of an existing vector,
    see shuffle_row_elements.
+
+    This is a generalization of numpy.random.permutation to tensors.
+    Otherwise it behaves the same.
    """
    # n should be a 0-dimension array
    assert n.shape == ()
@@ -680,7 +673,7 @@ def permutation_helper(random_state, n, shape):
        shape = ()
    out_shape = list(shape)
    out_shape.append(n)
-    out = numpy.zeros(out_shape, int)
+    out = numpy.empty(out_shape, int)
    for i in numpy.ndindex(*shape):
        out[i] = random_state.permutation(n)

@@ -869,7 +862,7 @@ class RandomStreamsBase(object):
    def binomial(self, size=None, n=1, p=0.5, ndim=None, dtype='int64',
                 prob=None):
        """
-        Sample n times with probability of success prob for each trial,
+        Sample n times with probability of success p for each trial and
        return the number of successes.

        If the size argument is ambiguous on the number of dimensions,