Move format explanation to doc, added new doc and updated list of ops.

fe3ff975 · Nicolas Bouchard · 2d18b424 · fe3ff975 · fe3ff975 · fe3ff975
--- a/doc/library/sparse/index.txt
+++ b/doc/library/sparse/index.txt

 .. _libdoc_sparse:

-===========================================================
-:mod:`sparse` -- Symbolic Sparse Matrices [doc TODO]
-===========================================================
+=========================================
+:mod:`sparse` -- Symbolic Sparse Matrices
+=========================================

-The sparse module has been used in:
+In the tutorial section, you can find a :ref:`sparse tutorial
+<tutsparse>`.

- NLP: Dense linear transformations of sparse vectors.
-
- Audio: Filterbank in Fourier domain.
-
-The sparse module is less mature than the tensor module.
-This documentation is also not mature.
-
-The sparse submodule is not loaded when we import theano. You must import theano.sparse to enable it.
-
-The sparse module provides two kinds of sparse tensors are supported: CSC matrices and CSR matrices.
-Operations that are implemented:
-
-grad?
-
- conversion from sparse <-> dense 
-    - theano.sparse.{dense_from_sparse,dense_from_sparse}
-    - 
+The sparse submodule is not loaded when we import Theano. You must
+import ``theano.sparse`` to enable it.

- [un]packing of sparse matrices from indexlists and nonzero elements.
-    - packing: theano.sparse.{CRC,CSR}
-    - unpacking: theano.sparse.csm_properties
+The sparse module provide the same functionalities as the tensor
+module. The difference lies under the cover because sparse matrices
+does not store data in a contiguous array. Note that there are no GPU
+implementations for sparse matrices implemented in Theano. The sparse
+module has been used in:

- transpose
-    - theano.sparse.transpose
-
- negation
-    - neg
-
- addition/multiplication (elemwise)
-    - theano.sparse.{add,mul}
-    - sparse + sparse, sparse + dense, dense + sparse
-    - sparse * sparse, sparse * dense, dense * sparse
-
- StructuredDot
-    - with gradient defined such that sparsity pattern is
-      constant.  This function is called "structured_dot"
-    - theano.sparse.structured_dot and its grad (structured_dot_grad)
-    - theano.dot call it.
-    - dot(sparse, dense) and dot(dense, sparse), dot(sparse, sparse)
+- NLP: Dense linear transformations of sparse vectors.
+- Audio: Filterbank in Fourier domain.

- Dot
-    - performs the true dot without special semantics.
-    - dot(sparse, dense), dot(dense, sparse), dot(sparse, sparse)
-    - When the operation has the form dot(csr_matrix, dense) the gradient of
-      this operation can be performed inplace by UsmmCscDense. This leads to
-      significant speed-ups.
- Subtensor
+Compressed Sparse Format
+========================
+
+This section tries to explain how information is store for the two
+sparse formats of SciPy supported by Theano. There is more formats
+that can be used with SciPy and some documentation about them may be
+found `here
+<http://deeplearning.net/software/theano/sandbox/sparse.html>`_.
+
+.. Changes to this section should also result in changes to tutorial/sparse.txt.
+
+Theano supports two *compressed sparse formats* ``csc`` and ``csr``,
+respectively based on columns and rows. They have both the same
+attributes: ``data``, ``indices``, ``indptr`` and ``shape``.
+
+  * The ``data`` attribute is a one-dimentionnal ``ndarray`` which
+    contains all the non-zero elements of the sparse matrix.
+
+  * The ``indices`` and ``indptr`` attributes are used to store the
+    position of the data in the sparse matrix.
+
+  * The ``shape`` attribute is exactly the same as the ``shape``
+    attribute of a dense (i.e. generic) matrix. It can be explicitly
+    specified at the creation of a sparse matrix if it cannot be
+    infered from the first three attributes.
+
+
+CSC Matrix
+----------
+
+In the *Compressed Sparse Column* format, ``indices`` stands for index
+inside the column vectors of the matrix and ``indptr`` tells where the
+column starts in the ``data`` and in the ``indices``
+attributes. ``indptr`` can be tought as giving the slice which must be
+applied to the other attribute in order to get each column of the
+matrix. In other words, ``slice(indptr[i], indptr[i+1])`` correspond
+to the slice needed to find the i-th column of the matrix in the
+``data`` and in the ``indices`` fields.
+
+The following example builds a matrix and returns its columns. It
+prints the i-th column, i.e. a list of indices in the column and their
+corresponding value in the second list.
+
+>>> data = np.asarray([7, 8, 9])
+>>> indices = np.asarray([0, 1, 2])
+>>> indptr = np.asarray([0, 2, 3, 3])
+>>> m = sp.csc_matrix((data, indices, indptr), shape=(3, 3))
+>>> print m.toarray()
+[[7 0 0]
+ [8 0 0]
+ [0 9 0]]
+>>> i = 0
+>>> print m.indices[m.indptr[i]:m.indptr[i+1]], m.data[m.indptr[i]:m.indptr[i+1]]
+[0, 1] [7, 8]
+>>> i = 1
+>>> print m.indices[m.indptr[i]:m.indptr[i+1]], m.data[m.indptr[i]:m.indptr[i+1]]
+[2] [9]
+>>> i = 2
+>>> print m.indices[m.indptr[i]:m.indptr[i+1]], m.data[m.indptr[i]:m.indptr[i+1]]
+[] []
+
+CSR Matrix
+----------
+
+In the *Compressed Sparse Row* format, ``indices`` stands for index
+inside the row vectors of the matrix and ``indptr`` tells where the
+row starts in the ``data`` and in the ``indices``
+attributes. ``indptr`` can be tought as giving the slice which must be
+applied to the other attribute in order to get each row of the
+matrix. In other words, ``slice(indptr[i], indptr[i+1])`` correspond
+to the slice needed to find the i-th row of the matrix in the ``data``
+and in the ``indices`` fields.
+
+The following example builds a matrix and returns its rows. It prints
+the i-th row, i.e. a list of indices in the row and their corresponding value
+in the second list.
+
+>>> data = np.asarray([7, 8, 9])
+>>> indices = np.asarray([0, 1, 2])
+>>> indptr = np.asarray([0, 2, 3, 3])
+>>> m = sp.csr_matrix((data, indices, indptr), shape=(3, 3))
+>>> print m.toarray()
+[[7 8 0]
+ [0 0 9]
+ [0 0 0]]
+>>> i = 0
+>>> print m.indices[m.indptr[i]:m.indptr[i+1]], m.data[m.indptr[i]:m.indptr[i+1]]
+[0, 1] [7, 8]
+>>> i = 1
+>>> print m.indices[m.indptr[i]:m.indptr[i+1]], m.data[m.indptr[i]:m.indptr[i+1]]
+[2] [9]
+>>> i = 2
+>>> print m.indices[m.indptr[i]:m.indptr[i+1]], m.data[m.indptr[i]:m.indptr[i+1]]
+[] []
+
+List of Implemented Operations
+==============================
+
+- Moving from and to sparse
+    - :class:`DenseFromSparse` and ``dense_from_sparse``
+    - :class:`SparseFromDense` and ``csr_from_dense``, ``csc_from_dense``
+
+- Construction of Sparses and their Properties
+    - :class:`CSC` to construct a ``csc`` matrix
+    - :class:`CSR` to construct a ``csr`` matrix
+    - :class:`CSMProperties` to get the properties of a sparse matrix
+    - ``sp_ones_like``
+    - ``sp_zeros_like``
+    - :class:`SquareDiagonal` and ``square_diagonal``
+
+- Cast
+    - :class:`Cast` with ``bcast``, ``wcast``, ``icast``, ``lcast``, 
+      ``fcast``, ``dcast``, ``ccast``, and ``zcast``
+
+- Transpose
+    - :class:`Transpose` and ``transpose``
+
+- Basic Arithmetic
+    - :class:`Neg` for negation
+    - ``add`` for addition
+    - ``sub`` for substraction
+    - ``mul`` for multiplication
+    - ``col_scale`` to multiply by a vector along the columns
+    - ``row_slace`` to multiply by a vector along the rows
+
+- Monoid (Is applied only with one sparse as input)
+    - ``structured_sigmoid``
+    - ``structured_exp``
+    - ``structured_log``
+    - ``structured_pow``
+    - ``structured_minimum``
+    - ``structured_maximum``
+    - ``structured_add``
+    - ``sin``
+    - ``arcsin``
+    - ``tan``
+    - ``arctan``
+    - ``sinh``
+    - ``arcsinh``
+    - ``tanh``
+    - ``arctanh``
+    - ``rint``
+    - ``ceil``
+    - ``floor``
+    - ``sgn``
+    - ``log1p``
+    - ``sqr``
+    - ``sqrt``
+
+- Dot Product
+    - :class:`Dot` and ``dot``
+    - :class:`StructuredDot` and ``structured_dot``
+    - :class:`SamplingDot` and ``sampling_dot``
+    - :class:`Usmm` and ``usmm``
+
+- Slice Operations
    - sparse_variable[N, N], return a tensor scalar
    - sparse_variable[M:N, O:P], return a sparse matrix
-    - Don't support [M, N:O] and [M:N, O] as we don't support sparse vector
+    - Sparse variable don't support [M, N:O] and [M:N, O] as we don't support sparse vector
      and returning a sparse matrix would break the numpy interface.
      Use [M:M+1, N:O] and [M:N, O:O+1] instead.
-
-There are no GPU implementations for sparse matrices implemented in Theano.
-
-Some documentation for sparse has been written
-`here <http://deeplearning.net/software/theano/sandbox/sparse.html>`_.
-
+    - :class:`Diag` and ``diag``
+
+- Concatenation
+    - :class:`HStack` and ``hstack``
+    - :class:`VStack` and ``vstack``
+
+- Probability
+    - :class:`Poisson` and ``poisson``
+    - :class:`Binomial` and ``csc_fbinomial``, ``csc_dbinomial``
+      ``csr_fbinomial``, ``csr_dbinomial``
+    - :class:`Multinomial` and ``multinomial``
+
+- Internal Representation
+    - :class:`EnsureSortedIndices` and ``ensure_sorted_indices``
+    - :class:`Remove0` and ``remove0``
+    - ``clean`` to resort indices and remove zeros

 ===================================================================
 :mod:`sparse` --  Sparse Op

--- a/doc/tutorial/sparse.txt
+++ b/doc/tutorial/sparse.txt
@@ -46,86 +46,54 @@ perhaps a tensor variable could be a better choice.

 More documentation may be found in the :ref:`Sparse Library Reference <libdoc_sparse>`.

+Before going further, here are the ``import`` statements that are assumed for the rest of the
+tutorial:
+
+>>> import theano
+>>> import numpy as np
+>>> import scipy.sparse as sp
+>>> from theano import sparse
+
 Compressed Sparse Format
 ========================

+.. Changes to this section should also result in changes to library/sparse/index.txt.
+
 Theano supports two *compressed sparse formats*  ``csc`` and ``csr``, respectively based on columns
 and rows. They have both the same attributes: ``data``, ``indices``, ``indptr`` and ``shape``.

-  * The ``shape`` attribute is exactly the same as the ``shape`` attribute of a dense (i.e. generic)
-    matrix. It can be explicitly specified at the creation of a sparse matrix if it cannot be infered
-    from the first three attributes.
-
  * The ``data`` attribute is a one-dimentionnal ``ndarray`` which contains all the non-zero
    elements of the sparse matrix.

  * The ``indices`` and ``indptr`` attributes are used to store the position of the data in the
    sparse matrix.

-Before going further, here are the ``import`` statements that are assumed for the rest of the
-tutorial:
+  * The ``shape`` attribute is exactly the same as the ``shape`` attribute of a dense (i.e. generic)
+    matrix. It can be explicitly specified at the creation of a sparse matrix if it cannot be infered
+    from the first three attributes.

->>> import theano
->>> import numpy as np
->>> import scipy.sparse as sp
->>> from theano import sparse
+Which format should I use?
+--------------------------

-CSC Matrix
----------
+At the end, the format does not affect the length of the ``data`` and ``indices`` attributes. They are both
+completly fixed by the number of elements you want to store. The only thing that changes with the format
+is ``indptr``. In ``csc`` format, the matrix is compressed along columns so a lower number of columns will
+result in less memory use. On the other hand, with the ``csr`` format, the matrix is compressed along
+the rows and with a matrix that have a lower number of rows, ``csr`` format is a better choice. So here is the rule:

-In the *Compressed Sparse Column* format, ``indices`` stands for the column index
-of the data and ``indptr`` tells where the column starts in the ``data`` and in the
-``indices`` attributes. ``indptr`` can be tought as giving the slice which must be
-applied to the other attribute in order to get each column of the matrix. The following
-example builds a matrix and returns its columns. It prints the i-th column indexes in the
-first list and their corresponding value in the second list.
-
->>> data = np.asarray([7, 8, 9])
->>> indices = np.asarray([0, 1, 2])
->>> indptr = np.asarray([0, 2, 3, 3])
->>> m = sp.csc_matrix((data, indices, indptr), shape=(3, 3))
->>> print m.toarray()
-[[7 0 0]
- [8 0 0]
- [0 9 0]]
->>> i = 0
->>> print m.indices[m.indptr[i]:m.indptr[i+1]], m.data[m.indptr[i]:m.indptr[i+1]]
-[0, 1] [7, 8]
->>> i = 1
->>> print m.indices[m.indptr[i]:m.indptr[i+1]], m.data[m.indptr[i]:m.indptr[i+1]]
-[2] [9]
->>> i = 2
->>> print m.indices[m.indptr[i]:m.indptr[i+1]], m.data[m.indptr[i]:m.indptr[i+1]]
-[] []
-
-CSR Matrix
----------
+.. note::
+
+    If shape[0] > shape[1], use ``csr`` format. Otherwise, use ``csc``.
+
+Sometimes, since the sparse module is young, ops does not exist for both format. So here is 
+what may be the most relevent rule:
+
+.. note::

-In the *Compressed Sparse Row* format, ``indices`` stands for the row index
-of the data and ``indptr`` tells where the row starts in the ``data`` and in the
-``indices`` attributes. ``indptr`` can be tought as giving the slice which must be
-applied to the other attribute in order to get each row of the matrix. The following
-example builds a matrix and returns its rows. It prints the i-th row indexes in the
-first list and their corresponding values in the second list.
-
->>> data = np.asarray([7, 8, 9])
->>> indices = np.asarray([0, 1, 2])
->>> indptr = np.asarray([0, 2, 3, 3])
->>> m = sp.csr_matrix((data, indices, indptr), shape=(3, 3))
->>> print m.toarray()
-[[7 8 0]
- [0 0 9]
- [0 0 0]]
->>> i = 0
->>> print m.indices[m.indptr[i]:m.indptr[i+1]], m.data[m.indptr[i]:m.indptr[i+1]]
-[0, 1] [7, 8]
->>> i = 1
->>> print m.indices[m.indptr[i]:m.indptr[i+1]], m.data[m.indptr[i]:m.indptr[i+1]]
-[2] [9]
->>> i = 2
->>> print m.indices[m.indptr[i]:m.indptr[i+1]], m.data[m.indptr[i]:m.indptr[i+1]]
-[] []
+    Use the format compatible with the ops in your computation graph.

+The documentation about the ops and their supported format may be found in 
+the :ref:`Sparse Library Reference <libdoc_sparse>`.

 Handling Sparse in Theano
 =========================

--- a/theano/sparse/basic.py
+++ b/theano/sparse/basic.py
@@ -4270,6 +4270,9 @@ class Dot(gof.op.Op):

    :note: The grad implemented is regular, i.e. not structured.
    :note: At least one of `x` or `y` must be a sparse matrix.
+    :note: When the operation has the form dot(csr_matrix, dense)
+           the gradient of this operation can be performed inplace
+           by UsmmCscDense. This leads to significant speed-ups.
    """

    def __eq__(self, other):