提交 9350272e authored 作者: Nicolas Bouchard's avatar Nicolas Bouchard

Major correction.

上级 2ad3e6a0
...@@ -4,61 +4,64 @@ ...@@ -4,61 +4,64 @@
Sparse Sparse
====== ======
Sparse matrices Sparse Matrices
=============== ===============
In general, sparse matrices provide the same functionnality as regular In general, *sparse* matrices provide the same functionality as regular
matrices. The difference lie in the way the element of the matrix are matrices. The difference lies in the way the elements of *sparse* matrices are
stored. These matrices do not store zeros element of a matrix, so it takes represented and stored in memory. Only the non-zero elements of the latter are stored.
advantage of a high number of zeros elements. This reduce compution This has some potential advantages: first, this
time by using specific algorithms and, obviously, it reduces memory usage may obviously lead to reduced memory usage and, second, clever
since zeros are not stored in memory. We usually refer to normaly stored storage methods may lead to reduced computation time through the use of
matrices as dense matrices. sparse specific algorithms. We usually refer to the generically stored matrices
as *dense* matrices.
The sparse package povide efficient algorithms, but it is not recommended
in all cases or for all matrices. An obvious exemple is if the Theano's sparse package provides efficient algorithms, but its use is not recommended
sparsity proportion if very low. The sparsity proportion refer to the in all cases or for all matrices. As an obvious example, consider the case where
ratio of the number of zeros elements over the number of all element. the *sparsity proportion* if very low. The *sparsity proportion* refers to the
A low sparsity proportion may results in use of more space in memory ratio of the number of zero elements to the number of all elements in a matrix.
since not only the actual data is stored, but also the position of all A low sparsity proportion may result in the use of more space in memory
the elements of the matrix. This would also takes more computation since not only the actual data is stored, but also the position of nearly every
time and a dense matrix with regular optimized algorithms may do a element of the matrix. This would also require more computation
better jobs. Other exemples depend on what would be done with the time whereas a dense matrix representation along with regular optimized algorithms might do a
matrix and the structure of the matrix itself. better job. Other examples may be found at the nexus of the specific purpose and structure
of the matrices.
Since sparse matrices are not store in contiguous array, there is many
way to represent them in memory. This is usually designated by the ``format`` Since sparse matrices are not stored in contiguous arrays, there are several
of the matrix. Theano sparse matrix package is based on the scipy ways to represent them in memory. This is usually designated by the so-called ``format``
sparse package, so all informations about the sparse matrices can be found of the matrix. Since Theano's sparse matrix package is based on the SciPy
in the scipy documentation. As scipy, Theano does not implement sparse for sparse package, complete information about sparse matrices can be found
arrays with a number of dimension different of 2. in the SciPy documentation. Like SciPy, Theano does not implement sparse formats for
arrays with a number of dimensions different from two.
So far, theano implements two ``formats`` of sparse matrix: ``csc`` and ``csr``.
They are almost the same format except ``csc`` is based on the columns of the So far, Theano implements two ``formats`` of sparse matrix: ``csc`` and ``csr``.
matrix and ``csr`` is based on the rows. They both have the same purpose: Those are almost identical except that ``csc`` is based on the *columns* of the
provide efficient algorithms to make linear algebra operations. On the other matrix and ``csr`` is based on its *rows*. They both have the same purpose:
hand, they fail to access their elements easily. That mean if you are planning to provide for the use of efficient algorithms performing linear algebra operations. A disadvantage is that
to access elements of a sparse matrix a lot in your computation graph, maybe they fail to ensure easy access to the elements of the underlying matrix. This means that if you are planning
to access elements of a sparse matrix a lot in your computational graph, perhaps
a tensor variable could be a better choice. a tensor variable could be a better choice.
More documentation in the :ref:`Sparse Library Reference <libdoc_sparse>`. More documentation may be found in the :ref:`Sparse Library Reference <libdoc_sparse>`.
Compressed sparse format Compressed Sparse Format
======================== ========================
Compressed sparse formats are the ``format`` supported by Theano. There is two Theano supports two *compressed sparse formats* ``csc`` and ``csr``, respectively based on columns
of them: one based on columns and one based on rows. They both have the same attributes: and rows. They have both the same attributes: ``data``, ``indices``, ``indptr`` and ``shape``.
``data``, ``indices``, ``indptr`` and ``shape``. The ``shape`` atribute is exactly
the same as the shape attribute of a dense matrix. It can be specified at the
creation of a sparse matrix if the shape cannot be infered from the first
three attributes.
The ``data`` attribute is a one dimentionnal ndarray which contains all the * The ``shape`` attribute is exactly the same as the ``shape`` attribute of a dense (i.e. generic)
non zeros elements of the sparse matrix. The ``indices`` and ``indptr`` matrix. It can be explicitly specified at the creation of a sparse matrix if it cannot be infered
attribute are used to store the position of the data in the sparse matrix. from the first three attributes.
Before going further, here is the import that is assumed for the rest of the * The ``data`` attribute is a one-dimentionnal ``ndarray`` which contains all the non-zero
tutorial. elements of the sparse matrix.
* The ``indices`` and ``indptr`` attributes are used to store the position of the data in the
sparse matrix.
Before going further, here are the ``import`` statements that are assumed for the rest of the
tutorial:
>>> import theano >>> import theano
>>> import numpy as np >>> import numpy as np
...@@ -66,12 +69,12 @@ tutorial. ...@@ -66,12 +69,12 @@ tutorial.
>>> from theano import tensor as T >>> from theano import tensor as T
>>> from theano import sparse as S >>> from theano import sparse as S
CSC matrix CSC Matrix
---------- ----------
For the Compressed Sparse Column matrices, ``indices`` stands for the indices In the *Compressed Sparse Column* format, ``indices`` stands for the indices
of the data along the column and ``indptr`` stands for the column index of the of the data along the column and ``indptr`` stands for the column index of the
matrix. The folowing exemple returns the i-th column of the matrix. matrix. The following example builds a matrix and returns its columns.
>>> data = np.asarray([7, 8, 9]) >>> data = np.asarray([7, 8, 9])
>>> indices = np.asarray([0, 1, 2]) >>> indices = np.asarray([0, 1, 2])
...@@ -86,19 +89,19 @@ matrix. The folowing exemple returns the i-th column of the matrix. ...@@ -86,19 +89,19 @@ matrix. The folowing exemple returns the i-th column of the matrix.
[0, 1] [7, 8] [0, 1] [7, 8]
>>> i = 1 >>> i = 1
>>> print m.indices[m.indptr[i]:m.indptr[i+1]], m.data[m.indptr[i]:m.indptr[i+1]] >>> print m.indices[m.indptr[i]:m.indptr[i+1]], m.data[m.indptr[i]:m.indptr[i+1]]
[2] [3] [2] [9]
>>> i = 2 >>> i = 2
>>> print m.indices[m.indptr[i]:m.indptr[i+1]], m.data[m.indptr[i]:m.indptr[i+1]] >>> print m.indices[m.indptr[i]:m.indptr[i+1]], m.data[m.indptr[i]:m.indptr[i+1]]
[] [] [] []
CSR matrix CSR Matrix
---------- ----------
For the Compressed Sparse Row matrices, ``indices`` stands for the indices In the *Compressed Sparse Row* format, ``indices`` stands for the indices
of the data along the row and ``indptr`` stands for the row index of the of the data along the row and ``indptr`` stands for the row index of the
matrix. The folowing exemple returns the i-th row of the matrix. matrix. The following example builds a matrix and returns its rows.
>>> data = np.asarray([1, 2, 3]) >>> data = np.asarray([7, 8, 9])
>>> indices = np.asarray([0, 1, 2]) >>> indices = np.asarray([0, 1, 2])
>>> indptr = np.asarray([0, 2, 3, 3]) >>> indptr = np.asarray([0, 2, 3, 3])
>>> m = sp.csr_matrix((data, indices, indptr), shape=(3, 3)) >>> m = sp.csr_matrix((data, indices, indptr), shape=(3, 3))
...@@ -111,37 +114,37 @@ matrix. The folowing exemple returns the i-th row of the matrix. ...@@ -111,37 +114,37 @@ matrix. The folowing exemple returns the i-th row of the matrix.
[0, 1] [7, 8] [0, 1] [7, 8]
>>> i = 1 >>> i = 1
>>> print m.indices[m.indptr[i]:m.indptr[i+1]], m.data[m.indptr[i]:m.indptr[i+1]] >>> print m.indices[m.indptr[i]:m.indptr[i+1]], m.data[m.indptr[i]:m.indptr[i+1]]
[2] [3] [2] [9]
>>> i = 2 >>> i = 2
>>> print m.indices[m.indptr[i]:m.indptr[i+1]], m.data[m.indptr[i]:m.indptr[i+1]] >>> print m.indices[m.indptr[i]:m.indptr[i+1]], m.data[m.indptr[i]:m.indptr[i+1]]
[] [] [] []
Handling sparse in Theano Handling Sparse in Theano
========================= =========================
Most of the ops in Theano depends on the ``format`` of the sparse matrix. Most of the ops in Theano depend on the ``format`` of the sparse matrix.
That is why there is two kinds of sparse variable: ``csc_matrix`` and That is why there are two kinds of constructors of sparse variables: ``csc_matrix`` and
``csr_matrix``. These two constructors can be called with the usual name and ``csr_matrix``. These can be called with the usual ``name`` and
specific dtype, but no broadcastable flags is allowed. This is forbiden ``dtype`` parameters, but no ``broadcastable`` flags are allowed. This is forbidden
since sparse package does not provide any way to handle a number of since the sparse package does not provide any way to handle a number of
dimension different of two. The set of all accepted dtypes for the sparse dimensions different from two. The set of all accepted ``dtype`` for the sparse
matrices can be found in ``sparse.all_dtypes``. matrices can be found in ``sparse.all_dtypes``.
>>> S.all_dtypes >>> S.all_dtypes
set(['int32', 'int16', 'float64', 'complex128', 'complex64', 'int64', 'int8', 'float32']) set(['int32', 'int16', 'float64', 'complex128', 'complex64', 'int64', 'int8', 'float32'])
Properties and construction Properties and Construction
--------------------------- ---------------------------
Sparse variable does not provide direct access to its properties, but Although sparse variables do not allow direct access to their properties,
this can be done using the ``csm_properties`` function. This will return this can be accomplished using the ``csm_properties`` function. This will return
a tuple of one dimensionnal tensor variable that represent the internal a tuple of one-dimensional ``tensor`` variable that represents the internal characteristics
of the sparse matrix. of the sparse matrix.
In order to reconstruct a sparse matrix from some properties, ``CSC`` In order to reconstruct a sparse matrix from some properties, the functions ``CSC``
and ``CSR`` can be used. This will create the sparse matrix in the desired and ``CSR`` can be used. This will create the sparse matrix in the desired
format. As an example, the folowing code reconstructs a ``csc`` matrix into format. As an example, the following code reconstructs a ``csc`` matrix into
a ``csr`` one. a ``csr`` one.
>>> x = S.csc_matrix(name='x', dtype='int64') >>> x = S.csc_matrix(name='x', dtype='int64')
...@@ -158,30 +161,30 @@ a ``csr`` one. ...@@ -158,30 +161,30 @@ a ``csr`` one.
[1 0 0] [1 0 0]
[1 0 0]] [1 0 0]]
The last example show that one format can be obtained by transposition of The last example shows that one format can be obtained from transposition of
the other. In fact, when calling the ``transpose`` function, the other. Indeed, when calling the ``transpose`` function,
the format of the resulting matrix will not be the same as the one the sparse characteristics of the resulting matrix cannot be the same as the one
in input. provided as input.
To and fro To and Fro
---------- ----------
To move back and forth from dense matrix to sparse matrix, theano To move back and forth from a dense matrix to a sparse matrix representation, Theano
provide the ``dense_from_sparse``, ``csr_from_dense`` and provides the ``dense_from_sparse``, ``csr_from_dense`` and
``csc_from_dense`` functions. No details must be added; here is ``csc_from_dense`` functions. No additional detail must be provided. Here is
an example that does completly nothing. an example that performs a full cycle from sparse to sparse:
>>> x = S.csc_matrix(name='x', dtype='float32') >>> x = S.csc_matrix(name='x', dtype='float32')
>>> y = S.dense_from_sparse(x) >>> y = S.dense_from_sparse(x)
>>> z = S.csc_from_dense(y) >>> z = S.csc_from_dense(y)
Structured operation Structured Operation
-------------------- --------------------
Many ops are set to make use of the very peculiar structure of the sparse Several ops are set to make use of the very peculiar structure of the sparse
matrices. These ops are said structured and they simply do not make any matrices. These ops are said to be *structured* and simply do not perform any
computation to the zeros elements of the matrix. They can be tough as being computations on the zero elements of the sparse matrix. They can be thought as being
apply only on the data attribute of the sparse matrix. applied only to the data attribute of the latter.
>>> x = S.csc_matrix(name='x', dtype='float32') >>> x = S.csc_matrix(name='x', dtype='float32')
>>> y = S.structured_add(x, 2) >>> y = S.structured_add(x, 2)
...@@ -199,11 +202,11 @@ apply only on the data attribute of the sparse matrix. ...@@ -199,11 +202,11 @@ apply only on the data attribute of the sparse matrix.
Gradient Gradient
-------- --------
The gradient of the sparse ops can also be structured. Some ops provide The gradients of the ops in the sparse module can also be structured. Some ops provide
a way to change if the grad is structured or not and the documentation can a *flag* to indicate if the gradient is to be structured or not. The documentation can
be used to determine if the grad of an op is regular or structured or the be used to determine if the gradient of an op is regular or structured or if its
implementation can be modify. When a structured grad is calculated, the implementation can be modified. Similarly to structured ops, when a structured gradient is calculated, the
computation is done only for the non zeros elements of the sparse matrix. computation is done only for the non-zero elements of the sparse matrix.
More documentation about the grad of specific ops is in the More documentation regarding the gradients of specific ops can be found in the
:ref:`Sparse Library Reference <libdoc_sparse>`. :ref:`Sparse Library Reference <libdoc_sparse>`.
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论