提交 d5520e81 authored 作者: Pascal Lamblin's avatar Pascal Lamblin 提交者: GitHub

Merge pull request #5343 from ChihebTrabelsi/broadcasting_tutorial

Broadcasting tutorial
......@@ -26,9 +26,6 @@ some of them might be outdated though:
* :ref:`unittest` -- Tutorial on how to use unittest in testing Theano.
* :ref:`sandbox_broadcasting` -- Short description of what a broadcastable
pattern is.
* :ref:`sandbox_debugging_step_mode` -- How to step through the execution of
a Theano function and print the inputs and outputs of each op.
......
......@@ -23,7 +23,7 @@ Glossary
(virtually) replicating the smaller tensor along
the dimensions that it is lacking.
For more detail, see :ref:`libdoc_tensor_broadcastable`, and also
For more detail, see :ref:`tutbroadcasting`, and also
* `SciPy documentation about numpy's broadcasting <http://www.scipy.org/EricsBroadcastingDoc>`_
* `OnLamp article about numpy's broadcasting <http://www.onlamp.com/pub/a/python/2000/09/27/numerically.html>`_
......
......@@ -332,7 +332,7 @@ TensorType and TensorVariable
A tuple of True/False values, one for each dimension. True in
position 'i' indicates that at evaluation-time, the ndarray will have
size 1 in that 'i'-th dimension. Such a dimension is called a
*broadcastable dimension* (see :ref:`libdoc_tensor_broadcastable`).
*broadcastable dimension* (see :ref:`tutbroadcasting`).
The broadcastable pattern indicates both the number of dimensions and
whether a particular dimension must have length 1.
......@@ -1523,46 +1523,7 @@ Mathematical
.. _libdoc_tensor_broadcastable:
Broadcasting in Theano vs. Numpy
--------------------------------
Broadcasting is a mechanism which allows tensors with
different numbers of dimensions to be added or multiplied
together by (virtually) replicating the smaller tensor along
the dimensions that it is lacking.
Broadcasting is the mechanism by which a scalar
may be added to a matrix, a vector to a matrix or a scalar to
a vector.
.. figure:: bcast.png
Broadcasting a row matrix. T and F respectively stand for
True and False and indicate along which dimensions we allow
broadcasting.
If the second argument were a vector, its shape would be
``(2,)`` and its broadcastable pattern ``(F,)``. They would
be automatically expanded to the **left** to match the
dimensions of the matrix (adding ``1`` to the shape and ``T``
to the pattern), resulting in ``(1, 2)`` and ``(T, F)``.
It would then behave just like the example above.
Unlike numpy which does broadcasting dynamically, Theano needs
to know, for any operation which supports broadcasting, which
dimensions will need to be broadcasted. When applicable, this
information is given in the :ref:`type` of a *Variable*.
See also:
* `SciPy documentation about numpy's broadcasting <http://www.scipy.org/EricsBroadcastingDoc>`_
* `OnLamp article about numpy's broadcasting <http://www.onlamp.com/pub/a/python/2000/09/27/numerically.html>`_
You can find more information about Broadcasting in the :ref:`tutbroadcasting` tutorial.
Linear Algebra
==============
......
.. _sandbox_broadcasting:
============
Broadcasting
============
The following may go either in:
a) numpy refresher.
b) more details of broadcasting in the types section.
=== broadcastable ===
The {{{broadcastable}}} field of a {{{Tensor}}} must be a tuple of boolean values. Each value corresponds to a dimension of the {{{Tensor}}} and specifies whether the {{{Tensor}}} can be "broadcasted" along that dimension.
A value of {{{True}}} means two things:
* The size of the corresponding dimension will necessarily be 1.
* If needed, the {{{Tensor}}} can be ''broadcasted'' or ''replicated'' along the corresponding dimension to emulate a larger {{{Tensor}}}.
A value of {{{False}}} means that the corresponding dimension can take any nonnegative value and that the {{{Tensor}}} cannot be replicated along it (regardless of whether it is 1 or not).
Example: to define a ''row'' type, set broadcastable to {{{(True, False)}}}: this means the shape must be like {{{(1, n)}}}. If you add a row of shape {{{(1, n)}}} to a matrix of shape {{{(m, n)}}}, the row will be "broadcasted" or "replicated" {{{m}}} times along the first dimension, producing a virtual matrix of the correct size {{{(m, n)}}}. Therefore, adding a row to a matrix will add the row to each row of the matrix. If the value of {{{broadcastable}}} for the first dimension of the row was {{{False}}}, the operation would instead raise an exception complaining that the dimensions are not the same.
Similarly, the broadcastable pattern for a column is {{{(False, True)}}}: this means the shape must be like {{{(m, 1)}}}, therefore adding a column to a matrix will add that column to each column of the matrix. Several Ops, such as {{{DimShuffle}}}, can add or remove broadcastable dimensions.
The length of {{{broadcastable}}} is the number of dimensions of the {{{Tensor}}}.
.. testsetup::
import numpy as np
import theano
import theano.tensor as T
.. _tutbroadcasting:
============
Broadcasting
============
Broadcasting is a mechanism which allows tensors with
different numbers of dimensions to be added or multiplied
together by (virtually) replicating the smaller tensor along
the dimensions that it is lacking.
Broadcasting is the mechanism by which a scalar
may be added to a matrix, a vector to a matrix or a scalar to
a vector.
.. figure:: bcast.png
Broadcasting a row matrix. T and F respectively stand for
True and False and indicate along which dimensions we allow
broadcasting.
If the second argument were a vector, its shape would be
``(2,)`` and its broadcastable pattern ``(False,)``. They would
be automatically expanded to the **left** to match the
dimensions of the matrix (adding ``1`` to the shape and ``True``
to the pattern), resulting in ``(1, 2)`` and ``(True, False)``.
It would then behave just like the example above.
Unlike numpy which does broadcasting dynamically, Theano needs
to know, for any operation which supports broadcasting, which
dimensions will need to be broadcasted. When applicable, this
information is given in the :ref:`type` of a *Variable*.
The following code illustrates how rows and columns are broadcasted in order to perform an addition operation with a matrix:
>>> r = T.row()
>>> r.broadcastable
(True, False)
>>> mtr = T.matrix()
>>> mtr.broadcastable
(False, False)
>>> f_row = theano.function([r, mtr], [r + mtr])
>>> R = np.arange(3).reshape(1, 3)
>>> R
array([[0, 1, 2]])
>>> M = np.arange(9).reshape(3, 3)
>>> M
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
>>> f_row(R, M)
[array([[ 0., 2., 4.],
[ 3., 5., 7.],
[ 6., 8., 10.]])]
>>> c = T.col()
>>> c.broadcastable
(False, True)
>>> f_col = theano.function([c, mtr], [c + mtr])
>>> C = np.arange(3).reshape(3, 1)
>>> C
array([[0],
[1],
[2]])
>>> M = np.arange(9).reshape(3, 3)
>>> f_col(C, M)
[array([[ 0., 1., 2.],
[ 4., 5., 6.],
[ 8., 9., 10.]])]
In these examples, we can see that both the row vector and the column vector are broadcasted in order to be be added to the matrix.
See also:
* `SciPy documentation about numpy's broadcasting <http://www.scipy.org/EricsBroadcastingDoc>`_
* `OnLamp article about numpy's broadcasting <http://www.onlamp.com/pub/a/python/2000/09/27/numerically.html>`_
......@@ -40,6 +40,7 @@ Basics
conditions
loop
shape_info
broadcasting
Advanced
--------
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论