提交 2bdef345 authored 作者: Olivier Delalleau's avatar Olivier Delalleau

Minor doc fixes

上级 dc636a5a
...@@ -75,7 +75,7 @@ field, as you can see here: ...@@ -75,7 +75,7 @@ field, as you can see here:
TensorType(float64, scalar) TensorType(float64, scalar)
>>> T.dscalar >>> T.dscalar
TensorType(float64, scalar) TensorType(float64, scalar)
>>> x.type == T.dscalar >>> x.type is T.dscalar
True True
You can learn more about the structures in Theano in :ref:`graphstructures`. You can learn more about the structures in Theano in :ref:`graphstructures`.
......
...@@ -68,7 +68,7 @@ squared difference between two matrices ``a`` and ``b`` at the same time: ...@@ -68,7 +68,7 @@ squared difference between two matrices ``a`` and ``b`` at the same time:
>>> f = function([a, b], [diff, abs_diff, diff_squared]) >>> f = function([a, b], [diff, abs_diff, diff_squared])
.. note:: .. note::
`dmatrices` produces as many outputs as names that you provide. It's a `dmatrices` produces as many outputs as names that you provide. It is a
shortcut for allocating symbolic variables that we will often use in the shortcut for allocating symbolic variables that we will often use in the
tutorials. tutorials.
...@@ -151,7 +151,7 @@ with respect to the second. In this way, Theano can be used for ...@@ -151,7 +151,7 @@ with respect to the second. In this way, Theano can be used for
output is also a list. The order in both list is important, element output is also a list. The order in both list is important, element
*i* of the output list is the gradient of the first argument of *i* of the output list is the gradient of the first argument of
``T.grad`` with respect to the *i*-th element of the list given as second argument. ``T.grad`` with respect to the *i*-th element of the list given as second argument.
The first arguement of ``T.grad`` has to be a scalar (a tensor The first argument of ``T.grad`` has to be a scalar (a tensor
of size 1). For more information on the semantics of the arguments of of size 1). For more information on the semantics of the arguments of
``T.grad`` and details about the implementation, see :ref:`this <libdoc_gradient>`. ``T.grad`` and details about the implementation, see :ref:`this <libdoc_gradient>`.
...@@ -227,7 +227,7 @@ variables. Shared variables can be used in symbolic expressions just like ...@@ -227,7 +227,7 @@ variables. Shared variables can be used in symbolic expressions just like
the objects returned by ``dmatrices(...)`` but they also have a ``.value`` the objects returned by ``dmatrices(...)`` but they also have a ``.value``
property that defines the value taken by this symbolic variable in *all* the property that defines the value taken by this symbolic variable in *all* the
functions that use it. It is called a *shared* variable because its value is functions that use it. It is called a *shared* variable because its value is
shared between many functions. We'll come back to this soon. shared between many functions. We will come back to this soon.
The other new thing in this code is the ``updates`` parameter of function. The other new thing in this code is the ``updates`` parameter of function.
The updates is a list of pairs of the form (shared-variable, new expression). The updates is a list of pairs of the form (shared-variable, new expression).
...@@ -314,7 +314,7 @@ numpy, though also not too complicated. ...@@ -314,7 +314,7 @@ numpy, though also not too complicated.
The way to think about putting randomness into Theano's computations is The way to think about putting randomness into Theano's computations is
to put random variables in your graph. Theano will allocate a numpy to put random variables in your graph. Theano will allocate a numpy
RandomStream object (a random number generator) for each such RandomStream object (a random number generator) for each such
variable, and draw from it as necessary. I'll call this sort of variable, and draw from it as necessary. We will call this sort of
sequence of random numbers a *random stream*. *Random streams* are at sequence of random numbers a *random stream*. *Random streams* are at
their core shared variables, so the observations on shared variables their core shared variables, so the observations on shared variables
hold here as well. hold here as well.
......
...@@ -5,17 +5,17 @@ ...@@ -5,17 +5,17 @@
Tutorial Tutorial
======== ========
Let's start an interactive session (e.g. ``python`` or ``ipython``) and import Theano. Let us start an interactive session (e.g. ``python`` or ``ipython``) and import Theano.
>>> from theano import * >>> from theano import *
Many of symbols you will need to use are in the ``tensor`` subpackage Many of symbols you will need to use are in the ``tensor`` subpackage
of Theano. Let's import that subpackage under a handy name. I like of Theano. Let's import that subpackage under a handy name like
``T`` (and many tutorials use this convention). ``T`` (many tutorials use this convention).
>>> import theano.tensor as T >>> import theano.tensor as T
If that worked you're ready for the tutorial, otherwise check your If that worked you are ready for the tutorial, otherwise check your
installation (see :ref:`install`). installation (see :ref:`install`).
Throughout the tutorial, bear in mind that there is a :ref:`glossary` to help Throughout the tutorial, bear in mind that there is a :ref:`glossary` to help
......
...@@ -74,7 +74,7 @@ output. You can now print the name of the op that is applied to get ...@@ -74,7 +74,7 @@ output. You can now print the name of the op that is applied to get
'Elemwise{mul,no_inplace}' 'Elemwise{mul,no_inplace}'
So a elementwise multiplication is used to compute ``y``. This So a elementwise multiplication is used to compute ``y``. This
muliplication is done between the inputs muliplication is done between the inputs:
>>> len(y.owner.inputs) >>> len(y.owner.inputs)
2 2
...@@ -97,7 +97,7 @@ same shape as x. This is done by using the op ``DimShuffle`` : ...@@ -97,7 +97,7 @@ same shape as x. This is done by using the op ``DimShuffle`` :
[2.0] [2.0]
Starting from this graph structure is easy to understand how Starting from this graph structure it is easy to understand how
*automatic differentiation* is done, or how the symbolic relations *automatic differentiation* is done, or how the symbolic relations
can be optimized for performance or stability. can be optimized for performance or stability.
...@@ -108,11 +108,11 @@ Automatic Differentiation ...@@ -108,11 +108,11 @@ Automatic Differentiation
Having the graph structure, computing automatic differentiation is Having the graph structure, computing automatic differentiation is
simple. The only thing :func:`tensor.grad` has to do is to traverse the simple. The only thing :func:`tensor.grad` has to do is to traverse the
graph from the outputs back towards the inputs through all :ref:`apply` graph from the outputs back towards the inputs through all :ref:`apply`
nodes ( :ref:`apply` nodes are those who define what computations the nodes (:ref:`apply` nodes are those that define which computations the
graph does). For each such :ref:`apply` node, its :ref:`op` defines graph does). For each such :ref:`apply` node, its :ref:`op` defines
how to compute the gradient of the node's outputs with respect to its how to compute the gradient of the node's outputs with respect to its
inputs. Note that if an :ref:`op` does not provide this information, inputs. Note that if an :ref:`op` does not provide this information,
it is assumed that the gradient does not defined. it is assumed that the gradient is not defined.
Using the Using the
`chain rule <http://en.wikipedia.org/wiki/Chain_rule>`_ `chain rule <http://en.wikipedia.org/wiki/Chain_rule>`_
these gradients can be composed in order to obtain the expression of the these gradients can be composed in order to obtain the expression of the
...@@ -127,8 +127,8 @@ When compiling a Theano function, what you give to the ...@@ -127,8 +127,8 @@ When compiling a Theano function, what you give to the
(starting from the outputs variables you can traverse the graph up to (starting from the outputs variables you can traverse the graph up to
the input variables). While this graph structure shows how to compute the input variables). While this graph structure shows how to compute
the output from the input, it also offers the posibility to improve the the output from the input, it also offers the posibility to improve the
the way this computation is carried out. The way optimizations work in way this computation is carried out. The way optimizations work in
Theano is by indentifying and replacing certain patterns in the graph Theano is by identifying and replacing certain patterns in the graph
with other specialized patterns that produce the same results but are either with other specialized patterns that produce the same results but are either
faster or more stable. Optimizations can also detect faster or more stable. Optimizations can also detect
identical subgraphs and ensure that the same values are not computed identical subgraphs and ensure that the same values are not computed
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论