提交 c8b10686 authored 作者: Joseph Turian's avatar Joseph Turian

Merged more changes into basic tutorial

上级 61e31ecc
......@@ -6,3 +6,7 @@ Computation of the Gradient
===========================
WRITEME
Describe what is happening in general when you compute the gradient
Give examples with varying shapes
......@@ -4,7 +4,7 @@ Making arithmetic Ops on double
===============================
Now that we have a ``double`` type, we have yet to use it to perform
computations. We'll start with defining multiplication.
computations. We'll start by defining multiplication.
What is an Op?
......@@ -16,12 +16,12 @@ function definition in most programming languages. From a list of
input :ref:`Results <result>` and an Op, you can build an :ref:`apply`
node representing the application of the Op to the inputs.
It is important to understand the distinction between the definition
of a function (an Op) and the application of a function (an Apply
node). If you were to interpret the Python language using Theano's
It is important to understand the distinction between an Op (the
definition of a function) and an Apply node (the application of a
function). If you were to interpret the Python language using Theano's
structures, code going like ``def f(x): ...`` would produce an Op for
``f`` whereas code like ``a = f(x)`` or ``g(f(4), 5)`` would produce
an Apply node involving the ``f`` Op.
``f`` whereas code like ``a = f(x)`` or ``g(f(4), 5)`` would produce an
Apply node involving the ``f`` Op.
......
......@@ -25,10 +25,11 @@ array(28.4)
Let's break this down into several steps. The first step is to define
two symbols, or Results, representing the quantities that you want to
add. Note that from now on, we will use the term :term:`Result` to
mean "symbol" (in other words, ``x``, ``y``, ``z`` are all Result
objects).
two symbols, or Results, representing the quantities that you want
to add. Note that from now on, we will use the term :term:`Result`
to mean "symbol" (in other words, ``x``, ``y``, ``z`` are all Result
objects). The output of the function ``f`` is a :api:`numpy.ndarray`
with zero dimensions.
If you are following along and typing into an interpreter, you may have
noticed that there was a slight delay in executing the ``function``
......@@ -80,7 +81,7 @@ The second step is to combine ``x`` and ``y`` into their sum ``z``:
``z`` is yet another :term:`Result` which represents the addition of
``x`` and ``y``. You can use the :api:`pp <theano.printing.pp>`
function to print out the computation associated to ``z``.
function to pretty-print out the computation associated to ``z``.
>>> print pp(z)
x + y
......@@ -146,9 +147,9 @@ with numpy arrays may be found :ref:`here <predefinedtypes>`.
.. note::
Watch out for the distinction between 32 and 64 bit integers (i
prefix vs the l prefix) and between 32 and 64 bit floats (f prefix
vs the d prefix).
You the user---not the system architecture---choose whether your
program will use 32- or 64-bit integers (i prefix vs the l prefix)
and floats (f prefix vs the d prefix).
**Next:** `More examples`_
......
......@@ -61,7 +61,7 @@ Theano supports functions with multiple outputs. For example, we can
compute the :term:`elementwise` difference, absolute difference, and
squared difference between two matrices ``x`` and ``y`` at the same time:
>>> x, y = T.dmatrices('xy')
>>> x, y = T.dmatrices('x', 'y')
>>> diff = x - y
>>> abs_diff = abs(diff)
>>> diff_squared = diff**2
......@@ -96,12 +96,22 @@ Here is code to compute this gradient:
>>> x = T.dscalar('x')
>>> y = x**2
>>> gy = T.grad(y, x)
>>> pp(gy)
'fill(x ** 2, 1.0) * 2 * x ** (2 - 1)'
>>> f = function([x], gy)
>>> f(4)
array(8.0)
>>> f(94.2)
array(188.40000000000001)
In the example above, we can see from ``pp(gw)`` that we are computing
the correct symbolic gradient.
``fill(x ** 2, 1.0)`` means to make a matrix of the same shape as ``x **
2`` and fill it with 1.0.
.. note::
The optimizer will simplify the symbolic gradient expression.
We can also compute the gradient of complex expressions such as the
logistic function defined above. It turns out that the derivative of the
logistic is: :math:`ds(x)/dx = s(x) \cdot (1 - s(x))`.
......@@ -141,7 +151,7 @@ Let's say you want to define a function that adds two numbers, except
that if you only provide one number, the other input is assumed to be
one. You can do it like this:
>>> x, y = T.dscalars('xy')
>>> x, y = T.dscalars('x', 'y')
>>> z = x + y
>>> f = function([x, (y, 1)], z)
>>> f(33)
......@@ -153,6 +163,26 @@ The syntax is that if one of the elements in the list of inputs is a
pair, the input is the first element of the pair and the second
element is its default value. Here ``y``'s default value is set to 1.
Inputs with default values should (must?) follow inputs without default
values. There can be multiple inputs with default values. Defaults can
be set positionally or by name, as in standard Python:
>>> x, y, w = T.dscalars('x', 'y', 'w')
>>> z = (x + y) * w
>>> f = function([x, (y, 1), (w, 2)], z)
>>> f(33)
array(68.0)
>>> f(33, 2)
array(70.0)
>>> f(33, 0, 1)
array(33.0)
>>> f(33, w=1)
array(34.0)
>>> f(33, w=1, y=0)
array(33.0)
>>> f(33, w=1, 2)
<type 'exceptions.SyntaxError'>: non-keyword arg after keyword arg (<ipython console>, line 1)
.. _functionstateexample:
......@@ -173,13 +203,17 @@ First let's define the accumulator function:
>>> accumulator = function([(inc, 1), ((state, new_state), 0)], new_state)
The first argument is a pair. As we saw in the previous section, this
means that ``inc`` is an input with a default value of 1. The
second argument has syntax that creates an internal state or
closure. The syntax is ``((state_result, new_state_result),
initial_value)``. What this means is that every time ``accumulator``
is called, the value of the internal ``state`` will be replaced
by the value computed as ``new_state``. In this case, the state will
be replaced by the result of incrementing it by ``inc``.
means that ``inc`` is an input with a default value of 1. The second
argument has syntax that creates an internal state. The syntax is
``((state_result, new_state_result), initial_value)``.
The internal storage associated with ``state_result`` is initialized to
``initial_value``. Every time ``accumulator`` is called, the value
of the internal ``state`` will be replaced by the value computed as
``new_state``. In this case, the state will be replaced by the result
of incrementing it by ``inc``.
We recommend (insist?) that internl state arguments occur after any
plain arguments and arguments with default values.
There is no limit to how many states you can have. You can add an
arbitrary number of elements to the input list which correspond to the
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论