Skip to content
项目
群组
代码片段
帮助
当前项目
正在载入...
登录 / 注册
切换导航面板
P
pytensor
项目
项目
详情
活动
周期分析
仓库
仓库
文件
提交
分支
标签
贡献者
图表
比较
统计图
议题
0
议题
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
CI / CD
CI / CD
流水线
作业
日程
统计图
Wiki
Wiki
代码片段
代码片段
成员
成员
折叠边栏
关闭边栏
活动
图像
聊天
创建新问题
作业
提交
问题看板
Open sidebar
testgroup
pytensor
Commits
1aa76646
提交
1aa76646
authored
2月 01, 2016
作者:
Francesco Visin
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
Merge graphstructures from tutorial into extending
上级
eba65e5e
显示空白字符变更
内嵌
并排
正在显示
6 个修改的文件
包含
146 行增加
和
304 行删除
+146
-304
graphstructures.txt
doc/extending/graphstructures.txt
+142
-116
symbolic_graph_opt.png
doc/extending/pics/symbolic_graph_opt.png
+0
-0
symbolic_graph_unopt.png
doc/extending/pics/symbolic_graph_unopt.png
+0
-0
glossary.txt
doc/glossary.txt
+4
-4
index.txt
doc/tutorial/index.txt
+0
-1
symbolic_graphs.txt
doc/tutorial/symbolic_graphs.txt
+0
-183
没有找到文件。
doc/extending/graphstructures.txt
浏览文件 @
1aa76646
...
...
@@ -5,16 +5,28 @@
Graph Structures
================
Theano represents symbolic mathematical computations as graphs. These
graphs are composed of interconnected :ref:`apply` and :ref:`variable`
nodes. They are associated to *function application* and *data*,
respectively. Operations are represented by :ref:`op` instances and data
types are represented by :ref:`type` instances. Here is a piece of code
and a diagram showing the structure built by that piece of code. This
should help you understand how these pieces fit together:
Debugging or profiling code written in Theano is not that simple if you
do not know what goes on under the hood. This chapter is meant to
introduce you to a required minimum of the inner workings of Theano.
The first step in writing Theano code is to write down all mathematical
relations using symbolic placeholders (**variables**). When writing down
these expressions you use operations like ``+``, ``-``, ``**``,
``sum()``, ``tanh()``. All these are represented internally as **ops**.
An *op* represents a certain computation on some type of inputs
producing some type of output. You can see it as a *function definition*
in most programming languages.
Theano represents symbolic mathematical computations as graphs. These
graphs are composed of interconnected :ref:`apply`, :ref:`variable` and
:ref:`op` nodes. *apply* node represents the application of an *op* to some
*variables*. It is important to draw the difference between the
definition of a computation represented by an *op* and its application
to some actual data which is represented by the *apply* node.
Furthermore, data types are represented by :ref:`type` instances. Here is a
piece of code and a diagram showing the structure built by that piece of code.
This should help you understand how these pieces fit together:
-----------------------
**Code**
...
...
@@ -28,12 +40,14 @@ should help you understand how these pieces fit together:
**Diagram**
.. _tutorial-graphfigure:
.. image:: apply.png
:align: center
-----------------------
Arrows represent references to the Python objects pointed at. The blue
box is an :ref:`
apply` node. Red boxes are :ref:`v
ariable` nodes. Green
box is an :ref:`
Apply` node. Red boxes are :ref:`V
ariable` nodes. Green
circles are :ref:`Ops <op>`. Purple boxes are :ref:`Types <type>`.
.. TODO
...
...
@@ -58,110 +72,52 @@ Note that the ``Apply`` instance's outputs points to
``z``, and ``z.owner`` points back to the ``Apply`` instance.
An explicit example
===================
In this example we will compare two ways of defining the same graph.
First, a short bit of code will build an expression (graph) the *normal* way, with most of the
graph construction being done automatically.
Second, we will walk through a longer re-coding of the same thing
without any shortcuts, that will make the graph construction very explicit.
**Short example**
This is what you would normally type:
.. testcode::
# create 3 Variables with owner = None
x = T.matrix('x')
y = T.matrix('y')
z = T.matrix('z')
# create 2 Variables (one for 'e', one intermediate for y*z)
# create 2 Apply instances (one for '+', one for '*')
e = x + y * z
**Long example**
This is what you would type to build the graph explicitly:
.. testcode::
from theano.tensor import add, mul, Apply, Variable, Constant, TensorType
# Instantiate a type that represents a matrix of doubles
float64_matrix = TensorType(dtype='float64', # double
broadcastable=(False, False)) # matrix
# We make the Variable instances we need.
x = Variable(type=float64_matrix, name='x')
y = Variable(type=float64_matrix, name='y')
z = Variable(type=float64_matrix, name='z')
# This is the Variable that we want to symbolically represents y*z
mul_variable = Variable(type=float64_matrix)
assert mul_variable.owner is None
# Instantiate a symbolic multiplication
node_mul = Apply(op=mul,
inputs=[y, z],
outputs=[mul_variable])
# Fields 'owner' and 'index' are set by Apply
assert mul_variable.owner is node_mul
# 'index' is the position of mul_variable in mode_mul's outputs
assert mul_variable.index == 0
# This is the Variable that we want to symbolically represents x+(y*z)
add_variable = Variable(type=float64_matrix)
assert add_variable.owner is None
# Instantiate a symbolic addition
node_add = Apply(op=add,
inputs=[x, mul_variable],
outputs=[add_variable])
# Fields 'owner' and 'index' are set by Apply
assert add_variable.owner is node_add
assert add_variable.index == 0
e = add_variable
# We have access to x, y and z through pointers
assert e.owner.inputs[0] is x
assert e.owner.inputs[1] is mul_variable
assert e.owner.inputs[1].owner.inputs[0] is y
assert e.owner.inputs[1].owner.inputs[1] is z
Note how the call to ``Apply`` modifies the ``owner`` and ``index``
fields of the :ref:`Variables <variable>` passed as outputs to point to
itself and the rank they occupy in the output list. This whole
machinery builds a DAG (Directed Acyclic Graph) representing the
computation, a graph that Theano can compile and optimize.
Automatic wrapping
------------------
Traversing the graph
====================
All nodes in the graph must be instances of ``Apply`` or ``Result``, but
``<Op subclass>.make_node()`` typically wraps constants to satisfy those
constraints. For example, the :func:`tensor.add`
Op instance is written so that:
The graph can be traversed starting from outputs (the result of some
computation) down to its inputs using the owner field.
Take for example the following code:
.. testcode::
e = T.dscalar('x') + 1
builds the following graph:
>>> import theano
>>> x = theano.tensor.dmatrix('x')
>>> y = x * 2.
If you enter ``type(y.owner)`` you get ``<class 'theano.gof.graph.Apply'>``,
which is the apply node that connects the op and the inputs to get this
output. You can now print the name of the op that is applied to get
*y*:
>>> y.owner.op.name
'Elemwise{mul,no_inplace}'
Hence, an elementwise multiplication is used to compute *y*. This
multiplication is done between the inputs:
>>> len(y.owner.inputs)
2
>>> y.owner.inputs[0]
x
>>> y.owner.inputs[1]
DimShuffle{x,x}.0
Note that the second input is not 2 as we would have expected. This is
because 2 was first :term:`broadcasted <broadcasting>` to a matrix of
same shape as *x*. This is done by using the op ``DimShuffle`` :
>>> type(y.owner.inputs[1])
<class 'theano.tensor.var.TensorVariable'>
>>> type(y.owner.inputs[1].owner)
<class 'theano.gof.graph.Apply'>
>>> y.owner.inputs[1].owner.op # doctest: +SKIP
<theano.tensor.elemwise.DimShuffle object at 0x106fcaf10>
>>> y.owner.inputs[1].owner.inputs
[TensorConstant{2.0}]
.. testcode::
node = Apply(op=add,
inputs=[Variable(type=T.dscalar, name='x'),
Constant(type=T.lscalar, data=1)],
outputs=[Variable(type=T.dscalar)])
e = node.outputs[0]
Starting from this graph structure it is easier to understand how
*automatic differentiation* proceeds and how the symbolic relations
can be *optimized* for performance or stability.
Graph Structures
...
...
@@ -224,7 +180,7 @@ An Apply instance can be created by calling ``gof.Apply(op, inputs, outputs)``.
.. _op:
--
Op
--
...
...
@@ -242,16 +198,13 @@ structures, code going like ``def f(x): ...`` would produce an Op for
Apply node involving the ``f`` Op.
.. index::
single: Type
single: graph construct; Type
.. _type:
----
Type
----
...
...
@@ -297,7 +250,6 @@ Theano Type.
--------
Variable
--------
...
...
@@ -426,3 +378,77 @@ Sum{acc_dtype=float64} [id A] '' 1
>>> client
('output', 0)
>>> assert f.maker.fgraph.outputs[client[1]] is var
Automatic Differentiation
=========================
Having the graph structure, computing automatic differentiation is
simple. The only thing :func:`tensor.grad` has to do is to traverse the
graph from the outputs back towards the inputs through all *apply*
nodes (*apply* nodes are those that define which computations the
graph does). For each such *apply* node, its *op* defines
how to compute the *gradient* of the node's outputs with respect to its
inputs. Note that if an *op* does not provide this information,
it is assumed that the *gradient* is not defined.
Using the
`chain rule <http://en.wikipedia.org/wiki/Chain_rule>`_
these gradients can be composed in order to obtain the expression of the
*gradient* of the graph's output with respect to the graph's inputs .
A following section of this tutorial will examine the topic of :ref:`differentiation<tutcomputinggrads>`
in greater detail.
Optimizations
=============
When compiling a Theano function, what you give to the
:func:`theano.function <function.function>` is actually a graph
(starting from the output variables you can traverse the graph up to
the input variables). While this graph structure shows how to compute
the output from the input, it also offers the possibility to improve the
way this computation is carried out. The way optimizations work in
Theano is by identifying and replacing certain patterns in the graph
with other specialized patterns that produce the same results but are either
faster or more stable. Optimizations can also detect
identical subgraphs and ensure that the same values are not computed
twice or reformulate parts of the graph to a GPU specific version.
For example, one (simple) optimization that Theano uses is to replace
the pattern :math:`\frac{xy}{y}` by *x.*
Further information regarding the optimization
:ref:`process<optimization>` and the specific :ref:`optimizations<optimizations>` that are applicable
is respectively available in the library and on the entrance page of the documentation.
**Example**
Symbolic programming involves a change of paradigm: it will become clearer
as we apply it. Consider the following example of optimization:
>>> import theano
>>> a = theano.tensor.vector("a") # declare symbolic variable
>>> b = a + a ** 10 # build symbolic expression
>>> f = theano.function([a], b) # compile function
>>> print(f([0, 1, 2])) # prints `array([0,2,1026])`
[ 0. 2. 1026.]
>>> theano.printing.pydotprint(b, outfile="./pics/symbolic_graph_unopt.png", var_with_name_simple=True) # doctest: +SKIP
The output file is available at ./pics/symbolic_graph_unopt.png
>>> theano.printing.pydotprint(f, outfile="./pics/symbolic_graph_opt.png", var_with_name_simple=True) # doctest: +SKIP
The output file is available at ./pics/symbolic_graph_opt.png
We used :func:`theano.printing.pydotprint` to visualize the optimized graph
(right), which is much more compact than the unoptimized graph (left).
.. |g1| image:: ./pics/symbolic_graph_unopt.png
:width: 500 px
.. |g2| image:: ./pics/symbolic_graph_opt.png
:width: 500 px
================================ ====================== ================================
Unoptimized graph Optimized graph
================================ ====================== ================================
|g1| |g2|
================================ ====================== ================================
doc/
tutorial
/pics/symbolic_graph_opt.png
→
doc/
extending
/pics/symbolic_graph_opt.png
浏览文件 @
1aa76646
File moved
doc/
tutorial
/pics/symbolic_graph_unopt.png
→
doc/
extending
/pics/symbolic_graph_unopt.png
浏览文件 @
1aa76646
File moved
doc/glossary.txt
浏览文件 @
1aa76646
...
...
@@ -63,7 +63,7 @@ Glossary
then compiling them with :term:`theano.function`.
See also :term:`Variable`, :term:`Op`, :term:`Apply`, and
:term:`Type`, or read more about :ref:`
tutorial_
graphstructures`.
:term:`Type`, or read more about :ref:`graphstructures`.
Destructive
An :term:`Op` is destructive (of particular input[s]) if its
...
...
@@ -108,7 +108,7 @@ Glossary
are provided with Theano, but you can add more.
See also :term:`Variable`, :term:`Type`, and :term:`Apply`,
or read more about :ref:`
tutorial_
graphstructures`.
or read more about :ref:`graphstructures`.
Optimizer
An instance of :class:`Optimizer`, which has the capacity to provide
...
...
@@ -141,7 +141,7 @@ Glossary
``.type`` attribute of a :term:`Variable`.
See also :term:`Variable`, :term:`Op`, and :term:`Apply`,
or read more about :ref:`
tutorial_
graphstructures`.
or read more about :ref:`graphstructures`.
Variable
The the main data structure you work with when using Theano.
...
...
@@ -153,7 +153,7 @@ Glossary
``x`` and ``y`` are both `Variables`, i.e. instances of the :class:`Variable` class.
See also :term:`Type`, :term:`Op`, and :term:`Apply`,
or read more about :ref:`
tutorial_
graphstructures`.
or read more about :ref:`graphstructures`.
View
Some Tensor Ops (such as Subtensor and Transpose) can be computed in
...
...
doc/tutorial/index.txt
浏览文件 @
1aa76646
...
...
@@ -56,7 +56,6 @@ Advanced configuration and debugging
.. toctree::
modes
symbolic_graphs
printing_drawing
debug_faq
nan_tutorial
...
...
doc/tutorial/symbolic_graphs.txt
deleted
100644 → 0
浏览文件 @
eba65e5e
.. _tutorial_graphstructures:
================
Graph Structures
================
Theano Graphs
=============
Debugging or profiling code written in Theano is not that simple if you
do not know what goes on under the hood. This chapter is meant to
introduce you to a required minimum of the inner workings of Theano.
For more detail see :ref:`extending`.
The first step in writing Theano code is to write down all mathematical
relations using symbolic placeholders (**variables**). When writing down
these expressions you use operations like ``+``, ``-``, ``**``,
``sum()``, ``tanh()``. All these are represented internally as **ops**.
An *op* represents a certain computation on some type of inputs
producing some type of output. You can see it as a *function definition*
in most programming languages.
Theano builds internally a graph structure composed of interconnected
**variable** nodes, **op** nodes and **apply** nodes. An
*apply* node represents the application of an *op* to some
*variables*. It is important to draw the difference between the
definition of a computation represented by an *op* and its application
to some actual data which is represented by the *apply* node. For more
detail about these building blocks refer to :ref:`variable`, :ref:`op`,
:ref:`apply`. Here is an example of a graph:
**Code**
.. testcode::
import theano.tensor as T
x = T.dmatrix('x')
y = T.dmatrix('y')
z = x + y
**Diagram**
.. _tutorial-graphfigure:
.. figure:: apply.png
:align: center
Interaction between instances of Apply (blue), Variable (red), Op (green),
and Type (purple).
.. # COMMENT
WARNING: hyper-links and ref's seem to break the PDF build when placed
into this figure caption.
Arrows in this figure represent references to the
Python objects pointed at. The blue
box is an :ref:`Apply` node. Red boxes are :ref:`Variable` nodes. Green
circles are :ref:`Ops <op>`. Purple boxes are :ref:`Types <type>`.
The graph can be traversed starting from outputs (the result of some
computation) down to its inputs using the owner field.
Take for example the following code:
>>> import theano
>>> x = theano.tensor.dmatrix('x')
>>> y = x * 2.
If you enter ``type(y.owner)`` you get ``<class 'theano.gof.graph.Apply'>``,
which is the apply node that connects the op and the inputs to get this
output. You can now print the name of the op that is applied to get
*y*:
>>> y.owner.op.name
'Elemwise{mul,no_inplace}'
Hence, an elementwise multiplication is used to compute *y*. This
multiplication is done between the inputs:
>>> len(y.owner.inputs)
2
>>> y.owner.inputs[0]
x
>>> y.owner.inputs[1]
DimShuffle{x,x}.0
Note that the second input is not 2 as we would have expected. This is
because 2 was first :term:`broadcasted <broadcasting>` to a matrix of
same shape as *x*. This is done by using the op ``DimShuffle`` :
>>> type(y.owner.inputs[1])
<class 'theano.tensor.var.TensorVariable'>
>>> type(y.owner.inputs[1].owner)
<class 'theano.gof.graph.Apply'>
>>> y.owner.inputs[1].owner.op # doctest: +SKIP
<theano.tensor.elemwise.DimShuffle object at 0x106fcaf10>
>>> y.owner.inputs[1].owner.inputs
[TensorConstant{2.0}]
Starting from this graph structure it is easier to understand how
*automatic differentiation* proceeds and how the symbolic relations
can be *optimized* for performance or stability.
Automatic Differentiation
=========================
Having the graph structure, computing automatic differentiation is
simple. The only thing :func:`tensor.grad` has to do is to traverse the
graph from the outputs back towards the inputs through all *apply*
nodes (*apply* nodes are those that define which computations the
graph does). For each such *apply* node, its *op* defines
how to compute the *gradient* of the node's outputs with respect to its
inputs. Note that if an *op* does not provide this information,
it is assumed that the *gradient* is not defined.
Using the
`chain rule <http://en.wikipedia.org/wiki/Chain_rule>`_
these gradients can be composed in order to obtain the expression of the
*gradient* of the graph's output with respect to the graph's inputs .
A following section of this tutorial will examine the topic of :ref:`differentiation<tutcomputinggrads>`
in greater detail.
Optimizations
=============
When compiling a Theano function, what you give to the
:func:`theano.function <function.function>` is actually a graph
(starting from the output variables you can traverse the graph up to
the input variables). While this graph structure shows how to compute
the output from the input, it also offers the possibility to improve the
way this computation is carried out. The way optimizations work in
Theano is by identifying and replacing certain patterns in the graph
with other specialized patterns that produce the same results but are either
faster or more stable. Optimizations can also detect
identical subgraphs and ensure that the same values are not computed
twice or reformulate parts of the graph to a GPU specific version.
For example, one (simple) optimization that Theano uses is to replace
the pattern :math:`\frac{xy}{y}` by *x.*
Further information regarding the optimization
:ref:`process<optimization>` and the specific :ref:`optimizations<optimizations>` that are applicable
is respectively available in the library and on the entrance page of the documentation.
**Example**
Symbolic programming involves a change of paradigm: it will become clearer
as we apply it. Consider the following example of optimization:
>>> import theano
>>> a = theano.tensor.vector("a") # declare symbolic variable
>>> b = a + a ** 10 # build symbolic expression
>>> f = theano.function([a], b) # compile function
>>> print(f([0, 1, 2])) # prints `array([0,2,1026])`
[ 0. 2. 1026.]
>>> theano.printing.pydotprint(b, outfile="./pics/symbolic_graph_unopt.png", var_with_name_simple=True) # doctest: +SKIP
The output file is available at ./pics/symbolic_graph_unopt.png
>>> theano.printing.pydotprint(f, outfile="./pics/symbolic_graph_opt.png", var_with_name_simple=True) # doctest: +SKIP
The output file is available at ./pics/symbolic_graph_opt.png
.. |g1| image:: ./pics/symbolic_graph_unopt.png
:width: 500 px
.. |g2| image:: ./pics/symbolic_graph_opt.png
:width: 500 px
We used :func:`theano.printing.pydotprint` to visualize the optimized graph
(right), which is much more compact than the unoptimized graph (left).
====================================================== =====================================================
Unoptimized graph Optimized graph
====================================================== =====================================================
|g1| |g2|
====================================================== =====================================================
编写
预览
Markdown
格式
0%
重试
或
添加新文件
添加附件
取消
您添加了
0
人
到此讨论。请谨慎行事。
请先完成此评论的编辑!
取消
请
注册
或者
登录
后发表评论