Skip to content
项目
群组
代码片段
帮助
当前项目
正在载入...
登录 / 注册
切换导航面板
P
pytensor
项目
项目
详情
活动
周期分析
仓库
仓库
文件
提交
分支
标签
贡献者
图表
比较
统计图
议题
0
议题
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
CI / CD
CI / CD
流水线
作业
日程
统计图
Wiki
Wiki
代码片段
代码片段
成员
成员
折叠边栏
关闭边栏
活动
图像
聊天
创建新问题
作业
提交
问题看板
Open sidebar
testgroup
pytensor
Commits
2dc97b65
提交
2dc97b65
authored
1月 15, 2010
作者:
Razvan Pascanu
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
forgot modes.txt
上级
6586354a
显示空白字符变更
内嵌
并排
正在显示
1 个修改的文件
包含
194 行增加
和
0 行删除
+194
-0
modes.txt
doc/tutorial/modes.txt
+194
-0
没有找到文件。
doc/tutorial/modes.txt
0 → 100644
浏览文件 @
2dc97b65
.. _using_modes:
===============================
Using different compiling modes
===============================
Mode
====
Everytime :ref:`theano.function <libdoc_compile_function>` is called
the symbolic relationships between the input and output Theano *variables*
are optimized and compiled. The way this compilation occurs
is controlled by the value of the ``mode`` parameter.
Theano defines the following modes by name:
- ``FAST_COMPILE``: Apply just a few optimizations, but use C op implementations where possible.
- ``FAST_RUN``: Apply all optimizations, and use C op implementations where possible.
- ``DEBUG_MODE``: Verify the correctness of all optimizations, and compare C and python
implementations. This mode can take much longer than the other modes,
but can identify many kinds of problems.
The default mode is typically ``FAST_RUN``, but it can be controlled via
the environment variable ``THEANO_DEFAULT_MODE``, which can in turn be
overridden by setting `theano.compile.mode.default_mode` directly,
which can in turn be overridden by passing the keyword argument to
:ref:`theano.function <libdoc_compile_function>`.
.. _using_debugmode:
Using DebugMode
===============
While normally you should use the ``FAST_RUN`` or ``FAST_COMPILE`` mode,
it is useful at first to run your code using the DebugMode
(available via ``mode='DEBUG_MODE'``). The DebugMode is designed to
do several self-checks and assertations that can help to diagnose
possible programming errors that can lead to incorect output. Note that
``DEBUG_MODE`` is much slower then ``FAST_RUN`` or ``FAST_COMPILE`` so
use it only during development, not when you luch 1000 process on a
cluster.
DebugMode is used as follows:
.. code-block:: python
x = theano.dvector('x')
f = theano.function(x, 10*x, mode='DEBUG_MODE')
f(5)
f(0)
f(7)
If any problem is detected, DebugMode will raise an exception according to
what went wrong, either at call time (e.g. ``f(5)``) or compile time (e.g
``f = theano.function(x, 10*x, mode='DEBUG_MODE')``). These exceptions
should *not* be ignored; talk to your local Theano guru or email the
users list if you cannot make the exception go away.
Some kinds of errors can only be detected for certain input value combinations.
In the example above, there is no way to guarantee that a future call to say,
``f(-1)`` won't cause a problem. DebugMode is not a silver bullet.
If you instantiate DebugMode using the constructor ``compile.DebugMode``
rather than the keyword ``DEBUG_MODE`` you can configure its behaviour via
constructor arguments. See :ref:`DebugMode <compile_debugMode>` for details.
The keyword version of DebugMode (which you get by using ``mode='DEBUG_MODE``)
is quite strict, and can raise several different Exception types. For a
list of possible exeption go here.
.. _using_profilemode:
ProfileMode
===========
Beside checking for errors, another important task is to profile your
code. For this Theano uses a special mode called ProfileMode which has
to be passed as an argument to :ref:`theano.function <libdoc_compile_function>`. Using the ProfileMode is a three-step process.
Creating a ProfileMode Instance
-------------------------------
First create a ProfileMode instance.
>>> from theano import ProfileMode
>>> profmode = theano.ProfileMode(optimizer='fast_run', linker=theano.gof.OpWiseCLinker())
The ProfileMode constructor takes as input an optimizer and a
linker. Which optimizer and linker to use will depend on the
application. For example, a user wanting to profile the Python
implementation only, should use the gof.PerformLinker (or "py" for
short). On the other hand, a user wanting to profile his graph using C
implementations wherever possible should use the ``gof.OpWiseCLinker``
(or "c|py"). For testing the speed of your code we would recommend
using the 'fast_run' optimizer and ``gof.OpWiseCLinker`` linker.
Compiling your Graph with ProfileMode
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Once the ProfileMode instance is created, simply compile your graph as you
would normally, by specifying the mode parameter.
>>> # with functions
>>> f = theano.function([input1,input2],[output1], mode=profmode)
>>> # with modules
>>> m = theano.Module()
>>> minst = m.make(mode=profmode)
Retrieving Timing Information
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Once your graph is compiled, simply run the program or operation you wish to
profile, then call ``profmode.print_summary()``. This will provide you with
the desired timing information, indicating where your graph is spending most
of its time.
This is best shown through an example.
Lets use the example of logistic
regression. (Code for this example is in the file
``benchmark/regression/regression.py``.)
Compiling the module with ProfileMode and calling ``profmode.print_summary()``
generates the following output:
.. code-block:: python
"""
ProfileMode.print_summary()
---------------------------
local_time 0.0749197006226 (Time spent running thunks)
Apply-wise summary: <fraction of local_time spent at this position> (<Apply position>, <Apply Op name>)
0.069 15 _dot22
0.064 1 _dot22
0.053 0 InplaceDimShuffle{x,0}
0.049 2 InplaceDimShuffle{1,0}
0.049 10 mul
0.049 6 Elemwise{ScalarSigmoid{output_types_preference=<theano.scalar.basic.transfer_type object at 0x171e650>}}[(0, 0)]
0.049 3 InplaceDimShuffle{x}
0.049 4 InplaceDimShuffle{x,x}
0.048 14 Sum{0}
0.047 7 sub
0.046 17 mul
0.045 9 sqr
0.045 8 Elemwise{sub}
0.045 16 Sum
0.044 18 mul
... (remaining 6 Apply instances account for 0.25 of the runtime)
Op-wise summary: <fraction of local_time spent on this kind of Op> <Op name>
0.139 * mul
0.134 * _dot22
0.092 * sub
0.085 * Elemwise{Sub{output_types_preference=<theano.scalar.basic.transfer_type object at 0x1779f10>}}[(0, 0)]
0.053 * InplaceDimShuffle{x,0}
0.049 * InplaceDimShuffle{1,0}
0.049 * Elemwise{ScalarSigmoid{output_types_preference=<theano.scalar.basic.transfer_type object at 0x171e650>}}[(0, 0)]
0.049 * InplaceDimShuffle{x}
0.049 * InplaceDimShuffle{x,x}
0.048 * Sum{0}
0.045 * sqr
0.045 * Sum
0.043 * Sum{1}
0.042 * Elemwise{Mul{output_types_preference=<theano.scalar.basic.transfer_type object at 0x17a0f50>}}[(0, 1)]
0.041 * Elemwise{Add{output_types_preference=<theano.scalar.basic.transfer_type object at 0x1736a50>}}[(0, 0)]
0.039 * Elemwise{Second{output_types_preference=<theano.scalar.basic.transfer_type object at 0x1736d90>}}[(0, 1)]
... (remaining 0 Ops account for 0.00 of the runtime)
(*) Op is running a c implementation
"""
The summary has two components to it. In the first section called the
Apply-wise summary, timing information is provided for the worst
offending Apply nodes. This corresponds to individual Op applications
within your graph which take the longest to execute (so if you use
``dot`` twice, you will see two entries there). In the second portion,
the Op-wise summary, the execution time of all Apply nodes executing
the same Op are grouped together and the total execution time per Op
is shown (so if you use ``dot`` twice, you will see only one entry
there corresponding to the sum of the time spent in each of them).
Note that the ProfileMode also shows which Ops were running a c
implementation.
编写
预览
Markdown
格式
0%
重试
或
添加新文件
添加附件
取消
您添加了
0
人
到此讨论。请谨慎行事。
请先完成此评论的编辑!
取消
请
注册
或者
登录
后发表评论