Skip to content
项目
群组
代码片段
帮助
当前项目
正在载入...
登录 / 注册
切换导航面板
P
pytensor
项目
项目
详情
活动
周期分析
仓库
仓库
文件
提交
分支
标签
贡献者
图表
比较
统计图
议题
0
议题
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
CI / CD
CI / CD
流水线
作业
日程
统计图
Wiki
Wiki
代码片段
代码片段
成员
成员
折叠边栏
关闭边栏
活动
图像
聊天
创建新问题
作业
提交
问题看板
Open sidebar
testgroup
pytensor
Commits
dba9a7b7
提交
dba9a7b7
authored
3月 30, 2009
作者:
Olivier Breuleux
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
some little additions
上级
112b3b0c
隐藏空白字符变更
内嵌
并排
正在显示
4 个修改的文件
包含
46 行增加
和
29 行删除
+46
-29
module_vs_op.txt
doc/sandbox/module_vs_op.txt
+0
-0
debug_faq.txt
doc/topics/debug_faq.txt
+9
-6
index.txt
doc/topics/index.txt
+0
-1
profilemode.txt
doc/topics/profilemode.txt
+37
-22
没有找到文件。
doc/
topics
/module_vs_op.txt
→
doc/
sandbox
/module_vs_op.txt
浏览文件 @
dba9a7b7
File moved
doc/topics/debug_faq.txt
浏览文件 @
dba9a7b7
...
@@ -56,8 +56,9 @@ something that you're not seeing.
...
@@ -56,8 +56,9 @@ something that you're not seeing.
I wrote a new optimization, but it's not getting used...
I wrote a new optimization, but it's not getting used...
---------------------------------------------------------
---------------------------------------------------------
Remember that you have to register optimizations with the OptDb, for them to get
Remember that you have to register optimizations with the :ref:`optdb`
used by the normal modes like FAST_COMPILE, FAST_RUN, and DEBUG_MODE.
for them to get used by the normal modes like FAST_COMPILE, FAST_RUN,
and DEBUG_MODE.
I wrote a new optimization, and it changed my results even though I'm pretty sure it is correct.
I wrote a new optimization, and it changed my results even though I'm pretty sure it is correct.
...
@@ -71,11 +72,13 @@ something that you're not seeing.
...
@@ -71,11 +72,13 @@ something that you're not seeing.
The function I compiled is too slow, what's up?
The function I compiled is too slow, what's up?
-----------------------------------------------
-----------------------------------------------
First, make sure you're running in FAST_RUN mode, by passing ``mode='FAST_RUN'``
First, make sure you're running in FAST_RUN mode, by passing
to ``theano.function`` or ``theano.make``.
``mode='FAST_RUN'`` to ``theano.function`` or ``theano.make``. Some
operations have excruciatingly slow Python implementations and that
can negatively effect the performance of FAST_COMPILE.
Second, try the theano :ref:`profilemode`. This will tell you which
Apply nodes,
Second, try the theano :ref:`profilemode`. This will tell you which
and which Ops are eating up your CPU cycles.
Apply nodes,
and which Ops are eating up your CPU cycles.
.. _faq_wraplinker:
.. _faq_wraplinker:
...
...
doc/topics/index.txt
浏览文件 @
dba9a7b7
...
@@ -14,6 +14,5 @@ Topics
...
@@ -14,6 +14,5 @@ Topics
profilemode
profilemode
debugmode
debugmode
debug_faq
debug_faq
module_vs_op
randomstreams
randomstreams
doc/topics/profilemode.txt
浏览文件 @
dba9a7b7
...
@@ -17,20 +17,31 @@ First create a ProfileMode instance.
...
@@ -17,20 +17,31 @@ First create a ProfileMode instance.
>>> from theano import ProfileMode
>>> from theano import ProfileMode
>>> profmode = theano.ProfileMode(optimizer='fast_run', linker=theano.gof.OpWiseCLinker())
>>> profmode = theano.ProfileMode(optimizer='fast_run', linker=theano.gof.OpWiseCLinker())
The ProfileMode constructor takes as input an optimizer and a linker. Which optimizer
The ProfileMode constructor takes as input an optimizer and a
and linker to use will depend on the application. For example, a user wanting
linker. Which optimizer and linker to use will depend on the
to profile the Python implementation only, should use the gof.PerformLinker (or
application. For example, a user wanting to profile the Python
"py" for short). On the other hand, a user wanting to profile his graph using
implementation only, should use the gof.PerformLinker (or "py" for
c-implementations wherever possible should use the ``gof.OpWiseCLinker`` (or "c|py").
short). On the other hand, a user wanting to profile his graph using C
implementations wherever possible should use the ``gof.OpWiseCLinker``
(or "c|py").
In the same manner, modifying which optimizer is passed to ProfileMode
In the same manner, modifying which optimizer is passed to ProfileMode
will decide which optimizations are applied to the graph, prior to
will decide which optimizations are applied to the graph, prior to
profiling. Changing the optimizer should be especially useful when developing
profiling. Changing the optimizer should be especially useful when
new graph optimizations, in order to evaluate their impact on performance.
developing new graph optimizations, in order to evaluate their impact
on performance. Also keep in mind that optimizations might change the
Note that most users will want to use ProfileMode to optimize their graph and
computation graph a lot, meaning that you might not recognize some of
find where most of the computation time is being spent. In this context,
the operations that are profiled (you did not use them explicitly but
'fast_run' optimizer and ``gof.OpWiseCLinker`` are the most appropriate choices.
an optimizer decided to use it to improve performance or numerical
stability). If you cannot easily relate the output of ProfileMode with
the computations you defined, you might want to try setting optimizer
to None (but keep in mind the computations will be slower than if they
were optimized).
Note that most users will want to use ProfileMode to optimize their
graph and find where most of the computation time is being spent. In
this context, 'fast_run' optimizer and ``gof.OpWiseCLinker`` are the
most appropriate choices.
Compiling your Graph with ProfileMode
Compiling your Graph with ProfileMode
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...
@@ -107,16 +118,20 @@ generates the following output:
...
@@ -107,16 +118,20 @@ generates the following output:
"""
"""
The summary has two components to it. In the first section called the Apply-wise
The summary has two components to it. In the first section called the
summary, timing information is provided for the worst offending Apply nodes. This
Apply-wise summary, timing information is provided for the worst
corresponds to individual nodes within your graph which take the longest to
offending Apply nodes. This corresponds to individual Op applications
execute. In the second portion, the Op-wise summary, the execution time of
within your graph which take the longest to execute (so if you use
all Apply nodes executing the same Op are grouped together and the total
``dot`` twice, you will see two entries there). In the second portion,
execution time per Op is shown.
the Op-wise summary, the execution time of all Apply nodes executing
the same Op are grouped together and the total execution time per Op
is shown (so if you use ``dot`` twice, you will see only one entry
there corresponding to the sum of the time spent in each of them).
Note that the ProfileMode also shows which Ops were running a c implementation.
Note that the ProfileMode also shows which Ops were running a c
implementation.
Developers wishing to optimize the performance of their graph, should
focus on the
Developers wishing to optimize the performance of their graph, should
worst offending Ops. If no c-implementation exists for this op, consider writing
focus on the worst offending Ops. If no C implementation exists for
a c-implementation yourself or use the mailing list, to suggest that a c-implementation
this op, consider writing a C implementation yourself or use the
be provided.
mailing list, to suggest that a C implementation
be provided.
编写
预览
Markdown
格式
0%
重试
或
添加新文件
添加附件
取消
您添加了
0
人
到此讨论。请谨慎行事。
请先完成此评论的编辑!
取消
请
注册
或者
登录
后发表评论