提交 dba9a7b7 authored 作者: Olivier Breuleux's avatar Olivier Breuleux

some little additions

上级 112b3b0c
...@@ -56,8 +56,9 @@ something that you're not seeing. ...@@ -56,8 +56,9 @@ something that you're not seeing.
I wrote a new optimization, but it's not getting used... I wrote a new optimization, but it's not getting used...
--------------------------------------------------------- ---------------------------------------------------------
Remember that you have to register optimizations with the OptDb, for them to get Remember that you have to register optimizations with the :ref:`optdb`
used by the normal modes like FAST_COMPILE, FAST_RUN, and DEBUG_MODE. for them to get used by the normal modes like FAST_COMPILE, FAST_RUN,
and DEBUG_MODE.
I wrote a new optimization, and it changed my results even though I'm pretty sure it is correct. I wrote a new optimization, and it changed my results even though I'm pretty sure it is correct.
...@@ -71,11 +72,13 @@ something that you're not seeing. ...@@ -71,11 +72,13 @@ something that you're not seeing.
The function I compiled is too slow, what's up? The function I compiled is too slow, what's up?
----------------------------------------------- -----------------------------------------------
First, make sure you're running in FAST_RUN mode, by passing ``mode='FAST_RUN'`` First, make sure you're running in FAST_RUN mode, by passing
to ``theano.function`` or ``theano.make``. ``mode='FAST_RUN'`` to ``theano.function`` or ``theano.make``. Some
operations have excruciatingly slow Python implementations and that
can negatively effect the performance of FAST_COMPILE.
Second, try the theano :ref:`profilemode`. This will tell you which Apply nodes, Second, try the theano :ref:`profilemode`. This will tell you which
and which Ops are eating up your CPU cycles. Apply nodes, and which Ops are eating up your CPU cycles.
.. _faq_wraplinker: .. _faq_wraplinker:
......
...@@ -14,6 +14,5 @@ Topics ...@@ -14,6 +14,5 @@ Topics
profilemode profilemode
debugmode debugmode
debug_faq debug_faq
module_vs_op
randomstreams randomstreams
...@@ -17,20 +17,31 @@ First create a ProfileMode instance. ...@@ -17,20 +17,31 @@ First create a ProfileMode instance.
>>> from theano import ProfileMode >>> from theano import ProfileMode
>>> profmode = theano.ProfileMode(optimizer='fast_run', linker=theano.gof.OpWiseCLinker()) >>> profmode = theano.ProfileMode(optimizer='fast_run', linker=theano.gof.OpWiseCLinker())
The ProfileMode constructor takes as input an optimizer and a linker. Which optimizer The ProfileMode constructor takes as input an optimizer and a
and linker to use will depend on the application. For example, a user wanting linker. Which optimizer and linker to use will depend on the
to profile the Python implementation only, should use the gof.PerformLinker (or application. For example, a user wanting to profile the Python
"py" for short). On the other hand, a user wanting to profile his graph using implementation only, should use the gof.PerformLinker (or "py" for
c-implementations wherever possible should use the ``gof.OpWiseCLinker`` (or "c|py"). short). On the other hand, a user wanting to profile his graph using C
implementations wherever possible should use the ``gof.OpWiseCLinker``
(or "c|py").
In the same manner, modifying which optimizer is passed to ProfileMode In the same manner, modifying which optimizer is passed to ProfileMode
will decide which optimizations are applied to the graph, prior to will decide which optimizations are applied to the graph, prior to
profiling. Changing the optimizer should be especially useful when developing profiling. Changing the optimizer should be especially useful when
new graph optimizations, in order to evaluate their impact on performance. developing new graph optimizations, in order to evaluate their impact
on performance. Also keep in mind that optimizations might change the
Note that most users will want to use ProfileMode to optimize their graph and computation graph a lot, meaning that you might not recognize some of
find where most of the computation time is being spent. In this context, the operations that are profiled (you did not use them explicitly but
'fast_run' optimizer and ``gof.OpWiseCLinker`` are the most appropriate choices. an optimizer decided to use it to improve performance or numerical
stability). If you cannot easily relate the output of ProfileMode with
the computations you defined, you might want to try setting optimizer
to None (but keep in mind the computations will be slower than if they
were optimized).
Note that most users will want to use ProfileMode to optimize their
graph and find where most of the computation time is being spent. In
this context, 'fast_run' optimizer and ``gof.OpWiseCLinker`` are the
most appropriate choices.
Compiling your Graph with ProfileMode Compiling your Graph with ProfileMode
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...@@ -107,16 +118,20 @@ generates the following output: ...@@ -107,16 +118,20 @@ generates the following output:
""" """
The summary has two components to it. In the first section called the Apply-wise The summary has two components to it. In the first section called the
summary, timing information is provided for the worst offending Apply nodes. This Apply-wise summary, timing information is provided for the worst
corresponds to individual nodes within your graph which take the longest to offending Apply nodes. This corresponds to individual Op applications
execute. In the second portion, the Op-wise summary, the execution time of within your graph which take the longest to execute (so if you use
all Apply nodes executing the same Op are grouped together and the total ``dot`` twice, you will see two entries there). In the second portion,
execution time per Op is shown. the Op-wise summary, the execution time of all Apply nodes executing
the same Op are grouped together and the total execution time per Op
is shown (so if you use ``dot`` twice, you will see only one entry
there corresponding to the sum of the time spent in each of them).
Note that the ProfileMode also shows which Ops were running a c implementation. Note that the ProfileMode also shows which Ops were running a c
implementation.
Developers wishing to optimize the performance of their graph, should focus on the Developers wishing to optimize the performance of their graph, should
worst offending Ops. If no c-implementation exists for this op, consider writing focus on the worst offending Ops. If no C implementation exists for
a c-implementation yourself or use the mailing list, to suggest that a c-implementation this op, consider writing a C implementation yourself or use the
be provided. mailing list, to suggest that a C implementation be provided.
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论