提交 171397d4 authored 作者: abergeron's avatar abergeron

Merge pull request #1632 from nouiz/canopy

Make Theano use the proper path on Windows with Canopy installed for all users. Mark ProfileMode as deprecated in the docs and document config.profile.
......@@ -21,7 +21,7 @@ Theano defines the following modes by name:
- ``'FAST_COMPILE'``: Apply just a few graph optimizations and only use Python implementations.
- ``'FAST_RUN'``: Apply all optimizations, and use C implementations where possible.
- ``'DebugMode'``: A mode for debuging. See :ref:`DebugMode <debugmode>` for details.
- ``'ProfileMode'``: A mode for profiling. See :ref:`ProfileMode <profilemode>` for details.
- ``'ProfileMode'``: Deprecated, use the Theano flag :attr:`config.profile`.
- ``'DEBUG_MODE'``: Deprecated. Use the string DebugMode.
- ``'PROFILE_MODE'``: Deprecated. Use the string ProfileMode.
......@@ -42,10 +42,6 @@ Reference
.. attribute:: FAST_RUN
.. attribute:: DEBUG_MODE
.. attribute:: PROFILE_MODE
.. class:: Mode(object)
Compilation is controlled by two attributes: the `optimizer` controls how
......
......@@ -14,6 +14,10 @@
Guide
=====
.. note::
ProfileMode is deprecated. Use :attr:`config.profile` instead.
To profile a Theano graph, a special mode called ProfileMode, must be passed as
an argument when compiling your graph. Using ProfileMode is a three-step
process.
......
......@@ -258,7 +258,8 @@ import theano and print the config variable, as in:
.. attribute:: mode
String value: 'Mode', 'ProfileMode', 'DebugMode', 'FAST_RUN', 'FAST_COMPILE'
String value: 'Mode', 'ProfileMode'(deprecated), 'DebugMode', 'FAST_RUN',
'FAST_COMPILE'
Default 'Mode'
......@@ -273,6 +274,8 @@ import theano and print the config variable, as in:
Do the vm/cvm linkers profile the execution time of Theano functions?
See :ref:`tut_profiling` for examples.
.. attribute:: profile_memory
Bool value: either True or False
......
......@@ -41,6 +41,7 @@ you out.
aliasing
shape_info
debug_faq
profiling
extending_theano
faq
python-memory-management
......@@ -137,7 +137,7 @@ Theano defines the following modes by name:
- ``'DebugMode``: Verify the correctness of all optimizations, and compare C and Python
implementations. This mode can take much longer than the other modes, but can identify
several kinds of problems.
- ``'ProfileMode'``: Same optimization as FAST_RUN, but print some profiling information
- ``'ProfileMode'``(deprecated): Same optimization as FAST_RUN, but print some profiling information.
The default mode is typically ``FAST_RUN``, but it can be controlled via
the configuration variable :attr:`config.mode`,
......@@ -150,7 +150,7 @@ short name Full constructor
``FAST_COMPILE`` ``compile.mode.Mode(linker='py', optimizer='fast_compile')`` Python implementations only, quick and cheap graph transformations
``FAST_RUN`` ``compile.mode.Mode(linker='cvm', optimizer='fast_run')`` C implementations where available, all available graph transformations.
``DebugMode`` ``compile.debugmode.DebugMode()`` Both implementations where available, all available graph transformations.
``ProfileMode`` ``compile.profilemode.ProfileMode()`` C implementations where available, all available graph transformations, print profile information.
``ProfileMode`` ``compile.profilemode.ProfileMode()`` Deprecated. C implementations where available, all available graph transformations, print profile information.
================= =============================================================== ===============================================================================
.. Note::
......@@ -180,7 +180,7 @@ c|py_nogc no yes "++" As c|py, but without gc
c no yes "+" Use only C code (if none available for an op, raise an error)
py yes yes "+++" Use only Python code
c&py [#cpy2]_ no yes "+++++" Use C and Python code
ProfileMode no no "++++" Compute some extra profiling info
ProfileMode no no "++++" (Deprecated) Compute some extra profiling info
DebugMode no yes VERY HIGH Make many checks on what Theano computes
============= ========= ================= ========= ===
......@@ -253,6 +253,11 @@ For more detail, see :ref:`DebugMode<debugmode>` in the library.
ProfileMode
===========
.. note::
ProfileMode is deprecated. Use :attr:`config.profile` instead.
Besides checking for errors, another important task is to profile your
code. For this Theano uses a special mode called ProfileMode which has
to be passed as an argument to :func:`theano.function <function.function>`.
......@@ -371,5 +376,5 @@ Finally, notice that the ``ProfileMode`` also shows which ops were running a C
implementation.
For more detail, see :ref:`ProfileMode<libdoc_compile_mode>` in the library.
For more detail, see :ref:`ProfileMode<profilemode>` in the library.
.. _tut_profiling:
=========================
Profiling Theano function
=========================
.. note::
This method replace the old ProfileMode. Do not use ProfileMode
anymore.
Besides checking for errors, another important task is to profile your
code. For this, you can use Theano flags and/or parameters which are
to be passed as an argument to :func:`theano.function <function.function>`.
The simplest way to profile Theano functions is to use the Theano
flags described below. When the process exits, they will cause the
information to be printed on stdout.
Using the ProfileMode is a three-step process.
Enabling the profiler is pretty easy. Just use the Theano flag
:attr:`config.profile`.
To enable the memory profiler use the Theano flag:
:attr:`config.profile_memory` in addition to :attr:`config.profile`.
To enable the profiling of Theano optimization phase, use the Theano
flag: :attr:`config.profile_optimizer` in addition to
:attr:`config.profile`.
You can use the Theano flags :attr:`profiling.n_apply`,
:attr:`profiling.n_ops` and :attr:`profiling.min_memory_size` to
modify the quantify of information printed.
The profiler will output one profile per Theano function and profile
that is the sum of the printed profile. Each profile contains 4
sections: global info, class info, Ops info and Apply node info.
In the global section, the "Message" is the name of the Theano
function. theano.function() has an optional parameter ``name`` that
defaults to None. Change it to something else to help you profile many
Theano functions. In that section, we also see the number of time the
function was called (1) and the total time spent in all those
calls. The time spent in Function.fn.__call__ and in thunks is useful
to help understand Theano overhead.
Also, we see the time spent in the two parts of the compilation
process: optimization(modify the graph to make it more stable/faster)
and the linking (compile c code and make the Python callable returned
by function).
The class, Ops and Apply nodes sections are the same information:
information about the Apply node that ran. The Ops section takes the
information from the Apply section and merge the Apply nodes that have
exactly the same op. If two Apply nodes in the graph have two Ops that
compare equal, they will be merged. Some Ops like Elemwise, will not
compare equal, if their parameters differ (the scalar being
executed). So the class section will merge more Apply nodes then the
Ops section.
Here is an example output when we disable some Theano optimizations to
give you a better idea of the difference between sections. With all
optimizations enabled, there would be only one op left in the graph.
to run the example:
THEANO_FLAGS=optimizer_excluding=fusion:inplace,profile=True python doc/tutorial/profiling_example.py
The output:
.. literalinclude:: profiling_example_out.txt
import numpy
import theano
x, y, z = theano.tensor.vectors('xyz')
f = theano.function([x, y, z], [(x + y + z) * 2])
xv = numpy.random.rand(10).astype(theano.config.floatX)
yv = numpy.random.rand(10).astype(theano.config.floatX)
zv = numpy.random.rand(10).astype(theano.config.floatX)
f(xv, yv, zv)
Function profiling
==================
Message: None
Time in 1 calls to Function.__call__: 5.698204e-05s
Time in Function.fn.__call__: 1.192093e-05s (20.921%)
Time in thunks: 6.198883e-06s (10.879%)
Total compile time: 3.642474e+00s
Theano Optimizer time: 7.326508e-02s
Theano validate time: 3.712177e-04s
Theano Linker time (includes C, CUDA code generation/compiling): 9.584920e-01s
Class
---
<% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> <Class name>
100.0% 100.0% 0.000s 2.07e-06s C 3 3 <class 'theano.tensor.elemwise.Elemwise'>
... (remaining 0 Classes account for 0.00%(0.00s) of the runtime)
Ops
---
<% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> <Op name>
65.4% 65.4% 0.000s 2.03e-06s C 2 2 Elemwise{add,no_inplace}
34.6% 100.0% 0.000s 2.15e-06s C 1 1 Elemwise{mul,no_inplace}
... (remaining 0 Ops account for 0.00%(0.00s) of the runtime)
Apply
------
<% time> <sum %> <apply time> <time per call> <#call> <id> <Apply name>
50.0% 50.0% 0.000s 3.10e-06s 1 0 Elemwise{add,no_inplace}(x, y)
34.6% 84.6% 0.000s 2.15e-06s 1 2 Elemwise{mul,no_inplace}(TensorConstant{(1,) of 2.0}, Elemwise{add,no_inplace}.0)
15.4% 100.0% 0.000s 9.54e-07s 1 1 Elemwise{add,no_inplace}(Elemwise{add,no_inplace}.0, z)
... (remaining 0 Apply instances account for 0.00%(0.00s) of the runtime)
......@@ -2113,14 +2113,17 @@ class _Maker(FunctionMaker): # inheritance buys a few helper functions
# optimize the fgraph
compute_test_value_orig = theano.config.compute_test_value
add_stack_trace_on_call = gof.Op.add_stack_trace_on_call
try:
theano.config.compute_test_value = theano.config.compute_test_value_opt
gof.Op.add_stack_trace_on_call = False # Should it be 0 == i?
optimizer(fgraph)
theano.compile.function_module.insert_deepcopy(fgraph, inputs,
outputs + additional_outputs)
finally:
theano.config.compute_test_value = compute_test_value_orig
gof.Op.add_stack_trace_on_call = add_stack_trace_on_call
if i:
li = fgraph.equivalence_tracker.event_list
......
......@@ -1265,9 +1265,9 @@ def orig_function(inputs, outputs, mode=None, accept_inplace=False,
- FAST_COMPILE (minimal optimization)
- PROFILE_MODE: allow to print a profile mode with mode.print_summary
- ProfileMode(deprecated): allow to print a profile mode with mode.print_summary
- DEBUG_MODE: verify many internal conditions that are normally assumed
- DebugMode: verify many internal conditions that are normally assumed
(slow)
:param accept_inplace: True iff the graph can contain inplace operations
......
......@@ -626,17 +626,17 @@ prof_mode_instance_to_print = [predefined_modes["PROFILE_MODE"]]
def atexit_print_default_profile_mode():
"""Print the summary of the predefined mode PROFILE_MODE if used.
"""Print the summary of the predefined mode ProfileMode if used.
This all to have the summary printed at exit when
config.mode=PROFILE_MODE
config.mode=ProfileMode
"""
for prof_mode in prof_mode_instance_to_print:
if prof_mode.local_time > 0:
prof_mode.print_summary()
#Register atexit_print_default_profile_mode to have the summary of the
#predefined mode PROFILE_MODE if it is used printed when the program terminate.
#predefined mode ProfileMode if it is used printed when the program terminate.
atexit.register(atexit_print_default_profile_mode)
......
......@@ -17,7 +17,7 @@ class T_bunch_of_modes(unittest.TestCase):
linker_classes_involved = []
predef_modes = ['FAST_COMPILE', 'FAST_RUN', 'DEBUG_MODE']
# Use a new instance of ProfileMode instead of 'PROFILE_MODE' to
# Use a new instance of ProfileMode instead of 'ProfileMode' to
# avoid printing a profile mode summary in nose output
predef_modes.append(ProfileMode())
......@@ -42,7 +42,7 @@ class T_bunch_of_modes(unittest.TestCase):
# there should be
# - VM_Linker
# - OpWiseCLinker (FAST_RUN)
# - WrapLinker (PROFILE_MODE)
# - WrapLinker ("ProfileMode")
# - PerformLinker (FAST_COMPILE)
# - DebugMode's Linker (DEBUG_MODE)
assert 5 == len(set(linker_classes_involved))
......
......@@ -134,7 +134,7 @@ if rc == 0:
# Keep the default linker the same as the one for the mode FAST_RUN
AddConfigVar('linker',
("Default linker used if the theano flags mode is Mode "
"or ProfileMode"),
"or ProfileMode(deprecated)"),
EnumStr('cvm', 'c|py', 'py', 'c', 'c|py_nogc', 'c&py',
'vm', 'vm_nogc', 'cvm_nogc'),
in_c_key=False)
......@@ -142,7 +142,7 @@ else:
# g++ is not present, linker should default to python only
AddConfigVar('linker',
("Default linker used if the theano flags mode is Mode "
"or ProfileMode"),
"or ProfileMode(deprecated)"),
EnumStr('py', 'vm', 'vm_nogc'),
in_c_key=False)
_logger.warning('g++ not detected ! Theano will be unable to execute '
......@@ -174,7 +174,7 @@ AddConfigVar('allow_gc',
#Keep the default optimizer the same as the one for the mode FAST_RUN
AddConfigVar('optimizer',
("Default optimizer. If not None, will use this linker with the Mode "
"object (not ProfileMode or DebugMode)"),
"object (not ProfileMode(deprecated) or DebugMode)"),
EnumStr('fast_run', 'merge', 'fast_compile', 'None'),
in_c_key=False)
......
......@@ -1456,8 +1456,17 @@ def std_lib_dirs_and_libs():
# modules, even when libpython27.lib and python27.dll are
# available, and the *.a files have to be found earlier than
# the other ones.
libdir = os.path.join(sys.base_prefix, '..', '..', '..',
'User', 'libs')
#When Canopy is installed for the user:
#sys.prefix:C:\Users\username\AppData\Local\Enthought\Canopy\User
#sys.base_prefix:C:\Users\username\AppData\Local\Enthought\Canopy\App\appdata\canopy-1.1.0.1371.win-x86_64
#When Canopy is installed for all users:
#sys.base_prefix: C:\Program Files\Enthought\Canopy\App\appdata\canopy-1.1.0.1371.win-x86_64
#sys.prefix: C:\Users\username\AppData\Local\Enthought\Canopy\User
#So we need to use sys.prefix as it support both cases.
#sys.base_prefix support only one case
libdir = os.path.join(sys.prefix, 'libs')
for f, lib in [('libpython27.a', 'libpython 1.2'),
('libmsvcr90.a', 'mingw 4.5.2')]:
if not os.path.exists(os.path.join(libdir, f)):
......
......@@ -333,7 +333,7 @@ class PrintListener(Feature):
class PreserveNames(Feature):
def on_change_input(self, fgraph, mode, i, r, new_r, reason=None):
def on_change_input(self, fgraph, node, i, r, new_r, reason=None):
if r.name is not None and new_r.name is None:
new_r.name = r.name
......
......@@ -1027,8 +1027,8 @@ class ShapeFeature(object):
# but this works with `local_useless_subtensor`, so for now we
# keep it this way. See #266 for a better long-term fix.
if getattr(d, 'dtype', 'int64') != 'int64':
assert d.dtype in theano.tensor.discrete_dtypes, d.dtype
assert str(d.dtype) != 'uint64'
assert d.dtype in theano.tensor.discrete_dtypes, (node, d.dtype)
assert str(d.dtype) != 'uint64', node
new_shape += sh[len(new_shape):i + 1]
new_shape[i] = theano.tensor.cast(d, 'int64')
if new_shape:
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论