提交 bd1bdc44 authored 作者: Olivier Delalleau's avatar Olivier Delalleau

Typo fixes related to new memory profiling

上级 91a31814
...@@ -108,36 +108,36 @@ default values. ...@@ -108,36 +108,36 @@ default values.
.. method:: get_shape_info(obj) .. method:: get_shape_info(obj)
Optional. Only needed to profile the memory of this Type of object Optional. Only needed to profile the memory of this Type of object.
Return the information needed to compute the memory size of obj. Return the information needed to compute the memory size of ``obj``.
The memory size is only the data, so this exclude the container. The memory size is only the data, so this excludes the container.
For an ndarray, this is the data, but not the ndarray object and For an ndarray, this is the data, but not the ndarray object and
others data structures as shape and strides. other data structures such as shape and strides.
get_shape_info() and get_size() work in tendem for the memory profiler. ``get_shape_info()`` and ``get_size()`` work in tandem for the memory profiler.
get_shape_info() is called during the execution of the function. ``get_shape_info()`` is called during the execution of the function.
So it is better that it is not too slow. So it is better that it is not too slow.
get_size() will be called with the output of this function ``get_size()`` will be called on the output of this function
when printing the memory profile. when printing the memory profile.
:param obj: The object that this Type represent during execution :param obj: The object that this Type represents during execution
:return: Python object that self.get_size() understand :return: Python object that ``self.get_size()`` understands
.. method:: get_size(shape_info) .. method:: get_size(shape_info)
Number of bytes taken by the object represented by shape_info Number of bytes taken by the object represented by shape_info.
Optional. Only needed to profile the memory of this Type of object Optional. Only needed to profile the memory of this Type of object.
:param shape_info: the output of the call to get_shape_info() :param shape_info: the output of the call to get_shape_info()
:return: the number of bytes taken by the object described in :return: the number of bytes taken by the object described by
shape_info. ``shape_info``.
"""
For each method, the *default* is what ``Type`` defines For each method, the *default* is what ``Type`` defines
for you. So, if you create an instance of ``Type`` or an for you. So, if you create an instance of ``Type`` or an
instance of a subclass of ``Type``, you instance of a subclass of ``Type``, you
......
...@@ -271,7 +271,7 @@ import theano and print the config variable, as in: ...@@ -271,7 +271,7 @@ import theano and print the config variable, as in:
Default False Default False
Do the vm/cvm linkers profile the execution of Theano functions? Do the vm/cvm linkers profile the execution time of Theano functions?
.. attribute:: profile_memory .. attribute:: profile_memory
...@@ -279,8 +279,8 @@ import theano and print the config variable, as in: ...@@ -279,8 +279,8 @@ import theano and print the config variable, as in:
Default False Default False
Do the vm/cvm linkers profile the memory of Theano functions get printed? Do the vm/cvm linkers profile the memory usage of Theano functions?
It only work when profile=True. It only works when profile=True.
.. attribute:: profile_optimizer .. attribute:: profile_optimizer
...@@ -289,26 +289,26 @@ import theano and print the config variable, as in: ...@@ -289,26 +289,26 @@ import theano and print the config variable, as in:
Default False Default False
Do the vm/cvm linkers profile the optimization phase when compiling a Theano function? Do the vm/cvm linkers profile the optimization phase when compiling a Theano function?
It only work when profile=True. It only works when profile=True.
.. attribute:: profiling.n_apply .. attribute:: profiling.n_apply
Positive int value, default: 20. Positive int value, default: 20.
The number of apply node to print in the profiler output The number of Apply nodes to print in the profiler output
.. attribute:: profiling.n_ops .. attribute:: profiling.n_ops
Positive int value, default: 20. Positive int value, default: 20.
The number of ops to print in the profiler output The number of Ops to print in the profiler output
.. attribute:: profiling.min_memory_size .. attribute:: profiling.min_memory_size
Positive int value, default: 1024. Positive int value, default: 1024.
For the memory profile, do not print apply nodes if the size For the memory profile, do not print Apply nodes if the size
of their outputs (in bytes) is lower then this. of their outputs (in bytes) is lower than this.
.. attribute:: config.lib.amdlibm .. attribute:: config.lib.amdlibm
......
...@@ -49,12 +49,12 @@ class Profile_Maker(FunctionMaker): ...@@ -49,12 +49,12 @@ class Profile_Maker(FunctionMaker):
theano.sandbox.cuda.cuda_enabled): theano.sandbox.cuda.cuda_enabled):
if os.environ.get('CUDA_LAUNCH_BLOCKING', '0') != '1': if os.environ.get('CUDA_LAUNCH_BLOCKING', '0') != '1':
raise Exception( raise Exception(
"You are running Theano profiler with CUDA enabled." "You are running the Theano profiler with CUDA enabled."
" Theano GPU ops execution are asynchron by default." " Theano GPU ops execution is asynchronous by default."
" So by default, the profile is useless." " So by default, the profile is useless."
" You must use set the environment variable" " You must set the environment variable"
" CUDA_LAUNCH_BLOCKING to 1 to tell the CUDA drvier to" " CUDA_LAUNCH_BLOCKING to 1 to tell the CUDA driver to"
" synchonize the execution to get meaning full profile.") " synchronize the execution to get a meaningful profile.")
# create a function-specific storage container for profiling info # create a function-specific storage container for profiling info
profile = ProfileStats(atexit_print=False) profile = ProfileStats(atexit_print=False)
......
...@@ -37,18 +37,18 @@ AddConfigVar('profiling.time_thunks', ...@@ -37,18 +37,18 @@ AddConfigVar('profiling.time_thunks',
BoolParam(True)) BoolParam(True))
AddConfigVar('profiling.n_apply', AddConfigVar('profiling.n_apply',
"Number of apply instances to print by default", "Number of Apply instances to print by default",
IntParam(20, lambda i: i > 0), IntParam(20, lambda i: i > 0),
in_c_key=False) in_c_key=False)
AddConfigVar('profiling.n_ops', AddConfigVar('profiling.n_ops',
"Number of ops to print by default", "Number of Ops to print by default",
IntParam(20, lambda i: i > 0), IntParam(20, lambda i: i > 0),
in_c_key=False) in_c_key=False)
AddConfigVar('profiling.min_memory_size', AddConfigVar('profiling.min_memory_size',
"""For the memory profile, do not print apply nodes if the size """For the memory profile, do not print Apply nodes if the size
of their outputs (in bytes) is lower then this threshold""", of their outputs (in bytes) is lower than this threshold""",
IntParam(1024, lambda i: i >= 0), IntParam(1024, lambda i: i >= 0),
in_c_key=False) in_c_key=False)
...@@ -185,12 +185,12 @@ class ProfileStats(object): ...@@ -185,12 +185,12 @@ class ProfileStats(object):
theano.sandbox.cuda.cuda_enabled): theano.sandbox.cuda.cuda_enabled):
if os.environ.get('CUDA_LAUNCH_BLOCKING', '0') != '1': if os.environ.get('CUDA_LAUNCH_BLOCKING', '0') != '1':
raise Exception( raise Exception(
"You are running Theano profiler with CUDA enabled." "You are running the Theano profiler with CUDA enabled."
" Theano GPU ops execution are asynchron by default." " Theano GPU ops execution is asynchronous by default."
" So by default, the profile is useless." " So by default, the profile is useless."
" You must use set the environment variable" " You must set the environment variable"
" CUDA_LAUNCH_BLOCKING to 1 to tell the CUDA drvier to" " CUDA_LAUNCH_BLOCKING to 1 to tell the CUDA driver to"
" synchonize the execution to get meaning full profile.") " synchronize the execution to get a meaningful profile.")
self.apply_callcount = {} self.apply_callcount = {}
self.output_size = {} self.output_size = {}
...@@ -708,7 +708,7 @@ class ProfileStats(object): ...@@ -708,7 +708,7 @@ class ProfileStats(object):
if len(fct_memory) > 1: if len(fct_memory) > 1:
print >> file, ("Memory Profile " print >> file, ("Memory Profile "
"(the max between all function in that profile)") "(the max between all functions in that profile)")
else: else:
print >> file, "Memory Profile" print >> file, "Memory Profile"
...@@ -717,15 +717,15 @@ class ProfileStats(object): ...@@ -717,15 +717,15 @@ class ProfileStats(object):
print >> file, "---" print >> file, "---"
# print >> file, " Max if no gc, inplace and view: %dKB" % int( # print >> file, " Max if no gc, inplace and view: %dKB" % int(
# round(max_sum_size / 1024)) # round(max_sum_size / 1024))
print >> file, " Max if linker=cvm (default): unknow" print >> file, " Max if linker=cvm (default): unknown"
print >> file, " Max if no gc (allow_gc=False): %dKB" % int(round( print >> file, " Max if no gc (allow_gc=False): %dKB" % int(round(
max_node_memory_size / 1024.)) max_node_memory_size / 1024.))
print >> file, " Max if linker=c|py: %dKB" % int(round( print >> file, " Max if linker=c|py: %dKB" % int(round(
max_running_max_memory_size / 1024.)) max_running_max_memory_size / 1024.))
# print >> file, " Memory saved if view are used: %dKB" % int(round( # print >> file, " Memory saved if views are used: %dKB" % int(
# max_node_memory_saved_by_view / 1024.)) # round(max_node_memory_saved_by_view / 1024.))
# print >> file, " Memory saved if inplace op are used: %dKB" % int( # print >> file, " Memory saved if inplace ops are used: %dKB" % \
# round(max_node_memory_saved_by_inplace / 1024.)) # int(round(max_node_memory_saved_by_inplace / 1024.))
print >> file, " Memory saved if gc is enabled (linker=c|py): %dKB" % int( print >> file, " Memory saved if gc is enabled (linker=c|py): %dKB" % int(
round(max_node_memory_size - max_running_max_memory_size) / 1024.) round(max_node_memory_size - max_running_max_memory_size) / 1024.)
if (hasattr(theano, 'sandbox') and if (hasattr(theano, 'sandbox') and
...@@ -734,7 +734,7 @@ class ProfileStats(object): ...@@ -734,7 +734,7 @@ class ProfileStats(object):
hasattr(theano.sandbox.cuda.cuda_ndarray.cuda_ndarray, hasattr(theano.sandbox.cuda.cuda_ndarray.cuda_ndarray,
'theano_allocated')): 'theano_allocated')):
_, gpu_max = theano.sandbox.cuda.cuda_ndarray.cuda_ndarray.theano_allocated() _, gpu_max = theano.sandbox.cuda.cuda_ndarray.cuda_ndarray.theano_allocated()
print >> file, (" Max Memory allocated on the GPU" print >> file, (" Max Memory allocated on the GPU "
"(for all functions): %dKB" % "(for all functions): %dKB" %
int(round(gpu_max / 1024.))) int(round(gpu_max / 1024.)))
...@@ -785,11 +785,11 @@ class ProfileStats(object): ...@@ -785,11 +785,11 @@ class ProfileStats(object):
) )
print >> file, '' print >> file, ''
if N == 0: if N == 0:
print >> file, (' All Apply node have outputs size that take' print >> file, (' All Apply nodes have output sizes that take'
' less then %dB.' % ' less than %dB.' %
config.profiling.min_memory_size) config.profiling.min_memory_size)
print >> file, ( print >> file, (
" <created/inplace/view> is taked from the op declaration.") " <created/inplace/view> is taken from the Op's declaration.")
print >> file, (" Apply nodes marked 'inplace' or 'view' may" print >> file, (" Apply nodes marked 'inplace' or 'view' may"
" actually allocate memory, this is not reported" " actually allocate memory, this is not reported"
" here. If you use DebugMode, warnings will be" " here. If you use DebugMode, warnings will be"
......
...@@ -582,12 +582,12 @@ class VM_Linker(link.LocalLinker): ...@@ -582,12 +582,12 @@ class VM_Linker(link.LocalLinker):
theano.sandbox.cuda.cuda_enabled): theano.sandbox.cuda.cuda_enabled):
if os.environ.get('CUDA_LAUNCH_BLOCKING', '0') != '1': if os.environ.get('CUDA_LAUNCH_BLOCKING', '0') != '1':
raise Exception( raise Exception(
"You are running Theano profiler with CUDA enabled." "You are running the Theano profiler with CUDA enabled."
" Theano GPU ops execution are asynchron by default." " Theano GPU ops execution is asynchronous by default."
" So by default, the profile is useless." " So by default, the profile is useless."
" You must use set the environment variable" " You must set the environment variable"
" CUDA_LAUNCH_BLOCKING to 1 to tell the CUDA drvier to" " CUDA_LAUNCH_BLOCKING to 1 to tell the CUDA driver to"
" synchonize the execution to get meaning full profile.") " synchronize the execution to get a meaningful profile.")
if no_recycling is None: if no_recycling is None:
no_recycling = [] no_recycling = []
......
...@@ -1199,32 +1199,33 @@ class TensorType(Type): ...@@ -1199,32 +1199,33 @@ class TensorType(Type):
return numpy.zeros(shape, dtype=self.dtype) return numpy.zeros(shape, dtype=self.dtype)
def get_shape_info(self, obj): def get_shape_info(self, obj):
"""Return the information needed to compute the memory size of obj. """
Return the information needed to compute the memory size of ``obj``.
The memory size is only the data, so this exclude the container. The memory size is only the data, so this excludes the container.
For an ndarray, this is the data, but not the ndarray object and For an ndarray, this is the data, but not the ndarray object and
others data structures as shape and strides. other data structures such as shape and strides.
get_shape_info() and get_size() work in tendem for the memory profiler. ``get_shape_info()`` and ``get_size()`` work in tandem for the memory
profiler.
get_shape_info() is called during the execution of the function. ``get_shape_info()`` is called during the execution of the function.
So it is better that it is not too slow. So it is better that it is not too slow.
get_size() will be called with the output of this function ``get_size()`` will be called on the output of this function
when printing the memory profile. when printing the memory profile.
:param obj: The object that this Type represent during execution :param obj: The object that this Type represents during execution
:return: Python object that self.get_size() understand :return: Python object that ``self.get_size()`` understands
""" """
return obj.shape return obj.shape
def get_size(self, shape_info): def get_size(self, shape_info):
""" Number of bytes taken by the object represented by shape_info """ Number of bytes taken by the object represented by shape_info.
:param shape_info: the output of the call to get_shape_info() :param shape_info: the output of the call to get_shape_info()
:return: the number of bytes taken by the object described in :return: the number of bytes taken by the object described by
shape_info. ``shape_info``.
""" """
if shape_info: if shape_info:
return numpy.prod(shape_info) * numpy.dtype(self.dtype).itemsize return numpy.prod(shape_info) * numpy.dtype(self.dtype).itemsize
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论