提交 bd1bdc44 authored 作者: Olivier Delalleau's avatar Olivier Delalleau

Typo fixes related to new memory profiling

上级 91a31814
......@@ -108,36 +108,36 @@ default values.
.. method:: get_shape_info(obj)
Optional. Only needed to profile the memory of this Type of object
Optional. Only needed to profile the memory of this Type of object.
Return the information needed to compute the memory size of obj.
Return the information needed to compute the memory size of ``obj``.
The memory size is only the data, so this exclude the container.
The memory size is only the data, so this excludes the container.
For an ndarray, this is the data, but not the ndarray object and
others data structures as shape and strides.
other data structures such as shape and strides.
get_shape_info() and get_size() work in tendem for the memory profiler.
``get_shape_info()`` and ``get_size()`` work in tandem for the memory profiler.
get_shape_info() is called during the execution of the function.
``get_shape_info()`` is called during the execution of the function.
So it is better that it is not too slow.
get_size() will be called with the output of this function
``get_size()`` will be called on the output of this function
when printing the memory profile.
:param obj: The object that this Type represent during execution
:return: Python object that self.get_size() understand
:param obj: The object that this Type represents during execution
:return: Python object that ``self.get_size()`` understands
.. method:: get_size(shape_info)
Number of bytes taken by the object represented by shape_info
Number of bytes taken by the object represented by shape_info.
Optional. Only needed to profile the memory of this Type of object
Optional. Only needed to profile the memory of this Type of object.
:param shape_info: the output of the call to get_shape_info()
:return: the number of bytes taken by the object described by
``shape_info``.
:param shape_info: the output of the call to get_shape_info()
:return: the number of bytes taken by the object described in
shape_info.
"""
For each method, the *default* is what ``Type`` defines
for you. So, if you create an instance of ``Type`` or an
instance of a subclass of ``Type``, you
......
......@@ -271,7 +271,7 @@ import theano and print the config variable, as in:
Default False
Do the vm/cvm linkers profile the execution of Theano functions?
Do the vm/cvm linkers profile the execution time of Theano functions?
.. attribute:: profile_memory
......@@ -279,8 +279,8 @@ import theano and print the config variable, as in:
Default False
Do the vm/cvm linkers profile the memory of Theano functions get printed?
It only work when profile=True.
Do the vm/cvm linkers profile the memory usage of Theano functions?
It only works when profile=True.
.. attribute:: profile_optimizer
......@@ -289,26 +289,26 @@ import theano and print the config variable, as in:
Default False
Do the vm/cvm linkers profile the optimization phase when compiling a Theano function?
It only work when profile=True.
It only works when profile=True.
.. attribute:: profiling.n_apply
Positive int value, default: 20.
The number of apply node to print in the profiler output
The number of Apply nodes to print in the profiler output
.. attribute:: profiling.n_ops
Positive int value, default: 20.
The number of ops to print in the profiler output
The number of Ops to print in the profiler output
.. attribute:: profiling.min_memory_size
Positive int value, default: 1024.
For the memory profile, do not print apply nodes if the size
of their outputs (in bytes) is lower then this.
For the memory profile, do not print Apply nodes if the size
of their outputs (in bytes) is lower than this.
.. attribute:: config.lib.amdlibm
......
......@@ -49,12 +49,12 @@ class Profile_Maker(FunctionMaker):
theano.sandbox.cuda.cuda_enabled):
if os.environ.get('CUDA_LAUNCH_BLOCKING', '0') != '1':
raise Exception(
"You are running Theano profiler with CUDA enabled."
" Theano GPU ops execution are asynchron by default."
"You are running the Theano profiler with CUDA enabled."
" Theano GPU ops execution is asynchronous by default."
" So by default, the profile is useless."
" You must use set the environment variable"
" CUDA_LAUNCH_BLOCKING to 1 to tell the CUDA drvier to"
" synchonize the execution to get meaning full profile.")
" You must set the environment variable"
" CUDA_LAUNCH_BLOCKING to 1 to tell the CUDA driver to"
" synchronize the execution to get a meaningful profile.")
# create a function-specific storage container for profiling info
profile = ProfileStats(atexit_print=False)
......
......@@ -37,18 +37,18 @@ AddConfigVar('profiling.time_thunks',
BoolParam(True))
AddConfigVar('profiling.n_apply',
"Number of apply instances to print by default",
"Number of Apply instances to print by default",
IntParam(20, lambda i: i > 0),
in_c_key=False)
AddConfigVar('profiling.n_ops',
"Number of ops to print by default",
"Number of Ops to print by default",
IntParam(20, lambda i: i > 0),
in_c_key=False)
AddConfigVar('profiling.min_memory_size',
"""For the memory profile, do not print apply nodes if the size
of their outputs (in bytes) is lower then this threshold""",
"""For the memory profile, do not print Apply nodes if the size
of their outputs (in bytes) is lower than this threshold""",
IntParam(1024, lambda i: i >= 0),
in_c_key=False)
......@@ -185,12 +185,12 @@ class ProfileStats(object):
theano.sandbox.cuda.cuda_enabled):
if os.environ.get('CUDA_LAUNCH_BLOCKING', '0') != '1':
raise Exception(
"You are running Theano profiler with CUDA enabled."
" Theano GPU ops execution are asynchron by default."
"You are running the Theano profiler with CUDA enabled."
" Theano GPU ops execution is asynchronous by default."
" So by default, the profile is useless."
" You must use set the environment variable"
" CUDA_LAUNCH_BLOCKING to 1 to tell the CUDA drvier to"
" synchonize the execution to get meaning full profile.")
" You must set the environment variable"
" CUDA_LAUNCH_BLOCKING to 1 to tell the CUDA driver to"
" synchronize the execution to get a meaningful profile.")
self.apply_callcount = {}
self.output_size = {}
......@@ -708,7 +708,7 @@ class ProfileStats(object):
if len(fct_memory) > 1:
print >> file, ("Memory Profile "
"(the max between all function in that profile)")
"(the max between all functions in that profile)")
else:
print >> file, "Memory Profile"
......@@ -717,15 +717,15 @@ class ProfileStats(object):
print >> file, "---"
# print >> file, " Max if no gc, inplace and view: %dKB" % int(
# round(max_sum_size / 1024))
print >> file, " Max if linker=cvm (default): unknow"
print >> file, " Max if linker=cvm (default): unknown"
print >> file, " Max if no gc (allow_gc=False): %dKB" % int(round(
max_node_memory_size / 1024.))
print >> file, " Max if linker=c|py: %dKB" % int(round(
max_running_max_memory_size / 1024.))
# print >> file, " Memory saved if view are used: %dKB" % int(round(
# max_node_memory_saved_by_view / 1024.))
# print >> file, " Memory saved if inplace op are used: %dKB" % int(
# round(max_node_memory_saved_by_inplace / 1024.))
# print >> file, " Memory saved if views are used: %dKB" % int(
# round(max_node_memory_saved_by_view / 1024.))
# print >> file, " Memory saved if inplace ops are used: %dKB" % \
# int(round(max_node_memory_saved_by_inplace / 1024.))
print >> file, " Memory saved if gc is enabled (linker=c|py): %dKB" % int(
round(max_node_memory_size - max_running_max_memory_size) / 1024.)
if (hasattr(theano, 'sandbox') and
......@@ -734,7 +734,7 @@ class ProfileStats(object):
hasattr(theano.sandbox.cuda.cuda_ndarray.cuda_ndarray,
'theano_allocated')):
_, gpu_max = theano.sandbox.cuda.cuda_ndarray.cuda_ndarray.theano_allocated()
print >> file, (" Max Memory allocated on the GPU"
print >> file, (" Max Memory allocated on the GPU "
"(for all functions): %dKB" %
int(round(gpu_max / 1024.)))
......@@ -785,11 +785,11 @@ class ProfileStats(object):
)
print >> file, ''
if N == 0:
print >> file, (' All Apply node have outputs size that take'
' less then %dB.' %
print >> file, (' All Apply nodes have output sizes that take'
' less than %dB.' %
config.profiling.min_memory_size)
print >> file, (
" <created/inplace/view> is taked from the op declaration.")
" <created/inplace/view> is taken from the Op's declaration.")
print >> file, (" Apply nodes marked 'inplace' or 'view' may"
" actually allocate memory, this is not reported"
" here. If you use DebugMode, warnings will be"
......
......@@ -582,12 +582,12 @@ class VM_Linker(link.LocalLinker):
theano.sandbox.cuda.cuda_enabled):
if os.environ.get('CUDA_LAUNCH_BLOCKING', '0') != '1':
raise Exception(
"You are running Theano profiler with CUDA enabled."
" Theano GPU ops execution are asynchron by default."
"You are running the Theano profiler with CUDA enabled."
" Theano GPU ops execution is asynchronous by default."
" So by default, the profile is useless."
" You must use set the environment variable"
" CUDA_LAUNCH_BLOCKING to 1 to tell the CUDA drvier to"
" synchonize the execution to get meaning full profile.")
" You must set the environment variable"
" CUDA_LAUNCH_BLOCKING to 1 to tell the CUDA driver to"
" synchronize the execution to get a meaningful profile.")
if no_recycling is None:
no_recycling = []
......
......@@ -1199,32 +1199,33 @@ class TensorType(Type):
return numpy.zeros(shape, dtype=self.dtype)
def get_shape_info(self, obj):
"""Return the information needed to compute the memory size of obj.
"""
Return the information needed to compute the memory size of ``obj``.
The memory size is only the data, so this exclude the container.
The memory size is only the data, so this excludes the container.
For an ndarray, this is the data, but not the ndarray object and
others data structures as shape and strides.
other data structures such as shape and strides.
get_shape_info() and get_size() work in tendem for the memory profiler.
``get_shape_info()`` and ``get_size()`` work in tandem for the memory
profiler.
get_shape_info() is called during the execution of the function.
``get_shape_info()`` is called during the execution of the function.
So it is better that it is not too slow.
get_size() will be called with the output of this function
``get_size()`` will be called on the output of this function
when printing the memory profile.
:param obj: The object that this Type represent during execution
:return: Python object that self.get_size() understand
:param obj: The object that this Type represents during execution
:return: Python object that ``self.get_size()`` understands
"""
return obj.shape
def get_size(self, shape_info):
""" Number of bytes taken by the object represented by shape_info
""" Number of bytes taken by the object represented by shape_info.
:param shape_info: the output of the call to get_shape_info()
:return: the number of bytes taken by the object described in
shape_info.
:return: the number of bytes taken by the object described by
``shape_info``.
"""
if shape_info:
return numpy.prod(shape_info) * numpy.dtype(self.dtype).itemsize
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论