Skip to content
项目
群组
代码片段
帮助
当前项目
正在载入...
登录 / 注册
切换导航面板
P
pytensor
项目
项目
详情
活动
周期分析
仓库
仓库
文件
提交
分支
标签
贡献者
图表
比较
统计图
议题
0
议题
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
CI / CD
CI / CD
流水线
作业
日程
统计图
Wiki
Wiki
代码片段
代码片段
成员
成员
折叠边栏
关闭边栏
活动
图像
聊天
创建新问题
作业
提交
问题看板
Open sidebar
testgroup
pytensor
Commits
bd1bdc44
提交
bd1bdc44
authored
5月 05, 2013
作者:
Olivier Delalleau
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
Typo fixes related to new memory profiling
上级
91a31814
隐藏空白字符变更
内嵌
并排
正在显示
6 个修改的文件
包含
65 行增加
和
64 行删除
+65
-64
type.txt
doc/extending/type.txt
+15
-15
config.txt
doc/library/config.txt
+8
-8
profilemode.py
theano/compile/profilemode.py
+5
-5
profiling.py
theano/compile/profiling.py
+19
-19
vm.py
theano/gof/vm.py
+5
-5
basic.py
theano/tensor/basic.py
+13
-12
没有找到文件。
doc/extending/type.txt
浏览文件 @
bd1bdc44
...
@@ -108,36 +108,36 @@ default values.
...
@@ -108,36 +108,36 @@ default values.
.. method:: get_shape_info(obj)
.. method:: get_shape_info(obj)
Optional. Only needed to profile the memory of this Type of object
Optional. Only needed to profile the memory of this Type of object
.
Return the information needed to compute the memory size of
obj
.
Return the information needed to compute the memory size of
``obj``
.
The memory size is only the data, so this exclude the container.
The memory size is only the data, so this exclude
s
the container.
For an ndarray, this is the data, but not the ndarray object and
For an ndarray, this is the data, but not the ndarray object and
other
s data structures
as shape and strides.
other
data structures such
as shape and strides.
get_shape_info() and get_size() work in te
ndem for the memory profiler.
``get_shape_info()`` and ``get_size()`` work in ta
ndem for the memory profiler.
get_shape_info()
is called during the execution of the function.
``get_shape_info()``
is called during the execution of the function.
So it is better that it is not too slow.
So it is better that it is not too slow.
get_size() will be called with
the output of this function
``get_size()`` will be called on
the output of this function
when printing the memory profile.
when printing the memory profile.
:param obj: The object that this Type represent during execution
:param obj: The object that this Type represent
s
during execution
:return: Python object that
self.get_size() understand
:return: Python object that
``self.get_size()`` understands
.. method:: get_size(shape_info)
.. method:: get_size(shape_info)
Number of bytes taken by the object represented by shape_info
Number of bytes taken by the object represented by shape_info
.
Optional. Only needed to profile the memory of this Type of object
Optional. Only needed to profile the memory of this Type of object.
:param shape_info: the output of the call to get_shape_info()
:return: the number of bytes taken by the object described by
``shape_info``.
:param shape_info: the output of the call to get_shape_info()
:return: the number of bytes taken by the object described in
shape_info.
"""
For each method, the *default* is what ``Type`` defines
For each method, the *default* is what ``Type`` defines
for you. So, if you create an instance of ``Type`` or an
for you. So, if you create an instance of ``Type`` or an
instance of a subclass of ``Type``, you
instance of a subclass of ``Type``, you
...
...
doc/library/config.txt
浏览文件 @
bd1bdc44
...
@@ -271,7 +271,7 @@ import theano and print the config variable, as in:
...
@@ -271,7 +271,7 @@ import theano and print the config variable, as in:
Default False
Default False
Do the vm/cvm linkers profile the execution of Theano functions?
Do the vm/cvm linkers profile the execution
time
of Theano functions?
.. attribute:: profile_memory
.. attribute:: profile_memory
...
@@ -279,8 +279,8 @@ import theano and print the config variable, as in:
...
@@ -279,8 +279,8 @@ import theano and print the config variable, as in:
Default False
Default False
Do the vm/cvm linkers profile the memory
of Theano functions get printed
?
Do the vm/cvm linkers profile the memory
usage of Theano functions
?
It only work when profile=True.
It only work
s
when profile=True.
.. attribute:: profile_optimizer
.. attribute:: profile_optimizer
...
@@ -289,26 +289,26 @@ import theano and print the config variable, as in:
...
@@ -289,26 +289,26 @@ import theano and print the config variable, as in:
Default False
Default False
Do the vm/cvm linkers profile the optimization phase when compiling a Theano function?
Do the vm/cvm linkers profile the optimization phase when compiling a Theano function?
It only work when profile=True.
It only work
s
when profile=True.
.. attribute:: profiling.n_apply
.. attribute:: profiling.n_apply
Positive int value, default: 20.
Positive int value, default: 20.
The number of
apply node
to print in the profiler output
The number of
Apply nodes
to print in the profiler output
.. attribute:: profiling.n_ops
.. attribute:: profiling.n_ops
Positive int value, default: 20.
Positive int value, default: 20.
The number of
o
ps to print in the profiler output
The number of
O
ps to print in the profiler output
.. attribute:: profiling.min_memory_size
.. attribute:: profiling.min_memory_size
Positive int value, default: 1024.
Positive int value, default: 1024.
For the memory profile, do not print
a
pply nodes if the size
For the memory profile, do not print
A
pply nodes if the size
of their outputs (in bytes) is lower th
e
n this.
of their outputs (in bytes) is lower th
a
n this.
.. attribute:: config.lib.amdlibm
.. attribute:: config.lib.amdlibm
...
...
theano/compile/profilemode.py
浏览文件 @
bd1bdc44
...
@@ -49,12 +49,12 @@ class Profile_Maker(FunctionMaker):
...
@@ -49,12 +49,12 @@ class Profile_Maker(FunctionMaker):
theano
.
sandbox
.
cuda
.
cuda_enabled
):
theano
.
sandbox
.
cuda
.
cuda_enabled
):
if
os
.
environ
.
get
(
'CUDA_LAUNCH_BLOCKING'
,
'0'
)
!=
'1'
:
if
os
.
environ
.
get
(
'CUDA_LAUNCH_BLOCKING'
,
'0'
)
!=
'1'
:
raise
Exception
(
raise
Exception
(
"You are running Theano profiler with CUDA enabled."
"You are running
the
Theano profiler with CUDA enabled."
" Theano GPU ops execution
are asynchron
by default."
" Theano GPU ops execution
is asynchronous
by default."
" So by default, the profile is useless."
" So by default, the profile is useless."
" You must
use
set the environment variable"
" You must set the environment variable"
" CUDA_LAUNCH_BLOCKING to 1 to tell the CUDA dr
vi
er to"
" CUDA_LAUNCH_BLOCKING to 1 to tell the CUDA dr
iv
er to"
" synch
onize the execution to get meaning ful
l profile."
)
" synch
ronize the execution to get a meaningfu
l profile."
)
# create a function-specific storage container for profiling info
# create a function-specific storage container for profiling info
profile
=
ProfileStats
(
atexit_print
=
False
)
profile
=
ProfileStats
(
atexit_print
=
False
)
...
...
theano/compile/profiling.py
浏览文件 @
bd1bdc44
...
@@ -37,18 +37,18 @@ AddConfigVar('profiling.time_thunks',
...
@@ -37,18 +37,18 @@ AddConfigVar('profiling.time_thunks',
BoolParam
(
True
))
BoolParam
(
True
))
AddConfigVar
(
'profiling.n_apply'
,
AddConfigVar
(
'profiling.n_apply'
,
"Number of
a
pply instances to print by default"
,
"Number of
A
pply instances to print by default"
,
IntParam
(
20
,
lambda
i
:
i
>
0
),
IntParam
(
20
,
lambda
i
:
i
>
0
),
in_c_key
=
False
)
in_c_key
=
False
)
AddConfigVar
(
'profiling.n_ops'
,
AddConfigVar
(
'profiling.n_ops'
,
"Number of
o
ps to print by default"
,
"Number of
O
ps to print by default"
,
IntParam
(
20
,
lambda
i
:
i
>
0
),
IntParam
(
20
,
lambda
i
:
i
>
0
),
in_c_key
=
False
)
in_c_key
=
False
)
AddConfigVar
(
'profiling.min_memory_size'
,
AddConfigVar
(
'profiling.min_memory_size'
,
"""For the memory profile, do not print
a
pply nodes if the size
"""For the memory profile, do not print
A
pply nodes if the size
of their outputs (in bytes) is lower th
e
n this threshold"""
,
of their outputs (in bytes) is lower th
a
n this threshold"""
,
IntParam
(
1024
,
lambda
i
:
i
>=
0
),
IntParam
(
1024
,
lambda
i
:
i
>=
0
),
in_c_key
=
False
)
in_c_key
=
False
)
...
@@ -185,12 +185,12 @@ class ProfileStats(object):
...
@@ -185,12 +185,12 @@ class ProfileStats(object):
theano
.
sandbox
.
cuda
.
cuda_enabled
):
theano
.
sandbox
.
cuda
.
cuda_enabled
):
if
os
.
environ
.
get
(
'CUDA_LAUNCH_BLOCKING'
,
'0'
)
!=
'1'
:
if
os
.
environ
.
get
(
'CUDA_LAUNCH_BLOCKING'
,
'0'
)
!=
'1'
:
raise
Exception
(
raise
Exception
(
"You are running Theano profiler with CUDA enabled."
"You are running
the
Theano profiler with CUDA enabled."
" Theano GPU ops execution
are asynchron
by default."
" Theano GPU ops execution
is asynchronous
by default."
" So by default, the profile is useless."
" So by default, the profile is useless."
" You must
use
set the environment variable"
" You must set the environment variable"
" CUDA_LAUNCH_BLOCKING to 1 to tell the CUDA dr
vi
er to"
" CUDA_LAUNCH_BLOCKING to 1 to tell the CUDA dr
iv
er to"
" synch
onize the execution to get meaning ful
l profile."
)
" synch
ronize the execution to get a meaningfu
l profile."
)
self
.
apply_callcount
=
{}
self
.
apply_callcount
=
{}
self
.
output_size
=
{}
self
.
output_size
=
{}
...
@@ -708,7 +708,7 @@ class ProfileStats(object):
...
@@ -708,7 +708,7 @@ class ProfileStats(object):
if
len
(
fct_memory
)
>
1
:
if
len
(
fct_memory
)
>
1
:
print
>>
file
,
(
"Memory Profile "
print
>>
file
,
(
"Memory Profile "
"(the max between all function in that profile)"
)
"(the max between all function
s
in that profile)"
)
else
:
else
:
print
>>
file
,
"Memory Profile"
print
>>
file
,
"Memory Profile"
...
@@ -717,15 +717,15 @@ class ProfileStats(object):
...
@@ -717,15 +717,15 @@ class ProfileStats(object):
print
>>
file
,
"---"
print
>>
file
,
"---"
# print >> file, " Max if no gc, inplace and view: %dKB" % int(
# print >> file, " Max if no gc, inplace and view: %dKB" % int(
# round(max_sum_size / 1024))
# round(max_sum_size / 1024))
print
>>
file
,
" Max if linker=cvm (default): unknow"
print
>>
file
,
" Max if linker=cvm (default): unknow
n
"
print
>>
file
,
" Max if no gc (allow_gc=False):
%
dKB"
%
int
(
round
(
print
>>
file
,
" Max if no gc (allow_gc=False):
%
dKB"
%
int
(
round
(
max_node_memory_size
/
1024.
))
max_node_memory_size
/
1024.
))
print
>>
file
,
" Max if linker=c|py:
%
dKB"
%
int
(
round
(
print
>>
file
,
" Max if linker=c|py:
%
dKB"
%
int
(
round
(
max_running_max_memory_size
/
1024.
))
max_running_max_memory_size
/
1024.
))
# print >> file, " Memory saved if view
are used: %dKB" % int(round
(
# print >> file, " Memory saved if view
s are used: %dKB" % int
(
# max_node_memory_saved_by_view / 1024.))
#
round(
max_node_memory_saved_by_view / 1024.))
# print >> file, " Memory saved if inplace op
are used: %dKB" % int(
# print >> file, " Memory saved if inplace op
s are used: %dKB" % \
# round(max_node_memory_saved_by_inplace / 1024.))
#
int(
round(max_node_memory_saved_by_inplace / 1024.))
print
>>
file
,
" Memory saved if gc is enabled (linker=c|py):
%
dKB"
%
int
(
print
>>
file
,
" Memory saved if gc is enabled (linker=c|py):
%
dKB"
%
int
(
round
(
max_node_memory_size
-
max_running_max_memory_size
)
/
1024.
)
round
(
max_node_memory_size
-
max_running_max_memory_size
)
/
1024.
)
if
(
hasattr
(
theano
,
'sandbox'
)
and
if
(
hasattr
(
theano
,
'sandbox'
)
and
...
@@ -734,7 +734,7 @@ class ProfileStats(object):
...
@@ -734,7 +734,7 @@ class ProfileStats(object):
hasattr
(
theano
.
sandbox
.
cuda
.
cuda_ndarray
.
cuda_ndarray
,
hasattr
(
theano
.
sandbox
.
cuda
.
cuda_ndarray
.
cuda_ndarray
,
'theano_allocated'
)):
'theano_allocated'
)):
_
,
gpu_max
=
theano
.
sandbox
.
cuda
.
cuda_ndarray
.
cuda_ndarray
.
theano_allocated
()
_
,
gpu_max
=
theano
.
sandbox
.
cuda
.
cuda_ndarray
.
cuda_ndarray
.
theano_allocated
()
print
>>
file
,
(
" Max Memory allocated on the GPU"
print
>>
file
,
(
" Max Memory allocated on the GPU
"
"(for all functions):
%
dKB"
%
"(for all functions):
%
dKB"
%
int
(
round
(
gpu_max
/
1024.
)))
int
(
round
(
gpu_max
/
1024.
)))
...
@@ -785,11 +785,11 @@ class ProfileStats(object):
...
@@ -785,11 +785,11 @@ class ProfileStats(object):
)
)
print
>>
file
,
''
print
>>
file
,
''
if
N
==
0
:
if
N
==
0
:
print
>>
file
,
(
' All Apply node
have outputs size
that take'
print
>>
file
,
(
' All Apply node
s have output sizes
that take'
' less th
e
n
%
dB.'
%
' less th
a
n
%
dB.'
%
config
.
profiling
.
min_memory_size
)
config
.
profiling
.
min_memory_size
)
print
>>
file
,
(
print
>>
file
,
(
" <created/inplace/view> is take
d from the op
declaration."
)
" <created/inplace/view> is take
n from the Op's
declaration."
)
print
>>
file
,
(
" Apply nodes marked 'inplace' or 'view' may"
print
>>
file
,
(
" Apply nodes marked 'inplace' or 'view' may"
" actually allocate memory, this is not reported"
" actually allocate memory, this is not reported"
" here. If you use DebugMode, warnings will be"
" here. If you use DebugMode, warnings will be"
...
...
theano/gof/vm.py
浏览文件 @
bd1bdc44
...
@@ -582,12 +582,12 @@ class VM_Linker(link.LocalLinker):
...
@@ -582,12 +582,12 @@ class VM_Linker(link.LocalLinker):
theano
.
sandbox
.
cuda
.
cuda_enabled
):
theano
.
sandbox
.
cuda
.
cuda_enabled
):
if
os
.
environ
.
get
(
'CUDA_LAUNCH_BLOCKING'
,
'0'
)
!=
'1'
:
if
os
.
environ
.
get
(
'CUDA_LAUNCH_BLOCKING'
,
'0'
)
!=
'1'
:
raise
Exception
(
raise
Exception
(
"You are running Theano profiler with CUDA enabled."
"You are running
the
Theano profiler with CUDA enabled."
" Theano GPU ops execution
are asynchron
by default."
" Theano GPU ops execution
is asynchronous
by default."
" So by default, the profile is useless."
" So by default, the profile is useless."
" You must
use
set the environment variable"
" You must set the environment variable"
" CUDA_LAUNCH_BLOCKING to 1 to tell the CUDA dr
vi
er to"
" CUDA_LAUNCH_BLOCKING to 1 to tell the CUDA dr
iv
er to"
" synch
onize the execution to get meaning ful
l profile."
)
" synch
ronize the execution to get a meaningfu
l profile."
)
if
no_recycling
is
None
:
if
no_recycling
is
None
:
no_recycling
=
[]
no_recycling
=
[]
...
...
theano/tensor/basic.py
浏览文件 @
bd1bdc44
...
@@ -1199,32 +1199,33 @@ class TensorType(Type):
...
@@ -1199,32 +1199,33 @@ class TensorType(Type):
return
numpy
.
zeros
(
shape
,
dtype
=
self
.
dtype
)
return
numpy
.
zeros
(
shape
,
dtype
=
self
.
dtype
)
def
get_shape_info
(
self
,
obj
):
def
get_shape_info
(
self
,
obj
):
"""
Return the information needed to compute the memory size of obj.
"""
Return the information needed to compute the memory size of ``obj``.
The memory size is only the data, so this exclude the container.
The memory size is only the data, so this exclude
s
the container.
For an ndarray, this is the data, but not the ndarray object and
For an ndarray, this is the data, but not the ndarray object and
other
s data structures
as shape and strides.
other
data structures such
as shape and strides.
get_shape_info() and get_size() work in tendem for the memory profiler.
``get_shape_info()`` and ``get_size()`` work in tandem for the memory
profiler.
get_shape_info()
is called during the execution of the function.
``get_shape_info()``
is called during the execution of the function.
So it is better that it is not too slow.
So it is better that it is not too slow.
get_size() will be called with
the output of this function
``get_size()`` will be called on
the output of this function
when printing the memory profile.
when printing the memory profile.
:param obj: The object that this Type represent during execution
:param obj: The object that this Type represent
s
during execution
:return: Python object that
self.get_size() understand
:return: Python object that
``self.get_size()`` understands
"""
"""
return
obj
.
shape
return
obj
.
shape
def
get_size
(
self
,
shape_info
):
def
get_size
(
self
,
shape_info
):
""" Number of bytes taken by the object represented by shape_info
""" Number of bytes taken by the object represented by shape_info
.
:param shape_info: the output of the call to get_shape_info()
:param shape_info: the output of the call to get_shape_info()
:return: the number of bytes taken by the object described
in
:return: the number of bytes taken by the object described
by
shape_info
.
``shape_info``
.
"""
"""
if
shape_info
:
if
shape_info
:
return
numpy
.
prod
(
shape_info
)
*
numpy
.
dtype
(
self
.
dtype
)
.
itemsize
return
numpy
.
prod
(
shape_info
)
*
numpy
.
dtype
(
self
.
dtype
)
.
itemsize
...
...
编写
预览
Markdown
格式
0%
重试
或
添加新文件
添加附件
取消
您添加了
0
人
到此讨论。请谨慎行事。
请先完成此评论的编辑!
取消
请
注册
或者
登录
后发表评论