Skip to content
项目
群组
代码片段
帮助
当前项目
正在载入...
登录 / 注册
切换导航面板
P
pytensor
项目
项目
详情
活动
周期分析
仓库
仓库
文件
提交
分支
标签
贡献者
图表
比较
统计图
议题
0
议题
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
CI / CD
CI / CD
流水线
作业
日程
统计图
Wiki
Wiki
代码片段
代码片段
成员
成员
折叠边栏
关闭边栏
活动
图像
聊天
创建新问题
作业
提交
问题看板
Open sidebar
testgroup
pytensor
Commits
690d3628
提交
690d3628
authored
8月 19, 2015
作者:
abergeron
浏览文件
操作
浏览文件
下载
差异文件
Merge pull request #3301 from harlouci/numpydoc_compile
Numpydoc compile
上级
e8ecd0fc
bed6f019
全部展开
隐藏空白字符变更
内嵌
并排
正在显示
13 个修改的文件
包含
380 行增加
和
256 行删除
+380
-256
builders.py
theano/compile/builders.py
+14
-10
debugmode.py
theano/compile/debugmode.py
+0
-0
function.py
theano/compile/function.py
+68
-75
function_module.py
theano/compile/function_module.py
+0
-0
io.py
theano/compile/io.py
+0
-0
mode.py
theano/compile/mode.py
+40
-22
monitormode.py
theano/compile/monitormode.py
+22
-22
nanguardmode.py
theano/compile/nanguardmode.py
+22
-11
ops.py
theano/compile/ops.py
+0
-0
pfunc.py
theano/compile/pfunc.py
+0
-0
profilemode.py
theano/compile/profilemode.py
+46
-24
profiling.py
theano/compile/profiling.py
+93
-40
sharedvalue.py
theano/compile/sharedvalue.py
+75
-52
没有找到文件。
theano/compile/builders.py
浏览文件 @
690d3628
...
...
@@ -10,10 +10,11 @@ from functools import reduce
class
OpFromGraph
(
gof
.
Op
):
"""This creates an `Op` from inputs and outputs lists of variables.
"""
This creates an `Op` from inputs and outputs lists of variables.
The signature is similar to theano.function() and the resulting
`Op`'s perform will do the same operation as:
:
`Op`'s perform will do the same operation as:
orig_function(inputs, outputs, **kwargs)
...
...
@@ -31,11 +32,15 @@ class OpFromGraph(gof.Op):
- Add support to pickle this Op.
- Add support/test with random generator
:note:
- We support shared variables in the inner graph. This is automatic and
invisible to the user. They can be as input to the node or in the
inner graph.
- We support unused inputs. This is needed for the grad.
Notes
-----
- We support shared variables in the inner graph. This is automatic and
invisible to the user. They can be as input to the node or in the
inner graph.
- We support unused inputs. This is needed for the grad.
Examples
--------
Example 1:
...
...
@@ -49,8 +54,6 @@ class OpFromGraph(gof.Op):
e2 = op(x, y, z) + op(z, y, x)
fn = function([x, y, z], [e2])
Example 2 with shared variable:
.. code-block:: python
...
...
@@ -139,7 +142,8 @@ class OpFromGraph(gof.Op):
def
connection_pattern
(
self
,
node
):
"""
Return connection pattern of subfgraph defined by inputs and outputs
Return connection pattern of subfgraph defined by inputs and outputs.
"""
return
io_connection_pattern
(
self
.
new_inputs
,
self
.
new_outputs
)
...
...
theano/compile/debugmode.py
浏览文件 @
690d3628
差异被折叠。
点击展开。
theano/compile/function.py
浏览文件 @
690d3628
"""Define the `function` function
"""
Define the `function` function.
"""
import
six.moves.cPickle
as
pickle
import
logging
...
...
@@ -23,8 +25,9 @@ def function_dump(filename, inputs, outputs=None, mode=None, updates=None,
no_default_updates
=
False
,
accept_inplace
=
False
,
name
=
None
,
rebuild_strict
=
True
,
allow_input_downcast
=
None
,
profile
=
None
,
on_unused_input
=
None
):
"""This is helpful to make a reproducable case for problem during
Theano compilation.
"""
This is helpful to make a reproducable case for problem during Theano
compilation.
Ex:
...
...
@@ -65,78 +68,67 @@ def function(inputs, outputs=None, mode=None, updates=None, givens=None,
"""
Return a callable object that will calculate `outputs` from `inputs`.
:type inputs: list of either Variable or Param instances.
:param inputs: function parameters, these are not allowed to be shared
variables
:type outputs: list or dict of Variables or Out instances. If it is a
dict, the keys must be strings
:param outputs: expressions to compute
:type mode: string or `Mode` instance.
:param mode: compilation mode
:type updates: iterable over pairs (shared_variable, new_expression).
List, tuple or OrderedDict.
:param updates: update the values for SharedVariable inputs
according to these expressions
:type givens: iterable over pairs (Var1, Var2) of Variables. List,
tuple or dict. The Var1 and Var2 in each pair must
have the same Type.
:param givens: specific substitutions to make in the computation
graph (Var2 replaces Var1).
:type no_default_updates: either bool or list of Variables
:param no_default_updates: if True, do not perform any automatic
update on Variables. If False (default), perform them
all. Else, perform automatic updates on all Variables that are
neither in "updates" nor in "no_default_updates".
:param name: an optional name for this function. The profile mode
will print the time spent in this function.
:param rebuild_strict: True (Default) is the safer and better
tested setting, in which case `givens` must substitute new
variables with the same Type as the variables they replace.
False is a you-better-know-what-you-are-doing setting, that
permits `givens` to replace variables with new variables of
any Type. The consequence of changing a Type is that all
results depending on that variable may have a different Type
too (the graph is rebuilt from inputs to outputs). If one of
the new types does not make sense for one of the Ops in the
graph, an Exception will be raised.
:type allow_input_downcast: Boolean or None
:param allow_input_downcast: True means that the values passed as
inputs when calling the function can be silently downcasted to
fit the dtype of the corresponding Variable, which may lose
precision. False means that it will only be cast to a more
general, or precise, type. None (default) is almost like
False, but allows downcasting of Python float scalars to
floatX.
:type profile: None, True, or ProfileStats instance
:param profile: accumulate profiling information into a given
ProfileStats instance. If argument is `True` then a new
ProfileStats instance will be used. This profiling object
will be available via self.profile.
:param on_unused_input: What to do if a variable in the 'inputs'
list is not used in the graph. Possible values are 'raise',
'warn', 'ignore' and None.
:rtype: Function instance
:returns: a callable object that will compute the outputs (given
the inputs) and update the implicit function arguments
according to the `updates`.
:note: Regarding givens: Be careful to make sure that these
substitutions are independent--behaviour when Var1 of one pair
appears in the graph leading to Var2 in another expression is
undefined. Replacements specified with givens are different
from optimizations in that Var2 is not expected to be
equivalent to Var1.
Parameters
----------
inputs : list of either Variable or Param instances.
Function parameters, these are not allowed to be shared variables.
outputs : list or dict of Variables or Out instances.
If it is a dict, the keys must be strings. Expressions to compute.
mode : string or `Mode` instance.
Compilation mode.
updates : iterable over pairs (shared_variable, new_expression). List, tuple
or OrderedDict.
Updates the values for SharedVariable inputs according to these
expressions.
givens : iterable over pairs (Var1, Var2) of Variables. List, tuple or dict.
The Var1 and Var2 in each pair must have the same Type.
Specific substitutions to make in the computation graph (Var2 replaces
Var1).
no_default_updates: either bool or list of Variables
If True, do not perform any automatic update on Variables. If False
(default), perform them all. Else, perform automatic updates on all
Variables that are neither in "updates" nor in "no_default_updates".
name : str
An optional name for this function. The profile mode will print the time
spent in this function.
rebuild_strict : bool
True (Default) is the safer and better tested setting, in which case
`givens` must substitute new variables with the same Type as the
variables they replace.
False is a you-better-know-what-you-are-doing setting, that permits
`givens` to replace variables with new variables of any Type.
The consequence of changing a Type is that all results depending on that
variable may have a different Type too (the graph is rebuilt from inputs
to outputs). If one of the new types does not make sense for one of the
Ops in the graph, an Exception will be raised.
allow_input_downcast: bool or None
True means that the values passed as inputs when calling the function
can be silently downcasted to fit the dtype of the corresponding
Variable, which may lose precision. False means that it will only be
cast to a more general, or precise, type. None (default) is almost like
False, but allows downcasting of Python float scalars to floatX.
profile: None, True, or ProfileStats instance
Accumulate profiling information into a given ProfileStats instance.
If argument is `True` then a new ProfileStats instance will be used.
This profiling object will be available via self.profile.
on_unused_input
What to do if a variable in the 'inputs' list is not used in the graph.
Possible values are 'raise', 'warn', 'ignore' and None.
Returns
-------
Function instance
A callable object that will compute the outputs (given the inputs) and
update the implicit function arguments according to the `updates`.
Notes
-----
Regarding givens: Be careful to make sure that these
substitutions are independent--behaviour when Var1 of one pair
appears in the graph leading to Var2 in another expression is
undefined. Replacements specified with givens are different
from optimizations in that Var2 is not expected to be
equivalent to Var1.
Internal documentation:
...
...
@@ -214,6 +206,7 @@ def function(inputs, outputs=None, mode=None, updates=None, givens=None,
was easier to develop the VM in Python then translate it to C instead
of just writing it in C from scratch.
CVM stands for C Virtual Machine.
"""
if
isinstance
(
outputs
,
dict
):
output_items
=
list
(
outputs
.
items
())
...
...
theano/compile/function_module.py
浏览文件 @
690d3628
差异被折叠。
点击展开。
theano/compile/io.py
浏览文件 @
690d3628
差异被折叠。
点击展开。
theano/compile/mode.py
浏览文件 @
690d3628
"""WRITEME
"""
WRITEME
"""
from
__future__
import
print_function
import
logging
...
...
@@ -34,8 +36,9 @@ AddConfigVar('optimizer_requiring',
def
check_equal
(
x
,
y
):
"""
Returns True iff x[0] and y[0] are equal (checks the dtype and
shape if x and y are numpy.ndarray instances). Used internally.
Returns True iff x[0] and y[0] are equal (checks the dtype and shape if x
and y are numpy.ndarray instances). Used internally.
"""
# I put the import here to allow using theano without scipy.
import
scipy.sparse
as
sp
...
...
@@ -125,17 +128,19 @@ def register_optimizer(name, opt):
class
AddDestroyHandler
(
gof
.
Optimizer
):
"""This optimizer performs two important functions:
"""
This optimizer performs two important functions:
1) It has a 'requirement' of the destroyhandler. This means that the fgraph
will include it as a feature for this optimization, and keep this feature
enabled for subsequent optimizations.
All optimizations that work inplace
enabled for subsequent optimizations. All optimizations that work inplace
on any of their inputs must run *after* this optimization to ensure that
the DestroyHandler has been included in the fgraph.
2) It tries to replace each output with an Op that purports to destroy it
(but it won't I promise). If this replacement succeeds it means that
there is a bug in theano. It should not be possible to destroy outputs.
(but it won't I promise). If this replacement succeeds it means that
there is a bug in theano. It should not be possible to destroy outputs.
"""
def
apply
(
self
,
fgraph
):
for
o
in
fgraph
.
outputs
:
...
...
@@ -157,11 +162,13 @@ class AddDestroyHandler(gof.Optimizer):
class
AddNoOutputFromInplace
(
gof
.
Optimizer
):
"""This optimizer adds to the fgraph a feature that will prevent outputs
"""
This optimizer adds to the fgraph a feature that will prevent outputs
of a fgraph to be created by performing inplace operations on intermediary
variables. This is useful when the outputs of the fgraph are preallocated
to prevent useless copying of the data. Currently, scan preallocates its
outputs
"""
def
add_requirements
(
self
,
fgraph
):
super
(
AddNoOutputFromInplace
,
self
)
.
add_requirements
(
fgraph
)
...
...
@@ -169,10 +176,12 @@ class AddNoOutputFromInplace(gof.Optimizer):
class
PrintCurrentFunctionGraph
(
gof
.
Optimizer
):
"""This optimizer is for debugging.
"""
This optimizer is for debugging.
Toss it into the optimization pipeline to see the state of things at any
given point.
"""
def
__init__
(
self
,
header
):
self
.
header
=
header
...
...
@@ -233,18 +242,23 @@ optdb.register('merge3', gof.MergeOptimizer(),
class
Mode
(
object
):
"""
The Mode represents a way to optimize and then link a computation
graph.
* optimizer -> a structure of type Optimizer. An Optimizer may
simplify the math, put similar computations together, improve
numerical stability and various other improvements.
* linker -> a structure of type Linker. A Linker decides which
implementations to use (C or Python, for example) and how to
string them together to perform the computation.
See predefined_linkers, predefined_optimizers and also
predefined_modes.
The Mode represents a way to optimize and then link a computation graph.
Parameters
----------
optimizer : a structure of type Optimizer
An Optimizer may simplify the math, put similar computations together,
improve numerical stability and various other improvements.
linker : a structure of type Linker
A Linker decides which implementations to use (C or Python, for example)
and how to string them together to perform the computation.
See Also
--------
predefined_linkers
predefined_optimizers
predefined_modes
"""
def
__init__
(
self
,
linker
=
None
,
optimizer
=
'default'
):
...
...
@@ -326,6 +340,7 @@ class Mode(object):
Keyword arguments can be provided for the linker,
in which case its `clone` method will be called with these
arguments.
"""
new_linker
=
self
.
linker
.
clone
(
**
link_kwargs
)
new_optimizer
=
self
.
provided_optimizer
...
...
@@ -412,7 +427,10 @@ def get_default_mode():
def
register_mode
(
name
,
mode
):
"""Add a `Mode` which can be referred to by `name` in `function`."""
"""
Add a `Mode` which can be referred to by `name` in `function`.
"""
if
name
in
predefined_modes
:
raise
ValueError
(
'Mode name already taken:
%
s'
%
name
)
predefined_modes
[
name
]
=
mode
theano/compile/monitormode.py
浏览文件 @
690d3628
...
...
@@ -8,7 +8,6 @@ from theano.compile.mode import Mode
class
MonitorMode
(
Mode
):
"""
`MonitorMode` is a debug mode to easily step through function execution.
...
...
@@ -19,28 +18,28 @@ class MonitorMode(Mode):
A typical use case is to detect the introduction of NaN values in a graph.
For an example of such a use case, see doc/tutorial/debug_faq.txt.
Parameters
----------
pre_func
A function to call before executing a thunk, with arguments:
- the thunk index
- the Apply node
- the thunk to be called
post_func
A function to call after executing a thunk, with the same three
arguments as `pre_func`.
optimizer
The optimizer to use. One may use for instance 'fast_compile' to skip
optimizations.
linker
DO NOT USE. This mode uses its own linker. The parameter is needed to
allow selecting optimizers to use.
"""
def
__init__
(
self
,
pre_func
=
None
,
post_func
=
None
,
optimizer
=
'default'
,
linker
=
None
):
"""
Constructor.
:param pre_func: A function to call before executing a thunk, with
arguments:
- the thunk index
- the Apply node
- the thunk to be called
:param post_func: A function to call after executing a thunk, with the
same three arguments as `pre_func`.
:param optimizer: The optimizer to use. One may use for instance
'fast_compile' to skip optimizations.
:param linker: DO NOT USE. This mode uses its own linker.
The parameter is needed to allow selecting optimizers to use.
"""
self
.
pre_func
=
pre_func
self
.
post_func
=
post_func
wrap_linker
=
theano
.
gof
.
WrapLinkerMany
([
theano
.
gof
.
OpWiseCLinker
()],
...
...
@@ -67,6 +66,7 @@ class MonitorMode(Mode):
def
eval
(
self
,
i
,
node
,
fn
):
"""
The method that calls the thunk `fn`.
"""
if
self
.
pre_func
is
not
None
:
self
.
pre_func
(
i
,
node
,
fn
)
...
...
@@ -96,9 +96,9 @@ class MonitorMode(Mode):
"""
Create a new instance of this Mode.
Keyword arguments can be provided for the linker,
but they will be ignored, because ProfileMode needs
to use its own linker.
Keyword arguments can be provided for the linker,
but they will be
ignored, because ProfileMode needs to use its own linker.
"""
new_mode
=
type
(
self
)(
pre_func
=
self
.
pre_func
,
post_func
=
self
.
post_func
,
...
...
theano/compile/nanguardmode.py
浏览文件 @
690d3628
...
...
@@ -16,11 +16,14 @@ def flatten(l):
Parameters
----------
l : List/tuple/other objects, might be nested.
l : list/tuple/other objects
Might be nested.
Returns
-------
A flattened list of objects
object
A flattened list of objects.
"""
if
isinstance
(
l
,
(
list
,
tuple
,
collections
.
ValuesView
)):
rval
=
[]
...
...
@@ -53,6 +56,7 @@ def contains_nan(arr):
This approach is faster and more memory efficient than the obvious
alternative, calling `np.any(np.isnan(ndarray))`, which requires the
construction of a boolean array with the same shape as the input array.
"""
if
isinstance
(
arr
,
theano
.
gof
.
type
.
CDataType
.
_cdata_type
):
return
False
...
...
@@ -81,6 +85,7 @@ def contains_inf(arr):
This approach is more memory efficient than the obvious alternative,
calling `np.any(np.isinf(ndarray))`, which requires the construction of a
boolean array with the same shape as the input array.
"""
if
isinstance
(
arr
,
theano
.
gof
.
type
.
CDataType
.
_cdata_type
):
return
False
...
...
@@ -97,14 +102,16 @@ class NanGuardMode(Mode):
Parameters
----------
nan_is_error : bool
If True, raise an error anytime a NaN is encountered
inf_is_error: bool
If True, raise an error anytime a NaN is encountered
.
inf_is_error
: bool
If True, raise an error anytime an Inf is encountered. Note that some
pylearn2 modules currently use np.inf as a default value (e.g.
mlp.max_pool) and these will cause an error if inf_is_error is True.
big_is_error: bool
big_is_error
: bool
If True, raise an error when a value greater than 1e10 is encountered.
"""
def
__init__
(
self
,
nan_is_error
,
inf_is_error
,
big_is_error
=
True
):
if
cuda
.
cuda_available
:
self
.
guard_input
=
cuda
.
fvector
(
'nan_guard'
)
...
...
@@ -135,12 +142,13 @@ class NanGuardMode(Mode):
var : numpy.ndarray
The value to be checked.
nd : theano.gof.Apply
The Apply node being executed
The Apply node being executed
.
f : callable
The thunk for the apply node
The thunk for the apply node
.
is_input : bool
If True, `var` is an input to `nd`.
If False, it is an output.
"""
error
=
False
if
nan_is_error
:
...
...
@@ -193,15 +201,18 @@ class NanGuardMode(Mode):
def
nan_check
(
i
,
node
,
fn
):
"""
Runs `fn` while checking its inputs and outputs for NaNs / Infs
Runs `fn` while checking its inputs and outputs for NaNs / Infs
.
Parameters
----------
i : currently ignored (TODO: determine why it is here or remove)
i :
Currently ignored.
TODO: determine why it is here or remove).
node : theano.gof.Apply
The Apply node currently being executed
The Apply node currently being executed
.
fn : callable
The thunk to execute for this Apply node
The thunk to execute for this Apply node.
"""
inputs
=
fn
.
inputs
# TODO: figure out why individual inputs are themselves lists
...
...
theano/compile/ops.py
浏览文件 @
690d3628
差异被折叠。
点击展开。
theano/compile/pfunc.py
浏览文件 @
690d3628
差异被折叠。
点击展开。
theano/compile/profilemode.py
浏览文件 @
690d3628
...
...
@@ -122,7 +122,10 @@ class ProfileMode(Mode):
profile_stats
))
def
function_maker
(
self
,
i
,
o
,
m
,
*
args
,
**
kwargs
):
"""Return an instance of `Profiler_Maker` which init the count"""
"""
Return an instance of `Profiler_Maker` which init the count.
"""
assert
m
is
self
return
Profile_Maker
(
i
,
o
,
self
,
*
args
,
**
kwargs
)
...
...
@@ -147,7 +150,9 @@ class ProfileMode(Mode):
self
.
profile_stats
=
profile_stats
def
profile_thunk
(
i
,
node
,
th
):
""" Profile only the execution time
"""
Profile only the execution time.
"""
global
run_cthunk
if
hasattr
(
th
,
'cthunk'
):
...
...
@@ -169,7 +174,9 @@ class ProfileMode(Mode):
self
.
apply_time
[
node
]
+=
max
(
dt
,
1e-14
)
def
profile_thunk2
(
i
,
node
,
th
):
""" Profile the execution time and the memory size.
"""
Profile the execution time and the memory size.
"""
global
run_cthunk
if
hasattr
(
th
,
'cthunk'
):
...
...
@@ -211,7 +218,8 @@ class ProfileMode(Mode):
self
.
fn_time
=
0
def
print_summary
(
self
,
**
kwargs
):
""" Print 3 summaries that show where time is spent. The first shows
"""
Print 3 summaries that show where time is spent. The first shows
an Apply-wise summary, the second an Op-wise summary and the
third a type-Op-wise summary.
...
...
@@ -235,10 +243,13 @@ class ProfileMode(Mode):
There is an hack with the Op-wise summary. Go see it if you
want to know more.
:param kwargs: They are passed to print_summary_ expanded.
Currently there is n_apply_to_print,
n_ops_to_print and min_memory_size that are
accepted.
Parameters
----------
kwargs
They are passed to print_summary_ expanded. Currently there is
n_apply_to_print, n_ops_to_print and min_memory_size that are
accepted.
"""
compile_time
=
sum
([
ps
.
compile_time
for
ps
in
self
.
profile_stats
.
values
()])
...
...
@@ -280,18 +291,23 @@ class ProfileMode(Mode):
**
kwargs
)
def
print_diff_summary
(
self
,
other
,
**
kwargs
):
""" As print_summary, but print the difference on two different
"""
As print_summary, but print the difference on two different
profile mode.
TODO: Also we don't print the Apply-wise summary as it don't
work for now.
TODO: make comparaison with gpu code.
:param other: the other instance of ProfileMode that we want
to be compared to.
:param kwargs: They are passed to print_summary_ expanded.
Parameters
----------
other
The other instance of ProfileMode that we want to be compared to.
kwargs
They are passed to print_summary_ expanded.
Currently there is n_apply_to_print, n_ops_to_print and
min_memory_size that are accepted.
"""
def
diff_dict
(
a_time
,
b_time_
):
...
...
@@ -343,13 +359,18 @@ class ProfileMode(Mode):
min_memory_size
=
config
.
ProfileMode
.
min_memory_size
,
):
"""
do the actual printing of print_summary and print_diff_summary.
:param n_apply_to_print: the number of apply to print. Default 15.
Do the actual printing of print_summary and print_diff_summary.
Parameters
----------
n_apply_to_print
The number of apply to print. Default 15.
n_ops_to_print
The number of ops to print. Default 20.
min_memory_size
Don't print memory profile of apply whose outputs memory size is
lower than that.
:param n_ops_to_print: the number of ops to print. Default 20.
:param min_memory_size: Don't print memory profile of apply
whose outputs memory size is lower then that.
"""
print
(
"ProfileMode is deprecated! Use the new profiler."
)
...
...
@@ -700,9 +721,9 @@ Test them first, as they are not guaranteed to always provide a speedup.""")
"""
Create a new instance of this Mode.
Keyword arguments can be provided for the linker,
in which case its `clone` method will be called with these
arguments.
Keyword arguments can be provided for the linker,
in which case its
`clone` method will be called with these arguments.
"""
new_linker
=
self
.
linker
.
clone
(
**
link_kwargs
)
new_optimizer
=
self
.
provided_optimizer
...
...
@@ -727,10 +748,11 @@ prof_mode_instance_to_print = [predefined_modes["PROFILE_MODE"]]
def
atexit_print_default_profile_mode
():
"""Print the summary of the predefined mode ProfileMode if used.
"""
Print the summary of the predefined mode ProfileMode if used.
This all to have the summary printed at exit when config.mode=ProfileMode.
This all to have the summary printed at exit when
config.mode=ProfileMode
"""
for
prof_mode
in
prof_mode_instance_to_print
:
if
prof_mode
.
local_time
>
0
:
...
...
theano/compile/profiling.py
浏览文件 @
690d3628
"""ProfileStats object for runtime and memory profiling.
"""
ProfileStats object for runtime and memory profiling.
"""
from
__future__
import
print_function
#
...
...
@@ -76,7 +78,9 @@ AddConfigVar('profiling.destination',
def
_atexit_print_fn
():
"""Print ProfileStat objects in _atexit_print_list to _atexit_print_file
"""
Print ProfileStat objects in _atexit_print_list to _atexit_print_file.
"""
to_sum
=
[]
...
...
@@ -135,6 +139,16 @@ class ProfileStats(object):
"""
Object to store runtime and memory profiling information for all of
Theano's operations: compilation, optimization, execution.
Parameters
----------
atexit_print : bool
True means that this object will be printed to stderr (using .summary())
at the end of the program.
**kwargs : misc initializers
These should (but need not) match the names of the class vars declared
in this class.
"""
#
...
...
@@ -212,12 +226,6 @@ class ProfileStats(object):
# param is called flag_time_thunks because most other attributes with time
# in the name are times *of* something, rather than configuration flags.
def
__init__
(
self
,
atexit_print
=
True
,
flag_time_thunks
=
None
,
**
kwargs
):
"""
atexit_print - bool. True means that this object will be printed to
stderr (using .summary()) at the end of the program.
**kwargs - misc initializers. These should (but need not) match the
names of the class vars declared in this class.
"""
if
(
hasattr
(
theano
,
'sandbox'
)
and
hasattr
(
theano
.
sandbox
,
'cuda'
)
and
theano
.
sandbox
.
cuda
.
cuda_enabled
):
...
...
@@ -250,7 +258,10 @@ class ProfileStats(object):
_atexit_registered
=
True
def
class_time
(
self
):
"""dict op -> total time on thunks"""
"""
dict op -> total time on thunks
"""
# timing is stored by node, we compute timing by class on demand
rval
=
{}
for
node
,
t
in
iteritems
(
self
.
apply_time
):
...
...
@@ -260,7 +271,10 @@ class ProfileStats(object):
return
rval
def
class_callcount
(
self
):
"""dict op -> total number of thunk calls"""
"""
dict op -> total number of thunk calls
"""
# timing is stored by node, we compute timing by class on demand
rval
=
{}
for
node
,
count
in
iteritems
(
self
.
apply_callcount
):
...
...
@@ -270,7 +284,10 @@ class ProfileStats(object):
return
rval
def
class_nodes
(
self
):
"""dict op -> total number of nodes"""
"""
dict op -> total number of nodes
"""
# timing is stored by node, we compute timing by class on demand
rval
=
{}
for
node
,
count
in
iteritems
(
self
.
apply_callcount
):
...
...
@@ -280,7 +297,10 @@ class ProfileStats(object):
return
rval
def
class_impl
(
self
):
"""dict op -> total number of nodes"""
"""
dict op -> total number of nodes
"""
# timing is stored by node, we compute timing by class on demand
rval
=
{}
for
node
in
self
.
apply_callcount
:
...
...
@@ -295,7 +315,10 @@ class ProfileStats(object):
return
rval
def
op_time
(
self
):
"""dict op -> total time on thunks"""
"""
dict op -> total time on thunks
"""
# timing is stored by node, we compute timing by Op on demand
rval
=
{}
for
node
,
t
in
iteritems
(
self
.
apply_time
):
...
...
@@ -304,7 +327,10 @@ class ProfileStats(object):
return
rval
def
fill_node_total_time
(
self
,
node
,
total_times
):
"""node -> fill total time icluding its parents (returns nothing)"""
"""
node -> fill total time icluding its parents (returns nothing)
"""
# timing is stored by node, we compute total time on demand
total
=
self
.
apply_time
[
node
]
for
parent
in
node
.
get_parents
():
...
...
@@ -315,7 +341,10 @@ class ProfileStats(object):
total_times
[
node
]
=
total
def
compute_total_times
(
self
):
"""dict op -> total time icluding the time for parents"""
"""
dict op -> total time icluding the time for parents
"""
rval
=
{}
for
node
in
self
.
apply_time
:
if
node
not
in
rval
:
...
...
@@ -323,7 +352,10 @@ class ProfileStats(object):
return
rval
def
op_callcount
(
self
):
"""dict op -> total number of thunk calls"""
"""
dict op -> total number of thunk calls
"""
# timing is stored by node, we compute timing by Op on demand
rval
=
{}
for
node
,
count
in
iteritems
(
self
.
apply_callcount
):
...
...
@@ -332,7 +364,10 @@ class ProfileStats(object):
return
rval
def
op_nodes
(
self
):
"""dict op -> total number of nodes"""
"""
dict op -> total number of nodes
"""
# timing is stored by node, we compute timing by Op on demand
rval
=
{}
for
node
,
count
in
iteritems
(
self
.
apply_callcount
):
...
...
@@ -341,7 +376,10 @@ class ProfileStats(object):
return
rval
def
op_impl
(
self
):
"""dict op -> 'C' or 'Py' depending how the op is implemented"""
"""
dict op -> 'C' or 'Py' depending how the op is implemented
"""
# timing is stored by node, we compute timing by Op on demand
rval
=
{}
for
node
in
self
.
apply_callcount
:
...
...
@@ -711,21 +749,23 @@ class ProfileStats(object):
def
count_running_memory
(
order
,
fgraph
,
nodes_mem
):
"""
Calculate memory with specific node order
Calculate memory with specific node order.
Return a list including the following values
1. node_memory_size
Sum of the size of all variables that actually allocate
memory (excluding views, and inplace)
;
2. running_memory_size
The memory allocated after the current apply node
3. running_max_memory_size
The maximum of running_memory_size during the function
memory (excluding views, and inplace)
.
2.
running_memory_size
The memory allocated after the current apply node
.
3.
running_max_memory_size
The maximum of running_memory_size during the function
.
4. node_memory_saved_by_view
The sum of memory saved by returning view instead of new
allocation
allocation
.
5. node_memory_saved_by_inplace
The sum of memory saved by reusing the input instead of
new allocation
new allocation.
"""
from
theano.sandbox.cuda
import
CudaNdarrayType
# Initial Mem info values [CPU, GPU]
...
...
@@ -874,10 +914,14 @@ class ProfileStats(object):
def
min_memory_generator
(
executable_nodes
,
viewed_by
,
view_of
):
"""
Generate all valid node order from node_list
and compute its memory peak.
Generate all valid node order from node_list and compute its
memory peak.
Parameters
----------
executable_nodes
Set of executable nodes.
:param executable_nodes: Set of executable nodes
"""
global
mem_count
,
mem_bound
,
max_mem_count
...
...
@@ -1255,9 +1299,13 @@ if False: # old code still to be ported from ProfileMode
"""
Print a readable summary of the stats.
param: n_apply_to_print the number of apply to print. Default 15.
Parameters
----------
n_apply_to_print
The number of apply to print. Default 15.
n_ops_to_print
The number of ops to print. Default 20.
param: n_ops_to_print the number of ops to print. Default 20.
"""
local_time
=
sum
(
self
.
apply_time
.
values
())
...
...
@@ -1483,11 +1531,13 @@ if False: # old code still to be ported from ProfileMode
There is a hack with the Op-wise summary. Go see it if you want to know
more.
:param n_apply_to_print: the number of apply to print. Default 15, or
n_ops_to_print flag.
Parameters
----------
n_apply_to_print
The number of apply to print. Default 15, or n_ops_to_print flag.
n_ops_to_print
The number of ops to print. Default 20, or n_apply_to_print flag.
:param n_ops_to_print: the number of ops to print. Default 20, or
n_apply_to_print flag.
"""
fct_call_time
=
self
.
mode
.
fct_call_time
fct_call
=
self
.
mode
.
fct_call
...
...
@@ -1517,12 +1567,15 @@ if False: # old code still to be ported from ProfileMode
now.
TODO: make comparaison with gpu code.
:param other: the other instance of ProfileMode that we want to be
compared to.
:param n_apply_to_print: the number of apply to print. Default 15.
Parameters
----------
other
The other instance of ProfileMode that we want to be compared to.
n_apply_to_print
The number of apply to print. Default 15.
n_ops_to_print
The number of ops to print. Default 20.
:param n_ops_to_print: the number of ops to print. Default 20.
"""
def
diff_dict
(
a_time
,
b_time_
):
...
...
theano/compile/sharedvalue.py
浏览文件 @
690d3628
"""Provide a simple user friendly API to Theano-managed memory"""
"""
Provide a simple user friendly API to Theano-managed memory.
"""
# Standard imports
import
copy
import
logging
...
...
@@ -18,6 +21,32 @@ class SharedVariable(Variable):
Variable that is (defaults to being) shared between functions that
it appears in.
Parameters
----------
name : str
The name for this variable (see `Variable`).
type : str
The type for this variable (see `Variable`).
value
A value to associate with this variable (a new container will be
created).
strict
True : assignments to .value will not be cast or copied, so they must
have the correct type.
allow_downcast
Only applies if `strict` is False.
True : allow assigned value to lose precision when cast during
assignment.
False : never allow precision loss.
None : only allow downcasting of a Python float to a scalar floatX.
container
The container to use for this variable. Illegal to pass this as well as
a value.
Notes
-----
For more user-friendly constructor, see `shared`.
"""
# Container object
...
...
@@ -36,29 +65,6 @@ class SharedVariable(Variable):
def
__init__
(
self
,
name
,
type
,
value
,
strict
,
allow_downcast
=
None
,
container
=
None
):
"""
:param name: The name for this variable (see `Variable`).
:param type: The type for this variable (see `Variable`).
:param value: A value to associate with this variable (a new
container will be created).
:param strict: True -> assignments to .value will not be cast
or copied, so they must have the correct type.
:param allow_downcast: Only applies if `strict` is False.
True -> allow assigned value to lose precision when cast
during assignment.
False -> never allow precision loss.
None -> only allow downcasting of a Python float to a scalar floatX.
:param container: The container to use for this
variable. Illegal to pass this as well as a value.
:note: For more user-friendly constructor, see `shared`
"""
super
(
SharedVariable
,
self
)
.
__init__
(
type
=
type
,
name
=
name
,
owner
=
None
,
index
=
None
)
...
...
@@ -79,18 +85,21 @@ class SharedVariable(Variable):
allow_downcast
=
allow_downcast
)
def
get_value
(
self
,
borrow
=
False
,
return_internal_type
=
False
):
"""Get the non-symbolic value associated with this SharedVariable.
"""
Get the non-symbolic value associated with this SharedVariable.
:param borrow: True to permit returning of an object aliased
to internal memory.
:param return_internal_type: True to permit the returning of
an arbitrary type object used internally to store the
shared variable.
Parameters
----------
borrow : bool
True to permit returning of an object aliased to internal memory.
return_internal_type : bool
True to permit the returning of an arbitrary type object used
internally to store the shared variable.
Only with borrow=False and return_internal_type=True does this
function
guarantee that you actually get the internal object.
But in that case, you may get different return types when
using
different compute devices.
Only with borrow=False and return_internal_type=True does this
function
guarantee that you actually get the internal object.
But in that case, you may get different return types when
using
different compute devices.
"""
if
borrow
:
...
...
@@ -99,14 +108,18 @@ class SharedVariable(Variable):
return
copy
.
deepcopy
(
self
.
container
.
value
)
def
set_value
(
self
,
new_value
,
borrow
=
False
):
"""Set the non-symbolic value associated with this SharedVariable.
"""
Set the non-symbolic value associated with this SharedVariable.
:param borrow:
Parameters
----------
borrow : bool
True to use the new_value directly, potentially creating problems
related to aliased memory.
Changes to this value will be visible to all functions using
this SharedVariable.
"""
if
borrow
:
self
.
container
.
value
=
new_value
...
...
@@ -114,15 +127,19 @@ class SharedVariable(Variable):
self
.
container
.
value
=
copy
.
deepcopy
(
new_value
)
def
zero
(
self
,
borrow
=
False
):
"""Set the values of a shared variable to 0.
"""
Set the values of a shared variable to 0.
:param borrow:
Parameters
----------
borrow : bbol
True to modify the value of a shared variable directly by using
its previous value. Potentially this can cause problems
regarding to the aliased memory.
Changes done with this function will be visible to all functions using
this SharedVariable.
"""
if
borrow
:
self
.
container
.
value
[
...
]
=
0
...
...
@@ -183,7 +200,8 @@ def shared_constructor(ctor, remove=False):
def
shared
(
value
,
name
=
None
,
strict
=
False
,
allow_downcast
=
None
,
**
kwargs
):
"""Return a SharedVariable Variable, initialized with a copy or
"""
Return a SharedVariable Variable, initialized with a copy or
reference of `value`.
This function iterates over
...
...
@@ -196,23 +214,25 @@ def shared(value, name=None, strict=False, allow_downcast=None, **kwargs):
``theano.shared`` is a shortcut to this function.
:note: By passing kwargs, you effectively limit the set of
potential constructors to those that can accept those kwargs.
Notes
-----
By passing kwargs, you effectively limit the set of potential constructors
to those that can accept those kwargs.
:note:
Some shared variable have ``borrow`` as extra kwargs.
`See <http://deeplearning.net/software/theano/tutorial/aliasing.
\
html#borrowing-when-creating-shared-variables>`_ for detail
.
Some shared variable have ``borrow`` as extra kwargs.
`See <http://deeplearning.net/software/theano/tutorial/aliasing.
\
html#borrowing-when-creating-shared-variables>`_ for details
.
:note: Some shared variable have ``broadcastable`` as extra kwargs.
As shared variable shapes can change, all dimensions default
to not being broadcastable, even if ``value`` has a shape of 1
along some dimension. This parameter allows you to create
for example a `row` or `column` 2d
tensor.
Some shared variable have ``broadcastable`` as extra kwargs. As shared
variable shapes can change, all dimensions default to not being
broadcastable, even if ``value`` has a shape of 1 along some dimension.
This parameter allows you to create for example a `row` or `column` 2d
tensor.
.. attribute:: constructors
A list of shared variable constructors that will be tried in reverse
order.
A list of shared variable constructors that will be tried in reverse
order.
"""
...
...
@@ -251,6 +271,9 @@ shared.constructors = []
@shared_constructor
def
generic_constructor
(
value
,
name
=
None
,
strict
=
False
,
allow_downcast
=
None
):
"""SharedVariable Constructor"""
"""
SharedVariable Constructor.
"""
return
SharedVariable
(
type
=
generic
,
value
=
value
,
name
=
name
,
strict
=
strict
,
allow_downcast
=
allow_downcast
)
编写
预览
Markdown
格式
0%
重试
或
添加新文件
添加附件
取消
您添加了
0
人
到此讨论。请谨慎行事。
请先完成此评论的编辑!
取消
请
注册
或者
登录
后发表评论