Skip to content
项目
群组
代码片段
帮助
当前项目
正在载入...
登录 / 注册
切换导航面板
P
pytensor
项目
项目
详情
活动
周期分析
仓库
仓库
文件
提交
分支
标签
贡献者
图表
比较
统计图
议题
0
议题
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
CI / CD
CI / CD
流水线
作业
日程
统计图
Wiki
Wiki
代码片段
代码片段
成员
成员
折叠边栏
关闭边栏
活动
图像
聊天
创建新问题
作业
提交
问题看板
Open sidebar
testgroup
pytensor
Commits
c190056b
提交
c190056b
authored
12月 16, 2014
作者:
Frédéric Bastien
浏览文件
操作
浏览文件
下载
差异文件
Merge pull request #2342 from abergeron/multi_fixes
Multi fixes (pieces of the multi-gpu code)
上级
567ceab6
7a5cb506
隐藏空白字符变更
内嵌
并排
正在显示
17 个修改的文件
包含
82 行增加
和
116 行删除
+82
-116
cop.txt
doc/extending/cop.txt
+7
-19
type.txt
doc/extending/type.txt
+10
-0
ops.py
theano/compile/ops.py
+4
-5
cc.py
theano/gof/cc.py
+3
-4
graph.py
theano/gof/graph.py
+2
-1
op.py
theano/gof/op.py
+9
-15
test_op.py
theano/gof/tests/test_op.py
+8
-8
opt.py
theano/sandbox/cuda/opt.py
+3
-13
type.py
theano/sandbox/cuda/type.py
+5
-0
basic_ops.py
theano/sandbox/gpuarray/basic_ops.py
+0
-1
opt.py
theano/sandbox/gpuarray/opt.py
+2
-4
test_scan.py
theano/sandbox/gpuarray/tests/test_scan.py
+1
-0
type.py
theano/sandbox/gpuarray/type.py
+8
-0
scan_op.py
theano/scan_module/scan_op.py
+4
-29
scan_opt.py
theano/scan_module/scan_opt.py
+3
-6
basic.py
theano/tensor/basic.py
+1
-11
type.py
theano/tensor/type.py
+12
-0
没有找到文件。
doc/extending/cop.txt
浏览文件 @
c190056b
...
@@ -43,9 +43,6 @@ There are less methods to define for an Op than for a Type:
...
@@ -43,9 +43,6 @@ There are less methods to define for an Op than for a Type:
that a python exception is set) if your C code needs to
that a python exception is set) if your C code needs to
raise an exception.
raise an exception.
``sub['struct_id']``
The integer id passed to the various _struct methods.
.. method:: c_code_cleanup(node, name, input_names, output_names, sub)
.. method:: c_code_cleanup(node, name, input_names, output_names, sub)
...
@@ -99,15 +96,12 @@ There are less methods to define for an Op than for a Type:
...
@@ -99,15 +96,12 @@ There are less methods to define for an Op than for a Type:
module is initialized, before anything else is executed and is
module is initialized, before anything else is executed and is
specialized for a particular apply of an :ref:`op`.
specialized for a particular apply of an :ref:`op`.
.. method:: c_init_code_struct(node,
struct_id
, sub)
.. method:: c_init_code_struct(node,
name
, sub)
Allows you to specify code that will be inserted in the struct
Allows you to specify code that will be inserted in the struct
constructor of the Op. This is for code which should be
constructor of the Op. This is for code which should be
executed once per thunk (Apply node, more or less).
executed once per thunk (Apply node, more or less).
`struct_id` is an integer guaranteed to be unique inside the
struct.
`sub` is a dictionary of extras parameters to the
`sub` is a dictionary of extras parameters to the
c_code_init_code_struct method. It contains the following
c_code_init_code_struct method. It contains the following
values:
values:
...
@@ -131,31 +125,25 @@ There are less methods to define for an Op than for a Type:
...
@@ -131,31 +125,25 @@ There are less methods to define for an Op than for a Type:
if the code is the same for each apply of an op. It will be
if the code is the same for each apply of an op. It will be
inserted at global scope.
inserted at global scope.
.. method:: c_support_code_struct(node,
struct_id
)
.. method:: c_support_code_struct(node,
name
)
Allows you to specify helper functions of variables that will
Allows you to specify helper functions of variables that will
be specific to one particular thunk. These are inserted at
be specific to one particular thunk. These are inserted at
struct scope.
struct scope.
`struct_id` is an integer guaranteed to be unique inside the
struct.
:note:
:note:
You cannot specify
kernels in the code returned by this since
You cannot specify
CUDA kernels in the code returned by this
that isn't supported by CUDA. You should place your kernels
since that isn't supported by CUDA. You should place your
in :meth:`c_support_code()` or :meth:`c_support_code_apply()`
kernels in :meth:`c_support_code()` or
and call them from this code.
:meth:`c_support_code_apply()`
and call them from this code.
.. method:: c_cleanup_code_struct(node,
struct_id
)
.. method:: c_cleanup_code_struct(node,
name
)
Allows you to specify code that will be inserted in the struct
Allows you to specify code that will be inserted in the struct
destructor of the Op. This is for cleaninp up allocations and
destructor of the Op. This is for cleaninp up allocations and
stuff like this when the thunk is released (when you "free" a
stuff like this when the thunk is released (when you "free" a
compiled function using this op).
compiled function using this op).
`struct_id` is an integer guaranteed to be unique inside the
struct.
.. method:: infer_shape(node, (i0_shapes,i1_shapes,...))
.. method:: infer_shape(node, (i0_shapes,i1_shapes,...))
Allow optimizations to lift the Shape op over this op. An
Allow optimizations to lift the Shape op over this op. An
...
...
doc/extending/type.txt
浏览文件 @
c190056b
...
@@ -138,6 +138,16 @@ default values.
...
@@ -138,6 +138,16 @@ default values.
:return: the number of bytes taken by the object described by
:return: the number of bytes taken by the object described by
``shape_info``.
``shape_info``.
.. method:: clone(dtype=None, broadcastable=None)
Optional, for TensorType-alikes.
Return a copy of the type with a possibly changed value for
dtype and broadcastable (if they aren't `None`).
:param dtype: New dtype for the copy.
:param broadcastable: New broadcastable tuple for the copy.
.. method:: may_share_memory(a, b)
.. method:: may_share_memory(a, b)
Optional to run, but mandatory for DebugMode. Return True if the Python
Optional to run, but mandatory for DebugMode. Return True if the Python
...
...
theano/compile/ops.py
浏览文件 @
c190056b
...
@@ -570,7 +570,7 @@ class Rebroadcast(gof.Op):
...
@@ -570,7 +570,7 @@ class Rebroadcast(gof.Op):
def
__hash__
(
self
):
def
__hash__
(
self
):
items
=
sorted
(
self
.
axis
.
iteritems
())
# no ambiguity because each item key is unique
items
=
sorted
(
self
.
axis
.
iteritems
())
# no ambiguity because each item key is unique
return
hash
(
type
(
self
))
^
hash
(
tuple
(
items
))
return
hash
(
(
type
(
self
),
tuple
(
items
)
))
def
__str__
(
self
):
def
__str__
(
self
):
if
len
(
self
.
axis
)
==
0
:
if
len
(
self
.
axis
)
==
0
:
...
@@ -586,10 +586,9 @@ class Rebroadcast(gof.Op):
...
@@ -586,10 +586,9 @@ class Rebroadcast(gof.Op):
def
make_node
(
self
,
x
):
def
make_node
(
self
,
x
):
if
self
.
axis
.
keys
()
and
(
x
.
ndim
<=
numpy
.
max
(
self
.
axis
.
keys
())):
if
self
.
axis
.
keys
()
and
(
x
.
ndim
<=
numpy
.
max
(
self
.
axis
.
keys
())):
raise
ValueError
(
'Trying to rebroadcast non-existent dimension'
)
raise
ValueError
(
'Trying to rebroadcast non-existent dimension'
)
t
=
x
.
type
.
__class__
(
dtype
=
x
.
type
.
dtype
,
t
=
x
.
type
.
clone
(
broadcastable
=
[
self
.
axis
.
get
(
i
,
b
)
broadcastable
=
[
self
.
axis
.
get
(
i
,
b
)
for
i
,
b
in
enumerate
(
for
i
,
b
in
enumerate
(
x
.
type
.
broadcastable
)])
x
.
type
.
broadcastable
)])
return
gof
.
Apply
(
self
,
[
x
],
[
t
()])
return
gof
.
Apply
(
self
,
[
x
],
[
t
()])
def
perform
(
self
,
node
,
inp
,
out_
):
def
perform
(
self
,
node
,
inp
,
out_
):
...
...
theano/gof/cc.py
浏览文件 @
c190056b
...
@@ -658,7 +658,6 @@ class CLinker(link.Linker):
...
@@ -658,7 +658,6 @@ class CLinker(link.Linker):
# Make the CodeBlock for c_code
# Make the CodeBlock for c_code
sub
[
'id'
]
=
id
sub
[
'id'
]
=
id
sub
[
'struct_id'
]
=
id
+
1
sub
[
'fail'
]
=
failure_code
(
sub
)
sub
[
'fail'
]
=
failure_code
(
sub
)
sub_struct
=
dict
()
sub_struct
=
dict
()
...
@@ -692,7 +691,7 @@ class CLinker(link.Linker):
...
@@ -692,7 +691,7 @@ class CLinker(link.Linker):
" didn't return a string for c_init_code_apply"
)
" didn't return a string for c_init_code_apply"
)
try
:
try
:
struct_init
=
op
.
c_init_code_struct
(
node
,
id
+
1
,
sub_struct
)
struct_init
=
op
.
c_init_code_struct
(
node
,
name
,
sub_struct
)
assert
isinstance
(
struct_init
,
basestring
),
(
assert
isinstance
(
struct_init
,
basestring
),
(
str
(
node
.
op
)
+
str
(
node
.
op
)
+
" didn't return a string for c_init_code_struct"
)
" didn't return a string for c_init_code_struct"
)
...
@@ -700,7 +699,7 @@ class CLinker(link.Linker):
...
@@ -700,7 +699,7 @@ class CLinker(link.Linker):
pass
pass
try
:
try
:
struct_support
=
op
.
c_support_code_struct
(
node
,
id
+
1
)
struct_support
=
op
.
c_support_code_struct
(
node
,
name
)
assert
isinstance
(
struct_support
,
basestring
),
(
assert
isinstance
(
struct_support
,
basestring
),
(
str
(
node
.
op
)
+
str
(
node
.
op
)
+
" didn't return a string for c_support_code_struct"
)
" didn't return a string for c_support_code_struct"
)
...
@@ -708,7 +707,7 @@ class CLinker(link.Linker):
...
@@ -708,7 +707,7 @@ class CLinker(link.Linker):
pass
pass
try
:
try
:
struct_cleanup
=
op
.
c_cleanup_code_struct
(
node
,
id
+
1
)
struct_cleanup
=
op
.
c_cleanup_code_struct
(
node
,
name
)
assert
isinstance
(
struct_cleanup
,
basestring
),
(
assert
isinstance
(
struct_cleanup
,
basestring
),
(
str
(
node
.
op
)
+
str
(
node
.
op
)
+
" didn't return a string for c_cleanup_code_struct"
)
" didn't return a string for c_cleanup_code_struct"
)
...
...
theano/gof/graph.py
浏览文件 @
c190056b
...
@@ -184,7 +184,8 @@ class Apply(Node):
...
@@ -184,7 +184,8 @@ class Apply(Node):
:note:
:note:
tags are copied from self to the returned instance.
tags are copied from self to the returned instance.
"""
"""
cp
=
self
.
__class__
(
self
.
op
,
self
.
inputs
,
[
output
.
clone
()
for
output
in
self
.
outputs
])
cp
=
self
.
__class__
(
self
.
op
,
self
.
inputs
,
[
output
.
clone
()
for
output
in
self
.
outputs
])
cp
.
tag
=
copy
(
self
.
tag
)
cp
.
tag
=
copy
(
self
.
tag
)
return
cp
return
cp
...
...
theano/gof/op.py
浏览文件 @
c190056b
...
@@ -322,17 +322,15 @@ class CLinkerOp(CLinkerObject):
...
@@ -322,17 +322,15 @@ class CLinkerOp(CLinkerObject):
raise
utils
.
MethodNotDefined
(
"c_init_code_apply"
,
type
(
self
),
raise
utils
.
MethodNotDefined
(
"c_init_code_apply"
,
type
(
self
),
self
.
__class__
.
__name__
)
self
.
__class__
.
__name__
)
def
c_init_code_struct
(
self
,
node
,
struct_id
,
sub
):
def
c_init_code_struct
(
self
,
node
,
name
,
sub
):
"""
"""
Optional: return a code string specific to the apply
Optional: return a code string specific to the apply
to be inserted in the struct initialization code.
to be inserted in the struct initialization code.
:param node: an Apply instance in the graph being compiled
:param node: an Apply instance in the graph being compiled
:param struct_id: a number that serves to uniquely identify
:param name: a unique name to distinguish you variables from
this code. The c_code will receive another
those of other nodes.
sub parameter named struct_id that will
contain this name.
:param sub: a dictionary of values to substitute in the code.
:param sub: a dictionary of values to substitute in the code.
Most notably it contains a 'fail' entry that you
Most notably it contains a 'fail' entry that you
...
@@ -345,17 +343,15 @@ class CLinkerOp(CLinkerObject):
...
@@ -345,17 +343,15 @@ class CLinkerOp(CLinkerObject):
raise
utils
.
MethodNotDefined
(
"c_init_code_apply"
,
type
(
self
),
raise
utils
.
MethodNotDefined
(
"c_init_code_apply"
,
type
(
self
),
self
.
__class__
.
__name__
)
self
.
__class__
.
__name__
)
def
c_support_code_struct
(
self
,
node
,
struct_id
):
def
c_support_code_struct
(
self
,
node
,
name
):
"""Optional: Return utility code for use by an `Op` that will be
"""Optional: Return utility code for use by an `Op` that will be
inserted at struct scope, that can be specialized for the
inserted at struct scope, that can be specialized for the
support of a particular `Apply` node.
support of a particular `Apply` node.
:param node: an Apply instance in the graph being compiled
:param node: an Apply instance in the graph being compiled
:param struct_id: a number that serves to uniquely identify
:param name: a unique name to distinguish you variables from
this code. The c_code will receive another
those of other nodes.
sub parameter named struct_id that will
contain this name.
:Exceptions:
:Exceptions:
- `MethodNotDefined`: Subclass does not implement this method
- `MethodNotDefined`: Subclass does not implement this method
...
@@ -364,17 +360,15 @@ class CLinkerOp(CLinkerObject):
...
@@ -364,17 +360,15 @@ class CLinkerOp(CLinkerObject):
raise
utils
.
MethodNotDefined
(
"c_support_code_struct"
,
raise
utils
.
MethodNotDefined
(
"c_support_code_struct"
,
type
(
self
),
self
.
__class__
.
__name__
)
type
(
self
),
self
.
__class__
.
__name__
)
def
c_cleanup_code_struct
(
self
,
node
,
struct_id
):
def
c_cleanup_code_struct
(
self
,
node
,
name
):
"""
"""
Optional: return a code string specific to the apply to be
Optional: return a code string specific to the apply to be
inserted in the struct cleanup code.
inserted in the struct cleanup code.
:param node: an Apply instance in the graph being compiled
:param node: an Apply instance in the graph being compiled
:param struct_id: a number that serves to uniquely identify
:param name: a unique name to distinguish you variables from
this code. The c_code will receive another
those of other nodes.
sub parameter named struct_id that will
contain this name.
:Exceptions:
:Exceptions:
- `MethodNotDefined`: the subclass does not override this method
- `MethodNotDefined`: the subclass does not override this method
...
...
theano/gof/tests/test_op.py
浏览文件 @
c190056b
...
@@ -94,20 +94,20 @@ class StructOp(Op):
...
@@ -94,20 +94,20 @@ class StructOp(Op):
def
make_node
(
self
,
i
):
def
make_node
(
self
,
i
):
return
Apply
(
self
,
[
i
],
[
scalar
.
uint64
()])
return
Apply
(
self
,
[
i
],
[
scalar
.
uint64
()])
def
c_support_code_struct
(
self
,
node
,
struct_id
):
def
c_support_code_struct
(
self
,
node
,
name
):
return
"npy_uint64 counter
%
d;"
%
(
struct_id
,)
return
"npy_uint64 counter
%
s;"
%
(
name
,)
def
c_init_code_struct
(
self
,
node
,
struct_id
,
sub
):
def
c_init_code_struct
(
self
,
node
,
name
,
sub
):
return
"counter
%
d = 0;"
%
(
struct_id
,)
return
"counter
%
s = 0;"
%
(
name
,)
def
c_code
(
self
,
node
,
name
,
input_names
,
outputs_names
,
sub
):
def
c_code
(
self
,
node
,
name
,
input_names
,
outputs_names
,
sub
):
return
"""
return
"""
%(out)
s = counter
%(
sid
)
s;
%(out)
s = counter
%(
name
)
s;
counter
%(
sid
)
s++;
counter
%(
name
)
s++;
"""
%
dict
(
out
=
outputs_names
[
0
],
sid
=
sub
[
'struct_id'
]
)
"""
%
dict
(
out
=
outputs_names
[
0
],
name
=
name
)
def
c_code_cache_version
(
self
):
def
c_code_cache_version
(
self
):
return
(
0
,)
return
(
1
,)
class
TestOp
:
class
TestOp
:
...
...
theano/sandbox/cuda/opt.py
浏览文件 @
c190056b
...
@@ -2000,12 +2000,6 @@ def local_gpu_extract_diagonal(node):
...
@@ -2000,12 +2000,6 @@ def local_gpu_extract_diagonal(node):
gpu_from_host
(
diag_node
.
inputs
[
0
]))]
gpu_from_host
(
diag_node
.
inputs
[
0
]))]
return
False
return
False
def
typeConstructor
(
broadcastable
,
dtype
):
if
dtype
==
'float32'
:
return
CudaNdarrayType
(
broadcastable
=
broadcastable
)
else
:
return
tensor
.
TensorType
(
broadcastable
=
broadcastable
,
dtype
=
dtype
)
@register_opt
(
'scan'
)
@register_opt
(
'scan'
)
@local_optimizer
([
gpu_from_host
,
scan_op
.
Scan
])
@local_optimizer
([
gpu_from_host
,
scan_op
.
Scan
])
def
gpuScanOptimization
(
node
):
def
gpuScanOptimization
(
node
):
...
@@ -2065,9 +2059,7 @@ def gpuScanOptimization(node):
...
@@ -2065,9 +2059,7 @@ def gpuScanOptimization(node):
nw_op
=
scan_op
.
Scan
(
scan_ins
,
nw_op
=
scan_op
.
Scan
(
scan_ins
,
scan_outs
,
scan_outs
,
info
,
info
)
.
make_node
(
*
nw_ins
)
typeConstructor
=
typeConstructor
)
.
make_node
(
*
nw_ins
)
_outputs
=
nw_op
.
outputs
_outputs
=
nw_op
.
outputs
return
_outputs
return
_outputs
...
@@ -2113,8 +2105,7 @@ def gpuScanOptimization(node):
...
@@ -2113,8 +2105,7 @@ def gpuScanOptimization(node):
_outputs
=
scan_op
.
Scan
(
_outputs
=
scan_op
.
Scan
(
scan_ins
,
scan_ins
,
scan_outs
,
scan_outs
,
info
,
info
)
.
make_node
(
*
nw_ins
)
.
outputs
typeConstructor
=
typeConstructor
)
.
make_node
(
*
nw_ins
)
.
outputs
outputs
=
[]
outputs
=
[]
for
x
,
y
in
zip
(
_outputs
,
node
.
outputs
):
for
x
,
y
in
zip
(
_outputs
,
node
.
outputs
):
if
isinstance
(
y
.
type
,
CudaNdarrayType
):
if
isinstance
(
y
.
type
,
CudaNdarrayType
):
...
@@ -2126,8 +2117,7 @@ def gpuScanOptimization(node):
...
@@ -2126,8 +2117,7 @@ def gpuScanOptimization(node):
optdb
.
register
(
'gpu_scanOp_make_inplace'
,
optdb
.
register
(
'gpu_scanOp_make_inplace'
,
scan_opt
.
ScanInplaceOptimizer
(
typeConstructor
=
typeConstructor
,
scan_opt
.
ScanInplaceOptimizer
(
gpu_flag
=
True
),
gpu_flag
=
True
),
75
,
75
,
'gpu'
,
'gpu'
,
'fast_run'
,
'fast_run'
,
...
...
theano/sandbox/cuda/type.py
浏览文件 @
c190056b
...
@@ -71,6 +71,11 @@ class CudaNdarrayType(Type):
...
@@ -71,6 +71,11 @@ class CudaNdarrayType(Type):
self
.
name
=
name
self
.
name
=
name
self
.
dtype_specs
()
# error checking is done there
self
.
dtype_specs
()
# error checking is done there
def
clone
(
self
,
dtype
=
None
,
broadcastable
=
None
):
if
broadcastable
is
None
:
broadcastable
=
self
.
broadcastable
return
self
.
__class__
(
broadcastable
,
name
=
self
.
name
,
dtype
=
dtype
)
def
filter
(
self
,
data
,
strict
=
False
,
allow_downcast
=
None
):
def
filter
(
self
,
data
,
strict
=
False
,
allow_downcast
=
None
):
return
self
.
filter_inplace
(
data
,
None
,
strict
=
strict
,
return
self
.
filter_inplace
(
data
,
None
,
strict
=
strict
,
allow_downcast
=
allow_downcast
)
allow_downcast
=
allow_downcast
)
...
...
theano/sandbox/gpuarray/basic_ops.py
浏览文件 @
c190056b
...
@@ -607,7 +607,6 @@ class GpuAlloc(HideC, Alloc):
...
@@ -607,7 +607,6 @@ class GpuAlloc(HideC, Alloc):
def
__init__
(
self
,
memset_0
=
False
):
def
__init__
(
self
,
memset_0
=
False
):
"""memset_0 is only an optimized version. True, it mean the
"""memset_0 is only an optimized version. True, it mean the
value is always 0, so the c code call memset as it is faster.
value is always 0, so the c code call memset as it is faster.
"""
"""
self
.
memset_0
=
memset_0
self
.
memset_0
=
memset_0
...
...
theano/sandbox/gpuarray/opt.py
浏览文件 @
c190056b
...
@@ -716,13 +716,11 @@ def local_scan_to_gpua(node):
...
@@ -716,13 +716,11 @@ def local_scan_to_gpua(node):
_cmodule_key
=
gof
.
CLinker
()
.
cmodule_key_
(
local_fgraph
,
[])
_cmodule_key
=
gof
.
CLinker
()
.
cmodule_key_
(
local_fgraph
,
[])
info
[
'gpu_hash'
]
=
hash
(
_cmodule_key
)
info
[
'gpu_hash'
]
=
hash
(
_cmodule_key
)
nw_op
=
scan_op
.
Scan
(
scan_ins
,
scan_outs
,
info
,
nw_op
=
scan_op
.
Scan
(
scan_ins
,
scan_outs
,
info
)
.
make_node
(
*
nw_ins
)
typeConstructor
=
GpuArrayType
)
.
make_node
(
*
nw_ins
)
return
nw_op
.
outputs
return
nw_op
.
outputs
optdb
.
register
(
'gpua_scanOp_make_inplace'
,
optdb
.
register
(
'gpua_scanOp_make_inplace'
,
scan_opt
.
ScanInplaceOptimizer
(
typeConstructor
=
GpuArrayType
,
scan_opt
.
ScanInplaceOptimizer
(
gpua_flag
=
True
),
gpua_flag
=
True
),
75
,
75
,
'gpua'
,
'gpua'
,
'fast_run'
,
'fast_run'
,
...
...
theano/sandbox/gpuarray/tests/test_scan.py
浏览文件 @
c190056b
...
@@ -15,6 +15,7 @@ from theano.sandbox.gpuarray.tests.test_basic_ops import mode_with_gpu
...
@@ -15,6 +15,7 @@ from theano.sandbox.gpuarray.tests.test_basic_ops import mode_with_gpu
class
T_Scan
(
TestCase
):
class
T_Scan
(
TestCase
):
def
setUp
(
self
):
def
setUp
(
self
):
utt
.
seed_rng
()
utt
.
seed_rng
()
super
(
T_Scan
,
self
)
.
setUp
()
def
test_one_sequence_one_output_weights_gpu1
(
self
):
def
test_one_sequence_one_output_weights_gpu1
(
self
):
def
f_rnn
(
u_t
,
x_tm1
,
W_in
,
W
):
def
f_rnn
(
u_t
,
x_tm1
,
W_in
,
W
):
...
...
theano/sandbox/gpuarray/type.py
浏览文件 @
c190056b
...
@@ -28,6 +28,14 @@ class GpuArrayType(Type):
...
@@ -28,6 +28,14 @@ class GpuArrayType(Type):
raise
TypeError
(
"Unsupported dtype for
%
s:
%
s"
%
raise
TypeError
(
"Unsupported dtype for
%
s:
%
s"
%
(
self
.
__class__
.
__name__
,
self
.
dtype
))
(
self
.
__class__
.
__name__
,
self
.
dtype
))
def
clone
(
self
,
dtype
=
None
,
broadcastable
=
None
):
if
dtype
is
None
:
dtype
=
self
.
dtype
if
broadcastable
is
None
:
broadcastable
=
self
.
broadcastable
return
self
.
__class__
(
dtype
=
dtype
,
broadcastable
=
broadcastable
,
name
=
self
.
name
)
def
__str__
(
self
):
def
__str__
(
self
):
return
"GpuArrayType(
%
s,
%
s)"
%
(
self
.
dtype
,
self
.
broadcastable
)
return
"GpuArrayType(
%
s,
%
s)"
%
(
self
.
dtype
,
self
.
broadcastable
)
...
...
theano/scan_module/scan_op.py
浏览文件 @
c190056b
...
@@ -47,7 +47,6 @@ class Scan(PureOp):
...
@@ -47,7 +47,6 @@ class Scan(PureOp):
inputs
,
inputs
,
outputs
,
outputs
,
info
,
info
,
typeConstructor
=
None
,
):
):
"""
"""
:param inputs: inputs of the inner function of scan
:param inputs: inputs of the inner function of scan
...
@@ -56,21 +55,6 @@ class Scan(PureOp):
...
@@ -56,21 +55,6 @@ class Scan(PureOp):
the scan op (like number of different types of
the scan op (like number of different types of
arguments, name, mode, if it should run on GPU or
arguments, name, mode, if it should run on GPU or
not, etc.)
not, etc.)
:param typeConstructor: function that constructs an equivalent
to Theano TensorType
Note: ``typeConstructor`` had been added to refactor how
Theano deals with the GPU. If it runs on the GPU, scan needs
to construct certain outputs (those who reside in the GPU
memory) as the GPU-specific type. However we can not import
gpu code in this file (as it is in sandbox, and not available
on each machine) so the workaround is that the GPU
optimization passes to the constructor of this class a
function that is able to construct a GPU type. This way the
class Scan does not need to be aware of the details for the
GPU, it just constructs any tensor using this function (which
by default constructs normal tensors).
"""
"""
if
'gpua'
not
in
info
:
if
'gpua'
not
in
info
:
info
[
'gpua'
]
=
False
info
[
'gpua'
]
=
False
...
@@ -87,19 +71,13 @@ class Scan(PureOp):
...
@@ -87,19 +71,13 @@ class Scan(PureOp):
self
.
output_types
=
[]
self
.
output_types
=
[]
idx
=
0
idx
=
0
jdx
=
0
jdx
=
0
tensorConstructor
=
lambda
broadcastable
,
dtype
:
TensorType
(
broadcastable
=
broadcastable
,
dtype
=
dtype
)
if
typeConstructor
is
None
:
typeConstructor
=
tensorConstructor
while
idx
<
self
.
n_mit_mot_outs
:
while
idx
<
self
.
n_mit_mot_outs
:
# Not that for mit_mot there are several output slices per
# Not that for mit_mot there are several output slices per
# output sequence
# output sequence
o
=
outputs
[
idx
]
o
=
outputs
[
idx
]
self
.
output_types
.
append
(
self
.
output_types
.
append
(
typeConstructor
(
o
.
type
.
clone
(
broadcastable
=
(
False
,)
+
o
.
type
.
broadcastable
))
broadcastable
=
(
False
,)
+
o
.
type
.
broadcastable
,
dtype
=
o
.
type
.
dtype
))
idx
+=
len
(
self
.
mit_mot_out_slices
[
jdx
])
idx
+=
len
(
self
.
mit_mot_out_slices
[
jdx
])
jdx
+=
1
jdx
+=
1
...
@@ -109,9 +87,7 @@ class Scan(PureOp):
...
@@ -109,9 +87,7 @@ class Scan(PureOp):
for
o
in
outputs
[
idx
:
end
]:
for
o
in
outputs
[
idx
:
end
]:
self
.
output_types
.
append
(
self
.
output_types
.
append
(
typeConstructor
(
o
.
type
.
clone
(
broadcastable
=
(
False
,)
+
o
.
type
.
broadcastable
))
broadcastable
=
(
False
,)
+
o
.
type
.
broadcastable
,
dtype
=
o
.
type
.
dtype
))
# shared outputs + possibly the ending condition
# shared outputs + possibly the ending condition
for
o
in
outputs
[
end
:]:
for
o
in
outputs
[
end
:]:
...
@@ -232,10 +208,9 @@ class Scan(PureOp):
...
@@ -232,10 +208,9 @@ class Scan(PureOp):
if
rval
.
ndim
==
as_var
.
ndim
:
if
rval
.
ndim
==
as_var
.
ndim
:
rval
=
as_var
.
type
.
filter_variable
(
rval
)
rval
=
as_var
.
type
.
filter_variable
(
rval
)
else
:
else
:
tmp
=
as_var
.
type
.
__class__
(
tmp
=
as_var
.
type
.
clone
(
broadcastable
=
tuple
(
var
.
broadcastable
[:
1
])
+
\
broadcastable
=
tuple
(
var
.
broadcastable
[:
1
])
+
\
tuple
(
as_var
.
broadcastable
),
tuple
(
as_var
.
broadcastable
))
dtype
=
as_var
.
dtype
)
rval
=
tmp
.
filter_variable
(
rval
)
rval
=
tmp
.
filter_variable
(
rval
)
return
rval
return
rval
...
...
theano/scan_module/scan_opt.py
浏览文件 @
c190056b
...
@@ -916,9 +916,8 @@ class PushOutScanOutput(gof.Optimizer):
...
@@ -916,9 +916,8 @@ class PushOutScanOutput(gof.Optimizer):
class
ScanInplaceOptimizer
(
Optimizer
):
class
ScanInplaceOptimizer
(
Optimizer
):
"""Graph optimizer for Scan(makes it run inplace)"""
"""Graph optimizer for Scan(makes it run inplace)"""
def
__init__
(
self
,
typeConstructor
=
None
,
gpu_flag
=
False
,
gpua_flag
=
False
):
def
__init__
(
self
,
gpu_flag
=
False
,
gpua_flag
=
False
):
Optimizer
.
__init__
(
self
)
Optimizer
.
__init__
(
self
)
self
.
typeConstructor
=
typeConstructor
self
.
gpu_flag
=
gpu_flag
self
.
gpu_flag
=
gpu_flag
self
.
gpua_flag
=
gpua_flag
self
.
gpua_flag
=
gpua_flag
...
@@ -960,8 +959,7 @@ class ScanInplaceOptimizer(Optimizer):
...
@@ -960,8 +959,7 @@ class ScanInplaceOptimizer(Optimizer):
inputs
=
ls_begin
+
ls
+
ls_end
inputs
=
ls_begin
+
ls
+
ls_end
new_op
=
scan_op
.
Scan
(
op
.
inputs
,
new_op
=
scan_op
.
Scan
(
op
.
inputs
,
op
.
outputs
,
op
.
outputs
,
info
,
info
)
typeConstructor
=
self
.
typeConstructor
)
# Do not call make_node for test_value
# Do not call make_node for test_value
new_outs
=
new_op
(
*
inputs
,
**
dict
(
return_list
=
True
))
new_outs
=
new_op
(
*
inputs
,
**
dict
(
return_list
=
True
))
...
@@ -2086,8 +2084,7 @@ scan_eqopt2 = theano.gof.EquilibriumDB()
...
@@ -2086,8 +2084,7 @@ scan_eqopt2 = theano.gof.EquilibriumDB()
optdb
.
register
(
'scan_eqopt1'
,
scan_eqopt1
,
.
1
,
'fast_run'
,
'scan'
)
optdb
.
register
(
'scan_eqopt1'
,
scan_eqopt1
,
.
1
,
'fast_run'
,
'scan'
)
optdb
.
register
(
'scan_eqopt2'
,
scan_eqopt2
,
1.6
,
'fast_run'
,
'scan'
)
optdb
.
register
(
'scan_eqopt2'
,
scan_eqopt2
,
1.6
,
'fast_run'
,
'scan'
)
optdb
.
register
(
'scanOp_make_inplace'
,
optdb
.
register
(
'scanOp_make_inplace'
,
ScanInplaceOptimizer
(
typeConstructor
=
None
,
ScanInplaceOptimizer
(),
gpu_flag
=
False
),
75
,
75
,
'fast_run'
,
'fast_run'
,
'inplace'
,
'inplace'
,
...
...
theano/tensor/basic.py
浏览文件 @
c190056b
...
@@ -2403,17 +2403,7 @@ class Alloc(gof.Op):
...
@@ -2403,17 +2403,7 @@ class Alloc(gof.Op):
This Op is used to replace fill() during optimizations because after shapes
This Op is used to replace fill() during optimizations because after shapes
are lifted, the first argument to fill can often be pruned from the graph.
are lifted, the first argument to fill can often be pruned from the graph.
"""
"""
def
__init__
(
self
):
__props__
=
()
pass
def
__eq__
(
self
,
other
):
return
type
(
self
)
==
type
(
other
)
def
__hash__
(
self
):
return
hash
(
type
(
self
))
def
__str__
(
self
):
return
self
.
__class__
.
__name__
def
make_node
(
self
,
value
,
*
shape
):
def
make_node
(
self
,
value
,
*
shape
):
v
=
as_tensor_variable
(
value
)
v
=
as_tensor_variable
(
value
)
...
...
theano/tensor/type.py
浏览文件 @
c190056b
...
@@ -52,6 +52,18 @@ class TensorType(Type):
...
@@ -52,6 +52,18 @@ class TensorType(Type):
" AdvancedSubtensor1 sparse_grad. Now use"
" AdvancedSubtensor1 sparse_grad. Now use"
" theano.sparse_grad(a_tensor[an_int_vector])."
)
" theano.sparse_grad(a_tensor[an_int_vector])."
)
def
clone
(
self
,
dtype
=
None
,
broadcastable
=
None
):
"""
Return a copy of the type optionally with a new dtype or
broadcastable pattern.
"""
if
dtype
is
None
:
dtype
=
self
.
dtype
if
broadcastable
is
None
:
broadcastable
=
self
.
broadcastable
return
self
.
__class__
(
dtype
,
broadcastable
,
name
=
self
.
name
,
sparse_grad
=
self
.
sparse_grad
)
def
filter
(
self
,
data
,
strict
=
False
,
allow_downcast
=
None
):
def
filter
(
self
,
data
,
strict
=
False
,
allow_downcast
=
None
):
"""Convert `data` to something which can be associated to a
"""Convert `data` to something which can be associated to a
`TensorVariable`.
`TensorVariable`.
...
...
编写
预览
Markdown
格式
0%
重试
或
添加新文件
添加附件
取消
您添加了
0
人
到此讨论。请谨慎行事。
请先完成此评论的编辑!
取消
请
注册
或者
登录
后发表评论