Skip to content
项目
群组
代码片段
帮助
当前项目
正在载入...
登录 / 注册
切换导航面板
P
pytensor
项目
项目
详情
活动
周期分析
仓库
仓库
文件
提交
分支
标签
贡献者
图表
比较
统计图
议题
0
议题
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
CI / CD
CI / CD
流水线
作业
日程
统计图
Wiki
Wiki
代码片段
代码片段
成员
成员
折叠边栏
关闭边栏
活动
图像
聊天
创建新问题
作业
提交
问题看板
Open sidebar
testgroup
pytensor
Commits
d1eba87d
提交
d1eba87d
authored
8月 19, 2015
作者:
abergeron
浏览文件
操作
浏览文件
下载
差异文件
Merge pull request #3294 from harlouci/numpydoc_sandbox_1
Numpydoc sandbox 1
上级
477fd7cf
88716ac9
隐藏空白字符变更
内嵌
并排
正在显示
12 个修改的文件
包含
815 行增加
和
482 行删除
+815
-482
basic_ops.py
theano/sandbox/cuda/basic_ops.py
+232
-80
blas.py
theano/sandbox/cuda/blas.py
+198
-131
blocksparse.py
theano/sandbox/cuda/blocksparse.py
+27
-15
cula.py
theano/sandbox/cuda/cula.py
+6
-2
fourier.py
theano/sandbox/fourier.py
+15
-10
multinomial.py
theano/sandbox/multinomial.py
+7
-2
neighbourhoods.py
theano/sandbox/neighbourhoods.py
+54
-52
rng_mrg.py
theano/sandbox/rng_mrg.py
+88
-68
scan.py
theano/sandbox/scan.py
+42
-31
solve.py
theano/sandbox/solve.py
+4
-2
test_rng_mrg.py
theano/sandbox/test_rng_mrg.py
+30
-15
theano_object.py
theano/sandbox/theano_object.py
+112
-74
没有找到文件。
theano/sandbox/cuda/basic_ops.py
浏览文件 @
d1eba87d
...
@@ -59,7 +59,9 @@ def as_cuda_array(obj):
...
@@ -59,7 +59,9 @@ def as_cuda_array(obj):
class
HostFromGpu
(
GpuOp
):
class
HostFromGpu
(
GpuOp
):
"""
"""
Implement the transfer from gpu to the cpu.
Implement the transfer from gpu to the cpu.
"""
"""
check_input
=
False
check_input
=
False
def
__eq__
(
self
,
other
):
def
__eq__
(
self
,
other
):
...
@@ -118,7 +120,9 @@ host_from_gpu = HostFromGpu()
...
@@ -118,7 +120,9 @@ host_from_gpu = HostFromGpu()
class
GpuFromHost
(
GpuOp
):
class
GpuFromHost
(
GpuOp
):
"""
"""
Implement the transfer from cpu to the gpu.
Implement the transfer from cpu to the gpu.
"""
"""
check_input
=
False
check_input
=
False
def
__eq__
(
self
,
other
):
def
__eq__
(
self
,
other
):
...
@@ -185,7 +189,9 @@ gpu_from_host = GpuFromHost()
...
@@ -185,7 +189,9 @@ gpu_from_host = GpuFromHost()
class
GpuElemwise
(
GpuOp
):
class
GpuElemwise
(
GpuOp
):
"""
"""
Implement a generic elemwise on the gpu.
Implement a generic elemwise on the gpu.
"""
"""
nin
=
property
(
lambda
self
:
self
.
scalar_op
.
nin
)
nin
=
property
(
lambda
self
:
self
.
scalar_op
.
nin
)
nout
=
property
(
lambda
self
:
self
.
scalar_op
.
nout
)
nout
=
property
(
lambda
self
:
self
.
scalar_op
.
nout
)
...
@@ -316,7 +322,9 @@ class GpuElemwise(GpuOp):
...
@@ -316,7 +322,9 @@ class GpuElemwise(GpuOp):
class
GpuDimShuffle
(
GpuOp
):
class
GpuDimShuffle
(
GpuOp
):
"""
"""
Implement DimShuffle on the gpu.
Implement DimShuffle on the gpu.
"""
"""
check_broadcast
=
False
check_broadcast
=
False
def
__init__
(
self
,
input_broadcastable
,
new_order
):
def
__init__
(
self
,
input_broadcastable
,
new_order
):
...
@@ -523,39 +531,47 @@ class GpuDimShuffle(GpuOp):
...
@@ -523,39 +531,47 @@ class GpuDimShuffle(GpuOp):
class
GpuCAReduce
(
GpuOp
):
class
GpuCAReduce
(
GpuOp
):
"""GpuCAReduce is a Reduction along some dimensions by a scalar op.
"""
GpuCAReduce is a Reduction along some dimensions by a scalar op.
The dimensions along which to reduce is specified by the
The dimensions along which to reduce is specified by the
`reduce_mask` that you pass to the constructor. The `reduce_mask`
`reduce_mask` that you pass to the constructor. The `reduce_mask`
is a tuple of booleans (actually integers 0 or 1) that specify for
is a tuple of booleans (actually integers 0 or 1) that specify for
each input dimension, whether to reduce it (1) or not (0).
each input dimension, whether to reduce it (1) or not (0).
For example, when scalar_op is a theano.scalar.basic.Add instance:
Parameters
----------
- reduce_mask == (1,) sums a vector to a scalar
pre_scalar_op
If present, must be a scalar op with only 1 input.
- reduce_mask == (1,0) computes the sum of each column in a matrix
We will execute it on the input value before reduction.
- reduce_mask == (0,1) computes the sum of each row in a matrix
Notes
-----
- reduce_mask == (1,1,1) computes the sum of all elements in a 3-tensor.
:note: any reduce_mask of all zeros is a sort of 'copy', and may
be removed during graph optimization
This Op is a work in progress.
This Op is a work in progress.
This op was recently upgraded from just GpuSum a general CAReduce. Not
This op was recently upgraded from just GpuSum a general CAReduce. Not
many code cases are supported for scalar_op being anything other than
many code cases are supported for scalar_op being anything other than
scal.Add instances yet.
scal.
Add instances yet.
Important note: if you implement new cases for this op, be sure to
Important note: if you implement new cases for this op, be sure to
benchmark them and make sure that they actually result in a speedup.
benchmark them and make sure that they actually result in a speedup.
GPUs are not especially well-suited to reduction operations so it is
GPUs are not especially well-suited to reduction operations so it is
quite possible that the GPU might be slower for some cases.
quite possible that the GPU might be slower for some cases.
pre_scalar_op: if present, must be a scalar op with only 1
Examples
input. We will execute it on the input value before reduction.
--------
When scalar_op is a theano.scalar.basic.Add instance:
- reduce_mask == (1,) sums a vector to a scalar
- reduce_mask == (1,0) computes the sum of each column in a matrix
- reduce_mask == (0,1) computes the sum of each row in a matrix
- reduce_mask == (1,1,1) computes the sum of all elements in a 3-tensor.
..note:: Any reduce_mask of all zeros is a sort of 'copy', and may
be removed during graph optimization.
"""
"""
...
@@ -620,8 +636,11 @@ class GpuCAReduce(GpuOp):
...
@@ -620,8 +636,11 @@ class GpuCAReduce(GpuOp):
"""
"""
def
supports_c_code
(
self
,
inputs
):
def
supports_c_code
(
self
,
inputs
):
""" Returns True if the current op and reduce pattern
"""
has functioning C code """
Returns True if the current op and reduce pattern has functioning C
code.
"""
# If we don't even have the right method, we certainly
# If we don't even have the right method, we certainly
# don't support the C code
# don't support the C code
...
@@ -781,9 +800,10 @@ class GpuCAReduce(GpuOp):
...
@@ -781,9 +800,10 @@ class GpuCAReduce(GpuOp):
return
sio
.
getvalue
()
return
sio
.
getvalue
()
def
_makecall
(
self
,
node
,
name
,
x
,
z
,
fail
,
pattern
=
None
):
def
_makecall
(
self
,
node
,
name
,
x
,
z
,
fail
,
pattern
=
None
):
"""Return a string for making a kernel call.
"""
Return a string for making a kernel call.
The return value looks something like:
The return value looks something like:
.. code-block:: c
.. code-block:: c
...
@@ -806,6 +826,7 @@ class GpuCAReduce(GpuOp):
...
@@ -806,6 +826,7 @@ class GpuCAReduce(GpuOp):
PyErr_Format(PyExc_RuntimeError, "Cuda error: ... );
PyErr_Format(PyExc_RuntimeError, "Cuda error: ... );
%(fail)
s;
%(fail)
s;
}
}
"""
"""
sio
=
StringIO
()
sio
=
StringIO
()
if
pattern
is
None
:
if
pattern
is
None
:
...
@@ -874,7 +895,8 @@ class GpuCAReduce(GpuOp):
...
@@ -874,7 +895,8 @@ class GpuCAReduce(GpuOp):
def
_k_decl
(
self
,
node
,
nodename
,
pattern
=
None
,
def
_k_decl
(
self
,
node
,
nodename
,
pattern
=
None
,
ndim
=
None
,
reduce_mask
=
None
):
ndim
=
None
,
reduce_mask
=
None
):
"""Return a string to declare a kernel function
"""
Return a string to declare a kernel function.
The result will look something like this:
The result will look something like this:
...
@@ -953,6 +975,7 @@ class GpuCAReduce(GpuOp):
...
@@ -953,6 +975,7 @@ class GpuCAReduce(GpuOp):
Otherwise, check that the scalar op is maximum or minimum
Otherwise, check that the scalar op is maximum or minimum
and return first_item. It should be the first element of the reduction.
and return first_item. It should be the first element of the reduction.
As the maximum and minimum of the same value don't change, this work.
As the maximum and minimum of the same value don't change, this work.
"""
"""
if
hasattr
(
self
.
scalar_op
,
'identity'
):
if
hasattr
(
self
.
scalar_op
,
'identity'
):
return
str
(
self
.
scalar_op
.
identity
)
return
str
(
self
.
scalar_op
.
identity
)
...
@@ -980,16 +1003,27 @@ class GpuCAReduce(GpuOp):
...
@@ -980,16 +1003,27 @@ class GpuCAReduce(GpuOp):
def
_assign_reduce
(
self
,
node
,
name
,
left
,
right
,
sub
,
pre
):
def
_assign_reduce
(
self
,
node
,
name
,
left
,
right
,
sub
,
pre
):
"""
"""
node: the node argument to this op's c_code
Parameters
name: the name argument to this op's c_code
----------
left: a C code string identifying an lvalue
node
right: a C code string identifying an expression
The node argument to this op's c_code.
sub: the sub argument to this op's c_code
name
pre: If True, we will add the pre_scalar_op.c_code
The name argument to this op's c_code.
left
returns C code to reduce left and right, assigning the
A C code string identifying an lvalue.
result to left."""
right
A C code string identifying an expression.
sub
The sub argument to this op's c_code.
pre
If True, we will add the pre_scalar_op.c_code.
Returns
-------
str
C code to reduce left and right, assigning the result to left.
"""
x
,
=
node
.
inputs
x
,
=
node
.
inputs
dtype
=
x
.
dtype
dtype
=
x
.
dtype
...
@@ -1019,8 +1053,11 @@ class GpuCAReduce(GpuOp):
...
@@ -1019,8 +1053,11 @@ class GpuCAReduce(GpuOp):
"""
"""
WRITEME
WRITEME
Parameters
----------
node, name, sub: these should be passed through from the original
node, name, sub: these should be passed through from the original
call to c_code
call to c_code
"""
"""
# This code (the code in new_version) is currently ignored.
# This code (the code in new_version) is currently ignored.
...
@@ -1158,9 +1195,11 @@ class GpuCAReduce(GpuOp):
...
@@ -1158,9 +1195,11 @@ class GpuCAReduce(GpuOp):
def
c_code_reduce_ccontig
(
self
,
sio
,
node
,
name
,
x
,
z
,
fail
):
def
c_code_reduce_ccontig
(
self
,
sio
,
node
,
name
,
x
,
z
,
fail
):
"""
"""
WRITEME
WRITEME
IG: I believe, based on how this is called in c_code, that it
IG: I believe, based on how this is called in c_code, that it
is for the case where we are reducing on all axes and x is
is for the case where we are reducing on all axes and x is
C contiguous.
C contiguous.
"""
"""
if
getattr
(
self
.
scalar_op
,
'identity'
,
None
)
==
0
:
if
getattr
(
self
.
scalar_op
,
'identity'
,
None
)
==
0
:
zero_shp
=
"cudaMemset(
%(z)
s->devdata, 0, CudaNdarray_SIZE(
%(z)
s) * sizeof(float))"
%
locals
()
zero_shp
=
"cudaMemset(
%(z)
s->devdata, 0, CudaNdarray_SIZE(
%(z)
s) * sizeof(float))"
%
locals
()
...
@@ -1243,8 +1282,14 @@ class GpuCAReduce(GpuOp):
...
@@ -1243,8 +1282,14 @@ class GpuCAReduce(GpuOp):
def
c_code_reduce_01X
(
self
,
sio
,
node
,
name
,
x
,
z
,
fail
,
N
):
def
c_code_reduce_01X
(
self
,
sio
,
node
,
name
,
x
,
z
,
fail
,
N
):
"""
"""
:param N: the number of 1 in the pattern N=1 -> 01, N=2 -> 011 N=3 ->0111
Work for N=1,2,3
Parameters
----------
N : int
The number of 1 in the pattern
N=1 -> 01, N=2 -> 011 N=3 ->0111
Works for N=1,2,3.
"""
"""
assert
N
in
[
1
,
2
,
3
]
assert
N
in
[
1
,
2
,
3
]
...
@@ -2395,7 +2440,9 @@ class GpuCAReduce(GpuOp):
...
@@ -2395,7 +2440,9 @@ class GpuCAReduce(GpuOp):
class
GpuReshape
(
tensor
.
Reshape
,
GpuOp
):
class
GpuReshape
(
tensor
.
Reshape
,
GpuOp
):
"""
"""
Implement Reshape on the gpu.
Implement Reshape on the gpu.
"""
"""
# __hash__, __eq__, __str__ come from tensor.Subtensor
# __hash__, __eq__, __str__ come from tensor.Subtensor
def
make_node
(
self
,
x
,
shp
):
def
make_node
(
self
,
x
,
shp
):
host_reshaped
=
host_from_gpu
(
x
)
.
reshape
(
shp
,
ndim
=
self
.
ndim
)
host_reshaped
=
host_from_gpu
(
x
)
.
reshape
(
shp
,
ndim
=
self
.
ndim
)
...
@@ -2541,7 +2588,9 @@ class GpuReshape(tensor.Reshape, GpuOp):
...
@@ -2541,7 +2588,9 @@ class GpuReshape(tensor.Reshape, GpuOp):
class
GpuSubtensor
(
GpuOp
,
tensor
.
Subtensor
):
class
GpuSubtensor
(
GpuOp
,
tensor
.
Subtensor
):
"""
"""
Implement subtensor on the gpu.
Implement subtensor on the gpu.
"""
"""
check_broadcast
=
False
check_broadcast
=
False
# __hash__, __eq__, __str__ come from tensor.Subtensor
# __hash__, __eq__, __str__ come from tensor.Subtensor
...
@@ -2647,7 +2696,9 @@ class GpuSubtensor(GpuOp, tensor.Subtensor):
...
@@ -2647,7 +2696,9 @@ class GpuSubtensor(GpuOp, tensor.Subtensor):
class
GpuAdvancedSubtensor1
(
tensor
.
AdvancedSubtensor1
,
GpuOp
):
class
GpuAdvancedSubtensor1
(
tensor
.
AdvancedSubtensor1
,
GpuOp
):
"""
"""
Implement AdvancedSubtensor1 on the gpu.
Implement AdvancedSubtensor1 on the gpu.
"""
"""
# If True or False, we assert that we use the take version or not
# If True or False, we assert that we use the take version or not
# If None, we choose the best one applicable
# If None, we choose the best one applicable
perform_using_take
=
None
perform_using_take
=
None
...
@@ -2762,7 +2813,9 @@ class GpuAdvancedSubtensor1(tensor.AdvancedSubtensor1, GpuOp):
...
@@ -2762,7 +2813,9 @@ class GpuAdvancedSubtensor1(tensor.AdvancedSubtensor1, GpuOp):
class
GpuAdvancedIncSubtensor1
(
tensor
.
AdvancedIncSubtensor1
,
GpuOp
):
class
GpuAdvancedIncSubtensor1
(
tensor
.
AdvancedIncSubtensor1
,
GpuOp
):
"""
"""
Implement AdvancedIncSubtensor1 on the gpu.
Implement AdvancedIncSubtensor1 on the gpu.
"""
"""
def
make_node
(
self
,
x
,
y
,
ilist
):
def
make_node
(
self
,
x
,
y
,
ilist
):
x_
=
as_cuda_ndarray_variable
(
x
)
x_
=
as_cuda_ndarray_variable
(
x
)
y_
=
as_cuda_ndarray_variable
(
y
)
y_
=
as_cuda_ndarray_variable
(
y
)
...
@@ -2936,13 +2989,17 @@ class GpuAdvancedIncSubtensor1(tensor.AdvancedIncSubtensor1, GpuOp):
...
@@ -2936,13 +2989,17 @@ class GpuAdvancedIncSubtensor1(tensor.AdvancedIncSubtensor1, GpuOp):
class
GpuAdvancedIncSubtensor1_dev20
(
GpuAdvancedIncSubtensor1
):
class
GpuAdvancedIncSubtensor1_dev20
(
GpuAdvancedIncSubtensor1
):
"""Implement AdvancedIncSubtensor1 on the gpu, but use function
"""
Implement AdvancedIncSubtensor1 on the gpu, but use function
only avail on compute capability 2.0 and more recent.
only avail on compute capability 2.0 and more recent.
"""
"""
def
make_node
(
self
,
x
,
y
,
ilist
):
def
make_node
(
self
,
x
,
y
,
ilist
):
"""It defer from GpuAdvancedIncSubtensor1 in that it make sure
"""
It defer from GpuAdvancedIncSubtensor1 in that it make sure
the index are of type long.
the index are of type long.
"""
"""
x_
=
as_cuda_ndarray_variable
(
x
)
x_
=
as_cuda_ndarray_variable
(
x
)
y_
=
as_cuda_ndarray_variable
(
y
)
y_
=
as_cuda_ndarray_variable
(
y
)
...
@@ -3132,11 +3189,14 @@ class GpuIncSubtensor(tensor.IncSubtensor, GpuOp):
...
@@ -3132,11 +3189,14 @@ class GpuIncSubtensor(tensor.IncSubtensor, GpuOp):
"""
"""
Implement IncSubtensor on the gpu.
Implement IncSubtensor on the gpu.
Note: The optimization to make this inplace is in tensor/opt.
Notes
The same optimization handles IncSubtensor and GpuIncSubtensor.
-----
This Op has c_code too; it inherits tensor.IncSubtensor's c_code.
The optimization to make this inplace is in tensor/opt.
The helper methods like do_type_checking, copy_of_x, etc. specialize
The same optimization handles IncSubtensor and GpuIncSubtensor.
the c_code for this Op.
This Op has c_code too; it inherits tensor.IncSubtensor's c_code.
The helper methods like do_type_checking, copy_of_x, etc. specialize
the c_code for this Op.
"""
"""
def
make_node
(
self
,
x
,
y
,
*
inputs
):
def
make_node
(
self
,
x
,
y
,
*
inputs
):
...
@@ -3146,22 +3206,32 @@ class GpuIncSubtensor(tensor.IncSubtensor, GpuOp):
...
@@ -3146,22 +3206,32 @@ class GpuIncSubtensor(tensor.IncSubtensor, GpuOp):
return
Apply
(
self
,
[
x
,
y
]
+
rval
.
inputs
[
2
:],
[
x
.
type
()])
return
Apply
(
self
,
[
x
,
y
]
+
rval
.
inputs
[
2
:],
[
x
.
type
()])
def
do_type_checking
(
self
,
node
):
def
do_type_checking
(
self
,
node
):
""" Should raise NotImplementedError if c_code does not support
"""
Should raise NotImplementedError if c_code does not support
the types involved in this node.
the types involved in this node.
"""
"""
if
not
isinstance
(
node
.
inputs
[
0
]
.
type
,
CudaNdarrayType
):
if
not
isinstance
(
node
.
inputs
[
0
]
.
type
,
CudaNdarrayType
):
raise
NotImplementedError
()
raise
NotImplementedError
()
def
copy_of_x
(
self
,
x
):
def
copy_of_x
(
self
,
x
):
"""
"""
:param x: a string giving the name of a C variable
pointing to an array
:return: C code expression to make a copy of x
Parameters
----------
x : str
A string giving the name of a C variable pointing to an array.
Returns
-------
str
C code expression to make a copy of x.
Notes
-----
Base class uses `PyArrayObject *`, subclasses may override for
different types of arrays.
Base class uses `PyArrayObject *`, subclasses may override for
different types of arrays.
"""
"""
return
"""(CudaNdarray*) CudaNdarray_Copy(
%(x)
s)"""
%
locals
()
return
"""(CudaNdarray*) CudaNdarray_Copy(
%(x)
s)"""
%
locals
()
...
@@ -3170,12 +3240,16 @@ class GpuIncSubtensor(tensor.IncSubtensor, GpuOp):
...
@@ -3170,12 +3240,16 @@ class GpuIncSubtensor(tensor.IncSubtensor, GpuOp):
def
make_view_array
(
self
,
x
,
view_ndim
):
def
make_view_array
(
self
,
x
,
view_ndim
):
"""
"""
:param x: a string identifying an array to be viewed
:param view_ndim: a string specifying the number of dimensions
to have in the view
Parameters
----------
x : str
A string identifying an array to be viewed.
view_ndim : str
A string specifying the number of dimensions to have in the view.
This doesn't need to actually set up the view with the
This doesn't need to actually set up the view with the
right indexing; we'll do that manually later.
right indexing; we'll do that manually later.
"""
"""
ret
=
"""zview = (CudaNdarray*) CudaNdarray_New(
%(view_ndim)
s);
ret
=
"""zview = (CudaNdarray*) CudaNdarray_New(
%(view_ndim)
s);
if (CudaNdarray_set_device_data(
if (CudaNdarray_set_device_data(
...
@@ -3201,18 +3275,29 @@ class GpuIncSubtensor(tensor.IncSubtensor, GpuOp):
...
@@ -3201,18 +3275,29 @@ class GpuIncSubtensor(tensor.IncSubtensor, GpuOp):
return
ret
return
ret
def
get_helper_c_code_args
(
self
):
def
get_helper_c_code_args
(
self
):
""" Return a dictionary of arguments to use with helper_c_code"""
"""
Return a dictionary of arguments to use with helper_c_code.
"""
return
{
'c_prefix'
:
'CudaNdarray'
,
return
{
'c_prefix'
:
'CudaNdarray'
,
'strides_mul'
:
4
'strides_mul'
:
4
}
}
def
copy_into
(
self
,
view
,
source
):
def
copy_into
(
self
,
view
,
source
):
"""
"""
view: string, C code expression for an array
source: string, C code expression for an array
returns a C code expression to copy source into view, and
Parameters
return 0 on success
----------
view : str
C code expression for an array.
source : str
C code expression for an array
Returns
-------
str
A C code expression to copy source into view, and 0 on success.
"""
"""
# On the CPU it unbroadcast based on the run time shapes. We
# On the CPU it unbroadcast based on the run time shapes. We
# need the same behavior on the GPU.
# need the same behavior on the GPU.
...
@@ -3245,7 +3330,9 @@ class GpuIncSubtensor(tensor.IncSubtensor, GpuOp):
...
@@ -3245,7 +3330,9 @@ class GpuIncSubtensor(tensor.IncSubtensor, GpuOp):
class
GpuFlatten
(
gof
.
HideC
,
tensor
.
Flatten
,
GpuOp
):
class
GpuFlatten
(
gof
.
HideC
,
tensor
.
Flatten
,
GpuOp
):
"""
"""
Implement Flatten on the gpu.
Implement Flatten on the gpu.
"""
"""
def
make_node
(
self
,
x
):
def
make_node
(
self
,
x
):
assert
isinstance
(
x
.
type
,
CudaNdarrayType
)
assert
isinstance
(
x
.
type
,
CudaNdarrayType
)
rval
=
tensor
.
Flatten
.
make_node
(
self
,
x
)
rval
=
tensor
.
Flatten
.
make_node
(
self
,
x
)
...
@@ -3257,7 +3344,9 @@ class GpuFlatten(gof.HideC, tensor.Flatten, GpuOp):
...
@@ -3257,7 +3344,9 @@ class GpuFlatten(gof.HideC, tensor.Flatten, GpuOp):
class
GpuShape
(
tensor
.
Shape
,
GpuOp
):
class
GpuShape
(
tensor
.
Shape
,
GpuOp
):
"""
"""
Implement Shape on the gpu.
Implement Shape on the gpu.
"""
"""
def
make_node
(
self
,
x
):
def
make_node
(
self
,
x
):
return
Apply
(
self
,
[
x
],
[
tensor
.
lvector
()])
return
Apply
(
self
,
[
x
],
[
tensor
.
lvector
()])
gpu_shape
=
GpuShape
()
gpu_shape
=
GpuShape
()
...
@@ -3266,7 +3355,9 @@ gpu_shape = GpuShape()
...
@@ -3266,7 +3355,9 @@ gpu_shape = GpuShape()
class
GpuJoin
(
tensor
.
Join
,
GpuOp
):
class
GpuJoin
(
tensor
.
Join
,
GpuOp
):
"""
"""
Implement Join on the gpu.
Implement Join on the gpu.
"""
"""
def
make_node
(
self
,
*
axis_and_tensors
):
def
make_node
(
self
,
*
axis_and_tensors
):
axis
,
tensors
=
axis_and_tensors
[
0
],
axis_and_tensors
[
1
:]
axis
,
tensors
=
axis_and_tensors
[
0
],
axis_and_tensors
[
1
:]
if
not
tensors
:
if
not
tensors
:
...
@@ -3516,7 +3607,11 @@ class GpuSplit(tensor.Split, GpuOp):
...
@@ -3516,7 +3607,11 @@ class GpuSplit(tensor.Split, GpuOp):
class
GpuAllocEmpty
(
GpuOp
):
class
GpuAllocEmpty
(
GpuOp
):
"""Implement Alloc on the gpu, but without initializing memory."""
"""
Implement Alloc on the gpu, but without initializing memory.
"""
__props__
=
()
__props__
=
()
@staticmethod
@staticmethod
...
@@ -3595,12 +3690,14 @@ gpu_alloc_empty = GpuAllocEmpty()
...
@@ -3595,12 +3690,14 @@ gpu_alloc_empty = GpuAllocEmpty()
class
GpuAlloc
(
GpuAllocEmpty
):
class
GpuAlloc
(
GpuAllocEmpty
):
"""Implement Alloc on the gpu.
"""
Implement Alloc on the gpu.
The memset_0 param is an optimization. When True, we call
The memset_0 param is an optimization. When True, we call
cudaMemset that is faster.
cudaMemset that is faster.
"""
"""
__props__
=
(
'memset_0'
,)
__props__
=
(
'memset_0'
,)
def
__init__
(
self
,
memset_0
=
False
):
def
__init__
(
self
,
memset_0
=
False
):
...
@@ -3706,9 +3803,12 @@ gpu_alloc = GpuAlloc()
...
@@ -3706,9 +3803,12 @@ gpu_alloc = GpuAlloc()
class
CopyOnNegativeStrides
(
GpuOp
):
class
CopyOnNegativeStrides
(
GpuOp
):
"""
"""
Checks if the input has contains negative strides. If it
Checks if the input has contains negative strides.
does, returns a c contiguous copy.
If it does, returns a c contiguous copy.
"""
"""
view_map
=
{
0
:
[
0
]}
view_map
=
{
0
:
[
0
]}
check_input
=
False
check_input
=
False
__props__
=
()
__props__
=
()
...
@@ -3781,7 +3881,9 @@ class GpuContiguous(GpuOp):
...
@@ -3781,7 +3881,9 @@ class GpuContiguous(GpuOp):
"""
"""
Always return a c contiguous output. Copy the input only if it is
Always return a c contiguous output. Copy the input only if it is
not already c contiguous.
not already c contiguous.
"""
"""
view_map
=
{
0
:
[
0
]}
view_map
=
{
0
:
[
0
]}
check_input
=
False
check_input
=
False
...
@@ -3855,9 +3957,16 @@ gpu_contiguous = GpuContiguous()
...
@@ -3855,9 +3957,16 @@ gpu_contiguous = GpuContiguous()
# Those are predifined CudaNdarrayType as done in tensor.basic
# Those are predifined CudaNdarrayType as done in tensor.basic
# Useful mostly for test as the gpu op are inserted automatically...
# Useful mostly for test as the gpu op are inserted automatically...
def
scalar
(
name
=
None
,
dtype
=
None
):
def
scalar
(
name
=
None
,
dtype
=
None
):
"""Return a symbolic scalar variable.
"""
:param dtype: numeric type (None means to use theano.config.floatX)
Return a symbolic scalar variable.
:param name: a name to attach to this variable
Parameters
----------
dtype
Numeric type (None means to use theano.config.floatX).
name : str
A name to attach to this variable.
"""
"""
if
dtype
is
None
:
if
dtype
is
None
:
dtype
=
config
.
floatX
dtype
=
config
.
floatX
...
@@ -3867,9 +3976,16 @@ fscalar = CudaNdarrayType(dtype='float32', broadcastable=())
...
@@ -3867,9 +3976,16 @@ fscalar = CudaNdarrayType(dtype='float32', broadcastable=())
def
vector
(
name
=
None
,
dtype
=
None
):
def
vector
(
name
=
None
,
dtype
=
None
):
"""Return a symbolic vector variable.
"""
:param dtype: numeric type (None means to use theano.config.floatX)
Return a symbolic vector variable.
:param name: a name to attach to this variable
Parameters
----------
dtype
Numeric type (None means to use theano.config.floatX).
name
A name to attach to this variable.
"""
"""
if
dtype
is
None
:
if
dtype
is
None
:
dtype
=
config
.
floatX
dtype
=
config
.
floatX
...
@@ -3879,9 +3995,16 @@ fvector = CudaNdarrayType(dtype='float32', broadcastable=(False, ))
...
@@ -3879,9 +3995,16 @@ fvector = CudaNdarrayType(dtype='float32', broadcastable=(False, ))
def
matrix
(
name
=
None
,
dtype
=
None
):
def
matrix
(
name
=
None
,
dtype
=
None
):
"""Return a symbolic matrix variable.
"""
:param dtype: numeric type (None means to use theano.config.floatX)
Return a symbolic matrix variable.
:param name: a name to attach to this variable
Parameters
----------
dtype
Numeric type (None means to use theano.config.floatX).
name
A name to attach to this variable.
"""
"""
if
dtype
is
None
:
if
dtype
is
None
:
dtype
=
config
.
floatX
dtype
=
config
.
floatX
...
@@ -3891,9 +4014,16 @@ fmatrix = CudaNdarrayType(dtype='float32', broadcastable=(False, False))
...
@@ -3891,9 +4014,16 @@ fmatrix = CudaNdarrayType(dtype='float32', broadcastable=(False, False))
def
row
(
name
=
None
,
dtype
=
None
):
def
row
(
name
=
None
,
dtype
=
None
):
"""Return a symbolic row variable (ndim=2, broadcastable=[True,False]).
"""
:param dtype: numeric type (None means to use theano.config.floatX)
Return a symbolic row variable (ndim=2, broadcastable=[True,False]).
:param name: a name to attach to this variable
Parameters
----------
dtype
Numeric type (None means to use theano.config.floatX).
name : str
A name to attach to this variable.
"""
"""
if
dtype
is
None
:
if
dtype
is
None
:
dtype
=
config
.
floatX
dtype
=
config
.
floatX
...
@@ -3903,9 +4033,16 @@ frow = CudaNdarrayType(dtype='float32', broadcastable=(True, False))
...
@@ -3903,9 +4033,16 @@ frow = CudaNdarrayType(dtype='float32', broadcastable=(True, False))
def
col
(
name
=
None
,
dtype
=
None
):
def
col
(
name
=
None
,
dtype
=
None
):
"""Return a symbolic column variable (ndim=2, broadcastable=[False,True]).
"""
:param dtype: numeric type (None means to use theano.config.floatX)
Return a symbolic column variable (ndim=2, broadcastable=[False,True]).
:param name: a name to attach to this variable
Parameters
----------
dtype
Numeric type (None means to use theano.config.floatX).
name : str
A name to attach to this variable.
"""
"""
if
dtype
is
None
:
if
dtype
is
None
:
dtype
=
config
.
floatX
dtype
=
config
.
floatX
...
@@ -3915,9 +4052,16 @@ fcol = CudaNdarrayType(dtype='float32', broadcastable=(False, True))
...
@@ -3915,9 +4052,16 @@ fcol = CudaNdarrayType(dtype='float32', broadcastable=(False, True))
def
tensor3
(
name
=
None
,
dtype
=
None
):
def
tensor3
(
name
=
None
,
dtype
=
None
):
"""Return a symbolic 3-D variable.
"""
:param dtype: numeric type (None means to use theano.config.floatX)
Return a symbolic 3-D variable.
:param name: a name to attach to this variable
Parameters
----------
dtype
Numeric type (None means to use theano.config.floatX).
name : str
A name to attach to this variable.
"""
"""
if
dtype
is
None
:
if
dtype
is
None
:
dtype
=
config
.
floatX
dtype
=
config
.
floatX
...
@@ -3927,9 +4071,16 @@ ftensor3 = CudaNdarrayType(dtype='float32', broadcastable=(False,) * 3)
...
@@ -3927,9 +4071,16 @@ ftensor3 = CudaNdarrayType(dtype='float32', broadcastable=(False,) * 3)
def
tensor4
(
name
=
None
,
dtype
=
None
):
def
tensor4
(
name
=
None
,
dtype
=
None
):
"""Return a symbolic 4-D variable.
"""
:param dtype: numeric type (None means to use theano.config.floatX)
Return a symbolic 4-D variable.
:param name: a name to attach to this variable
Parameters
----------
dtype
Numeric type (None means to use theano.config.floatX).
name : str
A name to attach to this variable.
"""
"""
if
dtype
is
None
:
if
dtype
is
None
:
dtype
=
config
.
floatX
dtype
=
config
.
floatX
...
@@ -3992,6 +4143,7 @@ def profile_printer(fct_name, compile_time, fct_call_time, fct_call,
...
@@ -3992,6 +4143,7 @@ def profile_printer(fct_name, compile_time, fct_call_time, fct_call,
class
GpuEye
(
GpuOp
):
class
GpuEye
(
GpuOp
):
def
__init__
(
self
,
dtype
=
None
):
def
__init__
(
self
,
dtype
=
None
):
if
dtype
is
None
:
if
dtype
is
None
:
dtype
=
config
.
floatX
dtype
=
config
.
floatX
...
...
theano/sandbox/cuda/blas.py
浏览文件 @
d1eba87d
...
@@ -217,6 +217,7 @@ batched_dot = BatchedDotOp()
...
@@ -217,6 +217,7 @@ batched_dot = BatchedDotOp()
class
GpuDot22
(
GpuOp
):
class
GpuDot22
(
GpuOp
):
"""
"""
Implement dot(2d, 2d) on the gpu.
Implement dot(2d, 2d) on the gpu.
"""
"""
def
__str__
(
self
):
def
__str__
(
self
):
return
'GpuDot22'
return
'GpuDot22'
...
@@ -299,7 +300,10 @@ class GpuDot22Scalar(GpuOp):
...
@@ -299,7 +300,10 @@ class GpuDot22Scalar(GpuOp):
"""
"""
Implement dot(2d, 2d) * scalar on the gpu.
Implement dot(2d, 2d) * scalar on the gpu.
:note: Not used anymore. Keep to allow unpickle of old graph.
Notes
-----
Not used anymore. Keep to allow unpickle of old graph.
"""
"""
def
__str__
(
self
):
def
__str__
(
self
):
return
'GpuDot22Scalar'
return
'GpuDot22Scalar'
...
@@ -707,16 +711,22 @@ gpu_ger_inplace = GpuGer(inplace=True)
...
@@ -707,16 +711,22 @@ gpu_ger_inplace = GpuGer(inplace=True)
class
BaseGpuCorrMM
(
GpuOp
):
class
BaseGpuCorrMM
(
GpuOp
):
"""Base class for `GpuCorrMM`, `GpuCorrMM_gradWeights` and
"""
Base class for `GpuCorrMM`, `GpuCorrMM_gradWeights` and
`GpuCorrMM_gradInputs`. Cannot be used directly.
`GpuCorrMM_gradInputs`. Cannot be used directly.
:param border_mode: one of 'valid', 'full', 'half'; additionally, the
Parameters
padding size could be directly specified by an integer or a pair of
----------
integers
border_mode : {'valid', 'full', 'half'}
:param subsample: perform subsampling of the output (default: (1, 1))
Additionally, the padding size could be directly specified by an integer
:param pad: *deprecated*, now you should always use border_mode
or a pair of integers
subsample
Perform subsampling of the output (default: (1, 1)).
pad
*deprecated*, now you should always use border_mode.
"""
"""
check_broadcast
=
False
check_broadcast
=
False
__props__
=
(
'border_mode'
,
'subsample'
)
__props__
=
(
'border_mode'
,
'subsample'
)
...
@@ -757,7 +767,10 @@ class BaseGpuCorrMM(GpuOp):
...
@@ -757,7 +767,10 @@ class BaseGpuCorrMM(GpuOp):
str
(
self
.
subsample
))
str
(
self
.
subsample
))
def
flops
(
self
,
inp
,
outp
):
def
flops
(
self
,
inp
,
outp
):
""" Useful with the hack in profilemode to print the MFlops"""
"""
Useful with the hack in profilemode to print the MFlops.
"""
# if the output shape is correct, then this gives the correct
# if the output shape is correct, then this gives the correct
# flops for any direction, sampling, padding, and border mode
# flops for any direction, sampling, padding, and border mode
inputs
,
filters
=
inp
inputs
,
filters
=
inp
...
@@ -794,32 +807,40 @@ class BaseGpuCorrMM(GpuOp):
...
@@ -794,32 +807,40 @@ class BaseGpuCorrMM(GpuOp):
Depending on the direction, one of bottom, weights, top will
Depending on the direction, one of bottom, weights, top will
receive the output, while the other two serve as inputs.
receive the output, while the other two serve as inputs.
:param bottom: Variable name of the input images in the forward pass,
Parameters
----------
bottom
Variable name of the input images in the forward pass,
or the gradient of the input images in backprop wrt. inputs
or the gradient of the input images in backprop wrt. inputs
:param weights: Variable name of the filters in the forward pass,
weights
Variable name of the filters in the forward pass,
or the gradient of the filters in backprop wrt. weights
or the gradient of the filters in backprop wrt. weights
:param top: Variable name of the output images / feature maps in the
top
Variable name of the output images / feature maps in the
forward pass, or the gradient of the outputs in the backprop passes
forward pass, or the gradient of the outputs in the backprop passes
:param direction: "forward" to correlate bottom with weights and store
direction : {'forward', 'backprop weights', 'backprop inputs'}
results in top,
"forward" to correlate bottom with weights and store
results in top,
"backprop weights" to do a valid convolution of bottom with top
"backprop weights" to do a valid convolution of bottom with top
(swapping the first two dimensions) and store results in weights,
(swapping the first two dimensions) and store results in weights,
and "backprop inputs" to do a full convolution of top with weights
and "backprop inputs" to do a full convolution of top with weights
(swapping the first two dimensions) and store results in bottom.
(swapping the first two dimensions) and store results in bottom.
:param sub: Dictionary of substitutions useable to help generating the
sub
C code.
Dictionary of substitutions useable to help generating the
C code.
:param height: If self.subsample[0] != 1, a variable giving the
height
height
of the filters for direction="backprop weights" or the height of
If self.subsample[0] != 1, a variable giving the height of the
the input images for direction="backprop inputs".
filters for direction="backprop weights" or the height of the input
images for direction="backprop inputs".
If self.border_mode == 'half', a variable giving the height of the
If self.border_mode == 'half', a variable giving the height of the
filters for direction="backprop weights". Ignored otherwise.
filters for direction="backprop weights".
:param width: If self.subsample[1] != 1, a variable giving the width
Ignored otherwise.
of the filters for direction="backprop weights" or the width of the
width
If self.subsample[1] != 1, a variable giving the width of the
filters for direction="backprop weights" or the width of the
input images for direction="backprop inputs".
input images for direction="backprop inputs".
If self.border_mode == 'half', a variable giving the width of the
If self.border_mode == 'half', a variable giving the width of the
filters for direction="backprop weights". Ignored otherwise.
filters for direction="backprop weights".
Ignored otherwise.
"""
"""
dH
,
dW
=
self
.
subsample
dH
,
dW
=
self
.
subsample
if
self
.
border_mode
==
"half"
:
if
self
.
border_mode
==
"half"
:
...
@@ -993,9 +1014,13 @@ class BaseGpuCorrMM(GpuOp):
...
@@ -993,9 +1014,13 @@ class BaseGpuCorrMM(GpuOp):
class
GpuCorrMM
(
BaseGpuCorrMM
):
class
GpuCorrMM
(
BaseGpuCorrMM
):
"""GPU correlation implementation using Matrix Multiplication.
"""
GPU correlation implementation using Matrix Multiplication.
:param border_mode: the width of a border of implicit zeros to pad the
Parameters
----------
border_mode
The width of a border of implicit zeros to pad the
input with. Must be a tuple with 2 elements giving the numbers of rows
input with. Must be a tuple with 2 elements giving the numbers of rows
and columns to pad on each side, or a single integer to pad the same
and columns to pad on each side, or a single integer to pad the same
on all sides, or a string shortcut setting the padding at runtime:
on all sides, or a string shortcut setting the padding at runtime:
...
@@ -1004,27 +1029,31 @@ class GpuCorrMM(BaseGpuCorrMM):
...
@@ -1004,27 +1029,31 @@ class GpuCorrMM(BaseGpuCorrMM):
``'half'`` for ``(kernel_rows // 2, kernel_columns // 2)`` (same
``'half'`` for ``(kernel_rows // 2, kernel_columns // 2)`` (same
convolution for odd-sized kernels). Note that the two widths are each
convolution for odd-sized kernels). Note that the two widths are each
applied twice, once per side (left and right, top and bottom).
applied twice, once per side (left and right, top and bottom).
:param subsample: the subsample operation applied to each output image.
subsample
The subsample operation applied to each output image.
Should be a tuple with 2 elements.
Should be a tuple with 2 elements.
`(sv, sh)` is equivalent to `GpuCorrMM(...)(...)[:,:,::sv, ::sh]`,
`(sv, sh)` is equivalent to `GpuCorrMM(...)(...)[:,:,::sv, ::sh]`,
but faster.
but faster.
Set to `(1, 1)` to disable subsampling.
Set to `(1, 1)` to disable subsampling.
:param pad: deprecated alias for `border_mode`.
pad
Deprecated alias for `border_mode`.
:note: Currently, the Op requires the inputs, filters and outputs to be
C-contiguous. Use :func:`gpu_contiguous
Notes
<theano.sandbox.cuda.basic_ops.gpu_contiguous>` on these arguments
-----
if needed.
Currently, the Op requires the inputs, filters and outputs to be
C-contiguous. Use :func:`gpu_contiguous
:note: You can either enable the Theano flag `optimizer_including=conv_gemm`
<theano.sandbox.cuda.basic_ops.gpu_contiguous>` on these arguments
to automatically replace all convolution operations with `GpuCorrMM`
if needed.
or one of its gradients, or you can use it as a replacement for
:func:`conv2d <theano.tensor.nnet.conv.conv2d>`, called as
You can either enable the Theano flag `optimizer_including=conv_gemm`
`GpuCorrMM(subsample=...)(image, filters)`. The latter is currently
to automatically replace all convolution operations with `GpuCorrMM`
faster, but note that it computes a correlation -- if you need to
or one of its gradients, or you can use it as a replacement for
compute a convolution, flip the filters as `filters[:,:,::-1,::-1]`.
:func:`conv2d <theano.tensor.nnet.conv.conv2d>`, called as
`GpuCorrMM(subsample=...)(image, filters)`. The latter is currently
:warning: For 700 series Nvidia GPUs of compute capability 3.5 and CUDA 5.0
faster, but note that it computes a correlation -- if you need to
compute a convolution, flip the filters as `filters[:,:,::-1,::-1]`.
..warning:: For 700 series Nvidia GPUs of compute capability 3.5 and CUDA 5.0
to 6.0, there is a bug in CUBLAS' matrix multiplication function that
to 6.0, there is a bug in CUBLAS' matrix multiplication function that
can make GpuCorrMM or its gradients crash for some input and filter
can make GpuCorrMM or its gradients crash for some input and filter
shapes. So if you have a Tesla K20, Tesla K40, Quadro K6000, GeForce GT
shapes. So if you have a Tesla K20, Tesla K40, Quadro K6000, GeForce GT
...
@@ -1032,6 +1061,7 @@ class GpuCorrMM(BaseGpuCorrMM):
...
@@ -1032,6 +1061,7 @@ class GpuCorrMM(BaseGpuCorrMM):
and experience a crash, switching to CUDA 6.5 or CUDA 4.2 should fix it.
and experience a crash, switching to CUDA 6.5 or CUDA 4.2 should fix it.
If this is not possible, changing the input or filter shapes (e.g., the
If this is not possible, changing the input or filter shapes (e.g., the
batchsize or number of filters) may also work around the CUBLAS bug.
batchsize or number of filters) may also work around the CUBLAS bug.
"""
"""
def
__init__
(
self
,
border_mode
=
"valid"
,
def
__init__
(
self
,
border_mode
=
"valid"
,
subsample
=
(
1
,
1
),
subsample
=
(
1
,
1
),
...
@@ -1068,11 +1098,13 @@ class GpuCorrMM(BaseGpuCorrMM):
...
@@ -1068,11 +1098,13 @@ class GpuCorrMM(BaseGpuCorrMM):
class
GpuCorrMM_gradWeights
(
BaseGpuCorrMM
):
class
GpuCorrMM_gradWeights
(
BaseGpuCorrMM
):
"""Gradient wrt. filters for `GpuCorrMM`.
"""
Gradient wrt. filters for `GpuCorrMM`.
:note: You will not want to use this directly, but rely on
Notes
Theano's automatic differentiation or graph optimization to
-----
use it as needed.
You will not want to use this directly, but rely on Theano's automatic
differentiation or graph optimization to use it as needed.
"""
"""
...
@@ -1126,11 +1158,13 @@ class GpuCorrMM_gradWeights(BaseGpuCorrMM):
...
@@ -1126,11 +1158,13 @@ class GpuCorrMM_gradWeights(BaseGpuCorrMM):
class
GpuCorrMM_gradInputs
(
BaseGpuCorrMM
):
class
GpuCorrMM_gradInputs
(
BaseGpuCorrMM
):
"""Gradient wrt. inputs for `GpuCorrMM`.
"""
Gradient wrt. inputs for `GpuCorrMM`.
:note: You will not want to use this directly, but rely on
Notes
Theano's automatic differentiation or graph optimization to
-----
use it as needed.
You will not want to use this directly, but rely on Theano's automatic
differentiation or graph optimization to use it as needed.
"""
"""
...
@@ -1180,8 +1214,12 @@ class GpuCorrMM_gradInputs(BaseGpuCorrMM):
...
@@ -1180,8 +1214,12 @@ class GpuCorrMM_gradInputs(BaseGpuCorrMM):
class
BaseGpuCorr3dMM
(
GpuOp
):
class
BaseGpuCorr3dMM
(
GpuOp
):
"""Base class for `GpuCorr3dMM`, `GpuCorr3dMM_gradWeights` and
"""
`GpuCorr3dMM_gradInputs`. Cannot be used directly."""
Base class for `GpuCorr3dMM`, `GpuCorr3dMM_gradWeights` and
`GpuCorr3dMM_gradInputs`. Cannot be used directly.
"""
__props__
=
(
'border_mode'
,
'subsample'
,
'pad'
)
__props__
=
(
'border_mode'
,
'subsample'
,
'pad'
)
def
__init__
(
self
,
border_mode
=
"valid"
,
def
__init__
(
self
,
border_mode
=
"valid"
,
...
@@ -1245,38 +1283,47 @@ class BaseGpuCorr3dMM(GpuOp):
...
@@ -1245,38 +1283,47 @@ class BaseGpuCorr3dMM(GpuOp):
Depending on the direction, one of bottom, weights, top will
Depending on the direction, one of bottom, weights, top will
receive the output, while the other two serve as inputs.
receive the output, while the other two serve as inputs.
:param bottom: Variable name of the input images in the forward pass,
Parameters
or the gradient of the input images in backprop wrt. inputs
----------
:param weights: Variable name of the filters in the forward pass,
bottom
or the gradient of the filters in backprop wrt. weights
Variable name of the input images in the forward pass,
:param top: Variable name of the output images / feature maps in the
or the gradient of the input images in backprop wrt. inputs.
forward pass, or the gradient of the outputs in the backprop passes
weights
:param direction: "forward" to correlate bottom with weights and store
Variable name of the filters in the forward pass,
results in top,
or the gradient of the filters in backprop wrt. weights.
top
Variable name of the output images / feature maps in the
forward pass, or the gradient of the outputs in the backprop passes.
direction : {'forward', 'backprop weights', 'backprop inputs'}
"forward" to correlate bottom with weights and store results in top,
"backprop weights" to do a valid convolution of bottom with top
"backprop weights" to do a valid convolution of bottom with top
(swapping the first two dimensions) and store results in weights,
(swapping the first two dimensions) and store results in weights,
and "backprop inputs" to do a full convolution of top with weights
and "backprop inputs" to do a full convolution of top with weights
(swapping the first two dimensions) and store results in bottom.
(swapping the first two dimensions) and store results in bottom.
:param sub: Dictionary of substitutions useable to help generating the
sub
C code.
Dictionary of substitutions useable to help generating the C code.
:param height: If self.subsample[0] != 1, a variable giving the height
height
If self.subsample[0] != 1, a variable giving the height
of the filters for direction="backprop weights" or the height of the
of the filters for direction="backprop weights" or the height of the
input images for direction="backprop inputs".
input images for direction="backprop inputs".
If self.pad == 'half', a variable giving the height of the filters
If self.pad == 'half', a variable giving the height of the filters
for direction="backprop weights".
for direction="backprop weights".
Ignored otherwise.
Ignored otherwise.
:param width: If self.subsample[1] != 1, a variable giving the width
width
If self.subsample[1] != 1, a variable giving the width
of the filters for direction="backprop weights" or the width of the
of the filters for direction="backprop weights" or the width of the
input images for direction="backprop inputs".
input images for direction="backprop inputs".
If self.pad == 'half', a variable giving the width of the filters
If self.pad == 'half', a variable giving the width of the filters
for direction="backprop weights".
for direction="backprop weights".
Ignored otherwise.
Ignored otherwise.
:param depth: If self.subsample[2] != 1, a variable giving the depth
depth
If self.subsample[2] != 1, a variable giving the depth
of the filters for direction="backprop weights" or the depth of the
of the filters for direction="backprop weights" or the depth of the
input images for direction="backprop inputs".
input images for direction="backprop inputs".
If self.pad == 'half', a variable giving the depth of the filters
If self.pad == 'half', a variable giving the depth of the filters
for direction="backprop weights".
for direction="backprop weights".
Ignored otherwise.
Ignored otherwise.
"""
"""
if
self
.
border_mode
!=
"valid"
:
if
self
.
border_mode
!=
"valid"
:
raise
ValueError
(
"mode must be 'valid'"
)
raise
ValueError
(
"mode must be 'valid'"
)
...
@@ -1503,7 +1550,34 @@ class BaseGpuCorr3dMM(GpuOp):
...
@@ -1503,7 +1550,34 @@ class BaseGpuCorr3dMM(GpuOp):
class
GpuCorr3dMM
(
BaseGpuCorr3dMM
):
class
GpuCorr3dMM
(
BaseGpuCorr3dMM
):
"""GPU correlation implementation using Matrix Multiplication.
"""GPU correlation implementation using Matrix Multiplication.
:warning: For 700 series Nvidia GPUs of compute capability 3.5 and CUDA 5.0
Parameters
----------
border_mode
Currently supports "valid" only; "full" can be simulated by setting
`pad="full"` (at the cost of performance), or by using
`GpuCorrMM_gradInputs`.
subsample
The subsample operation applied to each output image. Should be a tuple
with 3 elements. `(sv, sh, sl)` is equivalent to
`GpuCorrMM(...)(...)[:,:,::sv, ::sh, ::sl]`, but faster.
Set to `(1, 1, 1)` to disable subsampling.
pad
The width of a border of implicit zeros to pad the input image with.
Should be a tuple with 3 elements giving the numbers of rows and columns
to pad on each side, or "half" to set the padding
to `(kernel_rows // 2, kernel_columns // 2, kernel_depth // 2)`,
or "full" to set the padding
to `(kernel_rows - 1, kernel_columns - 1, kernel_depth - 1)` at runtime.
Set to `(0, 0, 0)` to disable padding.
Notes
-----
Currently, the Op requires the inputs, filters and outputs to be
C-contiguous. Use :func:`gpu_contiguous
<theano.sandbox.cuda.basic_ops.gpu_contiguous>` on these arguments
if needed.
.. warning:: For 700 series Nvidia GPUs of compute capability 3.5 and CUDA 5.0
to 6.0, there is a bug in CUBLAS' matrix multiplication function that
to 6.0, there is a bug in CUBLAS' matrix multiplication function that
can make GpuCorrMM or its gradients crash for some input and filter
can make GpuCorrMM or its gradients crash for some input and filter
shapes. So if you have a Tesla K20, Tesla K40, Quadro K6000, GeForce GT
shapes. So if you have a Tesla K20, Tesla K40, Quadro K6000, GeForce GT
...
@@ -1511,31 +1585,9 @@ class GpuCorr3dMM(BaseGpuCorr3dMM):
...
@@ -1511,31 +1585,9 @@ class GpuCorr3dMM(BaseGpuCorr3dMM):
and experience a crash, switching to CUDA 6.5 or CUDA 4.2 should fix it.
and experience a crash, switching to CUDA 6.5 or CUDA 4.2 should fix it.
If this is not possible, changing the input or filter shapes (e.g., the
If this is not possible, changing the input or filter shapes (e.g., the
batchsize or number of filters) may also work around the CUBLAS bug.
batchsize or number of filters) may also work around the CUBLAS bug.
"""
"""
def
__init__
(
self
,
border_mode
=
"valid"
,
def
__init__
(
self
,
border_mode
=
"valid"
,
subsample
=
(
1
,
1
,
1
),
pad
=
(
0
,
0
,
0
)):
subsample
=
(
1
,
1
,
1
),
pad
=
(
0
,
0
,
0
)):
"""
:param border_mode: currently supports "valid" only; "full" can be
simulated by setting `pad="full"` (at the cost of performance), or
by using `GpuCorrMM_gradInputs`
:param subsample: the subsample operation applied to each output image.
Should be a tuple with 3 elements.
`(sv, sh, sl)` is equivalent to `GpuCorrMM(...)(...)[:,:,::sv, ::sh, ::sl]`,
but faster.
Set to `(1, 1, 1)` to disable subsampling.
:param pad: the width of a border of implicit zeros to pad the input
image with. Should be a tuple with 3 elements giving the numbers of
rows and columns to pad on each side, or "half" to set the padding
to `(kernel_rows // 2, kernel_columns // 2, kernel_depth // 2)`, or "full" to set the
padding to `(kernel_rows - 1, kernel_columns - 1, kernel_depth - 1)` at runtime.
Set to `(0, 0, 0)` to disable padding.
:note: Currently, the Op requires the inputs, filters and outputs to be
C-contiguous. Use :func:`gpu_contiguous
<theano.sandbox.cuda.basic_ops.gpu_contiguous>` on these arguments
if needed.
"""
super
(
GpuCorr3dMM
,
self
)
.
__init__
(
border_mode
,
subsample
,
pad
)
super
(
GpuCorr3dMM
,
self
)
.
__init__
(
border_mode
,
subsample
,
pad
)
def
make_node
(
self
,
img
,
kern
):
def
make_node
(
self
,
img
,
kern
):
...
@@ -1570,8 +1622,11 @@ class GpuCorr3dMM(BaseGpuCorr3dMM):
...
@@ -1570,8 +1622,11 @@ class GpuCorr3dMM(BaseGpuCorr3dMM):
class
GpuCorr3dMM_gradWeights
(
BaseGpuCorr3dMM
):
class
GpuCorr3dMM_gradWeights
(
BaseGpuCorr3dMM
):
"""Gradient wrt. filters for `GpuCorr3dMM`.
"""Gradient wrt. filters for `GpuCorr3dMM`.
:note: You will not want to use this directly, but rely on Theano's
Notes
automatic differentiation or graph optimization to use it as needed.
-----
You will not want to use this directly, but rely on Theano's
automatic differentiation or graph optimization to use it as needed.
"""
"""
def
__init__
(
self
,
border_mode
=
"valid"
,
def
__init__
(
self
,
border_mode
=
"valid"
,
...
@@ -1627,8 +1682,11 @@ class GpuCorr3dMM_gradWeights(BaseGpuCorr3dMM):
...
@@ -1627,8 +1682,11 @@ class GpuCorr3dMM_gradWeights(BaseGpuCorr3dMM):
class
GpuCorr3dMM_gradInputs
(
BaseGpuCorr3dMM
):
class
GpuCorr3dMM_gradInputs
(
BaseGpuCorr3dMM
):
"""Gradient wrt. inputs for `GpuCorr3dMM`.
"""Gradient wrt. inputs for `GpuCorr3dMM`.
:note: You will not want to use this directly, but rely on Theano's
Notes
automatic differentiation or graph optimization to use it as needed.
-----
You will not want to use this directly, but rely on Theano's
automatic differentiation or graph optimization to use it as needed.
"""
"""
def
__init__
(
self
,
border_mode
=
"valid"
,
def
__init__
(
self
,
border_mode
=
"valid"
,
...
@@ -1683,6 +1741,48 @@ class GpuCorr3dMM_gradInputs(BaseGpuCorr3dMM):
...
@@ -1683,6 +1741,48 @@ class GpuCorr3dMM_gradInputs(BaseGpuCorr3dMM):
class
GpuConv
(
GpuOp
):
class
GpuConv
(
GpuOp
):
"""
"""
Implement the batched and stacked 2d convolution on the gpu.
Implement the batched and stacked 2d convolution on the gpu.
Parameters
----------
version
Each version of c_code implements many kernel for the
convolution. By default we try to guess the best one.
You can force one version with this parameter. This
parameter is used by the tests.
direction_hint : {'forward', 'bprop weights', 'bprop inputs'}
Serves as a hint for graph optimizers replacing
GpuConv by other implementations. If the GpuConv is
inserted automatically, we take its value from ConvOp.
verbose
For value of 1,2 and 3. Print more information during
the execution of the convolution. Mostly used for
optimization or debugging.
kshp
The size of the kernel. If provided, can generate
faster code. If the GpuConv op is automatically inserted,
We take its value automatically from the Conv op.
imshp
The size of the image. Not used for code generation but
allows to select an experimental new version in another repo.
max_threads_dim0
The maximum number of threads for the block size dimensions 0
(blockDim.x) used by the GPU function.
nkern
The number of kernels. Not used for this op, but can be
used by graph optimizers to select a more optimal
convolution implementation. If the GpuConv op is inserted
automatically, we take its value from the Conv op.
bsize
The batch size. Not used for this op, but can be used by graph
optimizers to select a more optimal convolution implementation.
If the GpuConv op is inserted automatically, we take its value from
the Conv op.
fft_opt
Deactivate fft_opt optimization at the op level when set to False.
Note that by default fft optimization aren't enabled.
See :ref:`convolution documentation <libdoc_tensor_nnet_conv>`
to enable them.
"""
"""
check_broadcast
=
False
check_broadcast
=
False
...
@@ -1708,42 +1808,6 @@ class GpuConv(GpuOp):
...
@@ -1708,42 +1808,6 @@ class GpuConv(GpuOp):
nkern
=
None
,
nkern
=
None
,
bsize
=
None
,
bsize
=
None
,
fft_opt
=
True
):
fft_opt
=
True
):
"""
:param version: each version of c_code implements many kernel for the
convolution. By default we try to guess the best one.
You can force one version with this parameter. This
parameter is used by the tests.
:param direction_hint: 'forward', 'bprop weights' or 'bprop inputs'.
Serves as a hint for graph optimizers replacing
GpuConv by other implementations. If the GpuConv is
inserted automatically, we take its value from ConvOp.
:param verbose: for value of 1,2 and 3. Print more information during
the execution of the convolution. Mostly used for
optimization or debugging.
:param kshp: The size of the kernel. If provided, can generate
faster code. If the GpuConv op is automatically
inserted,
we take its value automatically from the Conv op.
:param imshp: The size of the image. Not used for code generation but
allows to select an experimental new version in another
repo.
:param max_threads_dim0: The maximum number of threads for the
block size dimensions 0 (blockDim.x) used by the
GPU function.
:param nkern: The number of kernels. Not used for this op, but can be
used by graph optimizers to select a more optimal
convolution implementation. If the GpuConv op is inserted
automatically, we take its value from the Conv op.
:param bsize: The batch size. Not used for this op, but can be
used by graph optimizers to select a more optimal
convolution implementation. If the GpuConv op is inserted
automatically, we take its value from the Conv op.
:param fft_opt: deactivate fft_opt optimization at the op level when
set to False. Note that by default fft optimization
aren't enabled. See
:ref:`convolution documentation <libdoc_tensor_nnet_conv>`
to enable them.
"""
self
.
border_mode
=
border_mode
self
.
border_mode
=
border_mode
if
version
!=
-
1
:
if
version
!=
-
1
:
raise
Exception
(
raise
Exception
(
...
@@ -1956,6 +2020,7 @@ class GpuConv(GpuOp):
...
@@ -1956,6 +2020,7 @@ class GpuConv(GpuOp):
class
GpuDownsampleFactorMax
(
GpuOp
):
class
GpuDownsampleFactorMax
(
GpuOp
):
"""
"""
Implement downsample with max on the gpu.
Implement downsample with max on the gpu.
"""
"""
def
__init__
(
self
,
ds
,
ignore_border
=
False
):
def
__init__
(
self
,
ds
,
ignore_border
=
False
):
self
.
ds
=
tuple
(
ds
)
self
.
ds
=
tuple
(
ds
)
...
@@ -2149,6 +2214,7 @@ class GpuDownsampleFactorMax(GpuOp):
...
@@ -2149,6 +2214,7 @@ class GpuDownsampleFactorMax(GpuOp):
class
GpuDownsampleFactorMaxGrad
(
GpuOp
):
class
GpuDownsampleFactorMaxGrad
(
GpuOp
):
"""
"""
Implement the grad of downsample with max on the gpu.
Implement the grad of downsample with max on the gpu.
"""
"""
def
__init__
(
self
,
ds
,
ignore_border
):
def
__init__
(
self
,
ds
,
ignore_border
):
self
.
ds
=
tuple
(
ds
)
self
.
ds
=
tuple
(
ds
)
...
@@ -2371,6 +2437,7 @@ class GpuDownsampleFactorMaxGrad(GpuOp):
...
@@ -2371,6 +2437,7 @@ class GpuDownsampleFactorMaxGrad(GpuOp):
class
GpuDownsampleFactorMaxGradGrad
(
GpuOp
):
class
GpuDownsampleFactorMaxGradGrad
(
GpuOp
):
"""
"""
Implement the grad of downsample with max on the gpu.
Implement the grad of downsample with max on the gpu.
"""
"""
__props__
=
(
'ds'
,
'ignore_border'
)
__props__
=
(
'ds'
,
'ignore_border'
)
...
...
theano/sandbox/cuda/blocksparse.py
浏览文件 @
d1eba87d
...
@@ -30,7 +30,9 @@ class SparseBlockGemvSS(GpuOp):
...
@@ -30,7 +30,9 @@ class SparseBlockGemvSS(GpuOp):
This should not be directly called since the interface is subject
This should not be directly called since the interface is subject
to change without notice. Use the sparse_block_dot_SS() function
to change without notice. Use the sparse_block_dot_SS() function
for a stable interface.
for a stable interface.
"""
"""
def
__init__
(
self
,
inplace
=
False
):
def
__init__
(
self
,
inplace
=
False
):
self
.
inplace
=
inplace
self
.
inplace
=
inplace
if
self
.
inplace
:
if
self
.
inplace
:
...
@@ -367,9 +369,11 @@ class SparseBlockOuterSS(GpuOp):
...
@@ -367,9 +369,11 @@ class SparseBlockOuterSS(GpuOp):
The i and j are taken from the xIdx and yIdx lists respectively.
The i and j are taken from the xIdx and yIdx lists respectively.
This op should not be called directly since its interface is
This op should not be called directly since its interface is
subject to change without notice.
It is involved in the gradient
subject to change without notice. It is involved in the gradient
of SparseBlockGemvSS.
of SparseBlockGemvSS.
"""
"""
def
__init__
(
self
,
inplace
=
False
):
def
__init__
(
self
,
inplace
=
False
):
self
.
inplace
=
inplace
self
.
inplace
=
inplace
if
self
.
inplace
:
if
self
.
inplace
:
...
@@ -680,28 +684,36 @@ def sparse_block_dot_SS(W, h, inputIdx, b, outputIdx):
...
@@ -680,28 +684,36 @@ def sparse_block_dot_SS(W, h, inputIdx, b, outputIdx):
Parameters
Parameters
----------
----------
var: shape, comment
W : (iBlocks, oBlocks, iSize, oSize)
W: (iBlocks, oBlocks, iSize, oSize), weight matrix
Weight matrix.
h: (batch, iWin, iSize), input from lower layer (sparse)
h : (batch, iWin, iSize)
inputIdx: (batch, iWin), indexes of the input blocks
Input from lower layer (sparse).
b: (oBlocks, oSize), bias vector
inputIdx : (batch, iWin)
outputIdx: (batch, oWin), indexes of the output blocks
Indexes of the input blocks.
b : (oBlocks, oSize)
returns (batch, oWin, oSize), dot(W[i, j], h[i]) + b[j]
Bias vector.
but b[j] is only added once
outputIdx : (batch, oWin)
Indexes of the output blocks.
Notation
--------
Returns
-------
(batch, oWin, oSize)
dot(W[i, j], h[i]) + b[j], but b[j] is only added once.
Notes
-----
- `batch` is the number of examples in a minibatch (batch size).
- `batch` is the number of examples in a minibatch (batch size).
- `iBlocks` is the total number of blocks in the input (from lower layer).
- `iBlocks` is the total number of blocks in the input (from lower layer).
- `iSize` is the size of each of these input blocks.
- `iSize` is the size of each of these input blocks.
- `iWin` is the number of blocks that will be used as inputs. Which blocks
- `iWin` is the number of blocks that will be used as inputs. Which blocks
will be used is specified in `inputIdx`.
will be used is specified in `inputIdx`.
- `oBlocks` is the number or possible output blocks.
- `oBlocks` is the number or possible output blocks.
- `oSize` is the size of each of these output blocks.
- `oSize` is the size of each of these output blocks.
- `oWin` is the number of output blocks that will actually be computed.
- `oWin` is the number of output blocks that will actually be computed.
Which blocks will be computed is specified in `outputIdx`.
Which blocks will be computed is specified in `outputIdx`.
"""
"""
assert
inputIdx
.
ndim
==
h
.
ndim
-
1
assert
inputIdx
.
ndim
==
h
.
ndim
-
1
assert
outputIdx
.
ndim
==
inputIdx
.
ndim
assert
outputIdx
.
ndim
==
inputIdx
.
ndim
if
h
.
ndim
==
2
:
if
h
.
ndim
==
2
:
...
...
theano/sandbox/cuda/cula.py
浏览文件 @
d1eba87d
...
@@ -26,9 +26,13 @@ class GpuSolve(GpuOp):
...
@@ -26,9 +26,13 @@ class GpuSolve(GpuOp):
"""
"""
CULA GPU solver OP.
CULA GPU solver OP.
:param trans: Whether to take the transpose of the input matrix
Parameters
or not.
----------
trans
Whether to take the transpose of the input matrix or not.
"""
"""
__props__
=
(
'trans'
,)
__props__
=
(
'trans'
,)
def
__init__
(
self
,
trans
=
'N'
):
def
__init__
(
self
,
trans
=
'N'
):
...
...
theano/sandbox/fourier.py
浏览文件 @
d1eba87d
"""Provides Ops for FFT and DCT.
"""
Provides Ops for FFT and DCT.
"""
"""
import
numpy
import
numpy
...
@@ -23,18 +24,19 @@ grad_todo = GradTodo()
...
@@ -23,18 +24,19 @@ grad_todo = GradTodo()
class
FFT
(
Op
):
class
FFT
(
Op
):
"""Fast Fourier Transform
"""
Fast Fourier Transform.
.. TODO:
.. TODO:
The current implementation just works for matrix inputs, and permits
taking a 1D FFT over
The current implementation just works for matrix inputs, and permits
either rows or columns. Add support for N-D FFTs as provided by either numpy or FFTW
taking a 1D FFT over either rows or columns. Add support for N-D FFTs
directly.
as provided by either numpy or FFTW
directly.
.. TODO:
.. TODO:
Give the C code that uses FFTW.
Give the C code that uses FFTW.
.. TODO:
.. TODO:
u
nit tests.
U
nit tests.
"""
"""
...
@@ -42,7 +44,7 @@ class FFT(Op):
...
@@ -42,7 +44,7 @@ class FFT(Op):
# don't return the plan object in the 'buf' output
# don't return the plan object in the 'buf' output
half
=
False
half
=
False
"""Only return the first half (positive-valued) of the frequency components"""
"""Only return the first half (positive-valued) of the frequency components
.
"""
__props__
=
(
"half"
,
"inverse"
)
__props__
=
(
"half"
,
"inverse"
)
def
__init__
(
self
,
half
=
False
,
inverse
=
False
):
def
__init__
(
self
,
half
=
False
,
inverse
=
False
):
...
@@ -50,7 +52,10 @@ class FFT(Op):
...
@@ -50,7 +52,10 @@ class FFT(Op):
self
.
inverse
=
inverse
self
.
inverse
=
inverse
def
make_node
(
self
,
frames
,
n
,
axis
):
def
make_node
(
self
,
frames
,
n
,
axis
):
""" compute an n-point fft of frames along given axis """
"""
Compute an n-point fft of frames along given axis.
"""
_frames
=
tensor
.
as_tensor
(
frames
,
ndim
=
2
)
_frames
=
tensor
.
as_tensor
(
frames
,
ndim
=
2
)
_n
=
tensor
.
as_tensor
(
n
,
ndim
=
0
)
_n
=
tensor
.
as_tensor
(
n
,
ndim
=
0
)
_axis
=
tensor
.
as_tensor
(
axis
,
ndim
=
0
)
_axis
=
tensor
.
as_tensor
(
axis
,
ndim
=
0
)
...
@@ -103,8 +108,8 @@ def dct_matrix(rows, cols, unitary=True):
...
@@ -103,8 +108,8 @@ def dct_matrix(rows, cols, unitary=True):
"""
"""
Return a (rows x cols) matrix implementing a discrete cosine transform.
Return a (rows x cols) matrix implementing a discrete cosine transform.
This algorithm is adapted from Dan Ellis' Rastmat
This algorithm is adapted from Dan Ellis' Rastmat
spec2cep.m, lines 15-20.
spec2cep.m, lines 15 - 20.
"""
"""
rval
=
numpy
.
zeros
((
rows
,
cols
))
rval
=
numpy
.
zeros
((
rows
,
cols
))
col_range
=
numpy
.
arange
(
cols
)
col_range
=
numpy
.
arange
(
cols
)
...
...
theano/sandbox/multinomial.py
浏览文件 @
d1eba87d
...
@@ -13,7 +13,11 @@ if cuda_available:
...
@@ -13,7 +13,11 @@ if cuda_available:
class
MultinomialFromUniform
(
Op
):
class
MultinomialFromUniform
(
Op
):
'''Converts samples from a uniform into sample from a multinomial.'''
"""
Converts samples from a uniform into sample from a multinomial.
"""
__props__
=
(
"odtype"
,)
__props__
=
(
"odtype"
,)
def
__init__
(
self
,
odtype
):
def
__init__
(
self
,
odtype
):
...
@@ -164,7 +168,8 @@ class GpuMultinomialFromUniform(MultinomialFromUniform, GpuOp):
...
@@ -164,7 +168,8 @@ class GpuMultinomialFromUniform(MultinomialFromUniform, GpuOp):
The output is transposed compared to MultinomialFromUniform.
The output is transposed compared to MultinomialFromUniform.
We must insert a Transpose op after it.
We must insert a Transpose op after it.
The optimization that move it to the gpu do it.
The optimization that moves it to the gpu does it.
"""
"""
def
make_node
(
self
,
pvals
,
unis
):
def
make_node
(
self
,
pvals
,
unis
):
...
...
theano/sandbox/neighbourhoods.py
浏览文件 @
d1eba87d
"""WARNING: This code is not recommanded. It is not finished, it is
"""
slower then the version in sandbox/neighbours.py, and it do not work
.. warning:: This code is not recommanded. It is not finished, it is
slower than the version in sandbox/neighbours.py, and it does not work
on the GPU.
on the GPU.
We only keep this version here as it is a little bit more generic, so
We only keep this version here as it is a little bit more generic, so
...
@@ -16,66 +17,67 @@ from theano import gof, Op
...
@@ -16,66 +17,67 @@ from theano import gof, Op
class
NeighbourhoodsFromImages
(
Op
):
class
NeighbourhoodsFromImages
(
Op
):
"""
This extracts neighbourhoods from "images", but in a dimension-generic
manner.
__props__
=
(
"n_dims_before"
,
"dims_neighbourhoods"
,
"strides"
,
In the 2D case, this is similar to downsampling, but instead of reducing
"ignore_border"
,
"inverse"
)
a group of 2x2 pixels (for example) to a single new pixel in the output,
you place those 4 pixels in a row.
def
__init__
(
self
,
n_dims_before
,
dims_neighbourhoods
,
strides
=
None
,
ignore_border
=
False
,
inverse
=
False
):
"""
This extracts neighbourhoods from "images", but in a
dimension-generic manner.
In the 2D case, this is similar to downsampling, but instead of reducing
a group of 2x2 pixels (for example) to a single new pixel in the output,
you place those 4 pixels in a row.
For example, say you have this 2x4 image::
For example, say you have this 2x4 image::
[ [ 0.5, 0.6, 0.7, 0.8 ],
[ [ 0.5, 0.6, 0.7, 0.8 ],
[ 0.1, 0.2, 0.3, 0.4 ] ]
[ 0.1, 0.2, 0.3, 0.4 ] ]
and you want to extract 2x2 neighbourhoods. This op would then produce::
and you want to extract 2x2 neighbourhoods. This op would then produce::
[ [ [ 0.5, 0.6, 0.1, 0.2 ] ], # the first 2x2 group of pixels
[ [ [ 0.5, 0.6, 0.1, 0.2 ] ], # the first 2x2 group of pixels
[ [ 0.7, 0.8, 0.3, 0.4 ] ] ] # the second one
[ [ 0.7, 0.8, 0.3, 0.4 ] ] ] # the second one
so think of a 2D downsampling where each pixel of the resulting array
So think of a 2D downsampling where each pixel of the resulting array
is replaced by an array containing the (flattened) pixels of the
is replaced by an array containing the (flattened) pixels of the
corresponding neighbourhood.
corresponding neighbourhood.
If you provide a stack of 2D image, or multiple stacks, each image
If you provide a stack of 2D images, or multiple stacks, each image
will be treated independently, and the first dimensions of the array
will be treated independently, and the first dimensions of the array
will be preserved as such.
will be preserved as such.
This also makes sense in the 1D or 3D case. Below I'll still be calling
This also makes sense in the 1D or 3D case. Below I'll still be calling
those "images", by analogy.
those "images", by analogy.
In the 1D case, you're
In the 1D case, you're extracting subsequences from the original sequence.
extracting subsequences from the original sequence. In the 3D case,
In the 3D case, you're extracting cuboids.
you're extracting cuboids. If you ever find a 4D use, tell me! It
If you ever find a 4D use, tell me! It should be possible, anyhow.
should be possible, anyhow.
Parameters
Parameters
----------
----------
n_dims_before : int
n_dims_before : int
Number of dimensions preceding the "images".
Number of dimensions preceding the "images".
dims_neighbourhoods : tuple of ints
dims_neighbourhoods : tuple of ints
Exact shape of windows to be extracted (e.g. (2,2) in the case above).
Exact shape of windows to be extracted (e.g. (2,2) in the case above).
n_dims_before + len(dims_neighbourhoods) should be equal to the
n_dims_before + len(dims_neighbourhoods) should be equal to the
number of dimensions in the input given to the op.
number of dimensions in the input given to the op.
strides : tuple of int
strides : tuple of int
Number of elements to skip when moving to the next neighbourhood,
Number of elements to skip when moving to the next neighbourhood,
for each dimension of dims_neighbourhoods. There can be overlap
for each dimension of dims_neighbourhoods. There can be overlap
between neighbourhoods, or gaps.
between neighbourhoods, or gaps.
ignore_border : bool
ignore_border : bool
If the dimensions of the neighbourhoods don't exactly divide the
If the dimensions of the neighbourhoods don't exactly divide the
dimensions of the "images", you can either fill the last
dimensions of the "images", you can either fill the last
neighbourhood with zeros (False) or drop it entirely (True).
neighbourhood with zeros (False) or drop it entirely (True).
inverse : bool
inverse : bool
You shouldn't have to use this. Only used by child class
You shouldn't have to use this. Only used by child class
ImagesFromNeighbourhoods which simply reverses the assignment.
ImagesFromNeighbourhoods which simply reverses the assignment.
"""
"""
__props__
=
(
"n_dims_before"
,
"dims_neighbourhoods"
,
"strides"
,
"ignore_border"
,
"inverse"
)
def
__init__
(
self
,
n_dims_before
,
dims_neighbourhoods
,
strides
=
None
,
ignore_border
=
False
,
inverse
=
False
):
self
.
n_dims_before
=
n_dims_before
self
.
n_dims_before
=
n_dims_before
self
.
dims_neighbourhoods
=
dims_neighbourhoods
self
.
dims_neighbourhoods
=
dims_neighbourhoods
if
strides
is
not
None
:
if
strides
is
not
None
:
...
...
theano/sandbox/rng_mrg.py
浏览文件 @
d1eba87d
"""
"""
Implementation of MRG31k3p random number generator for Theano
Implementation of MRG31k3p random number generator for Theano
.
Generator code in SSJ package (L'Ecuyer & Simard)
Generator code in SSJ package (L'Ecuyer & Simard)
.
http://www.iro.umontreal.ca/~simardr/ssj/indexe.html
http://www.iro.umontreal.ca/~simardr/ssj/indexe.html
"""
"""
...
@@ -39,11 +39,14 @@ def matVecModM(A, s, m):
...
@@ -39,11 +39,14 @@ def matVecModM(A, s, m):
def
multMatVect
(
v
,
A
,
m1
,
B
,
m2
):
def
multMatVect
(
v
,
A
,
m1
,
B
,
m2
):
"""
"""
multiply the first half of v by A with a modulo of m1
Multiply the first half of v by A with a modulo of m1 and the second half
and the second half by B with a modulo of m2
by B with a modulo of m2.
Notes
-----
The parameters of dot_modulo are passed implicitly because passing them
explicitly takes more time than running the function's C-code.
Note: The parameters of dot_modulo are passed implicitly because passing
them explicitly takes more time then running the function's C-code.
"""
"""
if
multMatVect
.
dot_modulo
is
None
:
if
multMatVect
.
dot_modulo
is
None
:
A_sym
=
tensor
.
lmatrix
(
'A'
)
A_sym
=
tensor
.
lmatrix
(
'A'
)
...
@@ -76,7 +79,8 @@ class DotModulo(Op):
...
@@ -76,7 +79,8 @@ class DotModulo(Op):
Efficient and numerically stable implementation of a dot product followed
Efficient and numerically stable implementation of a dot product followed
by a modulo operation. This performs the same function as matVecModM.
by a modulo operation. This performs the same function as matVecModM.
We do this 2 times on 2 triple inputs and concatenating the output
We do this 2 times on 2 triple inputs and concatenating the output.
"""
"""
__props__
=
()
__props__
=
()
...
@@ -1014,9 +1018,12 @@ def guess_n_streams(size, warn=False):
...
@@ -1014,9 +1018,12 @@ def guess_n_streams(size, warn=False):
"""
"""
Return a guess at a good number of streams.
Return a guess at a good number of streams.
:param warn:
Parameters
If True, warn when a guess cannot be made (in which case we
----------
return 60 * 256).
warn : bool, optional
If True, warn when a guess cannot be made (in which case we
return 60 * 256).
"""
"""
# TODO: a smart way of choosing the number of streams, see #612.
# TODO: a smart way of choosing the number of streams, see #612.
# Note that this code was moved out of `MRG_RandomStreams` so that it can
# Note that this code was moved out of `MRG_RandomStreams` so that it can
...
@@ -1048,22 +1055,25 @@ def guess_n_streams(size, warn=False):
...
@@ -1048,22 +1055,25 @@ def guess_n_streams(size, warn=False):
class
MRG_RandomStreams
(
object
):
class
MRG_RandomStreams
(
object
):
"""Module component with similar interface to numpy.random (numpy.random.RandomState)"""
"""
Module component with similar interface to numpy.random
(numpy.random.RandomState).
Parameters
----------
seed : int or list of 6 int
A default seed to initialize the random state.
If a single int is given, it will be replicated 6 times.
The first 3 values of the seed must all be less than M1 = 2147483647,
and not all 0; and the last 3 values must all be less than
M2 = 2147462579, and not all 0.
"""
def
updates
(
self
):
def
updates
(
self
):
return
list
(
self
.
state_updates
)
return
list
(
self
.
state_updates
)
def
__init__
(
self
,
seed
=
12345
,
use_cuda
=
None
):
def
__init__
(
self
,
seed
=
12345
,
use_cuda
=
None
):
"""
:type seed: int or list of 6 int.
:param seed: a default seed to initialize the random state.
If a single int is given, it will be replicated 6 times.
The first 3 values of the seed must all be less than M1 = 2147483647,
and not all 0; and the last 3 values must all be less than
M2 = 2147462579, and not all 0.
"""
# A list of pairs of the form (input_r, output_r), representing the
# A list of pairs of the form (input_r, output_r), representing the
# update rules of all the random states generated by this RandomStreams.
# update rules of all the random states generated by this RandomStreams.
self
.
state_updates
=
[]
self
.
state_updates
=
[]
...
@@ -1107,14 +1117,18 @@ class MRG_RandomStreams(object):
...
@@ -1107,14 +1117,18 @@ class MRG_RandomStreams(object):
raise
TypeError
(
"seed should be 1 integer or 6 integers"
)
raise
TypeError
(
"seed should be 1 integer or 6 integers"
)
def
seed
(
self
,
seed
=
None
):
def
seed
(
self
,
seed
=
None
):
"""Re-initialize each random stream
"""
Re-initialize each random stream.
:param seed: each random stream will be assigned a unique
state that depends deterministically on this value.
:type seed: None or integer in range 0 to 2**30
Parameters
----------
seed : None or integer in range 0 to 2**30
Each random stream will be assigned a unique state that depends
deterministically on this value.
:rtype: None
Returns
-------
None
"""
"""
if
seed
is
None
:
if
seed
is
None
:
...
@@ -1133,14 +1147,20 @@ class MRG_RandomStreams(object):
...
@@ -1133,14 +1147,20 @@ class MRG_RandomStreams(object):
old_r
.
set_value
(
rstates
,
borrow
=
True
)
old_r
.
set_value
(
rstates
,
borrow
=
True
)
def
inc_rstate
(
self
):
def
inc_rstate
(
self
):
"""Update self.rstate to be skipped 2^134 steps forward to the next stream start"""
"""
Update self.rstate to be skipped 2^134 steps forward to the next stream
start.
"""
#self.rstate = ff_2p134(self.rstate)
#self.rstate = ff_2p134(self.rstate)
self
.
rstate
=
multMatVect
(
self
.
rstate
,
A1p134
,
M1
,
A2p134
,
M2
)
self
.
rstate
=
multMatVect
(
self
.
rstate
,
A1p134
,
M1
,
A2p134
,
M2
)
assert
self
.
rstate
.
dtype
==
numpy
.
int32
assert
self
.
rstate
.
dtype
==
numpy
.
int32
def
get_substream_rstates
(
self
,
n_streams
,
dtype
,
inc_rstate
=
True
):
def
get_substream_rstates
(
self
,
n_streams
,
dtype
,
inc_rstate
=
True
):
"""Initialize a matrix in which each row is a MRG stream state,
"""
Initialize a matrix in which each row is a MRG stream state,
and they are spaced by 2**72 samples.
and they are spaced by 2**72 samples.
"""
"""
assert
isinstance
(
dtype
,
str
)
assert
isinstance
(
dtype
,
str
)
assert
n_streams
<
2
**
72
assert
n_streams
<
2
**
72
...
@@ -1198,27 +1218,25 @@ class MRG_RandomStreams(object):
...
@@ -1198,27 +1218,25 @@ class MRG_RandomStreams(object):
distribution between low and high.
distribution between low and high.
If the size argument is ambiguous on the number of dimensions,
If the size argument is ambiguous on the number of dimensions,
ndim may be a plain integer to supplement the missing
ndim may be a plain integer to supplement the missing information.
information.
Parameters
:param low:
----------
Lower bound of the interval on which values are sampled. If
low
the ``dtype`` arg is provided, ``low`` will be cast into
Lower bound of the interval on which values are sampled.
dtype. This bound is excluded.
If the ``dtype`` arg is provided, ``low`` will be cast into
dtype. This bound is excluded.
:param high:
high
Higher bound of the interval on which values are sampled.
Higher bound of the interval on which values are sampled.
If the ``dtype`` arg is provided, ``high`` will be cast into
If the ``dtype`` arg is provided, ``high`` will be cast into
dtype. This bound is excluded.
dtype. This bound is excluded.
size
:param size:
Can be a list of integer or Theano variable (ex: the shape
Can be a list of integer or Theano variable (ex: the shape
of other Theano Variable)
of other Theano Variable).
dtype
:param dtype:
The output data type. If dtype is not specified, it will be
The output data type. If dtype is not specified, it will be
inferred from the dtype of low and high, but will be at
inferred from the dtype of low and high, but will be at
least as precise as floatX.
least as precise as floatX.
"""
"""
low
=
as_tensor_variable
(
low
)
low
=
as_tensor_variable
(
low
)
...
@@ -1300,15 +1318,17 @@ class MRG_RandomStreams(object):
...
@@ -1300,15 +1318,17 @@ class MRG_RandomStreams(object):
Example : pvals = [[.98, .01, .01], [.01, .98, .01]] will
Example : pvals = [[.98, .01, .01], [.01, .98, .01]] will
probably result in [[1,0,0],[0,1,0]].
probably result in [[1,0,0],[0,1,0]].
.. note::
Notes
-`size` and `ndim` are only there keep the same signature as other
-----
uniform, binomial, normal, etc.
-`size` and `ndim` are only there keep the same signature as other
todo : adapt multinomial to take that into account
uniform, binomial, normal, etc.
TODO : adapt multinomial to take that into account
-Does not do any value checking on pvals, i.e. there is no
check that the elements are non-negative, less than 1, or
sum to 1. passing pvals = [[-2., 2.]] will result in
sampling [[0, 0]]
-Does not do any value checking on pvals, i.e. there is no
check that the elements are non-negative, less than 1, or
sum to 1. passing pvals = [[-2., 2.]] will result in
sampling [[0, 0]]
"""
"""
if
pvals
is
None
:
if
pvals
is
None
:
raise
TypeError
(
"You have to specify pvals"
)
raise
TypeError
(
"You have to specify pvals"
)
...
@@ -1342,17 +1362,17 @@ class MRG_RandomStreams(object):
...
@@ -1342,17 +1362,17 @@ class MRG_RandomStreams(object):
def
normal
(
self
,
size
,
avg
=
0.0
,
std
=
1.0
,
ndim
=
None
,
def
normal
(
self
,
size
,
avg
=
0.0
,
std
=
1.0
,
ndim
=
None
,
dtype
=
None
,
nstreams
=
None
):
dtype
=
None
,
nstreams
=
None
):
"""
"""
:param size:
Parameters
Can be a list of integers or Theano variables (ex: the shape
----------
of another Theano Variable)
size
Can be a list of integers or Theano variables (ex: the shape
:param dtype:
of another Theano Variable).
The output data type. If dtype is not specified, it will b
e
dtyp
e
inferred from the dtype of low and high, but will be at
The output data type. If dtype is not specified, it will be
least as precise as floatX.
inferred from the dtype of low and high, but will be at
least as precise as floatX.
:param nstreams:
nstreams
Number of streams.
Number of streams.
"""
"""
# We need an even number of ]0,1[ samples. Then we split them
# We need an even number of ]0,1[ samples. Then we split them
...
...
theano/sandbox/scan.py
浏览文件 @
d1eba87d
...
@@ -49,13 +49,18 @@ def scan(fn,
...
@@ -49,13 +49,18 @@ def scan(fn,
control over the scan op, avoiding certain difficulties that arose from
control over the scan op, avoiding certain difficulties that arose from
missing optimizations.
missing optimizations.
:param fn: lambda function that describes one step of scan (see the
Parameters
----------
fn
Lambda function that describes one step of scan (see the
official Theano scan function)
official Theano scan function)
:param sequences: similar to the official Theano's scan. This version
sequences
Similar to the official Theano's scan. This version
of scan does not support taps for the sequences (it can only be a
of scan does not support taps for the sequences (it can only be a
list of tensor). Scan assumes that sequences have the right length
list of tensor). Scan assumes that sequences have the right length
and it does not check for this.
and it does not check for this.
:param states: similar to outputs_info of the official scan function.
states
Similar to outputs_info of the official scan function.
There is one crucial difference though, namely that the `initial`
There is one crucial difference though, namely that the `initial`
key in the dictionary has been replace by 'membuf' key. This
key in the dictionary has been replace by 'membuf' key. This
reflects the change of meaning. Instead of passing to scan just
reflects the change of meaning. Instead of passing to scan just
...
@@ -72,37 +77,43 @@ def scan(fn,
...
@@ -72,37 +77,43 @@ def scan(fn,
For states that do not require a initial state, one has to provide a
For states that do not require a initial state, one has to provide a
dictionary with a single key 'steps' that says how many intermediate
dictionary with a single key 'steps' that says how many intermediate
results to store. See examples below for more insight.
results to store. See examples below for more insight.
:param n_steps: This parameter is mandatory and it will represent the
n_steps
This parameter is mandatory and it will represent the
number of steps scan will do (scan will not check sequences or any
number of steps scan will do (scan will not check sequences or any
other source of information to figure out how many steps it needs
other source of information to figure out how many steps it needs
to do).
to do).
:param mode: Same as for the official scan
mode
:param name: Same as for the official scan
Same as for the official scan.
:param profile: Same as for the official scan
name
Same as for the official scan.
Note:
profile
- there is no truncate / go_backwards anymore !
Same as for the official scan.
- the outputs returned by scan contain the initial states as well (i.e.
if I loop over k steps, with my smallest tap for an output -3 and keep
Notes
al intermediate results, my output will be of length k+3
-----
- There is no truncate / go_backwards anymore !
Examples:
- The outputs returned by scan contain the initial states as well (i.e.
(a) if you do not want to store any intermediate results (just the
if I loop over k steps, with my smallest tap for an output -3 and keep
last one)
al intermediate results, my output will be of length k+3.
# The memory buffer can be the initial state, just that we need to
Examples
# add one extra dimension in front of it
--------
state = TT.unbroadcast(TT.shape_padleft(x0),0)
(a) if you do not want to store any intermediate results (just the
out,_ = scan(lambda x:x+1, states = state, n_steps = 5)
last one)
# Once we got our result we need to remove the extra dimension
out = out[0]
# The memory buffer can be the initial state, just that we need to
# add one extra dimension in front of it
(b) if you want to keep every intermediate results
state = TT.unbroadcast(TT.shape_padleft(x0),0)
out,_ = scan(lambda x:x+1, states = state, n_steps = 5)
state = TT.alloc(TT.constant(0), 6, x0.shape[0])
# Once we got our result we need to remove the extra dimension
state = TT.set_subtensor(state[0], x0)
out = out[0]
out,_ = scan(lambda x:x+1, states = state, n_steps = 5)
out = out[1:]
(b) if you want to keep every intermediate results
state = TT.alloc(TT.constant(0), 6, x0.shape[0])
state = TT.set_subtensor(state[0], x0)
out,_ = scan(lambda x:x+1, states = state, n_steps = 5)
out = out[1:]
"""
"""
def
wrap_into_list
(
x
):
def
wrap_into_list
(
x
):
...
...
theano/sandbox/solve.py
浏览文件 @
d1eba87d
...
@@ -13,9 +13,11 @@ from theano.tests import unittest_tools as utt
...
@@ -13,9 +13,11 @@ from theano.tests import unittest_tools as utt
class
Solve
(
gof
.
Op
):
class
Solve
(
gof
.
Op
):
"""
"""
Find the solution to the linear equation Ax=b,
Find the solution to the linear equation Ax=b.
where A is a 2d matrix and b is a 1d or 2d matrix.
A is a 2d matrix and b is a 1d or 2d matrix.
It use numpy.solve to find the solution.
It use numpy.solve to find the solution.
"""
"""
# TODO: Add class options to use the performance-enhancing flags
# TODO: Add class options to use the performance-enhancing flags
...
...
theano/sandbox/test_rng_mrg.py
浏览文件 @
d1eba87d
...
@@ -75,10 +75,11 @@ def test_deterministic():
...
@@ -75,10 +75,11 @@ def test_deterministic():
def
test_consistency_randomstreams
():
def
test_consistency_randomstreams
():
'''Verify that the random numbers generated by MRG_RandomStreams
"""
Verify that the random numbers generated by MRG_RandomStreams
are the same as the reference (Java) implementation by L'Ecuyer et al.
are the same as the reference (Java) implementation by L'Ecuyer et al.
'''
"""
seed
=
12345
seed
=
12345
n_samples
=
5
n_samples
=
5
n_streams
=
12
n_streams
=
12
...
@@ -108,9 +109,11 @@ def test_consistency_randomstreams():
...
@@ -108,9 +109,11 @@ def test_consistency_randomstreams():
def
test_consistency_cpu_serial
():
def
test_consistency_cpu_serial
():
'''Verify that the random numbers generated by mrg_uniform, serially,
"""
Verify that the random numbers generated by mrg_uniform, serially,
are the same as the reference (Java) implementation by L'Ecuyer et al.
are the same as the reference (Java) implementation by L'Ecuyer et al.
'''
"""
seed
=
12345
seed
=
12345
n_samples
=
5
n_samples
=
5
n_streams
=
12
n_streams
=
12
...
@@ -149,9 +152,11 @@ def test_consistency_cpu_serial():
...
@@ -149,9 +152,11 @@ def test_consistency_cpu_serial():
def
test_consistency_cpu_parallel
():
def
test_consistency_cpu_parallel
():
'''Verify that the random numbers generated by mrg_uniform, in parallel,
"""
Verify that the random numbers generated by mrg_uniform, in parallel,
are the same as the reference (Java) implementation by L'Ecuyer et al.
are the same as the reference (Java) implementation by L'Ecuyer et al.
'''
"""
seed
=
12345
seed
=
12345
n_samples
=
5
n_samples
=
5
n_streams
=
12
n_streams
=
12
...
@@ -193,9 +198,11 @@ def test_consistency_cpu_parallel():
...
@@ -193,9 +198,11 @@ def test_consistency_cpu_parallel():
def
test_consistency_GPU_serial
():
def
test_consistency_GPU_serial
():
'''Verify that the random numbers generated by GPU_mrg_uniform, serially,
"""
Verify that the random numbers generated by GPU_mrg_uniform, serially,
are the same as the reference (Java) implementation by L'Ecuyer et al.
are the same as the reference (Java) implementation by L'Ecuyer et al.
'''
"""
if
not
cuda_available
:
if
not
cuda_available
:
raise
SkipTest
(
'Optional package cuda not available'
)
raise
SkipTest
(
'Optional package cuda not available'
)
if
config
.
mode
==
'FAST_COMPILE'
:
if
config
.
mode
==
'FAST_COMPILE'
:
...
@@ -250,11 +257,12 @@ def test_consistency_GPU_serial():
...
@@ -250,11 +257,12 @@ def test_consistency_GPU_serial():
def
test_consistency_GPU_parallel
():
def
test_consistency_GPU_parallel
():
'''Verify that the random numbers generated by GPU_mrg_uniform, in
"""
Verify that the random numbers generated by GPU_mrg_uniform, in
parallel, are the same as the reference (Java) implementation by
parallel, are the same as the reference (Java) implementation by
L'Ecuyer et al.
L'Ecuyer et al.
'''
"""
if
not
cuda_available
:
if
not
cuda_available
:
raise
SkipTest
(
'Optional package cuda not available'
)
raise
SkipTest
(
'Optional package cuda not available'
)
if
config
.
mode
==
'FAST_COMPILE'
:
if
config
.
mode
==
'FAST_COMPILE'
:
...
@@ -310,9 +318,11 @@ def test_consistency_GPU_parallel():
...
@@ -310,9 +318,11 @@ def test_consistency_GPU_parallel():
def
test_GPU_nstreams_limit
():
def
test_GPU_nstreams_limit
():
"""Verify that a ValueError is raised when n_streams
"""
Verify that a ValueError is raised when n_streams
is greater than 2**20 on GPU. This is the value of
is greater than 2**20 on GPU. This is the value of
(NUM_VECTOR_OP_THREADS_PER_BLOCK * NUM_VECTOR_OP_BLOCKS).
(NUM_VECTOR_OP_THREADS_PER_BLOCK * NUM_VECTOR_OP_BLOCKS).
"""
"""
if
not
cuda_available
:
if
not
cuda_available
:
raise
SkipTest
(
'Optional package cuda not available'
)
raise
SkipTest
(
'Optional package cuda not available'
)
...
@@ -335,9 +345,11 @@ def test_GPU_nstreams_limit():
...
@@ -335,9 +345,11 @@ def test_GPU_nstreams_limit():
def
test_consistency_GPUA_serial
():
def
test_consistency_GPUA_serial
():
'''Verify that the random numbers generated by GPUA_mrg_uniform, serially,
"""
Verify that the random numbers generated by GPUA_mrg_uniform, serially,
are the same as the reference (Java) implementation by L'Ecuyer et al.
are the same as the reference (Java) implementation by L'Ecuyer et al.
'''
"""
from
theano.sandbox.gpuarray.tests.test_basic_ops
import
\
from
theano.sandbox.gpuarray.tests.test_basic_ops
import
\
mode_with_gpu
as
mode
mode_with_gpu
as
mode
from
theano.sandbox.gpuarray.type
import
gpuarray_shared_constructor
from
theano.sandbox.gpuarray.type
import
gpuarray_shared_constructor
...
@@ -387,11 +399,12 @@ def test_consistency_GPUA_serial():
...
@@ -387,11 +399,12 @@ def test_consistency_GPUA_serial():
def
test_consistency_GPUA_parallel
():
def
test_consistency_GPUA_parallel
():
'''Verify that the random numbers generated by GPUA_mrg_uniform, in
"""
Verify that the random numbers generated by GPUA_mrg_uniform, in
parallel, are the same as the reference (Java) implementation by
parallel, are the same as the reference (Java) implementation by
L'Ecuyer et al.
L'Ecuyer et al.
'''
"""
from
theano.sandbox.gpuarray.tests.test_basic_ops
import
\
from
theano.sandbox.gpuarray.tests.test_basic_ops
import
\
mode_with_gpu
as
mode
mode_with_gpu
as
mode
from
theano.sandbox.gpuarray.type
import
gpuarray_shared_constructor
from
theano.sandbox.gpuarray.type
import
gpuarray_shared_constructor
...
@@ -855,6 +868,7 @@ def test_multiple_rng_aliasing():
...
@@ -855,6 +868,7 @@ def test_multiple_rng_aliasing():
copy the (random) state between two similar theano graphs. The test is
copy the (random) state between two similar theano graphs. The test is
meant to detect a previous bug where state_updates was initialized as a
meant to detect a previous bug where state_updates was initialized as a
class-attribute, instead of the __init__ function.
class-attribute, instead of the __init__ function.
"""
"""
rng1
=
MRG_RandomStreams
(
1234
)
rng1
=
MRG_RandomStreams
(
1234
)
rng2
=
MRG_RandomStreams
(
2392
)
rng2
=
MRG_RandomStreams
(
2392
)
...
@@ -864,6 +878,7 @@ def test_multiple_rng_aliasing():
...
@@ -864,6 +878,7 @@ def test_multiple_rng_aliasing():
def
test_random_state_transfer
():
def
test_random_state_transfer
():
"""
"""
Test that random state can be transferred from one theano graph to another.
Test that random state can be transferred from one theano graph to another.
"""
"""
class
Graph
:
class
Graph
:
def
__init__
(
self
,
seed
=
123
):
def
__init__
(
self
,
seed
=
123
):
...
...
theano/sandbox/theano_object.py
浏览文件 @
d1eba87d
"""DRAFT: TheanoObject
"""
DRAFT: TheanoObject
N.B. the gotcha with this design is listed in the documentation of `TheanoObject`
N.B. the gotcha with this design is listed in the documentation of
`TheanoObject`.
"""
"""
from
__future__
import
print_function
from
__future__
import
print_function
...
@@ -10,7 +12,10 @@ import numpy
...
@@ -10,7 +12,10 @@ import numpy
def
theano_type
(
x
):
def
theano_type
(
x
):
"""Return a theano Type instance suitable for containing value `x`."""
"""
Return a theano Type instance suitable for containing value `x`.
"""
if
type
(
x
)
is
int
:
if
type
(
x
)
is
int
:
return
tensor
.
lscalar
return
tensor
.
lscalar
else
:
else
:
...
@@ -18,37 +23,47 @@ def theano_type(x):
...
@@ -18,37 +23,47 @@ def theano_type(x):
class
symbolic_fn_callable
(
object
):
class
symbolic_fn_callable
(
object
):
"""This is the class whose instance you get when you access a symbolic function in a
"""
`TheanoObject`.
This is the class whose instance you get when you access a symbolic function
in a `TheanoObject`.
When you call a symbolic function (`symbolic_fn`) of a TheanoObject the `__call__` of this
class handles your request.
When you call a symbolic function (`symbolic_fn`) of a TheanoObject,
the `__call__` of this class handles your request.
You can also access the symbolic outputs and updates of a symbolic function though this
class.
You can also access the symbolic outputs and updates of a symbolic function
through this class.
.. code-block:: python
Examples
class T(TheanoObject):
--------
@symbolic_fn
def add(self, x):
class T(TheanoObject):
...
@symbolic_fn
add_outputs = ...
def add(self, x):
add_updates = ...
...
return RVal(add_outputs, add_updates)
add_outputs = ...
t = T()
add_updates = ...
t.add.outputs(5) # returns `add_outputs` from when `x=theano_type(5)`
return RVal(add_outputs, add_updates)
t.add.updates(5) # returns `add_updates` from when `x=theano_type(5)`
t.add.theano_function(5) # returns the `Function` compiled when `x=theano_type(5)`
t = T()
t.add(5) # runs the `Function` compiled when `x=theano_type(5)`
t.add.outputs(5) # returns `add_outputs` from when `x=theano_type(5)`
# with arguments `(5,)`
t.add.updates(5) # returns `add_updates` from when `x=theano_type(5)`
t.add.theano_function(5) # returns the `Function` compiled when
# `x=theano_type(5)`
t.add(5) # runs the `Function` compiled when `x=theano_type(5)`
# with arguments `(5,)`
"""
"""
def
__init__
(
self
,
fn
,
mode
):
def
__init__
(
self
,
fn
,
mode
):
self
.
fn
=
fn
self
.
fn
=
fn
self
.
mode
=
mode
self
.
mode
=
mode
def
on
(
self
,
o_self
):
def
on
(
self
,
o_self
):
"""Silly method to work with symbolic_fn.__get__"""
"""
Silly method to work with symbolic_fn.__get__.
"""
self
.
o_self
=
o_self
self
.
o_self
=
o_self
return
self
return
self
...
@@ -69,7 +84,9 @@ class symbolic_fn_callable(object):
...
@@ -69,7 +84,9 @@ class symbolic_fn_callable(object):
class
symbolic_fn
(
object
):
class
symbolic_fn
(
object
):
"""A property-like class for decorating symbolic functions in `TheanoObject`
"""
A property-like class for decorating symbolic functions in `TheanoObject`.
"""
"""
def
__init__
(
self
,
fn
,
mode
=
None
):
def
__init__
(
self
,
fn
,
mode
=
None
):
self
.
fn
=
fn
self
.
fn
=
fn
...
@@ -84,9 +101,11 @@ class symbolic_fn(object):
...
@@ -84,9 +101,11 @@ class symbolic_fn(object):
def
symbolic_fn_opts
(
**
kwargs
):
def
symbolic_fn_opts
(
**
kwargs
):
"""Return a decorator for symbolic_functions in a `TheanoObject`
"""
Return a decorator for symbolic_functions in a `TheanoObject`.
`kwargs` passed here are passed to `theano.function` via `symbolic_fn`.
`kwargs` passed here are passed to `theano.function` via `symbolic_fn`
"""
"""
def
deco
(
f
):
def
deco
(
f
):
return
symbolic_fn
(
f
,
**
kwargs
)
return
symbolic_fn
(
f
,
**
kwargs
)
...
@@ -94,16 +113,22 @@ def symbolic_fn_opts(**kwargs):
...
@@ -94,16 +113,22 @@ def symbolic_fn_opts(**kwargs):
class
RVal
(
object
):
class
RVal
(
object
):
"""A Return-Value object for a `symbolic_fn` """
"""
A Return-Value object for a `symbolic_fn`.
"""
outputs
=
[]
outputs
=
[]
"""The method will compute values for the variables in this list"""
"""
The method will compute values for the variables in this list.
"""
updates
=
{}
updates
=
{}
"""The method will update module variables in this dictionary
"""The method will update module variables in this dictionary
.
For items ``(k,v)`` in this dictionary, ``k`` must be a `symbolic_member` of some module.
For items ``(k,v)`` in this dictionary, ``k`` must be a `symbolic_member`
On each call to this compiled function, the value of ``k`` will be replaced with the
of some module.
computed value of the Variable ``v``.
On each call to this compiled function, the value of ``k`` will be replaced
with the computed value of the Variable ``v``.
"""
"""
def
__init__
(
self
,
outputs
,
updates
=
None
):
def
__init__
(
self
,
outputs
,
updates
=
None
):
...
@@ -115,54 +140,63 @@ class RVal(object):
...
@@ -115,54 +140,63 @@ class RVal(object):
class
TheanoObject
(
object
):
class
TheanoObject
(
object
):
"""Base for Theano-supported classes
"""
Base for Theano-supported classes.
This class provides support for symbolic_fn class attributes.
This class provides support for symbolic_fn class attributes.
These will be compiled on demand so that they can be used just like normal (non-symbolic)
These will be compiled on demand so that they can be used just like normal
methods.
(non-symbolic) methods.
The symbolic functions in a TheanoObject can share member variables that have been created
using the `symbolic_member` method.
:note: Other variables (ones not created using ``self.symbolic_member``) referred to in the
The symbolic functions in a TheanoObject can share member variables that
body of a symbolic function will *not* be shared between symbolic functions, or between
have been created using the `symbolic_member` method.
symbolic functions and this class. These other variables will be locked away in the
closure of a symbolic function when that function is compiled.
Notes
-----
Other variables (ones not created using ``self.symbolic_member``) referred
to in the body of a symbolic function will *not* be shared between symbolic
functions, or between symbolic functions and this class. These other
variables will be locked away in the closure of a symbolic function when
that function is compiled.
:warning
: It is not recommended for code to interleave
.. warning:
: It is not recommended for code to interleave
(a) changes to non-symbolic instance variables with
(a) changes to non-symbolic instance variables with
(b) calls to symbolic functions that use those instance variables.
(b) calls to symbolic functions that use those instance variables.
A symbolic function may be
A symbolic function may be compiled multiple times because it must be
compiled multiple times because it must be compiled for each set of argument types.
compiled for each set of argument types.
Each time the function is compiled, the values of non-symbolic variables will be locked
Each time the function is compiled, the values of non-symbolic variables
into the compiled function. Subsequent changes to those non-symbolic instance variables
will be locked into the compiled function. Subsequent changes to those
will not have any effect on the behaviour of the already-compiled symbolic function.
non-symbolic instance variables will not have any effect on the behaviour
of the already-compiled symbolic function.
:todo: Is there an efficient way of recognizing when a compiled symbolic
function is stale,
:todo: Is there an efficient way of recognizing when a compiled symbolic
wrt the current values of the class's instance variables?
function is stale,
wrt the current values of the class's instance variables?
- One option is to re-evaluate symbolic functions symbolically and see if
the graph can be
- One option is to re-evaluate symbolic functions symbolically and see if
completely merged with the original graph. This is not fast enough to do all the time by
the graph can be completely merged with the original graph. This is not
default though.
fast enough to do all the time by
default though.
"""
"""
def
__init__
(
self
):
def
__init__
(
self
):
self
.
module_method_cache
=
{}
self
.
module_method_cache
=
{}
def
_get_method_impl
(
self
,
fn
,
o_self
,
args
,
kwargs
,
mode
):
def
_get_method_impl
(
self
,
fn
,
o_self
,
args
,
kwargs
,
mode
):
"""Retrieve information about the symbolic function (`fn`) in TheanoObject instance
"""
`o_self`, being evaluated on arguments `args` and `kwargs`.
Retrieve information about the symbolic function (`fn`) in TheanoObject
instance `o_self`, being evaluated on arguments `args` and `kwargs`.
:rtype: dict with entries 'theano_function', 'outputs', 'updates'
:return: the theano function compiled for these arguments, the symbolic outputs of that
Returns
function, and the symbolic updates performed by that function.
-------
dict with entries 'theano_function', 'outputs', 'updates'
The theano function compiled for these arguments, the symbolic
outputs of that function, and the symbolic updates performed by
that function.
:note: This function caches return values in self.`module_method_cache`.
Notes
-----
This function caches return values in self.`module_method_cache`.
:todo: This may at some point become a class-level cache rather than an
instance-level
:todo: This may at some point become a class-level cache rather than an
cache.
instance-level
cache.
"""
"""
if
kwargs
:
if
kwargs
:
...
@@ -203,15 +237,19 @@ class TheanoObject(object):
...
@@ -203,15 +237,19 @@ class TheanoObject(object):
return
cache
[
key
]
return
cache
[
key
]
def
symbolic_member
(
self
,
ival
,
name
=
None
):
def
symbolic_member
(
self
,
ival
,
name
=
None
):
"""Create a Variable instance to hold value `ival`.
"""
Create a Variable instance to hold value `ival`.
This function also immediately creates a Container object for ival.
This function also immediately creates a Container object for ival.
When the returned Variable is used as input to a `TheanoObject` `symbolic_fn`, (but
When the returned Variable is used as input to a `TheanoObject`
does not appear as an argument to that symbolic_fn), then this Container will be used to
`symbolic_fn`, (but does not appear as an argument to that symbolic_fn),
retrieve (and store) values for the Variable.
then this Container will be used to retrieve (and store) values for the
Variable.
This Variable's Container's contents can be retrieved by its `get()` method.
This Variable's Container's contents can be written using its `set(newval)` method.
This Variable's Container's contents can be retrieved by its `get()`
method.
This Variable's Container's contents can be written using its
`set(newval)` method.
"""
"""
if
type
(
ival
)
is
not
int
:
if
type
(
ival
)
is
not
int
:
...
...
编写
预览
Markdown
格式
0%
重试
或
添加新文件
添加附件
取消
您添加了
0
人
到此讨论。请谨慎行事。
请先完成此评论的编辑!
取消
请
注册
或者
登录
后发表评论