提交 8c86ef0d authored 作者: Pascal Lamblin's avatar Pascal Lamblin

Merge pull request #2147 from nouiz/mixed

[BUG] Fix TypeList missing view_map and fix crashes.
...@@ -140,10 +140,12 @@ default values. ...@@ -140,10 +140,12 @@ default values.
.. method:: may_share_memory(a, b) .. method:: may_share_memory(a, b)
Optional. Only needed for DebugMode. Return True if the python Optional to run, but mandatory for DebugMode. Return True if the python
objects `a` and `b` could share memory. Return False objects `a` and `b` could share memory. Return False
otherwise. It is used to debug when Ops didn't declare memory otherwise. It is used to debug when Ops didn't declare memory
aliaing between variables. Must be a static method. aliaing between variables. Can be a static method.
It is highly recommande to use and is mandatory for Type in Theano
as our buildbot run in DebugMode.
For each method, the *default* is what ``Type`` defines For each method, the *default* is what ``Type`` defines
for you. So, if you create an instance of ``Type`` or an for you. So, if you create an instance of ``Type`` or an
......
...@@ -31,10 +31,11 @@ TODO: Give examples on how to use these things! They are pretty complicated. ...@@ -31,10 +31,11 @@ TODO: Give examples on how to use these things! They are pretty complicated.
with batches of multi-channel 2D images, available for CPU and GPU. with batches of multi-channel 2D images, available for CPU and GPU.
Most of the more efficient GPU implementations listed below can be used Most of the more efficient GPU implementations listed below can be used
as an automatic replacement for nnet.conv2d by enabling specific graph as an automatic replacement for nnet.conv2d by enabling specific graph
optimizations. optimizations. It flip the kernel.
- :func:`conv2d_fft <theano.sandbox.cuda.fftconv.conv2d_fft>` This - :func:`conv2d_fft <theano.sandbox.cuda.fftconv.conv2d_fft>` This
is a GPU-only version of nnet.conv2d that uses an FFT transform is a GPU-only version of nnet.conv2d that uses an FFT transform
to perform the work. conv2d_fft should not be used directly as to perform the work. It flip the kernel as ``conv2d``.
conv2d_fft should not be used directly as
it does not provide a gradient. Instead, use nnet.conv2d and it does not provide a gradient. Instead, use nnet.conv2d and
allow Theano's graph optimizer to replace it by the FFT version allow Theano's graph optimizer to replace it by the FFT version
by setting by setting
...@@ -64,7 +65,7 @@ TODO: Give examples on how to use these things! They are pretty complicated. ...@@ -64,7 +65,7 @@ TODO: Give examples on how to use these things! They are pretty complicated.
<http://deeplearning.net/software/pylearn2/library/linear.html>`_ <http://deeplearning.net/software/pylearn2/library/linear.html>`_
implementation, but it can also be used `directly from within Theano implementation, but it can also be used `directly from within Theano
<http://benanne.github.io/2014/04/03/faster-convolutions-in-theano.html>`_ <http://benanne.github.io/2014/04/03/faster-convolutions-in-theano.html>`_
as a manual replacement for nnet.conv2d. as a manual replacement for nnet.conv2d. It does not flip the kernel.
- :func:`GpuCorrMM <theano.sandbox.cuda.blas.GpuCorrMM>` - :func:`GpuCorrMM <theano.sandbox.cuda.blas.GpuCorrMM>`
This is a GPU-only 2d correlation implementation taken from This is a GPU-only 2d correlation implementation taken from
`caffe <https://github.com/BVLC/caffe/blob/master/src/caffe/layers/conv_layer.cu>`_ `caffe <https://github.com/BVLC/caffe/blob/master/src/caffe/layers/conv_layer.cu>`_
...@@ -103,7 +104,7 @@ TODO: Give examples on how to use these things! They are pretty complicated. ...@@ -103,7 +104,7 @@ TODO: Give examples on how to use these things! They are pretty complicated.
- :func:`conv3D <theano.tensor.nnet.Conv3D.conv3D>` - :func:`conv3D <theano.tensor.nnet.Conv3D.conv3D>`
3D Convolution applying multi-channel 3D filters to batches of 3D Convolution applying multi-channel 3D filters to batches of
multi-channel 3D images. multi-channel 3D images. It do not flip the kernel.
- :func:`conv3d_fft <theano.sandbox.cuda.fftconv.conv3d_fft>` - :func:`conv3d_fft <theano.sandbox.cuda.fftconv.conv3d_fft>`
GPU-only version of conv3D using FFT transform. conv3d_fft should GPU-only version of conv3D using FFT transform. conv3d_fft should
not be called directly as it does not provide a gradient. not be called directly as it does not provide a gradient.
...@@ -124,7 +125,8 @@ TODO: Give examples on how to use these things! They are pretty complicated. ...@@ -124,7 +125,8 @@ TODO: Give examples on how to use these things! They are pretty complicated.
- :func:`conv3d2d <theano.tensor.nnet.conv3d2d.conv3d>` - :func:`conv3d2d <theano.tensor.nnet.conv3d2d.conv3d>`
Another conv3d implementation that uses the conv2d with data reshaping. Another conv3d implementation that uses the conv2d with data reshaping.
It is faster in some cases than conv3d, specifically on the GPU. It is faster in some cases than conv3d, and work on the GPU.
It flip the kernel.
.. autofunction:: theano.tensor.nnet.conv.conv2d .. autofunction:: theano.tensor.nnet.conv.conv2d
.. autofunction:: theano.sandbox.cuda.fftconv.conv2d_fft .. autofunction:: theano.sandbox.cuda.fftconv.conv2d_fft
......
...@@ -666,14 +666,26 @@ def _optcheck_fgraph(input_specs, output_specs, accept_inplace=False): ...@@ -666,14 +666,26 @@ def _optcheck_fgraph(input_specs, output_specs, accept_inplace=False):
return fgraph, map(SymbolicOutput, updates), equivalence_tracker return fgraph, map(SymbolicOutput, updates), equivalence_tracker
class DataDestroyed():
# this is a singleton class We put it in the storage_map when the
# variable value was destroyed to prevent reusing bad value for
# it.
pass
data_destroyed = DataDestroyed()
def _check_inputs(node, storage_map, r_vals, dr_vals, active_nodes, def _check_inputs(node, storage_map, r_vals, dr_vals, active_nodes,
clobber_dr_vals=True, clobber_dr_vals=True,
perform=None, warn_input_not_reused=True): perform=None, warn_input_not_reused=True):
""" """Raise BadDestroyMap if necessary, update dr_vals
Raise BadDestroyMap if necessary, update dr_vals
Returns a list of output variables that actually worked inplace Returns a list of output variables that actually worked inplace
(their value is aliased to the value of at least one input). (their value is aliased to the value of at least one input).
It modify the storage_map to remove node.inputs variable that have
been destroyed.
""" """
destroyed_idx_list = [] destroyed_idx_list = []
destroy_map = getattr(node.op, 'destroy_map', {}) destroy_map = getattr(node.op, 'destroy_map', {})
...@@ -736,7 +748,8 @@ def _check_inputs(node, storage_map, r_vals, dr_vals, active_nodes, ...@@ -736,7 +748,8 @@ def _check_inputs(node, storage_map, r_vals, dr_vals, active_nodes,
raise Exception('failure in topological ordering') raise Exception('failure in topological ordering')
if clobber_dr_vals: if clobber_dr_vals:
dr_vals[r] = (storage_map[r][0], node) #no copy, this is the last use of this variable dr_vals[r] = (storage_map[r][0], node) #no copy, this is the last use of this variable
storage_map[r][0] = None #make sure that dr_vals[r] doens't get used again # make sure that dr_vals[r] doens't get used again
storage_map[r][0] = data_destroyed
else: else:
raise BadDestroyMap(node, r_idx, r_vals[r], raise BadDestroyMap(node, r_idx, r_vals[r],
storage_map[r][0], perform) storage_map[r][0], perform)
...@@ -766,8 +779,15 @@ def _check_viewmap(node, storage_map): ...@@ -766,8 +779,15 @@ def _check_viewmap(node, storage_map):
# case... # case...
for ii, inode in enumerate(node.inputs): for ii, inode in enumerate(node.inputs):
in_storage = storage_map[inode][0]
if in_storage is data_destroyed:
# If the input have been destroyed, it can't be a
# view. So no need to check. Also, we don't have the
# original value, we we wouldn't be able to do this
# useless check.
continue
if hasattr(inode.type, 'may_share_memory') and\ if hasattr(inode.type, 'may_share_memory') and\
inode.type.may_share_memory(outstorage, storage_map[inode][0]): inode.type.may_share_memory(outstorage, in_storage):
nodeid = id(inode) nodeid = id(inode)
bad_alias[nodeid] = ii bad_alias[nodeid] = ii
......
...@@ -16,7 +16,8 @@ import theano ...@@ -16,7 +16,8 @@ import theano
from theano import gof from theano import gof
from theano.gof.python25 import partial from theano.gof.python25 import partial
import theano.compile.mode import theano.compile.mode
from theano.compile.io import In, SymbolicInput, SymbolicInputKit, SymbolicOutput from theano.compile.io import (
In, SymbolicInput, SymbolicInputKit, SymbolicOutput)
from theano.compile.ops import deep_copy_op, view_op from theano.compile.ops import deep_copy_op, view_op
import logging import logging
...@@ -29,15 +30,20 @@ class UnusedInputError(Exception): ...@@ -29,15 +30,20 @@ class UnusedInputError(Exception):
""" """
pass pass
def alias_root(v): def alias_root(v):
"""Return the variable to which v is aliased by view_maps and destroy_maps""" """Return the variable to which v is aliased by view_maps and destroy_maps"""
if v.owner is None: return v if v.owner is None:
return v
vmap = getattr(v.owner.op, 'view_map', {}) vmap = getattr(v.owner.op, 'view_map', {})
dmap = getattr(v.owner.op, 'destroy_map', {}) dmap = getattr(v.owner.op, 'destroy_map', {})
outpos = v.owner.outputs.index(v) outpos = v.owner.outputs.index(v)
v_views = vmap.get(outpos, []) + dmap.get(outpos, []) v_views = vmap.get(outpos, []) + dmap.get(outpos, [])
if len(v_views) > 1: if len(v_views) > 1:
raise NotImplementedError() raise NotImplementedError(
str(v) + " is a view/destroyed version of more then one inputs. "
"Currently, we only support the case where an output is a view or "
"a destroyed version of one input.")
elif v_views: elif v_views:
return alias_root(v.owner.inputs[v_views[0]]) return alias_root(v.owner.inputs[v_views[0]])
else: else:
...@@ -106,10 +112,11 @@ class Supervisor: ...@@ -106,10 +112,11 @@ class Supervisor:
return True return True
for r in self.protected + list(fgraph.outputs): for r in self.protected + list(fgraph.outputs):
if fgraph.destroyers(r): if fgraph.destroyers(r):
raise gof.InconsistencyError("Trying to destroy a protected Variable.", r) raise gof.InconsistencyError(
"Trying to destroy a protected Variable.", r)
def std_fgraph(input_specs, output_specs, accept_inplace = False): def std_fgraph(input_specs, output_specs, accept_inplace=False):
""" """
Makes an FunctionGraph corresponding to the input specs and the output Makes an FunctionGraph corresponding to the input specs and the output
specs. Any SymbolicInput in the input_specs, if its update field specs. Any SymbolicInput in the input_specs, if its update field
...@@ -134,7 +141,8 @@ def std_fgraph(input_specs, output_specs, accept_inplace = False): ...@@ -134,7 +141,8 @@ def std_fgraph(input_specs, output_specs, accept_inplace = False):
for node in fgraph.apply_nodes: for node in fgraph.apply_nodes:
if getattr(node.op, 'destroy_map', None): if getattr(node.op, 'destroy_map', None):
if not accept_inplace: if not accept_inplace:
raise TypeError("Graph must not contain inplace operations", node, node.op) raise TypeError("Graph must not contain inplace operations",
node, node.op)
else: else:
fgraph.attach_feature(gof.DestroyHandler()) fgraph.attach_feature(gof.DestroyHandler())
break break
...@@ -155,6 +163,7 @@ def std_fgraph(input_specs, output_specs, accept_inplace = False): ...@@ -155,6 +163,7 @@ def std_fgraph(input_specs, output_specs, accept_inplace = False):
std_fgraph.features = [gof.toolbox.PreserveNames] std_fgraph.features = [gof.toolbox.PreserveNames]
class AliasedMemoryError(Exception): class AliasedMemoryError(Exception):
"""Memory is aliased that should not be""" """Memory is aliased that should not be"""
pass pass
......
...@@ -317,6 +317,12 @@ class Shape_i(gof.Op): ...@@ -317,6 +317,12 @@ class Shape_i(gof.Op):
__props__ = ("i",) __props__ = ("i",)
def __init__(self, i): def __init__(self, i):
# As i will be used in the hash and that ndarray are not hashable,
# we need to convert it to an int as it is hashable.
if isinstance(i, numpy.ndarray):
assert "int" in str(i.dtype)
assert i == int(i)
i = int(i)
self.i = i self.i = i
def __str__(self): def __str__(self):
......
...@@ -44,6 +44,14 @@ class MyType(Type): ...@@ -44,6 +44,14 @@ class MyType(Type):
raise ValueError("Invalid value") raise ValueError("Invalid value")
return x return x
# Added to make those tests pass in DebugMode
@staticmethod
def may_share_memory(a, b):
# As this represent a string and string are immutable, they
# never share memory in the DebugMode sence. This is needed as
# Python reuse string internally.
return False
class MyOp(Op): class MyOp(Op):
......
...@@ -2516,10 +2516,7 @@ class GpuAdvancedIncSubtensor1(tensor.AdvancedIncSubtensor1, GpuOp): ...@@ -2516,10 +2516,7 @@ class GpuAdvancedIncSubtensor1(tensor.AdvancedIncSubtensor1, GpuOp):
if x_.type.ndim == 0: if x_.type.ndim == 0:
raise TypeError('cannot index into a scalar') raise TypeError('cannot index into a scalar')
bcast = (ilist_.broadcastable[0],) + x_.broadcastable[1:] return Apply(self, [x_, y_, ilist_], [x_.type()])
return Apply(self, [x_, y_, ilist_],
[CudaNdarrayType(dtype=x_.dtype,
broadcastable=bcast)()])
# CudaNdarray_Subscript() doesn't support Advanced slicing. # CudaNdarray_Subscript() doesn't support Advanced slicing.
# But we can't use the parent version that loops on each index # But we can't use the parent version that loops on each index
...@@ -2685,10 +2682,7 @@ class GpuAdvancedIncSubtensor1_dev20(GpuAdvancedIncSubtensor1): ...@@ -2685,10 +2682,7 @@ class GpuAdvancedIncSubtensor1_dev20(GpuAdvancedIncSubtensor1):
if x_.type.ndim == 0: if x_.type.ndim == 0:
raise TypeError('cannot index into a scalar') raise TypeError('cannot index into a scalar')
bcast = (ilist_.broadcastable[0],) + x_.broadcastable[1:] return Apply(self, [x_, y_, ilist_], [x_.type()])
return Apply(self, [x_, y_, ilist_],
[CudaNdarrayType(dtype=x_.dtype,
broadcastable=bcast)()])
def c_code_cache_version(self): def c_code_cache_version(self):
return (2,) return (2,)
......
...@@ -7,7 +7,7 @@ from theano.gof.type import CDataType ...@@ -7,7 +7,7 @@ from theano.gof.type import CDataType
from theano.compat import PY3 from theano.compat import PY3
from theano.compat.six import StringIO from theano.compat.six import StringIO
from theano.sandbox.cuda.type import CudaNdarrayType from theano.sandbox.cuda.type import CudaNdarrayType
from theano.sandbox.cuda import GpuOp from theano.sandbox.cuda import GpuOp, active_device_number, device_properties
from theano.sandbox.cuda.basic_ops import (as_cuda_ndarray_variable, from theano.sandbox.cuda.basic_ops import (as_cuda_ndarray_variable,
gpu_contiguous) gpu_contiguous)
from theano.sandbox.cuda.blas import GpuConv from theano.sandbox.cuda.blas import GpuConv
...@@ -16,6 +16,23 @@ from theano.sandbox.cuda.nnet import GpuSoftmax ...@@ -16,6 +16,23 @@ from theano.sandbox.cuda.nnet import GpuSoftmax
from theano.sandbox.cuda.nvcc_compiler import NVCC_compiler from theano.sandbox.cuda.nvcc_compiler import NVCC_compiler
def dnn_available():
if dnn_available.avail is None:
dev = active_device_number()
if device_properties(dev)['major'] < 3:
dnn_available.msg = "Device not supported by cuDNN"
dnn_available.avail = False
else:
dnn_available.msg = "Can not find the cuDNN library"
dnn_available.avail = theano.gof.cmodule.GCC_compiler.try_flags(
["-l", "cudnn"])
return dnn_available.avail
dnn_available.avail = None
dnn_available.msg = None
class DnnBase(GpuOp): class DnnBase(GpuOp):
""" """
Creates a handle for cudnn and pulls in the cudnn libraries and headers. Creates a handle for cudnn and pulls in the cudnn libraries and headers.
......
...@@ -27,6 +27,9 @@ class GpuCrossentropySoftmaxArgmax1HotWithBias(GpuOp): ...@@ -27,6 +27,9 @@ class GpuCrossentropySoftmaxArgmax1HotWithBias(GpuOp):
def make_node(self, x, b, y_idx): def make_node(self, x, b, y_idx):
#N.B. won't work when we don't cast y_idx to float anymore #N.B. won't work when we don't cast y_idx to float anymore
x = as_cuda_ndarray_variable(x)
b = as_cuda_ndarray_variable(b)
y_idx = as_cuda_ndarray_variable(y_idx)
nll = y_idx.type() nll = y_idx.type()
sm = x.type() sm = x.type()
am = y_idx.type() am = y_idx.type()
...@@ -237,6 +240,9 @@ class GpuCrossentropySoftmax1HotWithBiasDx(GpuOp): ...@@ -237,6 +240,9 @@ class GpuCrossentropySoftmax1HotWithBiasDx(GpuOp):
return self.__class__.__name__ return self.__class__.__name__
def make_node(self, dy, sm, y_idx): def make_node(self, dy, sm, y_idx):
dy = as_cuda_ndarray_variable(dy)
sm = as_cuda_ndarray_variable(sm)
y_idx = as_cuda_ndarray_variable(y_idx)
return Apply(self, [dy, sm, y_idx], [sm.type()]) return Apply(self, [dy, sm, y_idx], [sm.type()])
def c_code_cache_version(self): def c_code_cache_version(self):
...@@ -379,6 +385,7 @@ class GpuSoftmax(GpuOp): ...@@ -379,6 +385,7 @@ class GpuSoftmax(GpuOp):
return self.__class__.__name__ return self.__class__.__name__
def make_node(self, x): def make_node(self, x):
x = as_cuda_ndarray_variable(x)
return Apply(self, [x], [x.type()]) return Apply(self, [x], [x.type()])
def infer_shape(self, node, shape): def infer_shape(self, node, shape):
...@@ -543,6 +550,7 @@ class GpuSoftmaxWithBias(GpuOp): ...@@ -543,6 +550,7 @@ class GpuSoftmaxWithBias(GpuOp):
return self.__class__.__name__ return self.__class__.__name__
def make_node(self, x, b): def make_node(self, x, b):
x = as_cuda_ndarray_variable(x)
return Apply(self, [x, b], [x.type()]) return Apply(self, [x, b], [x.type()])
def infer_shape(self, node, shape): def infer_shape(self, node, shape):
......
...@@ -578,8 +578,8 @@ def test_gemm_valid(): ...@@ -578,8 +578,8 @@ def test_gemm_valid():
def test_dnn_valid(): def test_dnn_valid():
if cuda.device_properties(cuda.active_device_number())['major'] < 3: if not cuda.dnn.dnn_available():
raise SkipTest('Current GPU too old') raise SkipTest(cuda.dnn.dnn_available.msg)
for t in _test_valid(GpuDnnConv, mode=theano_mode.including("cudnn")): for t in _test_valid(GpuDnnConv, mode=theano_mode.including("cudnn")):
yield t yield t
...@@ -659,8 +659,8 @@ def test_gemm_full(): ...@@ -659,8 +659,8 @@ def test_gemm_full():
def test_dnn_full(): def test_dnn_full():
if cuda.device_properties(cuda.active_device_number())['major'] < 3: if not cuda.dnn.dnn_available():
raise SkipTest('Current GPU too old') raise SkipTest(cuda.dnn.dnn_available.msg)
for t in _test_full(GpuDnnConv, mode=theano_mode.including("cudnn")): for t in _test_full(GpuDnnConv, mode=theano_mode.including("cudnn")):
yield t yield t
...@@ -711,8 +711,8 @@ def test_gemm_subsample(): ...@@ -711,8 +711,8 @@ def test_gemm_subsample():
def test_dnn_subsample(): def test_dnn_subsample():
if cuda.device_properties(cuda.active_device_number())['major'] < 3: if not cuda.dnn.dnn_available():
raise SkipTest('Current GPU too old') raise SkipTest(cuda.dnn.dnn_available.msg)
for t in _test_subsample(GpuDnnConv, theano_mode.including('cudnn')): for t in _test_subsample(GpuDnnConv, theano_mode.including('cudnn')):
yield t yield t
...@@ -909,6 +909,10 @@ def conv_grad(mode, bs, ch, nf, rImg1, rImg2, rFlt1, rFlt2, subsample, op): ...@@ -909,6 +909,10 @@ def conv_grad(mode, bs, ch, nf, rImg1, rImg2, rFlt1, rFlt2, subsample, op):
def test_conv_grads(): def test_conv_grads():
if cuda.device_properties(cuda.active_device_number())['major'] < 3:
ops = [gemm_op]
else:
ops = [gemm_op, dnn_op]
for mode in 'valid', 'full': for mode in 'valid', 'full':
for bs in [1, 5]: for bs in [1, 5]:
for ch in [4]: for ch in [4]:
...@@ -918,7 +922,7 @@ def test_conv_grads(): ...@@ -918,7 +922,7 @@ def test_conv_grads():
for rFlt1 in [1, 2]: for rFlt1 in [1, 2]:
for rFlt2 in [1, 2]: for rFlt2 in [1, 2]:
for subsample in (1, 1), (1, 2), (2, 2): for subsample in (1, 1), (1, 2), (2, 2):
for op in [gemm_op, dnn_op]: for op in ops:
yield (conv_grad, mode, bs, ch, nf, yield (conv_grad, mode, bs, ch, nf,
rImg1, rImg2, rFlt1, rFlt2, rImg1, rImg2, rFlt1, rFlt2,
subsample, op) subsample, op)
......
...@@ -301,6 +301,9 @@ class test_SoftMax(unittest.TestCase): ...@@ -301,6 +301,9 @@ class test_SoftMax(unittest.TestCase):
self._cmp(0, 10, f, f_gpu) self._cmp(0, 10, f, f_gpu)
def test_cudnn_softmax(self): def test_cudnn_softmax(self):
if not cuda.dnn.dnn_available():
raise SkipTest(cuda.dnn.dnn_available.msg)
def cmp(n, m, f, f_gpu): def cmp(n, m, f, f_gpu):
data = numpy.arange(n * m, dtype='float32').reshape(n, m) data = numpy.arange(n * m, dtype='float32').reshape(n, m)
gdata = numpy.asarray(data)[:, :, None, None] gdata = numpy.asarray(data)[:, :, None, None]
......
...@@ -174,9 +174,9 @@ def conv3d(signals, filters, ...@@ -174,9 +174,9 @@ def conv3d(signals, filters,
:param filters_shape: None or a tuple/list with the shape of filters :param filters_shape: None or a tuple/list with the shape of filters
:param border_mode: The only one tested is 'valid'. :param border_mode: The only one tested is 'valid'.
:note: Work on the GPU. :note: Another way to define signals: (batch, time, in channel, row, column)
Another way to define signals: (batch, time, in channel, row, column)
Another way to define filters: (out channel,time,in channel, row, column) Another way to define filters: (out channel,time,in channel, row, column)
:note: See the `conv3d_fft`_ or `conv3d2d`_ for GPU implementations.
:see: Someone made a script that shows how to swap the axes between :see: Someone made a script that shows how to swap the axes between
both 3d convolution implementations in Theano. See the last both 3d convolution implementations in Theano. See the last
......
...@@ -822,9 +822,17 @@ class ShapeFeature(object): ...@@ -822,9 +822,17 @@ class ShapeFeature(object):
s_i.owner.inputs[0].owner and s_i.owner.inputs[0].owner and
isinstance(s_i.owner.inputs[0].owner.op, T.Shape)): isinstance(s_i.owner.inputs[0].owner.op, T.Shape)):
assert s_i.ndim == 0 assert s_i.ndim == 0
assert len(s_i.owner.inputs) == 2 assert len(s_i.owner.op.idx_list) == 1
# The current Subtensor always put constant index in the graph.
# This was not True in the past. So call the Subtensor function
# that will return the right index.
idx = theano.tensor.subtensor.get_idx_list(s_i.owner.inputs,
s_i.owner.op.idx_list)
assert len(idx) == 1
idx = idx[0]
try: try:
i = get_scalar_constant_value(s_i.owner.inputs[1]) i = get_scalar_constant_value(idx)
s_i = Shape_i(i)(s_i.owner.inputs[0].owner.inputs[0]) s_i = Shape_i(i)(s_i.owner.inputs[0].owner.inputs[0])
except NotScalarConstantError: except NotScalarConstantError:
pass pass
......
...@@ -1737,9 +1737,8 @@ class AdvancedIncSubtensor1(Op): ...@@ -1737,9 +1737,8 @@ class AdvancedIncSubtensor1(Op):
'cannot %s x subtensor with ndim=%s' 'cannot %s x subtensor with ndim=%s'
' by y with ndim=%s to x subtensor with ndim=%s ' % ( ' by y with ndim=%s to x subtensor with ndim=%s ' % (
opname, x_.type.ndim, y_.type.ndim)) opname, x_.type.ndim, y_.type.ndim))
bcast = (ilist_.broadcastable[0],) + x_.broadcastable[1:]
return Apply(self, [x_, y_, ilist_], [TensorType(dtype=x.dtype, return Apply(self, [x_, y_, ilist_], [x_.type()])
broadcastable=bcast)()])
def perform(self, node, inp, out_): def perform(self, node, inp, out_):
# TODO opt to make this inplace # TODO opt to make this inplace
......
...@@ -64,6 +64,11 @@ class T_extending(unittest.TestCase): ...@@ -64,6 +64,11 @@ class T_extending(unittest.TestCase):
def values_eq_approx(self, x, y, tolerance=1e-4): def values_eq_approx(self, x, y, tolerance=1e-4):
return abs(x - y) / (abs(x) + abs(y)) < tolerance return abs(x - y) / (abs(x) + abs(y)) < tolerance
# Added to make those tests pass in DebugMode
@staticmethod
def may_share_memory(a, b):
return a is b
double = Double() double = Double()
...@@ -87,6 +92,11 @@ class T_extending(unittest.TestCase): ...@@ -87,6 +92,11 @@ class T_extending(unittest.TestCase):
def __str__(self): def __str__(self):
return "double" return "double"
# Added to make those tests pass in DebugMode
@staticmethod
def may_share_memory(a, b):
return a is b
double = Double() double = Double()
...@@ -194,6 +204,11 @@ class T_extending(unittest.TestCase): ...@@ -194,6 +204,11 @@ class T_extending(unittest.TestCase):
def __str__(self): def __str__(self):
return "double" return "double"
# Added to make those tests pass in DebugMode
@staticmethod
def may_share_memory(a, b):
return a is b
double = Double() double = Double()
class BinaryDoubleOp(gof.Op): class BinaryDoubleOp(gof.Op):
...@@ -341,6 +356,11 @@ class T_extending(unittest.TestCase): ...@@ -341,6 +356,11 @@ class T_extending(unittest.TestCase):
def c_cleanup(self, name, sub): def c_cleanup(self, name, sub):
return "" return ""
# Added to make those tests pass in DebugMode
@staticmethod
def may_share_memory(a, b):
return a is b
double = Double() double = Double()
......
from type import TypedListType import copy
import numpy
from type import TypedListType
import theano import theano
from theano.gof import Apply, Constant, Op, Variable from theano.gof import Apply, Constant, Op, Variable
from theano.tensor.type_other import SliceType from theano.tensor.type_other import SliceType
from theano import tensor as T from theano import tensor as T
from theano.compile.debugmode import _lessbroken_deepcopy
import numpy
class _typed_list_py_operators: class _typed_list_py_operators:
...@@ -34,7 +36,7 @@ class _typed_list_py_operators: ...@@ -34,7 +36,7 @@ class _typed_list_py_operators:
def count(self, elem): def count(self, elem):
return count(self, elem) return count(self, elem)
#name "index" is already used by an attribute # name "index" is already used by an attribute
def ind(self, elem): def ind(self, elem):
return index_(self, elem) return index_(self, elem)
...@@ -51,6 +53,8 @@ TypedListType.Variable = TypedListVariable ...@@ -51,6 +53,8 @@ TypedListType.Variable = TypedListVariable
class GetItem(Op): class GetItem(Op):
# See doc in instance of this Op or function after this class definition. # See doc in instance of this Op or function after this class definition.
view_map = {0: [0]}
def __eq__(self, other): def __eq__(self, other):
return type(self) == type(other) return type(self) == type(other)
...@@ -112,6 +116,13 @@ class Append(Op): ...@@ -112,6 +116,13 @@ class Append(Op):
self.inplace = inplace self.inplace = inplace
if self.inplace: if self.inplace:
self.destroy_map = {0: [0]} self.destroy_map = {0: [0]}
# TODO: make destroy_handler support having views and
# destroyed version of multiple inputs.
# self.view_map = {0: [1]}
else:
# TODO: make destroy_handler support multiple view
# self.view_map = {0: [0, 1]}
self.view_map = {0: [0]}
def __eq__(self, other): def __eq__(self, other):
return type(self) == type(other) and self.inplace == other.inplace return type(self) == type(other) and self.inplace == other.inplace
...@@ -129,12 +140,15 @@ class Append(Op): ...@@ -129,12 +140,15 @@ class Append(Op):
out[0] = list(x) out[0] = list(x)
else: else:
out[0] = x out[0] = x
# need to copy toAppend due to destroy_handler limitation
toAppend = _lessbroken_deepcopy(toAppend)
out[0].append(toAppend) out[0].append(toAppend)
def __str__(self): def __str__(self):
return self.__class__.__name__ return self.__class__.__name__
def c_code(self, node, name, inp, out, sub): # DISABLED AS WE NEED TO UPDATE IT TO COPY toAppend().
def _c_code_(self, node, name, inp, out, sub):
x_name, toAppend = inp[0], inp[1] x_name, toAppend = inp[0], inp[1]
output_name = out[0] output_name = out[0]
fail = sub['fail'] fail = sub['fail']
...@@ -174,6 +188,13 @@ class Extend(Op): ...@@ -174,6 +188,13 @@ class Extend(Op):
self.inplace = inplace self.inplace = inplace
if self.inplace: if self.inplace:
self.destroy_map = {0: [0]} self.destroy_map = {0: [0]}
# TODO: make destroy_handler support having views and
# destroyed version of multiple inputs.
# self.view_map = {0: [1]}
else:
# TODO: make destroy_handler support multiple view
# self.view_map = {0: [0, 1]}
self.view_map = {0: [0]}
def __eq__(self, other): def __eq__(self, other):
return type(self) == type(other) and self.inplace == other.inplace return type(self) == type(other) and self.inplace == other.inplace
...@@ -191,12 +212,17 @@ class Extend(Op): ...@@ -191,12 +212,17 @@ class Extend(Op):
out[0] = list(x) out[0] = list(x)
else: else:
out[0] = x out[0] = x
out[0].extend(toAppend) # need to copy toAppend due to destroy_handler limitation
if toAppend:
o = out[0]
for i in toAppend:
o.append(_lessbroken_deepcopy(i))
def __str__(self): def __str__(self):
return self.__class__.__name__ return self.__class__.__name__
def c_code(self, node, name, inp, out, sub): # DISABLED AS WE NEED TO UPDATE IT TO COPY toAppend().
def _c_code_(self, node, name, inp, out, sub):
x_name, toAppend = inp[0], inp[1] x_name, toAppend = inp[0], inp[1]
output_name = out[0] output_name = out[0]
fail = sub['fail'] fail = sub['fail']
...@@ -222,7 +248,7 @@ class Extend(Op): ...@@ -222,7 +248,7 @@ class Extend(Op):
Py_INCREF(%(output_name)s); Py_INCREF(%(output_name)s);
""" % locals() """ % locals()
def c_code_cache_version(self): def c_code_cache_version_(self):
return (1,) return (1,)
extend = Extend() extend = Extend()
...@@ -240,6 +266,13 @@ class Insert(Op): ...@@ -240,6 +266,13 @@ class Insert(Op):
self.inplace = inplace self.inplace = inplace
if self.inplace: if self.inplace:
self.destroy_map = {0: [0]} self.destroy_map = {0: [0]}
# TODO: make destroy_handler support having views and
# destroyed version of multiple inputs.
# self.view_map = {0: [2]}
else:
# TODO: make destroy_handler support multiple view
# self.view_map = {0: [0, 2]}
self.view_map = {0: [0]}
def __eq__(self, other): def __eq__(self, other):
return type(self) == type(other) and self.inplace == other.inplace return type(self) == type(other) and self.inplace == other.inplace
...@@ -262,12 +295,15 @@ class Insert(Op): ...@@ -262,12 +295,15 @@ class Insert(Op):
out[0] = list(x) out[0] = list(x)
else: else:
out[0] = x out[0] = x
# need to copy toAppend due to destroy_handler limitation
toInsert = _lessbroken_deepcopy(toInsert)
out[0].insert(index, toInsert) out[0].insert(index, toInsert)
def __str__(self): def __str__(self):
return self.__class__.__name__ return self.__class__.__name__
def c_code(self, node, name, inp, out, sub): # DISABLED AS WE NEED TO UPDATE IT TO COPY toAppend().
def _c_code_(self, node, name, inp, out, sub):
x_name, index, toInsert = inp[0], inp[1], inp[2] x_name, index, toInsert = inp[0], inp[1], inp[2]
output_name = out[0] output_name = out[0]
fail = sub['fail'] fail = sub['fail']
...@@ -308,6 +344,8 @@ class Remove(Op): ...@@ -308,6 +344,8 @@ class Remove(Op):
self.inplace = inplace self.inplace = inplace
if self.inplace: if self.inplace:
self.destroy_map = {0: [0]} self.destroy_map = {0: [0]}
else:
self.view_map = {0: [0]}
def __eq__(self, other): def __eq__(self, other):
return type(self) == type(other) and self.inplace == other.inplace return type(self) == type(other) and self.inplace == other.inplace
...@@ -360,6 +398,8 @@ class Reverse(Op): ...@@ -360,6 +398,8 @@ class Reverse(Op):
self.inplace = inplace self.inplace = inplace
if self.inplace: if self.inplace:
self.destroy_map = {0: [0]} self.destroy_map = {0: [0]}
else:
self.view_map = {0: [0]}
def __eq__(self, other): def __eq__(self, other):
return type(self) == type(other) and self.inplace == other.inplace return type(self) == type(other) and self.inplace == other.inplace
......
...@@ -21,13 +21,13 @@ except ImportError: ...@@ -21,13 +21,13 @@ except ImportError:
scipy_imported = False scipy_imported = False
#took from tensors/tests/test_basic.py # took from tensors/tests/test_basic.py
def rand_ranged_matrix(minimum, maximum, shape): def rand_ranged_matrix(minimum, maximum, shape):
return numpy.asarray(numpy.random.rand(*shape) * (maximum - minimum) return numpy.asarray(numpy.random.rand(*shape) * (maximum - minimum)
+ minimum, dtype=theano.config.floatX) + minimum, dtype=theano.config.floatX)
#took from sparse/tests/test_basic.py # took from sparse/tests/test_basic.py
def random_lil(shape, dtype, nnz): def random_lil(shape, dtype, nnz):
rval = sp.lil_matrix(shape, dtype=dtype) rval = sp.lil_matrix(shape, dtype=dtype)
huge = 2 ** 30 huge = 2 ** 30
...@@ -35,7 +35,7 @@ def random_lil(shape, dtype, nnz): ...@@ -35,7 +35,7 @@ def random_lil(shape, dtype, nnz):
# set non-zeros in random locations (row x, col y) # set non-zeros in random locations (row x, col y)
idx = numpy.random.random_integers(huge, size=2) % shape idx = numpy.random.random_integers(huge, size=2) % shape
value = numpy.random.rand() value = numpy.random.rand()
#if dtype *int*, value will always be zeros! # if dtype *int*, value will always be zeros!
if "int" in dtype: if "int" in dtype:
value = int(value * 100) value = int(value * 100)
# The call to tuple is needed as scipy 0.13.1 do not support # The call to tuple is needed as scipy 0.13.1 do not support
...@@ -84,8 +84,9 @@ class test_get_item(unittest.TestCase): ...@@ -84,8 +84,9 @@ class test_get_item(unittest.TestCase):
x = rand_ranged_matrix(-1000, 1000, [100, 101]) x = rand_ranged_matrix(-1000, 1000, [100, 101])
y = rand_ranged_matrix(-1000, 1000, [100, 101]) y = rand_ranged_matrix(-1000, 1000, [100, 101])
self.assertTrue(numpy.array_equal(f([x], numpy.asarray(0, self.assertTrue(numpy.array_equal(f([x],
dtype='int64')), x)) numpy.asarray(0, dtype='int64')),
x))
def test_interface(self): def test_interface(self):
mySymbolicMatricesList = TypedListType(T.TensorType( mySymbolicMatricesList = TypedListType(T.TensorType(
...@@ -99,8 +100,9 @@ class test_get_item(unittest.TestCase): ...@@ -99,8 +100,9 @@ class test_get_item(unittest.TestCase):
x = rand_ranged_matrix(-1000, 1000, [100, 101]) x = rand_ranged_matrix(-1000, 1000, [100, 101])
self.assertTrue(numpy.array_equal(f([x], numpy.asarray(0, self.assertTrue(numpy.array_equal(f([x],
dtype='int64')), x)) numpy.asarray(0, dtype='int64')),
x))
z = mySymbolicMatricesList[0] z = mySymbolicMatricesList[0]
...@@ -258,8 +260,10 @@ class test_insert(unittest.TestCase): ...@@ -258,8 +260,10 @@ class test_insert(unittest.TestCase):
y = rand_ranged_matrix(-1000, 1000, [100, 101]) y = rand_ranged_matrix(-1000, 1000, [100, 101])
self.assertTrue(numpy.array_equal(f([x], numpy.asarray(1, self.assertTrue(numpy.array_equal(f([x],
dtype='int64'), y), [x, y])) numpy.asarray(1, dtype='int64'),
y),
[x, y]))
def test_sanity_check(self): def test_sanity_check(self):
mySymbolicMatricesList = TypedListType(T.TensorType( mySymbolicMatricesList = TypedListType(T.TensorType(
...@@ -292,8 +296,10 @@ class test_insert(unittest.TestCase): ...@@ -292,8 +296,10 @@ class test_insert(unittest.TestCase):
y = rand_ranged_matrix(-1000, 1000, [100, 101]) y = rand_ranged_matrix(-1000, 1000, [100, 101])
self.assertTrue(numpy.array_equal(f([x], numpy.asarray(1, self.assertTrue(numpy.array_equal(f([x],
dtype='int64'), y), [x, y])) numpy.asarray(1, dtype='int64'),
y),
[x, y]))
class test_remove(unittest.TestCase): class test_remove(unittest.TestCase):
...@@ -443,8 +449,8 @@ class test_index(unittest.TestCase): ...@@ -443,8 +449,8 @@ class test_index(unittest.TestCase):
def test_sparse(self): def test_sparse(self):
if not scipy_imported: if not scipy_imported:
raise SkipTest('Optional package SciPy not installed') raise SkipTest('Optional package SciPy not installed')
mySymbolicSparseList = TypedListType(sparse.SparseType('csr', mySymbolicSparseList = TypedListType(
theano.config.floatX))() sparse.SparseType('csr', theano.config.floatX))()
mySymbolicSparse = sparse.csr_matrix() mySymbolicSparse = sparse.csr_matrix()
z = Index()(mySymbolicSparseList, mySymbolicSparse) z = Index()(mySymbolicSparseList, mySymbolicSparse)
...@@ -509,8 +515,8 @@ class test_count(unittest.TestCase): ...@@ -509,8 +515,8 @@ class test_count(unittest.TestCase):
def test_sparse(self): def test_sparse(self):
if not scipy_imported: if not scipy_imported:
raise SkipTest('Optional package SciPy not installed') raise SkipTest('Optional package SciPy not installed')
mySymbolicSparseList = TypedListType(sparse.SparseType('csr', mySymbolicSparseList = TypedListType(
theano.config.floatX))() sparse.SparseType('csr', theano.config.floatX))()
mySymbolicSparse = sparse.csr_matrix() mySymbolicSparse = sparse.csr_matrix()
z = Count()(mySymbolicSparseList, mySymbolicSparse) z = Count()(mySymbolicSparseList, mySymbolicSparse)
......
...@@ -76,6 +76,21 @@ class TypedListType(gof.Type): ...@@ -76,6 +76,21 @@ class TypedListType(gof.Type):
return True return True
def may_share_memory(self, a, b):
if a is b:
return True
# As a list contain other element, if a or b isn't a list, we
# still need to check if that element is contained in the
# other list.
if not isinstance(a, list):
a = [a]
if not isinstance(b, list):
b = [b]
for idx1 in range(len(a)):
for idx2 in range(len(b)):
if self.ttype.may_share_memory(a[idx1], b[idx2]):
return True
def c_declare(self, name, sub, check_input=True): def c_declare(self, name, sub, check_input=True):
return """ return """
PyListObject* %(name)s; PyListObject* %(name)s;
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论