提交 80b00e7f authored 作者: David Warde-Farley's avatar David Warde-Farley

Merge pull request #497 from delallea/minor

Minor stuff
...@@ -110,7 +110,7 @@ Deprecation (will be removed in Theano 0.5, warning generated if you use them): ...@@ -110,7 +110,7 @@ Deprecation (will be removed in Theano 0.5, warning generated if you use them):
(list/tuple/TensorVariable). (list/tuple/TensorVariable).
* Currently tensor.grad return a type list when the wrt is a list/tuple of * Currently tensor.grad return a type list when the wrt is a list/tuple of
more then 1 element. more than 1 element.
Decrecated in 0.4.0(Reminder, warning generated if you use them): Decrecated in 0.4.0(Reminder, warning generated if you use them):
......
.. _NEWS: .. _NEWS:
Update in the Trunk since the last release: Updates in the Trunk since the last release:
Documentation Documentation
* Added in the tutorial documentation on how to extend Theano. * Added in the tutorial documentation on how to extend Theano.
...@@ -8,7 +8,7 @@ Documentation ...@@ -8,7 +8,7 @@ Documentation
http://deeplearning.net/software/theano/tutorial/extending_theano.html http://deeplearning.net/software/theano/tutorial/extending_theano.html
(Frédéric B.) (Frédéric B.)
Interface change Interface changes
* theano.function does not accept duplicate inputs, so function([x, x], ...) * theano.function does not accept duplicate inputs, so function([x, x], ...)
does not work anymore. (Pascal L.) does not work anymore. (Pascal L.)
* theano.function now raises an error if some of the provided inputs are * theano.function now raises an error if some of the provided inputs are
...@@ -17,14 +17,14 @@ Interface change ...@@ -17,14 +17,14 @@ Interface change
``on_unused_input={'raise', 'warn', 'ignore'}`` to control this. ``on_unused_input={'raise', 'warn', 'ignore'}`` to control this.
(Pascal L.) (Pascal L.)
New Feature New Features
* debugprint new param ids=["CHAR","id","int",""] * debugprint new param ids=["CHAR", "id", "int", ""]
This make the identifier printed to be the python id, a uniq char, a This makes the identifier printed to be the python id, a unique char, a
unit int, or not have it printed. We changed the default to be "CHAR" unique int, or not have it printed. We changed the default to be "CHAR"
as this is more readable. as this is more readable.
* debugprint new param stop_on_name=[False,True]. If True, we don't print * debugprint new param stop_on_name=[False, True]. If True, we don't print
anything bellow an intermediate variable that have a name. Default to False. anything below an intermediate variable that has a name. Defaults to False.
* debugprint now stop printing the "|" symbol in a columns after the last input. * debugprint does not print anymore the "|" symbol in a column after the last input.
============= =============
Release Notes Release Notes
......
...@@ -180,7 +180,7 @@ Here is the state of that vision as of 24 October 2011 (after Theano release ...@@ -180,7 +180,7 @@ Here is the state of that vision as of 24 October 2011 (after Theano release
* Example of use: Determine if we should move computation to the * Example of use: Determine if we should move computation to the
GPU or not depending on the input size. GPU or not depending on the input size.
* Possible implementation note: allow Theano Variable in the env to * Possible implementation note: allow Theano Variable in the env to
have more then 1 owner. have more than 1 owner.
* We have a CUDA backend for tensors of type `float32` only. * We have a CUDA backend for tensors of type `float32` only.
* Efforts have begun towards a generic GPU ndarray (GPU tensor) (started in the * Efforts have begun towards a generic GPU ndarray (GPU tensor) (started in the
......
...@@ -6,6 +6,7 @@ ...@@ -6,6 +6,7 @@
__docformat__ = "restructuredtext en" __docformat__ = "restructuredtext en"
import time, copy, sys, copy_reg, gc, os import time, copy, sys, copy_reg, gc, os
from itertools import izip
from StringIO import StringIO from StringIO import StringIO
import numpy import numpy
...@@ -510,7 +511,7 @@ def debugprint(r, prefix='', depth=-1, done=None, print_type=False, ...@@ -510,7 +511,7 @@ def debugprint(r, prefix='', depth=-1, done=None, print_type=False,
:param prefix: prefix to each line (typically some number of spaces) :param prefix: prefix to each line (typically some number of spaces)
:param depth: maximum recursion depth (Default -1 for unlimited). :param depth: maximum recursion depth (Default -1 for unlimited).
:param done: dict of Apply instances that have already been printed :param done: dict of Apply instances that have already been printed
and there associated printed ids and their associated printed ids
:param print_type: wether to print the Variable type after the other infos :param print_type: wether to print the Variable type after the other infos
:param file: file-like object to which to print :param file: file-like object to which to print
:param print_destroy_map: wether to print the op destroy_map after ofther info :param print_destroy_map: wether to print the op destroy_map after ofther info
...@@ -521,7 +522,7 @@ def debugprint(r, prefix='', depth=-1, done=None, print_type=False, ...@@ -521,7 +522,7 @@ def debugprint(r, prefix='', depth=-1, done=None, print_type=False,
int - print integer character int - print integer character
CHAR - print capital character CHAR - print capital character
"" - don't print an identifier "" - don't print an identifier
:param stop_on_name: When True, if a node in the graph have a name, :param stop_on_name: When True, if a node in the graph has a name,
we don't print anything below it. we don't print anything below it.
""" """
...@@ -1931,35 +1932,39 @@ class _Maker(FunctionMaker): # inheritance buys a few helper functions ...@@ -1931,35 +1932,39 @@ class _Maker(FunctionMaker): # inheritance buys a few helper functions
""" """
Create a function. Create a function.
defaults -> a list matching the inputs list and providing default values defaults -> a list matching the inputs list and providing default
if the default for an input is None, then that input is a values if the default for an input is None, then that input
required input. For an input with an update, the default is a required input. For an input with an update, the
acts as initialization. default acts as initialization.
trustme -> disables some exceptions, used internally trustme -> disables some exceptions, used internally
""" """
if defaults is None: if defaults is None:
defaults = [None] * len(self.inputs) defaults = [None] * len(self.inputs)
input_storage = [] # list of independent one-element lists, will be passed to the linker # List of independent one-element lists, will be passed to the linker.
input_storage = []
_defaults = [] _defaults = []
# The following loop is to fill in the input_storage and _defaults lists. # The following loop is to fill in the input_storage and _defaults
for (input, indices, subinputs), default in zip(self.indices, defaults): # lists.
for (input, indices, subinputs), default in izip(self.indices,
defaults):
__default = default __default = default
if isinstance(default, gof.Container): if isinstance(default, gof.Container):
# If the default is a gof.Container, this means we want to share # If the default is a gof.Container, this means we want to
# the same storage. This is done by appending default.storage # share the same storage. This is done by appending
# to input_storage # default.storage to input_storage.
if indices is not None: if indices is not None:
raise TypeError("Cannot take a Container instance as default for a SymbolicInputKit.") raise TypeError("Cannot take a Container instance as "
"default for a SymbolicInputKit.")
input_storage.append(default.storage) input_storage.append(default.storage)
default = None default = None
required = False required = False
elif isinstance(input, SymbolicInputKit): elif isinstance(input, SymbolicInputKit):
# If the input is a SymbolicInputKit, it represents more than # If the input is a SymbolicInputKit, it represents more than
# one storage unit. The indices and subinputs lists represent which # one storage unit. The indices and subinputs lists represent
# of the kit's inputs are active in this graph, so we make as many # which of the kit's inputs are active in this graph, so we
# storage units as needed # make as many storage units as needed
if isinstance(default, (list, tuple)) \ if isinstance(default, (list, tuple)) \
and all(isinstance(x, gof.Container) for x in default): and all(isinstance(x, gof.Container) for x in default):
if len(default) == len(indices): if len(default) == len(indices):
...@@ -1967,7 +1972,9 @@ class _Maker(FunctionMaker): # inheritance buys a few helper functions ...@@ -1967,7 +1972,9 @@ class _Maker(FunctionMaker): # inheritance buys a few helper functions
elif len(default) > len(indices): elif len(default) > len(indices):
input_storage += [default[i].storage for i in indices] input_storage += [default[i].storage for i in indices]
else: else:
raise ValueError('Not enough storage for SymbolicInputKit', input, indices, default) raise ValueError(
'Not enough storage for SymbolicInputKit',
input, indices, default)
default = _NODEFAULT default = _NODEFAULT
else: else:
input_storage += [[None] for i in indices] input_storage += [[None] for i in indices]
...@@ -1977,8 +1984,10 @@ class _Maker(FunctionMaker): # inheritance buys a few helper functions ...@@ -1977,8 +1984,10 @@ class _Maker(FunctionMaker): # inheritance buys a few helper functions
# Filling _defaults. Each entry is a tuple of three elements: # Filling _defaults. Each entry is a tuple of three elements:
# (required, refeed, value) # (required, refeed, value)
# - required means that the user must provide a value when calling the function # - required means that the user must provide a value when calling
# - refeed means that we want to put the default back in the storage after each function call # the function
# - refeed means that we want to put the default back in the
# storage after each function call
# - value is the value that will be put in the storage initially # - value is the value that will be put in the storage initially
# Even though a SymbolicInputKit represents more than one input, # Even though a SymbolicInputKit represents more than one input,
...@@ -2001,7 +2010,9 @@ class _Maker(FunctionMaker): # inheritance buys a few helper functions ...@@ -2001,7 +2010,9 @@ class _Maker(FunctionMaker): # inheritance buys a few helper functions
_defaults.append((False, False, None)) _defaults.append((False, False, None))
else: else:
# This might catch some bugs early # This might catch some bugs early
raise ValueError("A default (initial) value is required for an input which can update itself.", input) raise ValueError(
"A default (initial) value is required for an "
"input which can update itself.", input)
else: else:
_defaults.append((False, False, default)) _defaults.append((False, False, default))
else: else:
...@@ -2066,8 +2077,8 @@ class DebugMode(Mode): ...@@ -2066,8 +2077,8 @@ class DebugMode(Mode):
If there are internal errors, this mode will raise an If there are internal errors, this mode will raise an
`DebugModeError` exception. `DebugModeError` exception.
:remark: The work of debugging is implemented by the `_Maker`, `_Linker`, and :remark: The work of debugging is implemented by the `_Maker`, `_Linker`,
`_VariableEquivalenceTracker` classes. and `_VariableEquivalenceTracker` classes.
""" """
...@@ -2084,7 +2095,8 @@ class DebugMode(Mode): ...@@ -2084,7 +2095,8 @@ class DebugMode(Mode):
check_py_code = config.DebugMode.check_py check_py_code = config.DebugMode.check_py
""" """
Should we evaluate (and check) the `perform` implementations? Always checked if no `c_code`. Should we evaluate (and check) the `perform` implementations?
Always checked if no `c_code`.
""" """
check_isfinite = config.DebugMode.check_finite check_isfinite = config.DebugMode.check_finite
...@@ -2102,7 +2114,9 @@ class DebugMode(Mode): ...@@ -2102,7 +2114,9 @@ class DebugMode(Mode):
# This function will be used to create a FunctionMaker in # This function will be used to create a FunctionMaker in
# function_module.function # function_module.function
def function_maker(self, i, o, m, *args, **kwargs): def function_maker(self, i, o, m, *args, **kwargs):
"""Return an instance of `_Maker` which handles much of the debugging work""" """
Return an instance of `_Maker` which handles much of the debugging work
"""
assert m is self assert m is self
return _Maker(i, o, self.optimizer, self, *args, **kwargs) return _Maker(i, o, self.optimizer, self, *args, **kwargs)
...@@ -2114,13 +2128,18 @@ class DebugMode(Mode): ...@@ -2114,13 +2128,18 @@ class DebugMode(Mode):
check_isfinite=None, check_isfinite=None,
require_matching_strides=None, require_matching_strides=None,
linker=None): linker=None):
"""Initialize member variables. """Initialize member variables.
If any of these arguments (except optimizer) is not None, it overrides the class default. If any of these arguments (except optimizer) is not None, it overrides
The linker arguments is not used. It is set their to allow Mode.requiring() and some other fct to work with DebugMode too. the class default.
The linker argument is not used. It is set there to allow
Mode.requiring() and some other fct to work with DebugMode too.
""" """
if linker is not None and not issubclass(linker, _Linker): if linker is not None and not issubclass(linker, _Linker):
raise Exception("DebugMode can use only its own linker! Don't give him one to use it.", linker) raise Exception("DebugMode can only use its own linker! You "
"should not provide one.", linker)
super(DebugMode, self).__init__( super(DebugMode, self).__init__(
optimizer=optimizer, optimizer=optimizer,
...@@ -2142,6 +2161,7 @@ class DebugMode(Mode): ...@@ -2142,6 +2161,7 @@ class DebugMode(Mode):
self.require_matching_strides = require_matching_strides self.require_matching_strides = require_matching_strides
if not (self.check_c_code or self.check_py_code): if not (self.check_c_code or self.check_py_code):
raise ValueError('DebugMode has to check at least one of c and py code') raise ValueError('DebugMode has to check at least one of c and py '
'code')
register_mode('DEBUG_MODE', DebugMode(optimizer='fast_run')) register_mode('DEBUG_MODE', DebugMode(optimizer='fast_run'))
...@@ -56,13 +56,13 @@ for val in keys.values(): ...@@ -56,13 +56,13 @@ for val in keys.values():
nbs_mod = {} # nb seen -> how many key nbs_mod = {} # nb seen -> how many key
nbs_mod_to_key = {} #nb seen -> keys nbs_mod_to_key = {} #nb seen -> keys
more_then_one = 0 more_than_one = 0
for mod,kk in mods.iteritems(): for mod,kk in mods.iteritems():
val = len(kk) val = len(kk)
nbs_mod.setdefault(val, 0) nbs_mod.setdefault(val, 0)
nbs_mod[val]+=1 nbs_mod[val]+=1
if val>1: if val>1:
more_then_one += 1 more_than_one += 1
nbs_mod_to_key[val] = kk nbs_mod_to_key[val] = kk
if DISPLAY_MOST_FREQUENT_DUPLICATE_CCODE: if DISPLAY_MOST_FREQUENT_DUPLICATE_CCODE:
...@@ -87,7 +87,7 @@ uniq = len(mods) ...@@ -87,7 +87,7 @@ uniq = len(mods)
useless = total - uniq useless = total - uniq
print "mod.{cpp,cu} total:", total print "mod.{cpp,cu} total:", total
print "mod.{cpp,cu} uniq:", uniq print "mod.{cpp,cu} uniq:", uniq
print "mod.{cpp,cu} with more then 1 copy:", more_then_one print "mod.{cpp,cu} with more than 1 copy:", more_than_one
print "mod.{cpp,cu} useless:", useless, float(useless)/total*100,"%" print "mod.{cpp,cu} useless:", useless, float(useless)/total*100,"%"
print "nb directory", len(dirs) print "nb directory", len(dirs)
...@@ -103,7 +103,8 @@ def mysend(subject, file): ...@@ -103,7 +103,8 @@ def mysend(subject, file):
float64_time = start float64_time = start
start = "" start = ""
s="Resume of the output:\n\n"+filter_output(open(file))+"\n\nFull output:\n\n"+s s = ("Summary of the output:\n\n" + filter_output(open(file)) +
"\n\nFull output:\n\n" + s)
img = MIMEText(s) img = MIMEText(s)
fp.close() fp.close()
msg.attach(img) msg.attach(img)
......
...@@ -46,7 +46,7 @@ def debugprint(obj, depth=-1, print_type=False, ...@@ -46,7 +46,7 @@ def debugprint(obj, depth=-1, print_type=False,
int - print integer character int - print integer character
CHAR - print capital character CHAR - print capital character
"" - don't print an identifier "" - don't print an identifier
:param stop_on_name: When True, if a node in the graph have a name, :param stop_on_name: When True, if a node in the graph has a name,
we don't print anything below it. we don't print anything below it.
:returns: string if `file` == 'str', else file arg :returns: string if `file` == 'str', else file arg
......
...@@ -846,7 +846,7 @@ CudaNdarray_conv_full(const CudaNdarray *img, const CudaNdarray * kern, CudaNdar ...@@ -846,7 +846,7 @@ CudaNdarray_conv_full(const CudaNdarray *img, const CudaNdarray * kern, CudaNdar
if(version==-1 && nb_split>1) version=4; if(version==-1 && nb_split>1) version=4;
else if(version==-1) version=3; else if(version==-1) version=3;
else if(version==3 && nb_split!=1) version=4;//we force version 4 when we need more then 1 split as to be always execute. else if(version==3 && nb_split!=1) version=4;//we force version 4 when we need more than 1 split as to be always execute.
assert(version!=3 || nb_split==1); assert(version!=3 || nb_split==1);
assert(version!=5 || kern_len>1); assert(version!=5 || kern_len>1);
......
...@@ -183,7 +183,7 @@ conv_full_patch_stack( float* img, float* kern, float* out, ...@@ -183,7 +183,7 @@ conv_full_patch_stack( float* img, float* kern, float* out,
/** /**
* As conv_patch_stack, but used for the full convolution by padding the image in shared memory. * As conv_patch_stack, but used for the full convolution by padding the image in shared memory.
* I keep it separated from conv_patch as we take 19-20 register which is more then the 10/16 max for each thread and thus this could lower the occupency. * I keep it separated from conv_patch as we take 19-20 register which is more than the 10/16 max for each thread and thus this could lower the occupency.
* Implementation of the valid convolution that keep the full image and the full kernel in shared memory * Implementation of the valid convolution that keep the full image and the full kernel in shared memory
* each thread compute only one value for the output if split is true. Otherwise compute ceil((float)out_len/N) pixel. * each thread compute only one value for the output if split is true. Otherwise compute ceil((float)out_len/N) pixel.
* thread block size=out_wid, nb_rows (optimized value is ceil(out_len/N)) * thread block size=out_wid, nb_rows (optimized value is ceil(out_len/N))
...@@ -195,7 +195,7 @@ conv_full_patch_stack( float* img, float* kern, float* out, ...@@ -195,7 +195,7 @@ conv_full_patch_stack( float* img, float* kern, float* out,
* nstack: the size of the stack, used to compute the image to load. * nstack: the size of the stack, used to compute the image to load.
* template flipped_kern: if true, we "flip" the kernel as in a real convolution, else we don't * template flipped_kern: if true, we "flip" the kernel as in a real convolution, else we don't
* template c_contiguous: if true, the image and kernel have are c_contiguous.(use less registers) * template c_contiguous: if true, the image and kernel have are c_contiguous.(use less registers)
* template split: if true, each thread compute more then 1 output pixel. * template split: if true, each thread compute more than 1 output pixel.
* template low_mem: if true, as split but with use less dynamic shared memory but use more registers. * template low_mem: if true, as split but with use less dynamic shared memory but use more registers.
* if you set split and low_mem to true, we will use the low_mem version! * if you set split and low_mem to true, we will use the low_mem version!
*/ */
......
...@@ -204,7 +204,7 @@ __device__ void store_or_accumulate(float& dst,const float value ){ ...@@ -204,7 +204,7 @@ __device__ void store_or_accumulate(float& dst,const float value ){
* nkern: the number of kernel, used to compute the output image to store the result * nkern: the number of kernel, used to compute the output image to store the result
* nstack: the size of the stack, used to compute the image to load. * nstack: the size of the stack, used to compute the image to load.
* template flipped_kern: if true, we "flip" the kernel as in a real convolution, else we don't * template flipped_kern: if true, we "flip" the kernel as in a real convolution, else we don't
* template split: if true, each thread compute more then 1 output pixel * template split: if true, each thread computes more than 1 output pixel
* When true, allow for output image bigger then 512 pixel. * When true, allow for output image bigger then 512 pixel.
* Use more registers. * Use more registers.
*/ */
...@@ -273,7 +273,7 @@ conv_patch( float* img, float* kern, float* out, ...@@ -273,7 +273,7 @@ conv_patch( float* img, float* kern, float* out,
* As conv_patch, but implement the stack in the kernel. * As conv_patch, but implement the stack in the kernel.
* I keep it separated from conv_patch as we take more registers and this could lower the occupency. * I keep it separated from conv_patch as we take more registers and this could lower the occupency.
* Implementation of the valid convolution that keep the full image and the full kernel in shared memory * Implementation of the valid convolution that keep the full image and the full kernel in shared memory
* each thread compute only one value for the output if split==false else it compute more then 1 values * each thread compute only one value for the output if split==false else it compute more than 1 values
* thread block size=out_wid, out_len/X (X is any number, optimized value is ceil(out_len/N) * thread block size=out_wid, out_len/X (X is any number, optimized value is ceil(out_len/N)
* grid block size=batch_id, nkern * grid block size=batch_id, nkern
* dynamic shared memory: img_len*img_wid+(preload_full_kern?KERNEL_LEN:1)*kern_wid * dynamic shared memory: img_len*img_wid+(preload_full_kern?KERNEL_LEN:1)*kern_wid
...@@ -287,7 +287,7 @@ conv_patch( float* img, float* kern, float* out, ...@@ -287,7 +287,7 @@ conv_patch( float* img, float* kern, float* out,
* template KERN_WIDTH: if 0, will work for any kern_wid, else it specialyse to this kern_wid as an optimization * template KERN_WIDTH: if 0, will work for any kern_wid, else it specialyse to this kern_wid as an optimization
* template img_c_contiguous_2d: if true, the img have are collon and row contiguous * template img_c_contiguous_2d: if true, the img have are collon and row contiguous
* template kern_c_contiguous_2d: if true, the kernel have are collon and row contiguous * template kern_c_contiguous_2d: if true, the kernel have are collon and row contiguous
* template split: if true, each thread generate more then 1 output pixel, but use more registers. * template split: if true, each thread generate more than 1 output pixel, but use more registers.
* template preload_full_kern: if true, we load the full kernel in shared memory, else, we load 1 row at a time. * template preload_full_kern: if true, we load the full kernel in shared memory, else, we load 1 row at a time.
* template subsample: if false, remove some computation needed when dx or dy!=1. * template subsample: if false, remove some computation needed when dx or dy!=1.
*/ */
......
...@@ -3240,9 +3240,9 @@ int CudaNdarray_sger(float alpha, const CudaNdarray * x, const CudaNdarray * y, ...@@ -3240,9 +3240,9 @@ int CudaNdarray_sger(float alpha, const CudaNdarray * x, const CudaNdarray * y,
if(x_strides == 0){ if(x_strides == 0){
if(CudaNdarray_HOST_DIMS(x)[0] != 1){ if(CudaNdarray_HOST_DIMS(x)[0] != 1){
PyErr_Format(PyExc_RuntimeError, PyErr_Format(PyExc_RuntimeError,
"CudaNdarray_sger: Invalid input x(should not happen)." "CudaNdarray_sger: Invalid input x (should not happen)."
" We received an CudaNdarray vector with a stride of 0" " We received a CudaNdarray vector with a stride of 0"
" that have more then 1 elements!"); " that has more than 1 element!");
return -1; return -1;
} }
x_strides = 1; x_strides = 1;
...@@ -3256,9 +3256,9 @@ int CudaNdarray_sger(float alpha, const CudaNdarray * x, const CudaNdarray * y, ...@@ -3256,9 +3256,9 @@ int CudaNdarray_sger(float alpha, const CudaNdarray * x, const CudaNdarray * y,
if(y_strides == 0){ if(y_strides == 0){
if(CudaNdarray_HOST_DIMS(y)[0] != 1){ if(CudaNdarray_HOST_DIMS(y)[0] != 1){
PyErr_Format(PyExc_RuntimeError, PyErr_Format(PyExc_RuntimeError,
"CudaNdarray_sger: Invalid input y(should not happen)." "CudaNdarray_sger: Invalid input y (should not happen)."
" We received an CudaNdarray vector with a stride of 0" " We received a CudaNdarray vector with a stride of 0"
" that have more then 1 elements!"); " that has more than 1 elements!");
return -1; return -1;
} }
y_strides = 1; y_strides = 1;
......
...@@ -257,7 +257,8 @@ def test_downsample(): ...@@ -257,7 +257,8 @@ def test_downsample():
for ds in (2, 2), (3,2), (1,1): for ds in (2, 2), (3,2), (1,1):
if ds[0] > shp[2]: continue if ds[0] > shp[2]: continue
if ds[1] > shp[3]: continue if ds[1] > shp[3]: continue
#GpuDownsampleFactorMax don't having more then 512 columns in the output tensor # GpuDownsampleFactorMax doesn't like having more than 512 columns
# in the output tensor.
if float(shp[3])/ds[1]>512: continue if float(shp[3])/ds[1]>512: continue
for ignore_border in (True, False): for ignore_border in (True, False):
print 'test_downsample', shp, ds, ignore_border print 'test_downsample', shp, ds, ignore_border
......
...@@ -1000,7 +1000,7 @@ class Scan(PureOp): ...@@ -1000,7 +1000,7 @@ class Scan(PureOp):
if i < n_steps: if i < n_steps:
# The reason I don't use out[idx][0][:i] is because for # The reason I don't use out[idx][0][:i] is because for
# certain outputs (those with multiple taps), # certain outputs (those with multiple taps),
# outs[idx][0] has more then n_steps entries, with the # outs[idx][0] has more than n_steps entries, with the
# initial state at the begining. When indexing in it I # initial state at the begining. When indexing in it I
# usually have to do something like # usually have to do something like
# outs[idx][0][i+offset]. To do something similar here, # outs[idx][0][i+offset]. To do something similar here,
......
...@@ -1060,7 +1060,7 @@ static PyObject *__pyx_pf_6theano_11scan_module_12scan_perform_0get_version(PyOb ...@@ -1060,7 +1060,7 @@ static PyObject *__pyx_pf_6theano_11scan_module_12scan_perform_0get_version(PyOb
*/ */
static PyObject *__pyx_pf_6theano_11scan_module_12scan_perform_1perform(PyObject *__pyx_self, PyObject *__pyx_args, PyObject *__pyx_kwds); /*proto*/ static PyObject *__pyx_pf_6theano_11scan_module_12scan_perform_1perform(PyObject *__pyx_self, PyObject *__pyx_args, PyObject *__pyx_kwds); /*proto*/
static char __pyx_doc_6theano_11scan_module_12scan_perform_1perform[] = "\n Parameters\n ----------\n n_shared_outs: unsigned int\n Number of arugments that correspond to shared variables with\n updates\n n_mit_mot_outs: unsigned int\n Sum over the number of output taps for each mit_mot sequence\n n_seqs: unsigned int\n Number of sequences provided as input\n n_mit_mot : unsigned int\n Number of mit_mot arguemnts\n n_mit_sot: unsigned int\n Number of mit_sot arguments\n n_sit_sot: unsigned int\n Number of sit sot arguemnts\n n_nit_sot: unsigned int\n Number of nit_sot arguments\n n_steps: unsigned int\n Number of steps to loop over\n mintaps: int32 ndarray (can also be a simple python list if that is better !)\n For any of the mit_mot, mit_sot, sit_sot says which is the furtherst\n away input tap from current position. For example, if the taps where [-2,\n -5, -9], the mintap would be -9. For sit_sot this is always -1 since\n is the only allowed tap.\n tap_array: int32 ndarray( can be replaced by a list of list in python if better)\n For each of the mit_mot, mit_sot, sit_sot (the first dimension) says\n which are the corresponding input taps. While this is a matrix, not all\n values in a row are needed and tap_array_len is there to say up to\n which entry we are dealing with valid taps ( afterwards there are\n just 0s to ensure the fix format)\n tap_array_len: int32 ndarray( can be replaced by a list if better)\n For each of the mit_mot, mit_sot, sit_sot says how many input taps\n each has. For sit_sot this will always be 1.\n vector_seqs: int32 ndarray (can be replaced by a list of bools if better)\n For each sequence the corresponding entry is either a 1, is the\n sequence is a vector or 0 if it has more then 1 dimension\n vector_outs: int32 ndarray( can be replaced by list of bools if better)\n For each output ( mit_mot, mit_sot, si""t_sot, nit_sot in this order)\n the entry is 1 if the corresponding argument is a 1 dimensional\n tensor, 0 otherwise.\n mit_mot_out_slices : int32 ndarray( can be replaced by list of lists)\n Same as tap_array, but for the output taps of mit_mot sequences\n mit_mot_out_nslices: int32 ndarray (Can be replaced by a list)\n Same as tap_array_len, but is the number of output taps of the\n mit_mot sequences (i.e. it corresponds to mit_mot_out_slices)\n fn: callable\n This is the linker, i.e. the function that will loop over the\n computational graph and call the perform of each operation. For this\n linker there is a c version in gof/lazy_linker.c that will be the\n starting point of implementing this funciton in C ( we need to take\n all the code around the call of this function and put in C inside\n that code)\n fnct: python object\n Only used to attach some timings for the profile mode ( can be\n skiped if we don't care about Theano's profile mode)\n inplace\n Boolean that says if things should be computed inplace or if they\n should not.\n args: list of ndarrays (and random states)\n The inputs of scan in a given order ( n_steps, sequences, mit_mot,\n mit_sot, sit_sot, nit_sot, shared_outs, other_args)\n outs: list of 1 element list ( or storage objects?)\n This is where we need to copy our outputs ( we don't return the\n results, though we can change the code such that we return, and\n figure things out on the outside - python)\n self: python object\n The scan op itself. I only use it to attach to it some timing\n informations .. but I don;t need to.\n\n "; static char __pyx_doc_6theano_11scan_module_12scan_perform_1perform[] = "\n Parameters\n ----------\n n_shared_outs: unsigned int\n Number of arugments that correspond to shared variables with\n updates\n n_mit_mot_outs: unsigned int\n Sum over the number of output taps for each mit_mot sequence\n n_seqs: unsigned int\n Number of sequences provided as input\n n_mit_mot : unsigned int\n Number of mit_mot arguemnts\n n_mit_sot: unsigned int\n Number of mit_sot arguments\n n_sit_sot: unsigned int\n Number of sit sot arguemnts\n n_nit_sot: unsigned int\n Number of nit_sot arguments\n n_steps: unsigned int\n Number of steps to loop over\n mintaps: int32 ndarray (can also be a simple python list if that is better !)\n For any of the mit_mot, mit_sot, sit_sot says which is the furtherst\n away input tap from current position. For example, if the taps where [-2,\n -5, -9], the mintap would be -9. For sit_sot this is always -1 since\n is the only allowed tap.\n tap_array: int32 ndarray( can be replaced by a list of list in python if better)\n For each of the mit_mot, mit_sot, sit_sot (the first dimension) says\n which are the corresponding input taps. While this is a matrix, not all\n values in a row are needed and tap_array_len is there to say up to\n which entry we are dealing with valid taps ( afterwards there are\n just 0s to ensure the fix format)\n tap_array_len: int32 ndarray( can be replaced by a list if better)\n For each of the mit_mot, mit_sot, sit_sot says how many input taps\n each has. For sit_sot this will always be 1.\n vector_seqs: int32 ndarray (can be replaced by a list of bools if better)\n For each sequence the corresponding entry is either a 1, is the\n sequence is a vector or 0 if it has more than 1 dimension\n vector_outs: int32 ndarray( can be replaced by list of bools if better)\n For each output ( mit_mot, mit_sot, si""t_sot, nit_sot in this order)\n the entry is 1 if the corresponding argument is a 1 dimensional\n tensor, 0 otherwise.\n mit_mot_out_slices : int32 ndarray( can be replaced by list of lists)\n Same as tap_array, but for the output taps of mit_mot sequences\n mit_mot_out_nslices: int32 ndarray (Can be replaced by a list)\n Same as tap_array_len, but is the number of output taps of the\n mit_mot sequences (i.e. it corresponds to mit_mot_out_slices)\n fn: callable\n This is the linker, i.e. the function that will loop over the\n computational graph and call the perform of each operation. For this\n linker there is a c version in gof/lazy_linker.c that will be the\n starting point of implementing this funciton in C ( we need to take\n all the code around the call of this function and put in C inside\n that code)\n fnct: python object\n Only used to attach some timings for the profile mode ( can be\n skiped if we don't care about Theano's profile mode)\n inplace\n Boolean that says if things should be computed inplace or if they\n should not.\n args: list of ndarrays (and random states)\n The inputs of scan in a given order ( n_steps, sequences, mit_mot,\n mit_sot, sit_sot, nit_sot, shared_outs, other_args)\n outs: list of 1 element list ( or storage objects?)\n This is where we need to copy our outputs ( we don't return the\n results, though we can change the code such that we return, and\n figure things out on the outside - python)\n self: python object\n The scan op itself. I only use it to attach to it some timing\n informations .. but I don;t need to.\n\n ";
static PyMethodDef __pyx_mdef_6theano_11scan_module_12scan_perform_1perform = {__Pyx_NAMESTR("perform"), (PyCFunction)__pyx_pf_6theano_11scan_module_12scan_perform_1perform, METH_VARARGS|METH_KEYWORDS, __Pyx_DOCSTR(__pyx_doc_6theano_11scan_module_12scan_perform_1perform)}; static PyMethodDef __pyx_mdef_6theano_11scan_module_12scan_perform_1perform = {__Pyx_NAMESTR("perform"), (PyCFunction)__pyx_pf_6theano_11scan_module_12scan_perform_1perform, METH_VARARGS|METH_KEYWORDS, __Pyx_DOCSTR(__pyx_doc_6theano_11scan_module_12scan_perform_1perform)};
static PyObject *__pyx_pf_6theano_11scan_module_12scan_perform_1perform(PyObject *__pyx_self, PyObject *__pyx_args, PyObject *__pyx_kwds) { static PyObject *__pyx_pf_6theano_11scan_module_12scan_perform_1perform(PyObject *__pyx_self, PyObject *__pyx_args, PyObject *__pyx_kwds) {
unsigned int __pyx_v_n_shared_outs; unsigned int __pyx_v_n_shared_outs;
......
...@@ -125,7 +125,7 @@ def perform( ...@@ -125,7 +125,7 @@ def perform(
each has. For sit_sot this will always be 1. each has. For sit_sot this will always be 1.
vector_seqs: int32 ndarray (can be replaced by a list of bools if better) vector_seqs: int32 ndarray (can be replaced by a list of bools if better)
For each sequence the corresponding entry is either a 1, is the For each sequence the corresponding entry is either a 1, is the
sequence is a vector or 0 if it has more then 1 dimension sequence is a vector or 0 if it has more than 1 dimension
vector_outs: int32 ndarray( can be replaced by list of bools if better) vector_outs: int32 ndarray( can be replaced by list of bools if better)
For each output ( mit_mot, mit_sot, sit_sot, nit_sot in this order) For each output ( mit_mot, mit_sot, sit_sot, nit_sot in this order)
the entry is 1 if the corresponding argument is a 1 dimensional the entry is 1 if the corresponding argument is a 1 dimensional
......
...@@ -959,7 +959,7 @@ class GetItemScalar(gof.op.Op): ...@@ -959,7 +959,7 @@ class GetItemScalar(gof.op.Op):
Implement a subtensor of a sparse variable that take two scalar as Implement a subtensor of a sparse variable that take two scalar as
index and return a scalar index and return a scalar
:see: GetItem2d to return more then one element. :see: GetItem2d to return more than one element.
""" """
def __eq__(self, other): def __eq__(self, other):
return (type(self) == type(other)) return (type(self) == type(other))
......
...@@ -5840,7 +5840,7 @@ def sort(a, axis=-1, kind='quicksort', order=None): ...@@ -5840,7 +5840,7 @@ def sort(a, axis=-1, kind='quicksort', order=None):
Tensor to be sorted Tensor to be sorted
axis : Tensor axis : Tensor
Axis along which to sort. None is not still supported. Axis along which to sort. If None, the array is flattened before sorting.
kind : {'quicksort', 'mergesort', 'heapsort'}, optional kind : {'quicksort', 'mergesort', 'heapsort'}, optional
...@@ -5848,7 +5848,7 @@ def sort(a, axis=-1, kind='quicksort', order=None): ...@@ -5848,7 +5848,7 @@ def sort(a, axis=-1, kind='quicksort', order=None):
order : list, optional order : list, optional
When a is a structured array, this argument specifies which When `a` is a structured array, this argument specifies which
fields to compare first, second, and so on. This list does not fields to compare first, second, and so on. This list does not
need to include all of the fields. need to include all of the fields.
......
...@@ -5110,7 +5110,7 @@ class test_arithmetic_cast(unittest.TestCase): ...@@ -5110,7 +5110,7 @@ class test_arithmetic_cast(unittest.TestCase):
warnings.filterwarnings('ignore', message='Division of two integer', warnings.filterwarnings('ignore', message='Division of two integer',
category=DeprecationWarning) category=DeprecationWarning)
try: try:
for cfg in ('numpy+floatX', ): # Used to test 'numpy' as well. for cfg in ('numpy+floatX', ): # Used to test 'numpy' as well.
config.cast_policy = cfg config.cast_policy = cfg
for op in (operator.add, operator.sub, operator.mul, for op in (operator.add, operator.sub, operator.mul,
operator.div, operator.floordiv): operator.div, operator.floordiv):
...@@ -5237,7 +5237,7 @@ class test_broadcast(unittest.TestCase): ...@@ -5237,7 +5237,7 @@ class test_broadcast(unittest.TestCase):
def test_broadcast_bigdim(self): def test_broadcast_bigdim(self):
def f(): def f():
x = matrix() x = matrix()
addbroadcast(x,2) addbroadcast(x, 2)
self.assertRaises(ValueError, f) self.assertRaises(ValueError, f)
def test_unbroadcast_addbroadcast(self): def test_unbroadcast_addbroadcast(self):
...@@ -5246,41 +5246,41 @@ class test_broadcast(unittest.TestCase): ...@@ -5246,41 +5246,41 @@ class test_broadcast(unittest.TestCase):
and fuse consecutive Rebroadcast op and fuse consecutive Rebroadcast op
""" """
x=matrix() x = matrix()
assert unbroadcast(x,0) is x assert unbroadcast(x, 0) is x
assert unbroadcast(x,1) is x assert unbroadcast(x, 1) is x
assert unbroadcast(x,1,0) is x assert unbroadcast(x, 1, 0) is x
assert unbroadcast(x,0,1) is x assert unbroadcast(x, 0, 1) is x
assert addbroadcast(x,0) is not x assert addbroadcast(x, 0) is not x
assert addbroadcast(x,1) is not x assert addbroadcast(x, 1) is not x
assert addbroadcast(x,1,0).owner.inputs[0] is x assert addbroadcast(x, 1, 0).owner.inputs[0] is x
assert unbroadcast(addbroadcast(x,0),0) is x assert unbroadcast(addbroadcast(x, 0), 0) is x
assert addbroadcast(unbroadcast(x,0),0) is not x assert addbroadcast(unbroadcast(x, 0), 0) is not x
x=row() x = row()
assert unbroadcast(x,0) is not x assert unbroadcast(x, 0) is not x
assert unbroadcast(x,1) is x assert unbroadcast(x, 1) is x
assert unbroadcast(x,1,0) is not x assert unbroadcast(x, 1, 0) is not x
assert unbroadcast(x,0,1) is not x assert unbroadcast(x, 0, 1) is not x
assert addbroadcast(x,0) is x assert addbroadcast(x, 0) is x
assert addbroadcast(x,1).owner.inputs[0] is x assert addbroadcast(x, 1).owner.inputs[0] is x
assert addbroadcast(x,1,0).owner.inputs[0] is x assert addbroadcast(x, 1, 0).owner.inputs[0] is x
assert addbroadcast(x,0,1).owner.inputs[0] is x assert addbroadcast(x, 0, 1).owner.inputs[0] is x
assert unbroadcast(addbroadcast(x,1),1) is x assert unbroadcast(addbroadcast(x, 1), 1) is x
assert addbroadcast(unbroadcast(x,1),1) is not x assert addbroadcast(unbroadcast(x, 1), 1) is not x
# The first broadcast is remove the broadcast, so the second # The first broadcast is remove the broadcast, so the second
# should not make one # should not make one
assert unbroadcast(unbroadcast(x,0),0).owner.inputs[0] is x assert unbroadcast(unbroadcast(x, 0), 0).owner.inputs[0] is x
# Test that consecutive Rebroadcast op are fused # Test that consecutive Rebroadcast op are fused
x=TensorType(dtype = 'float64', broadcastable = (True,True))() x = TensorType(dtype='float64', broadcastable=(True, True))()
assert unbroadcast(unbroadcast(x,1),0).owner.inputs[0] is x assert unbroadcast(unbroadcast(x, 1), 0).owner.inputs[0] is x
assert addbroadcast(unbroadcast(x,1),0).owner.inputs[0] is x assert addbroadcast(unbroadcast(x, 1), 0).owner.inputs[0] is x
assert addbroadcast(unbroadcast(x,0),0) is x assert addbroadcast(unbroadcast(x, 0), 0) is x
def test_patternbroadcast(self): def test_patternbroadcast(self):
# Test that patternbroadcast with an empty broadcasting pattern works # Test that patternbroadcast with an empty broadcasting pattern works
...@@ -5295,7 +5295,7 @@ class test_broadcast(unittest.TestCase): ...@@ -5295,7 +5295,7 @@ class test_broadcast(unittest.TestCase):
x = matrix() x = matrix()
y = addbroadcast(x, 0) y = addbroadcast(x, 0)
f = theano.function([x], y.shape) f = theano.function([x], y.shape)
assert (f(numpy.zeros((1,5), dtype=config.floatX)) == [1,5]).all() assert (f(numpy.zeros((1, 5), dtype=config.floatX)) == [1, 5]).all()
topo = f.maker.env.toposort() topo = f.maker.env.toposort()
if theano.config.mode != 'FAST_COMPILE': if theano.config.mode != 'FAST_COMPILE':
assert len(topo) == 2 assert len(topo) == 2
...@@ -5305,7 +5305,7 @@ class test_broadcast(unittest.TestCase): ...@@ -5305,7 +5305,7 @@ class test_broadcast(unittest.TestCase):
x = matrix() x = matrix()
y = unbroadcast(x, 0) y = unbroadcast(x, 0)
f = theano.function([x], y.shape) f = theano.function([x], y.shape)
assert (f(numpy.zeros((2,5), dtype=config.floatX)) == [2,5]).all() assert (f(numpy.zeros((2, 5), dtype=config.floatX)) == [2, 5]).all()
topo = f.maker.env.toposort() topo = f.maker.env.toposort()
if theano.config.mode != 'FAST_COMPILE': if theano.config.mode != 'FAST_COMPILE':
assert len(topo) == 3 assert len(topo) == 3
...@@ -5316,7 +5316,7 @@ class test_broadcast(unittest.TestCase): ...@@ -5316,7 +5316,7 @@ class test_broadcast(unittest.TestCase):
x = row() x = row()
y = unbroadcast(x, 0) y = unbroadcast(x, 0)
f = theano.function([x], y.shape) f = theano.function([x], y.shape)
assert (f(numpy.zeros((1,5), dtype=config.floatX)) == [1,5]).all() assert (f(numpy.zeros((1, 5), dtype=config.floatX)) == [1, 5]).all()
topo = f.maker.env.toposort() topo = f.maker.env.toposort()
if theano.config.mode != 'FAST_COMPILE': if theano.config.mode != 'FAST_COMPILE':
assert len(topo) == 2 assert len(topo) == 2
...@@ -5588,7 +5588,7 @@ class test_sort(unittest.TestCase): ...@@ -5588,7 +5588,7 @@ class test_sort(unittest.TestCase):
def setUp(self): def setUp(self):
self.rng = numpy.random.RandomState(seed=utt.fetch_seed()) self.rng = numpy.random.RandomState(seed=utt.fetch_seed())
self.m_val = self.rng.rand(3,2) self.m_val = self.rng.rand(3, 2)
self.v_val = self.rng.rand(4) self.v_val = self.rng.rand(4)
def test1(self): def test1(self):
......
...@@ -91,14 +91,14 @@ class RopLop_checker(unittest.TestCase): ...@@ -91,14 +91,14 @@ class RopLop_checker(unittest.TestCase):
(i.e. the tensor with which you multiply the (i.e. the tensor with which you multiply the
Jacobian). It should be a tuple of ints. Jacobian). It should be a tuple of ints.
If the Op have more then 1 input, one of them must be mx, the If the Op has more than 1 input, one of them must be mx, while
other must be shared variable/constant. We will test only others must be shared variables / constants. We will test only
again the input self.mx, so you must call against the input self.mx, so you must call
check_mat_rop_lop/check_rop_lop for the others input. check_mat_rop_lop/check_rop_lop for the other inputs.
We expect all inputs/outputs have dtype floatX. We expect all inputs/outputs have dtype floatX.
If you want to test an out with an output matrix, add a sum If you want to test an Op with an output matrix, add a sum
after the Op you want to test. after the Op you want to test.
""" """
vx = numpy.asarray(self.rng.uniform(size=self.mat_in_shape), vx = numpy.asarray(self.rng.uniform(size=self.mat_in_shape),
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论