提交 8d2c43de authored 作者: lamblin's avatar lamblin

Merge pull request #1011 from delallea/minor

Minor fixes
...@@ -7,15 +7,15 @@ Old Release Notes ...@@ -7,15 +7,15 @@ Old Release Notes
Theano 0.5 (23 February 2012) Theano 0.5 (23 February 2012)
============================= =============================
Highlight: Highlights:
* Moved to github: http://github.com/Theano/Theano/ * Moved to github: http://github.com/Theano/Theano/
* Old trac ticket moved to assembla ticket: http://www.assembla.com/spaces/theano/tickets * Old trac tickets moved to assembla tickets: http://www.assembla.com/spaces/theano/tickets
* Theano vision: http://deeplearning.net/software/theano/introduction.html#theano-vision (Many people) * Theano vision: http://deeplearning.net/software/theano/introduction.html#theano-vision (Many people)
* Theano with GPU works in some cases on Windows now. Still experimental. (Sebastian Urban) * Theano with GPU works in some cases on Windows now. Still experimental. (Sebastian Urban)
* Faster dot() call: New/Better direct call to cpu and gpu ger, gemv, gemm * Faster dot() call: New/Better direct call to cpu and gpu ger, gemv, gemm
and dot(vector, vector). (James, Frédéric, Pascal) and dot(vector, vector). (James, Frédéric, Pascal)
* C implementation of Alloc. (James, Pascal) * C implementation of Alloc. (James, Pascal)
* theano.grad() now also work with sparse variable. (Arnaud) * theano.grad() now also works with sparse variables. (Arnaud)
* Macro to implement the Jacobian/Hessian with theano.tensor.{jacobian,hessian} (Razvan) * Macro to implement the Jacobian/Hessian with theano.tensor.{jacobian,hessian} (Razvan)
* See the Interface changes. * See the Interface changes.
...@@ -28,14 +28,14 @@ Interface Behavior Changes: ...@@ -28,14 +28,14 @@ Interface Behavior Changes:
a warning since Theano 0.3 released Nov. 23rd, 2010) a warning since Theano 0.3 released Nov. 23rd, 2010)
* The current output dtype of sum with input dtype [u]int* is now always [u]int64. * The current output dtype of sum with input dtype [u]int* is now always [u]int64.
You can specify the output dtype with a new dtype parameter to sum. You can specify the output dtype with a new dtype parameter to sum.
The output dtype is the one using for the summation. The output dtype is the one used for the summation.
There is no warning in previous Theano version about this. There is no warning in previous Theano versions about this.
The consequence is that the sum is done in a dtype with more precision than before. The consequence is that the sum is done in a dtype with more precision than before.
So the sum could be slower, but will be more resistent to overflow. So the sum could be slower, but will be more resistant to overflow.
This new behavior is the same as numpy. (Olivier, Pascal) This new behavior is the same as numpy. (Olivier, Pascal)
* When using a GPU, detect faulty nvidia drivers. This was detected * When using a GPU, detect faulty nvidia drivers. This was detected
when running Theano tests. Now this is always tested. Faulty when running Theano tests. Now this is always tested. Faulty
drivers results in wrong results for reduce operations. (Frederic B.) drivers result in wrong results for reduce operations. (Frederic B.)
Interface Features Removed (most were deprecated): Interface Features Removed (most were deprecated):
...@@ -69,7 +69,7 @@ New deprecation (will be removed in Theano 0.6, warning generated if you use the ...@@ -69,7 +69,7 @@ New deprecation (will be removed in Theano 0.6, warning generated if you use the
Bug fixes (incorrect results): Bug fixes (incorrect results):
* On CPU, if the convolution had received explicit shape information, * On CPU, if the convolution had received explicit shape information,
they where not checked at runtime. This caused wrong result if the they were not checked at runtime. This caused wrong result if the
input shape was not the one expected. (Frederic, reported by Sander input shape was not the one expected. (Frederic, reported by Sander
Dieleman) Dieleman)
* Theoretical bug: in some case we could have GPUSum return bad value. * Theoretical bug: in some case we could have GPUSum return bad value.
...@@ -95,21 +95,21 @@ Bug fixes (incorrect results): ...@@ -95,21 +95,21 @@ Bug fixes (incorrect results):
Scan fixes: Scan fixes:
* computing grad of a function of grad of scan (reported by Justin Bayer, fix by Razvan) * computing grad of a function of grad of scan (reported by Justin Bayer, fix by Razvan)
before : most of the time crash, but could be wrong value with bad number of dimensions (so a visible bug) before: most of the time crash, but could be wrong value with bad number of dimensions (so a visible bug)
now : do the right thing. now: do the right thing.
* gradient with respect to outputs using multiple taps (reported by Timothy, fix by Razvan) * gradient with respect to outputs using multiple taps (reported by Timothy, fix by Razvan)
before : it used to return wrong values before: it used to return wrong values
now : do the right thing. now: do the right thing.
Note: The reported case of this bug was happening in conjunction with the Note: The reported case of this bug was happening in conjunction with the
save optimization of scan that give run time errors. So if you didn't save optimization of scan that give run time errors. So if you didn't
manually disable the same memory optimization (number in the list4), manually disable the same memory optimization (number in the list4),
you are fine if you didn't manually request multiple taps. you are fine if you didn't manually request multiple taps.
* Rop of gradient of scan (reported by Timothy and Justin Bayer, fix by Razvan) * Rop of gradient of scan (reported by Timothy and Justin Bayer, fix by Razvan)
before : compilation error when computing R-op before: compilation error when computing R-op
now : do the right thing. now: do the right thing.
* save memory optimization of scan (reported by Timothy and Nicolas BL, fix by Razvan) * save memory optimization of scan (reported by Timothy and Nicolas BL, fix by Razvan)
before : for certain corner cases used to result in a runtime shape error before: for certain corner cases used to result in a runtime shape error
now : do the right thing. now: do the right thing.
* Scan grad when the input of scan has sequences of different lengths. (Razvan, reported by Michael Forbes) * Scan grad when the input of scan has sequences of different lengths. (Razvan, reported by Michael Forbes)
* Scan.infer_shape now works correctly when working with a condition for the number of loops. * Scan.infer_shape now works correctly when working with a condition for the number of loops.
In the past, it returned n_steps as the length, which is not always true. (Razvan) In the past, it returned n_steps as the length, which is not always true. (Razvan)
...@@ -118,10 +118,10 @@ Scan fixes: ...@@ -118,10 +118,10 @@ Scan fixes:
New features: New features:
* AdvancedIncSubtensor grad defined and tested (Justin Bayer) * AdvancedIncSubtensor grad defined and tested (Justin Bayer)
* Adding 1D advanced indexing support to inc_subtensor and set_subtensor (James Bergstra) * Adding 1D advanced indexing support to inc_subtensor and set_subtensor (James Bergstra)
* tensor.{zeros,ones}_like now support the dtype param as numpy (Frederic) * tensor.{zeros,ones}_like now supports the dtype param as numpy (Frederic)
* Added configuration flag "exception_verbosity" to control the verbosity of exceptions (Ian) * Added configuration flag "exception_verbosity" to control the verbosity of exceptions (Ian)
* theano-cache list: list the content of the theano cache (Frederic) * theano-cache list: list the content of the theano cache (Frederic)
* theano-cache unlock: remove the Theano lock (Olivier) * theano-cache unlock: remove the Theano cache lock (Olivier)
* tensor.ceil_int_div to compute ceil(a / float(b)) (Frederic) * tensor.ceil_int_div to compute ceil(a / float(b)) (Frederic)
* MaxAndArgMax.grad now works with any axis (The op supports only 1 axis) (Frederic) * MaxAndArgMax.grad now works with any axis (The op supports only 1 axis) (Frederic)
* used by tensor.{max,min,max_and_argmax} * used by tensor.{max,min,max_and_argmax}
...@@ -142,12 +142,12 @@ New features: ...@@ -142,12 +142,12 @@ New features:
* theano.sandbox.cuda.cuda_ndarray.cuda_ndarray.mem_info() return free and total gpu memory (Frederic) * theano.sandbox.cuda.cuda_ndarray.cuda_ndarray.mem_info() return free and total gpu memory (Frederic)
* Theano flags compiledir_format. Keep the same default as before: compiledir_%(platform)s-%(processor)s-%(python_version)s. (Josh Bleecher Snyder) * Theano flags compiledir_format. Keep the same default as before: compiledir_%(platform)s-%(processor)s-%(python_version)s. (Josh Bleecher Snyder)
* We also support the "theano_version" substitution. * We also support the "theano_version" substitution.
* IntDiv C code (faster and allow this elemwise to be fused with other elemwise) (Pascal) * IntDiv C code (faster and allows this elemwise to be fused with other elemwise) (Pascal)
* Internal filter_variable mechanism in Type. (Pascal, Ian) * Internal filter_variable mechanism in Type. (Pascal, Ian)
* Ifelse works on sparse. * Ifelse works on sparse.
* It makes use of gpu shared variable more transparent with theano.function updates and givens parameter. * It makes use of gpu shared variable more transparent with theano.function updates and givens parameter.
* Added a_tensor.transpose(axes) axes is optional (James) * Added a_tensor.transpose(axes) axes is optional (James)
* theano.tensor.transpose(a_tensor, kwargs) We where ignoring kwargs, now it is used as the axes. * theano.tensor.transpose(a_tensor, kwargs) We were ignoring kwargs, now it is used as the axes.
* a_CudaNdarray_object[*] = int, now works (Frederic) * a_CudaNdarray_object[*] = int, now works (Frederic)
* tensor_variable.size (as numpy) computes the product of the shape elements. (Olivier) * tensor_variable.size (as numpy) computes the product of the shape elements. (Olivier)
* sparse_variable.size (as scipy) computes the number of stored values. (Olivier) * sparse_variable.size (as scipy) computes the number of stored values. (Olivier)
...@@ -168,11 +168,11 @@ New features: ...@@ -168,11 +168,11 @@ New features:
* 'theano-cache list' prints the number of compiled modules per op class (Frederic B.) * 'theano-cache list' prints the number of compiled modules per op class (Frederic B.)
* The Theano flag "nvcc.fastmath" is now also used for the cuda_ndarray.cu file. * The Theano flag "nvcc.fastmath" is now also used for the cuda_ndarray.cu file.
* Add the header_dirs to the hard part of the compilation key. This is * Add the header_dirs to the hard part of the compilation key. This is
currently used only by cuda, but if we use library that are only headers, currently used only by cuda, but if we use libraries that are only headers,
this can be useful. (Frederic B.) this can be useful. (Frederic B.)
* The Theano flag "nvcc.flags" is now included in the hard part of the key. * The Theano flag "nvcc.flags" is now included in the hard part of the key.
This mean that now we recompile all modules for each value of "nvcc.flags". This means that now we recompile all modules for each value of "nvcc.flags".
A change in "nvcc.flags" used to be ignored for module that were already A change in "nvcc.flags" used to be ignored for modules that were already
compiled. (Frederic B.) compiled. (Frederic B.)
* Alloc, GpuAlloc are not always pre-computed (constant_folding optimization) * Alloc, GpuAlloc are not always pre-computed (constant_folding optimization)
at compile time if all their inputs are constant. at compile time if all their inputs are constant.
...@@ -209,7 +209,7 @@ Crashes fixed: ...@@ -209,7 +209,7 @@ Crashes fixed:
* Runtime crash related to an optimization with subtensor of alloc (reported by Razvan, fixed by Frederic) * Runtime crash related to an optimization with subtensor of alloc (reported by Razvan, fixed by Frederic)
* Fix dot22scalar cast of integer scalars (Justin Bayer, Frédéric, Olivier) * Fix dot22scalar cast of integer scalars (Justin Bayer, Frédéric, Olivier)
* Fix runtime crash in gemm, dot22. FB * Fix runtime crash in gemm, dot22. FB
* Fix on 32bits computer: make sure all shape are int64.(Olivier) * Fix on 32 bit computer: make sure all shapes are int64. (Olivier)
* Fix to deque on python 2.4 (Olivier) * Fix to deque on python 2.4 (Olivier)
* Fix crash when not using C code (or using DebugMode) (not used by * Fix crash when not using C code (or using DebugMode) (not used by
default) with numpy 1.6*. Numpy has a bug in the reduction code that default) with numpy 1.6*. Numpy has a bug in the reduction code that
...@@ -287,10 +287,9 @@ Others: ...@@ -287,10 +287,9 @@ Others:
The other accepted value is "raise" to raise an error when this happens. (Frederic) The other accepted value is "raise" to raise an error when this happens. (Frederic)
* The buidbot now raises optimization/shape errors instead of just printing a warning. (Frederic) * The buidbot now raises optimization/shape errors instead of just printing a warning. (Frederic)
* better pycuda tests (Frederic) * better pycuda tests (Frederic)
* check_blas.py now accept the shape and the number of iteration as parameter (Frederic) * check_blas.py now accepts the shape and the number of iterations as parameter (Frederic)
* Fix opt warning when the opt ShapeOpt is disabled (enabled by default) (Frederic) * Fix opt warning when the opt ShapeOpt is disabled (enabled by default) (Frederic)
* More internal verification on what each op.infer_shape return. (Frederic, James) * More internal verification on what each op.infer_shape return. (Frederic, James)
* Argmax dtype to int64 (Olivier)
* Improved docstring and basic tests for the Tile Op (David). * Improved docstring and basic tests for the Tile Op (David).
Reviewers (alphabetical order): Reviewers (alphabetical order):
......
差异被折叠。
差异被折叠。
...@@ -165,7 +165,7 @@ Note: There is no short term plan to support multi-node computation. ...@@ -165,7 +165,7 @@ Note: There is no short term plan to support multi-node computation.
Theano Vision State Theano Vision State
=================== ===================
Here is the state of that vision as of 1 October 2012 (after Theano release Here is the state of that vision as of October 1st, 2012 (after Theano release
0.6rc1): 0.6rc1):
* We support tensors using the `numpy.ndarray` object and we support many operations on them. * We support tensors using the `numpy.ndarray` object and we support many operations on them.
...@@ -196,8 +196,8 @@ Here is the state of that vision as of 1 October 2012 (after Theano release ...@@ -196,8 +196,8 @@ Here is the state of that vision as of 1 October 2012 (after Theano release
* The profiler used by cvm is less complete than `ProfileMode`. * The profiler used by cvm is less complete than `ProfileMode`.
* SIMD parallelism on the CPU comes from the compiler. * SIMD parallelism on the CPU comes from the compiler.
* Multi-core parallelism is only supported Conv2d. If the external BLAS implementation supports it, * Multi-core parallelism is only supported by Conv2d. If the external BLAS implementation supports it,
there is also, gemm, gemv and ger that are parallelized. there are also, gemm, gemv and ger that are parallelized.
* No multi-node support. * No multi-node support.
* Many, but not all NumPy functions/aliases are implemented. * Many, but not all NumPy functions/aliases are implemented.
* http://www.assembla.com/spaces/theano/tickets/781 * http://www.assembla.com/spaces/theano/tickets/781
......
...@@ -251,7 +251,7 @@ import theano and print the config variable, as in: ...@@ -251,7 +251,7 @@ import theano and print the config variable, as in:
Default False Default False
Do the vm/cvm linker profile the execution of Theano function? Do the vm/cvm linkers profile the execution of Theano functions?
.. attribute:: profile_optimizer .. attribute:: profile_optimizer
...@@ -259,7 +259,7 @@ import theano and print the config variable, as in: ...@@ -259,7 +259,7 @@ import theano and print the config variable, as in:
Default False Default False
Do the vm/cvm linker profile the optimization phase when compiling a Theano function? Do the vm/cvm linkers profile the optimization phase when compiling a Theano function?
.. attribute:: config.lib.amdlibm .. attribute:: config.lib.amdlibm
......
...@@ -135,7 +135,7 @@ then be used like a normal Python function. ...@@ -135,7 +135,7 @@ then be used like a normal Python function.
variables to the values to substitute for them, and it returned variables to the values to substitute for them, and it returned
the numerical value of the expression. the numerical value of the expression.
:func:`eval` will be slow the first time you call it on a variable-- :func:`eval` will be slow the first time you call it on a variable --
it needs to call :func:`function` to compile the expression behind it needs to call :func:`function` to compile the expression behind
the scenes. Subsequent calls to :func:`eval` on that same variable the scenes. Subsequent calls to :func:`eval` on that same variable
will be fast, because the variable caches the compiled function. will be fast, because the variable caches the compiled function.
......
...@@ -34,13 +34,12 @@ Also the ``-march=native`` flag must be used with care if you have NFS. In that ...@@ -34,13 +34,12 @@ Also the ``-march=native`` flag must be used with care if you have NFS. In that
Faster Theano function Faster Theano function
---------------------- ----------------------
You can set the Theano `allow_gc` to `False` to get a speed up by You can set the Theano flag `allow_gc` to `False` to get a speed-up by using
using more memory. By default, Theano free intermediate results when more memory. By default, Theano frees intermediate results when we don't need
we don't need them anymore. Doing so prevent us from reusing this them anymore. Doing so prevents us from reusing this memory. So disabling the
memory. So disabling the gc will keep all intermediate results memory garbage collection will keep all intermediate results' memory space to allow to
space to allow to reuse them during the next call to the same Theano reuse them during the next call to the same Theano function, if they are of the
function if they are of the good shape. The shape could change if the correct shape. The shape could change if the shapes of the inputs change.
shape of the inputs change.
Faster Small Theano function Faster Small Theano function
---------------------------- ----------------------------
......
...@@ -221,7 +221,7 @@ class Apply(Node): ...@@ -221,7 +221,7 @@ class Apply(Node):
return new_node return new_node
def get_parents(self): def get_parents(self):
return list( self.inputs ) return list(self.inputs)
#convenience properties #convenience properties
nin = property(lambda self: len(self.inputs), doc='same as len(self.inputs)') nin = property(lambda self: len(self.inputs), doc='same as len(self.inputs)')
...@@ -387,8 +387,8 @@ class Variable(Node): ...@@ -387,8 +387,8 @@ class Variable(Node):
def get_parents(self): def get_parents(self):
if self.owner is not None: if self.owner is not None:
return [ self.owner ] return [self.owner]
return [ ] return []
def env_getter(self): def env_getter(self):
warnings.warn("Variable.env is deprecated, it has been renamed 'fgraph'", warnings.warn("Variable.env is deprecated, it has been renamed 'fgraph'",
...@@ -405,8 +405,7 @@ class Variable(Node): ...@@ -405,8 +405,7 @@ class Variable(Node):
stacklevel=2) stacklevel=2)
del self.fgraph del self.fgraph
def eval(self, inputs_to_values=None):
def eval(self, inputs_to_values = None):
""" Evaluates this variable. """ Evaluates this variable.
inputs_to_values: a dictionary mapping theano Variables to values. inputs_to_values: a dictionary mapping theano Variables to values.
...@@ -418,13 +417,12 @@ class Variable(Node): ...@@ -418,13 +417,12 @@ class Variable(Node):
if not hasattr(self, '_fn'): if not hasattr(self, '_fn'):
self._fn_inputs = inputs_to_values.keys() self._fn_inputs = inputs_to_values.keys()
self._fn = theano.function(self._fn_inputs, self) self._fn = theano.function(self._fn_inputs, self)
args = [ inputs_to_values[param] for param in self._fn_inputs ] args = [inputs_to_values[param] for param in self._fn_inputs]
rval = self._fn(*args) rval = self._fn(*args)
return rval return rval
env = property(env_getter, env_setter, env_deleter) env = property(env_getter, env_setter, env_deleter)
...@@ -1030,6 +1028,7 @@ def view_roots(r): ...@@ -1030,6 +1028,7 @@ def view_roots(r):
else: else:
return [r] return [r]
def list_of_nodes(inputs, outputs): def list_of_nodes(inputs, outputs):
""" Return the apply nodes of the graph between inputs and outputs """ """ Return the apply nodes of the graph between inputs and outputs """
return stack_search( return stack_search(
......
...@@ -1052,7 +1052,7 @@ CudaNdarray_TakeFrom(CudaNdarray * self, PyObject *args){ ...@@ -1052,7 +1052,7 @@ CudaNdarray_TakeFrom(CudaNdarray * self, PyObject *args){
// We are not 100% sure that cudaMemcpy wait that the async gpu kernel are // We are not 100% sure that cudaMemcpy wait that the async gpu kernel are
// finished before doing the transfer. So we add this explicit sync as it // finished before doing the transfer. So we add this explicit sync as it
// is pretty fast. In a python loop, I ran 1 000 000 call in 1 second. // is pretty fast. In a python loop, I ran 1 000 000 call in 1 second.
// It is better to be save and not significatively slower then not safe. // It is better to be safe and not significatively slower than unsafe.
cudaThreadSynchronize(); cudaThreadSynchronize();
err = cudaMemcpy(&cpu_err_var, err_var, sizeof(int), err = cudaMemcpy(&cpu_err_var, err_var, sizeof(int),
......
...@@ -13,10 +13,10 @@ except ImportError: ...@@ -13,10 +13,10 @@ except ImportError:
sys.stderr.write("WARNING: scipy can't be imported." sys.stderr.write("WARNING: scipy can't be imported."
" We disable the sparse matrix code.") " We disable the sparse matrix code.")
from type import * from theano.sparse.type import *
if enable_sparse: if enable_sparse:
from basic import * from theano.sparse.basic import *
import opt from theano.sparse import opt
import sharedvar from theano.sparse import sharedvar
from sharedvar import sparse_constructor as shared from theano.sparse.sharedvar import sparse_constructor as shared
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论