* Allow GpuSoftmax and GpuSoftmaxWithBias to work with bigger input.
Bugs fixed:
* In one case an AdvancedSubtensor1 could be converted to a GpuAdvancedIncSubtensor1 insted of GpuAdvancedSubtensor1.
It probably didn't happen due to the order of optimizations, but that order is not guaranteed to be the same on all computers.
* Derivative of set_subtensor was wrong.
* Derivative of Alloc was wrong.
Crash fixed:
* On an unusual Python 2.4.4 on Windows
* When using a C cache copied from another location
* On Windows 32 bits when setting a complex64 to 0.
* Compilation crash with CUDA 4
* When wanting to copy the compilation cache from a computer to another
* This can be useful for using Theano on a computer without a compiler.
* GPU:
* Compilation crash fixed under Ubuntu 11.04
* Compilation crash fixed with CUDA 4.0
Know bug:
* CAReduce with nan in inputs don't return the good output (`Ticket <http://trac-hg.assembla.com/theano/ticket/763>`_).
* This is used in tensor.{max,mean,prod,sum} and in the grad of PermuteRowElements.
* This is not a new bug, just a bug discovered since the last release that we didn't had time to fix.
Deprecation (will be removed in Theano 0.5, warning generated if you use them):
* The string mode (accepted only by theano.function()) FAST_RUN_NOGC. Use Mode(linker='c|py_nogc') instead.
* The string mode (accepted only by theano.function()) STABILIZE. Use Mode(optimizer='stabilize') instead.
* scan interface change:
* The use of `return_steps` for specifying how many entries of the output
scan has been deprecated
* The same thing can be done by applying a subtensor on the output
return by scan to select a certain slice
* The inner function (that scan receives) should return its outputs and
updates following this order:
[outputs], [updates], [condition]. One can skip any of the three if not
used, but the order has to stay unchanged.
* tensor.grad(cost, wrt) will return an object of the "same type" as wrt
(list/tuple/TensorVariable).
* Currently tensor.grad return a type list when the wrt is a list/tuple of
more then 1 element.
Decrecated in 0.4.0(Reminder, warning generated if you use them):
* Dividing integers with / is deprecated: use // for integer division, or
cast one of the integers to a float type if you want a float result (you may
also change this behavior with config.int_division).
* tag.shape attribute deprecated (#633)
* CudaNdarray_new_null is deprecated in favour of CudaNdarray_New
Sandbox:
* MRG random generator now implements the same casting behavior as the regular random generator.
Sandbox New features(not enabled by default):
* New Linkers (theano flags linker={vm,cvm})
* The new linker allows lazy evaluation of the new ifelse op, meaning we compute only the true or false branch depending of the condition. This can speed up some types of computation.
* Uses a new profiling system (that currently tracks less stuff)
* The cvm is implemented in C, so it lowers Theano's overhead.
* The vm is implemented in python. So it can help debugging in some cases.
* In the future, the default will be the cvm.
* Some new not yet well tested sparse ops: theano.sparse.sandbox.{SpSum, Diag, SquareDiagonal, ColScaleCSC, RowScaleCSC, Remove0, EnsureSortedIndices, ConvolutionIndices}
Documentation:
* How to compute the `Jacobian, Hessian, Jacobian times a vector, Hessian times a vector <http://deeplearning.net/software/theano/tutorial/gradients.html>`_.
* Slide for a 3 hours class with exercises that was done at the HPCS2011 Conference in Montreal.
Others:
* Logger name renamed to be consistent.
* Logger function simplified and made more consistent.
* Fixed transformation of error by other not related error with the compute_test_value Theano flag.
* Compilation cache enhancements.
* Made compatible with NumPy 1.6 and SciPy 0.9
* Fix tests when there was new dtype in NumPy that is not supported by Theano.
* Fixed some tests when SciPy is not available.
* Don't compile anything when Theano is imported. Compile support code when we compile the first C code.
* Python 2.4 fix:
* Fix the file theano/misc/check_blas.py
* For python 2.4.4 on Windows, replaced float("inf") with numpy.inf.
* Removes useless inputs to a scan node
* Beautification mostly, making the graph more visible. Such inputs would appear as a consequence of other optimizations
Core:
* there is a new mechanism that lets an Op permit that one of its
inputs to be aliased to another destroyed input. This will generally
result in incorrect calculation, so it should be used with care! The
right way to use it is when the caller can guarantee that even if
these two inputs look aliased, they actually will never overlap. This
mechanism can be used, for example, by a new alternative approach to
implementing Scan. If an op has an attribute called
"destroyhandler_tolerate_aliased" then this is what's going on.
IncSubtensor is thus far the only Op to use this mechanism.Mechanism
* When wanting to copy the compilation cache from a computer to another
*
* This can be useful for using Theano on a computer without a compiler.
* GPU:
* Compilation crash fixed under Ubuntu 11.04
Interface Behavior Change (was deprecated and generated a warning since Theano 0.3 released the 23 Nov 2010):
* Compilation crash fixed with CUDA 4.0
* The current default value of the parameter axis of
theano.{max,min,argmax,argmin,max_and_argmax} is now the same as
numpy: None. i.e. operate on all dimensions of the tensor.
Know bug:
* CAReduce with nan in inputs don't return the good output (`Ticket <http://trac-hg.assembla.com/theano/ticket/763>`_).
* This is used in tensor.{max,mean,prod,sum} and in the grad of PermuteRowElements.
Interface Feature Removed (was deprecated):
* This is not a new bug, just a bug discovered since the last release that we didn't had time to fix.
* The string mode FAST_RUN_NOGC and STABILIZE are not accepted. It was accepted only by theano.function(). Use Mode(linker='c|py_nogc') or Mode(optimizer='stabilize') instead.
* tensor.grad(cost, wrt) now return an object of the "same type" as wrt
Deprecation (will be removed in Theano 0.5, warning generated if you use them):
(list/tuple/TensorVariable).
* a few tag.shape and Join.vec_length left.
* The string mode (accepted only by theano.function()) FAST_RUN_NOGC. Use Mode(linker='c|py_nogc') instead.
* The string mode (accepted only by theano.function()) STABILIZE. Use Mode(optimizer='stabilize') instead.
* scan interface change:
* scan interface change: RP
* The use of `return_steps` for specifying how many entries of the output
* The use of `return_steps` for specifying how many entries of the output
scan has been deprecated
scan has been deprecated
...
@@ -97,67 +44,138 @@ Deprecation (will be removed in Theano 0.5, warning generated if you use them):
...
@@ -97,67 +44,138 @@ Deprecation (will be removed in Theano 0.5, warning generated if you use them):
[outputs], [updates], [condition]. One can skip any of the three if not
[outputs], [updates], [condition]. One can skip any of the three if not
used, but the order has to stay unchanged.
used, but the order has to stay unchanged.
* tensor.grad(cost, wrt) will return an object of the "same type" as wrt
* shared.value is moved, use shared.set_value() or shared.get_value() instead.
(list/tuple/TensorVariable).
New Deprecation (will be removed in Theano 0.6, warning generated if you use them):
* tensor.shared() renamed to tensor._shared (Olivier D.)
* You probably want to call theano.shared()!
Interface Bug Fix:
* Rop in some case should have returned a list of 1 theano varible, but returned directly that variable.
* Theano flags "home" is not used anymore as it was a duplicate. If you use it, theano should raise an error.
New features:
* adding 1d advanced indexing support to inc_subtensor and set_subtensor (James
* tensor.{zeros,ones}_like now support the dtype param as numpy (Fred)
* config flags "exception_verbosity" to control the verbosity of exception (Ian
* theano-cache list: list the content of the theano cache(Fred)
* tensor.ceil_int_div FB
* MaxAndArgMax.grad now work with any axis(The op support only 1 axis) FB
* used by tensor.{max,min,max_and_argmax}
* tensor.{all,any} RP
* tensor.roll as numpy: (Matthew Rocklin, DWF)
* on Windows work. Still experimental. (Sebastian Urban)
* IfElse now allow to have a list/tuple as the result of the if/else branches.
* They must have the same length and correspondig type) RP
* argmax dtype as int64. OD
New Optimizations:
* AdvancedSubtensor1 reuse preallocated memory if available(scan, c|py_nogc linker)(Fred)
* tensor_variable.size (as numpy) product of the shape elements OD
* sparse_variable.size (as scipy) the number of stored value.OD
* dot22, dot22scalar work with complex(Fred)
* Doc how to wrap in Theano an existing python function(in numpy, scipy, ...) Fred
* added arccos IG
* sparse dot with full output. (Yann Dauphin)
* Optimized to Usmm and UsmmCscDense in some case (YD)
* Note: theano.dot, sparse.dot return a structured_dot grad(
* Generate Gemv/Gemm more often JB
* scan move computation outside the inner loop when the remove everything from the inner loop RP
* scan optimization done earlier. This allow other optimization to be applied FB, RP, GD
* exp(x) * sigmoid(-x) is now correctly optimized to a more stable form.
GPU:
* GpuAdvancedSubtensor1 support broadcasted dimensions
* Currently tensor.grad return a type list when the wrt is a list/tuple of
more then 1 element.
Decrecated in 0.4.0(Reminder, warning generated if you use them):
Bugs fixed:
* On cpu, if the convolution had received explicit shape information, they where not checked at run time. This caused wrong result if the input shape was not the one expected. (Fred, reported by Sander Dieleman)
* Scan grad when the input of scan has sequence of different length. (RP reported by Michael Forbes)
* Scan.infer_shape now work correctly when working with a condition for the number of loop. In the past, it returned n_stepts as the shape, witch is not always true. RP
* Theoritic bug: in some case we could have GPUSum return bad value. Was not able to produce the error..
* pattern affected({0,1}*nb dim, 0 no reduction on this dim, 1 reduction on this dim )
01, 011, 0111, 010, 10, 001, 0011, 0101: FB
* div by zeros in verify_grad. This hidded a bug in the grad of Images2Neibs. (JB)
* theano.sandbox.neighbors.Images2Neibs grad was returning wrong value. The grad is now disabled and return an error. FB
Crash fixed:
* T.mean crash at graph building timeby Ian G.
* "Interactive debugger" crash fix (Ian, Fred)
* "Interactive Debugger" renamed to "Using Test Values"
* Do not call gemm with strides 0, some blas refuse it. (PL)
* optimization crash with gemm and complex.(Fred
* Gpu crash with elemwise Fred
* compilation crash with amdlibm and the gpu. Fred
* IfElse crash Fred
* Execution crash fix in AdvancedSubtensor1 on 32 bits computer(PL)
* gpu compilation crash on MacOS X OD
* gpu compilation crash on MacOS X Fred
* Support for OSX Enthought Python Distribution 7.x (Graham Taylor, OD)
* When the subtensor inputs had 0 dimensions and the outputs 0 dimensions
* Crash when the step to subtensor was not 1 in conjonction with some optimization
* MRG random generator now implements the same casting behavior as the regular random generator.
Sandbox New features(not enabled by default):
Sandbox New features(not enabled by default):
* CURAND_RandomStreams for uniform and normal(not pickable, gpu only)(James)
* New Linkers (theano flags linker={vm,cvm})
* The new linker allows lazy evaluation of the new ifelse op, meaning we compute only the true or false branch depending of the condition. This can speed up some types of computation.
* Uses a new profiling system (that currently tracks less stuff)
* The cvm is implemented in C, so it lowers Theano's overhead.
* The vm is implemented in python. So it can help debugging in some cases.
* In the future, the default will be the cvm.
* Some new not yet well tested sparse ops: theano.sparse.sandbox.{SpSum, Diag, SquareDiagonal, ColScaleCSC, RowScaleCSC, Remove0, EnsureSortedIndices, ConvolutionIndices}
Documentation:
Documentation:
* Many update by many people: Olivier Delalleau, Fred, RP, David,
* Updates to install doc on MacOS (OD)
* Updates to install doc on Windows(DWF, OD)
* Doc how to use scan to loop with a condition as the number of iteration RP
* How to compute the `Jacobian, Hessian, Jacobian times a vector, Hessian times a vector <http://deeplearning.net/software/theano/tutorial/gradients.html>`_.
* Slide for a 3 hours class with exercises that was done at the HPCS2011 Conference in Montreal.
Others:
Others:
* Better error message at many places: David Warde-Farley, Ian, Fred, Olivier D.
* Logger name renamed to be consistent.
* pep8: James,
* Logger function simplified and made more consistent.
* min_informative_str to print graph: Ian G.
* Fixed transformation of error by other not related error with the compute_test_value Theano flag.
* Fix catching of exception. (Sometimes we catched interupt): Fred, David, Ian, OD,
* Compilation cache enhancements.
* Better support for uft string(David WF)
* Made compatible with NumPy 1.6 and SciPy 0.9
* Fix pydotprint with a function compiled with a ProfileMode (Fred)
* Fix tests when there was new dtype in NumPy that is not supported by Theano.
* Was broken with change to the profiler.
* Fixed some tests when SciPy is not available.
* warning when people have old cache entry (OD)
* Don't compile anything when Theano is imported. Compile support code when we compile the first C code.
* More test for join on the gpu and cpu.
* Python 2.4 fix:
* Don't request to load the gpu module by default in scan module. RP
* Better opt that lift transpose around dot JB
* Fix the file theano/misc/check_blas.py
* Fix some import problem
* For python 2.4.4 on Windows, replaced float("inf") with numpy.inf.
* Filtering update JB
* Removes useless inputs to a scan node
* Beautification mostly, making the graph more visible. Such inputs would appear as a consequence of other optimizations
Reviewers:
James, David, Ian, Fred, Razvan, delallea
Core:
* there is a new mechanism that lets an Op permit that one of its
inputs to be aliased to another destroyed input. This will generally
result in incorrect calculation, so it should be used with care! The
right way to use it is when the caller can guarantee that even if
these two inputs look aliased, they actually will never overlap. This
mechanism can be used, for example, by a new alternative approach to
implementing Scan. If an op has an attribute called
"destroyhandler_tolerate_aliased" then this is what's going on.
IncSubtensor is thus far the only Op to use this mechanism.Mechanism
* When wanting to copy the compilation cache from a computer to another
*
* This can be useful for using Theano on a computer without a compiler.
* GPU:
* Compilation crash fixed under Ubuntu 11.04
Interface Behavior Change (was deprecated and generated a warning since Theano 0.3 released the 23 Nov 2010):
* Compilation crash fixed with CUDA 4.0
* The current default value of the parameter axis of
theano.{max,min,argmax,argmin,max_and_argmax} is now the same as
numpy: None. i.e. operate on all dimensions of the tensor.
Know bug:
* CAReduce with nan in inputs don't return the good output (`Ticket <http://trac-hg.assembla.com/theano/ticket/763>`_).
* This is used in tensor.{max,mean,prod,sum} and in the grad of PermuteRowElements.
Interface Feature Removed (was deprecated):
* This is not a new bug, just a bug discovered since the last release that we didn't had time to fix.
* The string mode FAST_RUN_NOGC and STABILIZE are not accepted. It was accepted only by theano.function(). Use Mode(linker='c|py_nogc') or Mode(optimizer='stabilize') instead.
* tensor.grad(cost, wrt) now return an object of the "same type" as wrt
Deprecation (will be removed in Theano 0.5, warning generated if you use them):
(list/tuple/TensorVariable).
* a few tag.shape and Join.vec_length left.
* The string mode (accepted only by theano.function()) FAST_RUN_NOGC. Use Mode(linker='c|py_nogc') instead.
* The string mode (accepted only by theano.function()) STABILIZE. Use Mode(optimizer='stabilize') instead.
* scan interface change:
* scan interface change: RP
* The use of `return_steps` for specifying how many entries of the output
* The use of `return_steps` for specifying how many entries of the output
scan has been deprecated
scan has been deprecated
...
@@ -97,67 +44,138 @@ Deprecation (will be removed in Theano 0.5, warning generated if you use them):
...
@@ -97,67 +44,138 @@ Deprecation (will be removed in Theano 0.5, warning generated if you use them):
[outputs], [updates], [condition]. One can skip any of the three if not
[outputs], [updates], [condition]. One can skip any of the three if not
used, but the order has to stay unchanged.
used, but the order has to stay unchanged.
* tensor.grad(cost, wrt) will return an object of the "same type" as wrt
* shared.value is moved, use shared.set_value() or shared.get_value() instead.
(list/tuple/TensorVariable).
New Deprecation (will be removed in Theano 0.6, warning generated if you use them):
* tensor.shared() renamed to tensor._shared (Olivier D.)
* You probably want to call theano.shared()!
Interface Bug Fix:
* Rop in some case should have returned a list of 1 theano varible, but returned directly that variable.
* Theano flags "home" is not used anymore as it was a duplicate. If you use it, theano should raise an error.
New features:
* adding 1d advanced indexing support to inc_subtensor and set_subtensor (James
* tensor.{zeros,ones}_like now support the dtype param as numpy (Fred)
* config flags "exception_verbosity" to control the verbosity of exception (Ian
* theano-cache list: list the content of the theano cache(Fred)
* tensor.ceil_int_div FB
* MaxAndArgMax.grad now work with any axis(The op support only 1 axis) FB
* used by tensor.{max,min,max_and_argmax}
* tensor.{all,any} RP
* tensor.roll as numpy: (Matthew Rocklin, DWF)
* on Windows work. Still experimental. (Sebastian Urban)
* IfElse now allow to have a list/tuple as the result of the if/else branches.
* They must have the same length and correspondig type) RP
* argmax dtype as int64. OD
New Optimizations:
* AdvancedSubtensor1 reuse preallocated memory if available(scan, c|py_nogc linker)(Fred)
* tensor_variable.size (as numpy) product of the shape elements OD
* sparse_variable.size (as scipy) the number of stored value.OD
* dot22, dot22scalar work with complex(Fred)
* Doc how to wrap in Theano an existing python function(in numpy, scipy, ...) Fred
* added arccos IG
* sparse dot with full output. (Yann Dauphin)
* Optimized to Usmm and UsmmCscDense in some case (YD)
* Note: theano.dot, sparse.dot return a structured_dot grad(
* Generate Gemv/Gemm more often JB
* scan move computation outside the inner loop when the remove everything from the inner loop RP
* scan optimization done earlier. This allow other optimization to be applied FB, RP, GD
* exp(x) * sigmoid(-x) is now correctly optimized to a more stable form.
GPU:
* GpuAdvancedSubtensor1 support broadcasted dimensions
* Currently tensor.grad return a type list when the wrt is a list/tuple of
more then 1 element.
Decrecated in 0.4.0(Reminder, warning generated if you use them):
Bugs fixed:
* On cpu, if the convolution had received explicit shape information, they where not checked at run time. This caused wrong result if the input shape was not the one expected. (Fred, reported by Sander Dieleman)
* Scan grad when the input of scan has sequence of different length. (RP reported by Michael Forbes)
* Scan.infer_shape now work correctly when working with a condition for the number of loop. In the past, it returned n_stepts as the shape, witch is not always true. RP
* Theoritic bug: in some case we could have GPUSum return bad value. Was not able to produce the error..
* pattern affected({0,1}*nb dim, 0 no reduction on this dim, 1 reduction on this dim )
01, 011, 0111, 010, 10, 001, 0011, 0101: FB
* div by zeros in verify_grad. This hidded a bug in the grad of Images2Neibs. (JB)
* theano.sandbox.neighbors.Images2Neibs grad was returning wrong value. The grad is now disabled and return an error. FB
Crash fixed:
* T.mean crash at graph building timeby Ian G.
* "Interactive debugger" crash fix (Ian, Fred)
* "Interactive Debugger" renamed to "Using Test Values"
* Do not call gemm with strides 0, some blas refuse it. (PL)
* optimization crash with gemm and complex.(Fred
* Gpu crash with elemwise Fred
* compilation crash with amdlibm and the gpu. Fred
* IfElse crash Fred
* Execution crash fix in AdvancedSubtensor1 on 32 bits computer(PL)
* gpu compilation crash on MacOS X OD
* gpu compilation crash on MacOS X Fred
* Support for OSX Enthought Python Distribution 7.x (Graham Taylor, OD)
* When the subtensor inputs had 0 dimensions and the outputs 0 dimensions
* Crash when the step to subtensor was not 1 in conjonction with some optimization
* MRG random generator now implements the same casting behavior as the regular random generator.
Sandbox New features(not enabled by default):
Sandbox New features(not enabled by default):
* CURAND_RandomStreams for uniform and normal(not pickable, gpu only)(James)
* New Linkers (theano flags linker={vm,cvm})
* The new linker allows lazy evaluation of the new ifelse op, meaning we compute only the true or false branch depending of the condition. This can speed up some types of computation.
* Uses a new profiling system (that currently tracks less stuff)
* The cvm is implemented in C, so it lowers Theano's overhead.
* The vm is implemented in python. So it can help debugging in some cases.
* In the future, the default will be the cvm.
* Some new not yet well tested sparse ops: theano.sparse.sandbox.{SpSum, Diag, SquareDiagonal, ColScaleCSC, RowScaleCSC, Remove0, EnsureSortedIndices, ConvolutionIndices}
Documentation:
Documentation:
* Many update by many people: Olivier Delalleau, Fred, RP, David,
* Updates to install doc on MacOS (OD)
* Updates to install doc on Windows(DWF, OD)
* Doc how to use scan to loop with a condition as the number of iteration RP
* How to compute the `Jacobian, Hessian, Jacobian times a vector, Hessian times a vector <http://deeplearning.net/software/theano/tutorial/gradients.html>`_.
* Slide for a 3 hours class with exercises that was done at the HPCS2011 Conference in Montreal.
Others:
Others:
* Better error message at many places: David Warde-Farley, Ian, Fred, Olivier D.
* Logger name renamed to be consistent.
* pep8: James,
* Logger function simplified and made more consistent.
* min_informative_str to print graph: Ian G.
* Fixed transformation of error by other not related error with the compute_test_value Theano flag.
* Fix catching of exception. (Sometimes we catched interupt): Fred, David, Ian, OD,
* Compilation cache enhancements.
* Better support for uft string(David WF)
* Made compatible with NumPy 1.6 and SciPy 0.9
* Fix pydotprint with a function compiled with a ProfileMode (Fred)
* Fix tests when there was new dtype in NumPy that is not supported by Theano.
* Was broken with change to the profiler.
* Fixed some tests when SciPy is not available.
* warning when people have old cache entry (OD)
* Don't compile anything when Theano is imported. Compile support code when we compile the first C code.
* More test for join on the gpu and cpu.
* Python 2.4 fix:
* Don't request to load the gpu module by default in scan module. RP
* Better opt that lift transpose around dot JB
* Fix the file theano/misc/check_blas.py
* Fix some import problem
* For python 2.4.4 on Windows, replaced float("inf") with numpy.inf.
* Filtering update JB
* Removes useless inputs to a scan node
* Beautification mostly, making the graph more visible. Such inputs would appear as a consequence of other optimizations
Reviewers:
James, David, Ian, Fred, Razvan, delallea
Core:
* there is a new mechanism that lets an Op permit that one of its
inputs to be aliased to another destroyed input. This will generally
result in incorrect calculation, so it should be used with care! The
right way to use it is when the caller can guarantee that even if
these two inputs look aliased, they actually will never overlap. This
mechanism can be used, for example, by a new alternative approach to
implementing Scan. If an op has an attribute called
"destroyhandler_tolerate_aliased" then this is what's going on.
IncSubtensor is thus far the only Op to use this mechanism.Mechanism