* DebugMode print more info when there is an error. (Frederic B.)
* Better profiling of test time with `theano-nose --time-profile`. (Frederic B.)
* Detection of infinite loop with global optimizer. (Pascal L.)
* DebugMode.check_preallocated_output now also work on Theano function output. (Pascal L.)
* DebugMode will now complain when the strides of CudaNdarray of dimensions of 1 are not 0. (Frederic B.)
Speed-ups:
* c_code for SpecifyShape op. (Frederic B.)
* cross-entropy optimization now work when specify_shape is used. (Pascal L.)
* The Scan optimization ScanSaveMem and PushOutDot1 applied more frequently. (Razvan P, reported Abalkin)
A skipped optimization warning was printed.
* dot(vector, vector) now faster with some BLAS implementation. (Eric Hunsberger)
OpenBLAS and possibly others didn't call {s,d}dot internally when we called {s,d}gemv.
MKL was doing this.
* Compilation speed up: Take the compiledir lock only for op that generate c_code. (Frederic B)
* More scan optimization (Razvan P.)
* Opt to make RNN fast in Theano.
* Optimize some case of dot, by moving them outside of Scan.
* Move some sequences outside of scan too.
* Merge more scan inputs, mostly byproduct of other Scan optimizations.
* c_code for theano.sparse.AddSD. (Rami Al-Rfou', Vivek Kulkarni)
Crash Fixes:
* Fix crash about dimshuffle. (abalkin)
* Fix crash at compilation. (Olivier D.)
* Fix openmp detection. (Pascal L.)
Resulted in a crash with EPD on Windows.
* Fix for new BLAS interface in SciPy. (Olivier D.)
Fix crash with some development version of SciPy.
* GpuSum work with bigger shape when summing on the first dim on 3d tensor. (Frederic B., reported Chris Currivan)
* Windows compilation crash fix. (Frederic B.)
* Make CrossentropySoftmax1HotWithBiasDx and CrossentropySoftmaxArgmax1HotWithBias support uint* dtype. (Frederic B., reported by Mark Fenner)
* Fix GpuSoftmax and GpuSoftmaxWithBias crash on GTX285. (Frederic B.)
* Fix crash due to a race condition when importing theano. (Ian G.)
* Fix crash from path problem with `theano-nose --batch`. (Abalkin)
* Fix crash with tensor.roll(Var, iscalar). (Frederic B., reported by Jeremiah Lowin)
* Fix compilation crash with llvm on Mac. (Abalkin)
* Fix the grad of Scan that told wrongly that there is no connection between cost and parameters. (Razvan P.)
* The infer shape mechanism now force that broadcasted dimensions have a shape know to be equivalent to one during compilation.
Sometimes, we where not able knowing this before run time and resulted in crash. (Frederic B.)
* Fix compilation problems on GPU on Windows. (Frederic B.)
* Fix copy on the GPU with big shape for 4d tensor (Pascal L.)
* GpuSubtensor didn't set the stride to 0 for dimensions of 1. This could lead to check failing later that caused a crash. (Frederic B., reported by vmichals)
Theoretical bugfix (bug that won't happen with current Theano code, but if you messed with the internal, could have affected you):
* GpuContiguous, GpuAlloc, GpuDownSampleGrad, Conv2d now check the preallocated outputs strides before using it. (Pascal L.)
* GpuDownSample, GpuDownSampleGrad didn't work correctly with negative strides in their output due to problem with nvcc (Pascal L, reported by abalkin?)
Others:
* Fix race condition when determining if g++ is available. (Abalkin)
* Documentation improvements. (Many people including David W-F, abalkin, Amir Elaguizy, Olivier D., Frederic B.)
* The current GPU back-end have a new function CudaNdarray_prep_output(CudaNdarray ** arr, int nd, const int * dims) (Ian G)
=============
Release Notes
=============
Theano 0.6rc2 (November 21th, 2012)
===================================
Highlights:
* Fix for a few regressions introduced in 0.6rc1.
* A few new features.
* Speed-ups.
* Scan fixes.
* Crash fixes.
* A few small interface changes.
Commiters for this rc2 only:
Razvan Pascanu
Pascal Lamblin
Frederic Bastien
Ian Goodfellow
Jeremiah Lowin
Caglar Gulcehre
Jey Kottalam
Matthew Rocklin
abalkin
Regressions in 0.6rc1 fixed:
* Fixed the scan gradient dtype issue. In 0.6rc1, some upcast were inserted. (Razvan P.)
* Now grad() will do as before 0.6rc1 for float, i.e. the grad dtype will be the same as the inputs inside the graph. If you ask for the direct grad, it will return the computed dtype. (Pascal L.)
Wrong results fixes:
* Scan fix in some case didn't returned the good results. (Razvan P., reported by Jeremiah L.)
This happened if you had a state with only neg tap and the output of the state was a function of some sequence.
If you had multiple states, there was no problem.
* Fixed bug in Scan with multiple outputs,
where one output would sometimes overwrite another one. (Razvan P.)
* Clip.grad treated the gradient with respect to the clipping boundary as always 0. (Ian G.)
Interface changes:
* We do not support anymore unaligned ndarray in Python code. (Frederic B.)
We did not support it in C code and supporting it in Python code made
the detection harder.
* Now we only officially support SciPy 0.7.2 and NumPy 1.5.0 (Frederic B.)
We weren't and aren't testing with older versions.
* The theano.sparse.SparseType is available even when SciPy is not (Frederic B.)
* Fixed issue where members of consider_constant grad parameter
were treated differently from Constant variables. (Ian G.)
* Removed the parameter g_cost from theano.grad(). (Ian G.)
Use the new more powerful parameter known_grads instead.
NumPy interface support:
* theano.tensor.where is an alias for theano.tensor.switch to support NumPy semantic. (Ian G.)
* TensorVariable objects now have dot, argmin, argmax, clip, conj, repeat, trace, std, round,
ravel and argsort functions and the real and imag properties as numpy.ndarray objects.
The functionality was already available in Theano. (abalkin)
Speed-ups:
* A C version of the SoftMax op (Razvan P.)
There was C code for the softmax with bias code.
* Faster GpuIncSubtensor (Ian G.)
* Faster copy on the GPU for 4d tensor. (Ian G.)
* The fix of flatten infer_shape re-enables an optimization (Pascal L.)
* The bug was introduced in 0.6rc1.
* Enable inc_subtensor on the GPU when updating it with a float64 dtype. (Ian G.)
It was causing an optimization warning.
* Make DeepCopy reuse preallocated memory. (Frederic B.)
* Move the convolution to the GPU when the image shape and logical image shape differ. (Frederic Bastien)
* C code for the View Op (Razvan P., Pascal L.)
New Features:
* Added a monitoring mode "MonitorMode" as a debugging tool. (Olivier D.)
* Allow integer axes when keepdims==True (Jeremiah Lowin)
* Added erfinv and erfcinv op. (Jey Kottalam)
* Added tensor.batched_dot(). (Caglar Gulcehre)
It uses scan behind the scenes, but makes doing this easier.
* theano.get_constant_value(x) (Frederic B.)
This tries to have x as a constant int.
This does some constant folding to try to convert x into an int.