* Fix wrong result of GpuDownsampleFactorMaxGrad on Mac OSX. (Pascal L.)
* Auto-Detect and work around a bug in BLAS on MacOS X (Pascal L.)
* Work around bug in MacOS X. If 2 compiled module had the same name, the os or python was not always the right one event when we use the right handle to it. (Pascal L.)
Use this hash in the Python module, and in %(nodename)s, so that different helper functions in the support code for different Ops will always have different names.
* Fix infinit loop related to Scan on the GPU. (Pascal L.)
* Fix ConstructSparseFromList.infer_shape, (Pascal L., reported by Rami Al-Rfou')
* (introduced in the development version after 0.6rc3 release) (Frederic B.)
Reduction that upcast the input on no axis (ex: call theano.sum() on a scalar when the original dtype isn't float64 or [u]int64). It produce bad results as we don't upcast the inputs in the code, we just copy it.
New Features:
* Python 3.3 compatible (abalkin, Gabe Schwartz, Frederic B.)
* A new profiler (Frederic B.)
The new profiler now can profile the memory with the Theano flag profile_memory=True.
The ProfileMode now can't profile memory anymore and print a message about it.
Now we raise an error if we try to profile when the gpu is enabled if we didn't set
correctly the env variable to force the driver to sync the kernel launch.
Otherwise the profile information are useless.
The new profiler support the enabling/disable of the garbage collection.
* Adds tensor.tri, tensor.triu, and tensor.tril functions that wrap Numpy equivalents (Jeremiah Lowin)
* Fix import theano crash when g++ isn't there (OD)
* Fix crash related to rebuild of Theano graph (Pascal L., reported by Divine Eguzouwa)
* Fix crash during compilation (David Ward-Farley)
* Crash fix in the grad of GPU op in corner case (Pascal L.)
* Crash fix on MacOS X (Robert Kern)
* theano.misc.gnumpy_utils.garray_to_cudandarray() set strides correctly for dimensions of 1. (Frederic B., reported by Justin Bayer)
* Fix crash during optimization with consecutive sum and some combination of axis (Frederic B., reported by Çağlar Gülçehre)
* Fix crash with keepdims and negative axis (Frederic B., reported by David W.-F.)
* Fix crash of theano.[sparse.]dot(x,y) when x or y is a vector. (Frederic B., reported by Zsolt Bitvai)
* Fix opt crash/disabled with ifelse on the gpu (Frederic B, reported by Ryan Price)
Others:
* Theano flags are now evaluated lazyly, only if requeted (Frederic B.)
* Fix test when g++ is not avail (Frederic B.)
* Typo/pep8 (Olivier D., Frederic B.)
* Update doc (Ben McCann)
* Doc compatibility guide (abalkin)
* Add manual instruction for OpenBLAS on Ubuntu by (Jianri Li )
* Doc fixes (Yaroslav Halchenko)
* Better error message (Ian Goodfellow)
* More doc (Frederic B.)
* Fix Error reporting with GpuConv (Frederic B., reported by Heng Luo and Nicolas Pinto)
* Update BLAS compilation doc on windows to use OpenBLAS (Olivier D.)
* The infer_shape tester method now warn if the shapes values could hide errors. (Frederic B.)
* Now travis-ci test with scipy the part that need it (Frederic B.)
* Export some function that work on CudaNdarray for windows (Frederic B.)
* If the user specify an -arch=sm_* value in the Theano flags for the gpu, don't add one (Frederic B., Pascal L.)
* If a c thunk return an error, check if a python exception is set. Otherwise, set a default one (Pascal L.)
* Crash fix introduced in the development version (Wei LI)
* Added BLAS benchmark result (Frederic B., Ben McCann)
* Fix code comment (Hannes Schulz)
* More stable tests (Frederic B.)
* Add utt.asset_allclose(a, b) to have better error messge. (Frederic B.)
* Better error message (Pascal L., Frederic B.)
* Better error message with compute_test_value (Frederic, reported by John Salvatier)
* Stochastic order behavior fix (Frederic B.)
* Simpler initial graph for subtensor infer shape (Olivier D.)
The optimization was doing the optimization, but this allow better reading of the graph before optimization.
* Better detectiont of non-aligned ndarray (Frederic B.)
* Updae MRG multinomial gradient to the new interface (Mehdi Mirza)
* Implement Image2Neibs.perform() to help debug (Frederic B.)
* Remove Theano flags from the compilation key (Frederic B.)
* Make theano-nose work on executable *.py files. (Alistair Muldal)
* Make theano-nose work with older nose version (Frederic B.)
* Add extra debug info in verify_grad() (Frederic B.)
Todo for the final release:
* update the NEWS.txt file.
* https://github.com/Theano/Theano/pull/1450
* More stable test (Frederic B.)
* https://github.com/Theano/Theano/pull/1441
* On error during execution of an Theano function, now always print the apply node, its inputs shape, strides and types. If Theano flags exception_verbosity=high, also print debugprint(op, stop_on_name=True, print_type=True)
* https://github.com/Theano/Theano/pull/1238
Refers to the bug reported by Akio Takano on the mailing list. The problem refers to the case when one tries to replace x with a function of x, f(x).
The fix implies first replacing x with some tmp_x and then tmp_x with f(x).
Reported by: Akio Takano, done by Razvan
* https://github.com/Theano/Theano/pull/1236
Install script with anaconda 1.3.1 (Pascal L.)
=============
Release Notes
=============
Theano 0.6rc3 (February 14th, 2013)
===================================
Highlights:
* Windows related fixes.
* Speed-ups.
* Crash fixes.
* A few small interface changes.
* GPU memory leak fix.
* A few corner cases fixes without incidence.
* More Theano determinism
* tensor.{dot,tensordot} more complete/faster/GPU friendly.
* tensor.tensordot now support Rop/Lop
* tensor.dot support n-dimensional inputs as NumPy
* DebugMode print more info when there is an error. (Frederic B.)
* Better profiling of test time with `theano-nose --time-profile`. (Frederic B.)
* Detection of infinite loop with global optimizer. (Pascal L.)
* DebugMode.check_preallocated_output now also work on Theano function output. (Pascal L.)
* DebugMode will now complain when the strides of CudaNdarray of dimensions of 1 are not 0. (Frederic B.)
Speed-ups:
* c_code for SpecifyShape op. (Frederic B.)
* cross-entropy optimization now work when specify_shape is used. (Pascal L.)
* The Scan optimization ScanSaveMem and PushOutDot1 applied more frequently. (Razvan P, reported Abalkin)
A skipped optimization warning was printed.
* dot(vector, vector) now faster with some BLAS implementation. (Eric Hunsberger)
OpenBLAS and possibly others didn't call {s,d}dot internally when we called {s,d}gemv.
MKL was doing this.
* Compilation speed up: Take the compiledir lock only for op that generate c_code. (Frederic B)
* More scan optimization (Razvan P.)
* Opt to make RNN fast in Theano.
* Optimize some case of dot, by moving them outside of Scan.
* Move some sequences outside of scan too.
* Merge more scan inputs, mostly byproduct of other Scan optimizations.
* c_code for theano.sparse.AddSD. (Rami Al-Rfou', Vivek Kulkarni)
Crash Fixes:
* Fix crash about dimshuffle. (abalkin)
* Fix crash at compilation. (Olivier D.)
* Fix openmp detection. (Pascal L.)
Resulted in a crash with EPD on Windows.
* Fix for new BLAS interface in SciPy. (Olivier D.)
Fix crash with some development version of SciPy.
* GpuSum work with bigger shape when summing on the first dim on 3d tensor. (Frederic B., reported Chris Currivan)
* Windows compilation crash fix. (Frederic B.)
* Make CrossentropySoftmax1HotWithBiasDx and CrossentropySoftmaxArgmax1HotWithBias support uint* dtype. (Frederic B., reported by Mark Fenner)
* Fix GpuSoftmax and GpuSoftmaxWithBias crash on GTX285. (Frederic B.)
* Fix crash due to a race condition when importing theano. (Ian G.)
* Fix crash from path problem with `theano-nose --batch`. (Abalkin)
* Fix crash with tensor.roll(Var, iscalar). (Frederic B., reported by Jeremiah Lowin)
* Fix compilation crash with llvm on Mac. (Abalkin)
* Fix the grad of Scan that told wrongly that there is no connection between cost and parameters. (Razvan P.)
* The infer shape mechanism now force that broadcasted dimensions have a shape know to be equivalent to one during compilation.
Sometimes, we where not able knowing this before run time and resulted in crash. (Frederic B.)
* Fix compilation problems on GPU on Windows. (Frederic B.)
* Fix copy on the GPU with big shape for 4d tensor (Pascal L.)
* GpuSubtensor didn't set the stride to 0 for dimensions of 1. This could lead to check failing later that caused a crash. (Frederic B., reported by vmichals)
Theoretical bugfix (bug that won't happen with current Theano code, but if you messed with the internal, could have affected you):
* GpuContiguous, GpuAlloc, GpuDownSampleGrad, Conv2d now check the preallocated outputs strides before using it. (Pascal L.)
* GpuDownSample, GpuDownSampleGrad didn't work correctly with negative strides in their output due to problem with nvcc (Pascal L, reported by abalkin?)
Others:
* Fix race condition when determining if g++ is available. (Abalkin)
* Documentation improvements. (Many people including David W-F, abalkin, Amir Elaguizy, Olivier D., Frederic B.)
* The current GPU back-end have a new function CudaNdarray_prep_output(CudaNdarray ** arr, int nd, const int * dims) (Ian G)
=============
Release Notes
=============
Theano 0.6rc2 (November 21th, 2012)
===================================
Highlights:
* Fix for a few regressions introduced in 0.6rc1.
* A few new features.
* Speed-ups.
* Scan fixes.
* Crash fixes.
* A few small interface changes.
Commiters for this rc2 only:
Razvan Pascanu
Pascal Lamblin
Frederic Bastien
Ian Goodfellow
Jeremiah Lowin
Caglar Gulcehre
Jey Kottalam
Matthew Rocklin
abalkin
Regressions in 0.6rc1 fixed:
* Fixed the scan gradient dtype issue. In 0.6rc1, some upcast were inserted. (Razvan P.)
* Now grad() will do as before 0.6rc1 for float, i.e. the grad dtype will be the same as the inputs inside the graph. If you ask for the direct grad, it will return the computed dtype. (Pascal L.)
Wrong results fixes:
* Scan fix in some case didn't returned the good results. (Razvan P., reported by Jeremiah L.)
This happened if you had a state with only neg tap and the output of the state was a function of some sequence.
If you had multiple states, there was no problem.
* Fixed bug in Scan with multiple outputs,
where one output would sometimes overwrite another one. (Razvan P.)
* Clip.grad treated the gradient with respect to the clipping boundary as always 0. (Ian G.)
Interface changes:
* We do not support anymore unaligned ndarray in Python code. (Frederic B.)
We did not support it in C code and supporting it in Python code made
the detection harder.
* Now we only officially support SciPy 0.7.2 and NumPy 1.5.0 (Frederic B.)
We weren't and aren't testing with older versions.
* The theano.sparse.SparseType is available even when SciPy is not (Frederic B.)
* Fixed issue where members of consider_constant grad parameter
were treated differently from Constant variables. (Ian G.)
* Removed the parameter g_cost from theano.grad(). (Ian G.)
Use the new more powerful parameter known_grads instead.
NumPy interface support:
* theano.tensor.where is an alias for theano.tensor.switch to support NumPy semantic. (Ian G.)
* TensorVariable objects now have dot, argmin, argmax, clip, conj, repeat, trace, std, round,
ravel and argsort functions and the real and imag properties as numpy.ndarray objects.
The functionality was already available in Theano. (abalkin)
Speed-ups:
* A C version of the SoftMax op (Razvan P.)
There was C code for the softmax with bias code.
* Faster GpuIncSubtensor (Ian G.)
* Faster copy on the GPU for 4d tensor. (Ian G.)
* The fix of flatten infer_shape re-enables an optimization (Pascal L.)
* The bug was introduced in 0.6rc1.
* Enable inc_subtensor on the GPU when updating it with a float64 dtype. (Ian G.)
It was causing an optimization warning.
* Make DeepCopy reuse preallocated memory. (Frederic B.)
* Move the convolution to the GPU when the image shape and logical image shape differ. (Frederic Bastien)
* C code for the View Op (Razvan P., Pascal L.)
New Features:
* Added a monitoring mode "MonitorMode" as a debugging tool. (Olivier D.)
* Allow integer axes when keepdims==True (Jeremiah Lowin)
* Added erfinv and erfcinv op. (Jey Kottalam)
* Added tensor.batched_dot(). (Caglar Gulcehre)
It uses scan behind the scenes, but makes doing this easier.
* theano.get_constant_value(x) (Frederic B.)
This tries to have x as a constant int.
This does some constant folding to try to convert x into an int.