* CudaNdarray_new_null is deprecated in favour of CudaNdarray_New
* Dividing integers with / is deprecated: use // for integer division, or
cast one of the integers to a float type if you want a float result (you may
also change this behavior with config.int_division).
New features:
...
...
@@ -67,8 +48,38 @@ Optimizations:
* SetSubtensor(x, x[idx], idx) -> x (when x is a constant)
* subtensor(alloc,...) -> alloc
* Many new scan optimization (TODO, list them)
* Lower scan execution overhead with a Cython implementation
* Removed scan double compilation (by using the new Op.make_thunk mechanism)
* Lower scan execution overhead with a Cython implementation
* Removed scan double compilation (by using the new Op.make_thunk mechanism)
* Pushes out computation from the inner graph to the other graph. For not it only pushes out computations that have strictly as inputs only non_sequence inputs and constants
* Merges scan ops that go over the same number of steps (and have the same condition).
* The scan ops should be parallel one to the other (in the sense that one is not a input of another)
GPU:
* PyCUDA/Theano bridge and `documentation <http://deeplearning.net/software/theano/tutorial/pycuda.html>`_.
* New function to easily convert pycuda GPUArray object to and from CudaNdarray object
* Fixed a bug if you crated a view of a manually created CudaNdarray that are view of GPUArray.
* Removed a warning when nvcc is not available and the user did not requested it.