* The random number generator in theano/sandbox/rng_mrg.py did not always return the same sequence of number on the CPU and GPU.
* In some cases, there was a small fraction of garbage in the returned sequence,
but that garbage looked random. So if your usage did not depend too much on the random properties, you might be OK.
* In python mode (not the default mode) when input of elemwise operation was an empty ndarray, we were not returning an empty ndarray.
* Some segfault at exit with GPU code.
* Some bugs in Scan:
* Scan was incorrectly caching the number of steps to execute
This affect you only if you change the number of step of a compiled scan op. Constant number of step were ok.
* others: Razvan?
* In GpuConv, errors in conv_patch_stack_reduce when the entire kernel doesn't fit into shared memory.
The error was not found before as the impact was less then the relative tolerance of 1e-3. Now the relative tolerance is 1e-5.
Crash fixed:
* Add a feature to not have an exception that makes Theano crash when taking the gradient on DimShuffle in some particular case.
* Compilation crash for GpuElemwise with tensor with high number of dimensions(~6 or more).
* Disabled C code generator that make gcc crash on complex type.
* Crash in optimization when an Op has no input.
* output shape is now computed correctly for matrix-vector multiplication on GPU.
* In Scan, when using numbers as inputs, not symbolic variables
* In GpuSum, bug in calculation of n_blocks for the 10 pattern
(Sum on the row of a matrix)
Optimization:
* New SpecifyShape op that allow to pass more shape info in the graph.
* Speed up gemv by a work around scipy gemv slowness when the matrix is in C order (the default).
* Remove join of only 1 element
* During optimization, consider one more case in get_constant_value.
GPU:
* cuda_shared.value = X now works inplace!
* cuda_shared_var.set_value(new_ndarray) will overwrite the old value inplace in the most common case.
* Allow to create a CudaNdarraySharedVariable from a CudaNdarray.
* new init_gpu_device theano flags.
* Fuse GpuElemwise more often (in the case where there are so many inputs that fusing them all would bust the 256 bytes limit of parameter to gpu function).
* Cpu join of only 1 element that was not moved to the gpu.
New features:
* Tensor.reshape now makes dimensions of length broadcastable (fixes #434).
* Tensor.prod now implements the gradient
* DebugMode now warns if an Op declared itself as returning a view of the input but did not do so.
* This behaviour is a problem, because it can block other Ops from being inplace on the same inputs. This could lower the reuse of memory.
* Sparse.structured_dot now works when both matrices are sparse
* Sparse type is now supported by the shape op, and the ShapeFeature optimizer works correctly with them.
* New 3D convolution ops, with CPU and GPU implementations.
* New colors in pydotprint.
Documentation:
* Documented lib.amdlibm and (new) init_gpu_device config variables.
* A new page (was done for 0.3 but an error was hiding it on the web page) on the memory aliasing contract of Theano.
* Revision to the Windows installation instructions.
* The cuda documentation is now generated on the web server.
* Better documentation of .theanorc and its sections.
Unit tests:
* Stop usage of deprecated functions or syntax in the unit tests.
* Better testing of GPU convolution nets.
* Make more tests able to use different random seeds.
* Tests of sparse now use default mode, not a hard-coded one.
* Remove some tests of unimplemented features.
Other:
* The name of compiledir now includes the Python version to make it easier for people with many Python versions
* Added theano.tensor.std as a shortcut to sqrt(var(input=input, axis=axis)).
* Whitespace, tabulation and indentation clean-up in the code.
* Better detection of memory sharing between variables.
Download
--------
You can download Theano from http://pypi.python.org/pypi/Theano.
Description
-----------
Theano is a Python library that allows you to define, optimize, and
This is the first major release of Theano since 0.1. Version 0.2 development started internally but it was never advertised as a release.
There have been so many changes since 0.1 that we have lost track of many of them. Below is a *partial* list of changes since 0.1.
* GPU code using NVIDIA's CUDA framework is now generated for many Ops.
* Some interface changes since 0.1:
* A new "shared variable" system to allow reusing memory space between Theano functions.
* A new memory contract has been formally written for Theano, for people who want to minimize memory copies.
* The old module system has been deprecated.
* By default, inputs to a Theano function will not be silently downcasted (e.g. from float64 to float32).
* An error is now raised when using the result of logical operation on Theano variable in an 'if' (i.e. an implicit call to __nonzeros__).
* An error is now raised when we receive a non-aligned ndarray as input to a function (this is not supported).
* An error is raised when the list of dimensions passed to dimshuffle() contains duplicates or is otherwise not sensible.
* Call NumPy BLAS bindings for gemv operations in addition to the already supported gemm.
* If gcc is unavailable at import time, Theano now falls back to a Python-based emulation mode after raising a warning.
* An error is now raised when tensor.grad is called on a non-scalar Theano variable (in the past we would implicitly do a sum on the tensor to make it a scalar).
* Added support for "erf" and "erfc" functions.
* The current default value of the parameter axis of theano.{max,min,argmax,argmin,max_and_argmax} is deprecated. We now use the default NumPy behavior of operating on the entire tensor.
* Theano is now available from PyPI and installable through "easy_install" or "pip".
* The random number generator in theano/sandbox/rng_mrg.py did not always return the same sequence of number on the CPU and GPU.
* The random number generator in theano/sandbox/rng_mrg.py did not always return the same sequence of number on the CPU and GPU.
* In that case, there was garbage in the returned sequence, but that garbage looked random. So if your usage did not depend too much on the random properties, you might be OK.
* In some cases, there was a small fraction of garbage in the returned sequence,
but that garbage looked random. So if your usage did not depend too much on the random properties, you might be OK.
* In python mode (not the default mode) when input of elemwise operation was an empty ndarray, we were not returning an empty ndarray.
* In python mode (not the default mode) when input of elemwise operation was an empty ndarray, we were not returning an empty ndarray.
* Some segfault at exit with GPU code.
* Some segfault at exit with GPU code.
* Some bugs in Scan:
* Some bugs in Scan:
* Scan was incorrectly caching the number of steps to execute
* Scan was incorrectly caching the number of steps to execute
This affect you only if you change the number of step of a compiled scan op. Constant number of step were ok.
* others: Razvan?
* others: Razvan?
* In GpuSum, bug in calculation of n_blocks for the 10 pattern
* In GpuConv, errors in conv_patch_stack_reduce when the entire kernel doesn't fit into shared memory.
* In GpuConv, errors in conv_patch_stack_reduce when the entire kernel doesn't fit into shared memory.
The error was not found before as the impact was less then the relative tolerance of 1e-3. Now the relative tolerance is 1e-5.
The error was not found before as the impact was less then the relative tolerance of 1e-3. Now the relative tolerance is 1e-5.
Crash fixed:
Crash fixed:
* Add a feature to not have an exception that makes Theano crash when taking the gradient on DimShuffle in some particular case.
* Add a feature to not have an exception that makes Theano crash when taking the gradient on DimShuffle in some particular case.
* Compilation crash for GpuElemwise with tensor with very high number of dimensions.
* Compilation crash for GpuElemwise with tensor with high number of dimensions(~6 or more).
* Disabled C code generator that make gcc crash on complex type.
* Disabled C code generator that make gcc crash on complex type.
* Crash in optimization when an Op has no input.
* Crash in optimization when an Op has no input.
* output shape is now computed correctly for matrix-vector multiplication on GPU.
* output shape is now computed correctly for matrix-vector multiplication on GPU.
* In Scan, when using numbers as inputs, not symbolic variables
* In Scan, when using numbers as inputs, not symbolic variables
* In GpuSum, bug in calculation of n_blocks for the 10 pattern
(Sum on the row of a matrix)
Optimization:
Optimization:
* New SpecifyShape op that allow to pass more shape info in the graph.
* New SpecifyShape op that allow to pass more shape info in the graph.
* Fuse GpuElemwise more often (in the case where there are so many inputs that fusing them all would bust the 256 bytes limit of parameter to gpu function).
* Speed up gemv by a work around scipy gemv slowness when the matrix is in C order (the default).
* Speed up gemv by a work around scipy gemv slowness when the matrix is in C order (the default).
* Remove join of only 1 element
* Remove join of only 1 element
* Cpu join of only 1 element that was not moved to the gpu.
* During optimization, consider one more case in get_constant_value.
* During optimization, consider one more case in get_constant_value.
GPU:
GPU:
...
@@ -43,10 +41,12 @@ GPU:
...
@@ -43,10 +41,12 @@ GPU:
* cuda_shared_var.set_value(new_ndarray) will overwrite the old value inplace in the most common case.
* cuda_shared_var.set_value(new_ndarray) will overwrite the old value inplace in the most common case.
* Allow to create a CudaNdarraySharedVariable from a CudaNdarray.
* Allow to create a CudaNdarraySharedVariable from a CudaNdarray.
* new init_gpu_device theano flags.
* new init_gpu_device theano flags.
* Fuse GpuElemwise more often (in the case where there are so many inputs that fusing them all would bust the 256 bytes limit of parameter to gpu function).
* Cpu join of only 1 element that was not moved to the gpu.
New features:
New features:
* tensor.reshape now makes dimensions of length broadcastable (fixes #434).
* Tensor.reshape now makes dimensions of length broadcastable (fixes #434).
* tensor.prod now implements the gradient
* Tensor.prod now implements the gradient
* DebugMode now warns if an Op declared itself as returning a view of the input but did not do so.
* DebugMode now warns if an Op declared itself as returning a view of the input but did not do so.
* This behaviour is a problem, because it can block other Ops from being inplace on the same inputs. This could lower the reuse of memory.
* This behaviour is a problem, because it can block other Ops from being inplace on the same inputs. This could lower the reuse of memory.
* Sparse.structured_dot now works when both matrices are sparse
* Sparse.structured_dot now works when both matrices are sparse
...
@@ -70,30 +70,6 @@ Unit tests:
...
@@ -70,30 +70,6 @@ Unit tests:
Other:
Other:
* The name of compiledir now includes the Python version to make it easier for people with many Python versions
* The name of compiledir now includes the Python version to make it easier for people with many Python versions
* added theano.tensor.std as a shortcut to sqrt(var(input=input, axis=axis)).
* Added theano.tensor.std as a shortcut to sqrt(var(input=input, axis=axis)).
* Whitespace, tabulation and indentation clean-up in the code.
* Whitespace, tabulation and indentation clean-up in the code.
* Better detection of memory sharing between variables.
* Better detection of memory sharing between variables.
Theano 0.3 (2010-11-23)
-----------------------
This is the first major release of Theano since 0.1. Version 0.2 development started internally but it was never advertised as a release.
There have been so many changes since 0.1 that we have lost track of many of them. Below is a *partial* list of changes since 0.1.
* GPU code using NVIDIA's CUDA framework is now generated for many Ops.
* Some interface changes since 0.1:
* A new "shared variable" system to allow reusing memory space between Theano functions.
* A new memory contract has been formally written for Theano, for people who want to minimize memory copies.
* The old module system has been deprecated.
* By default, inputs to a Theano function will not be silently downcasted (e.g. from float64 to float32).
* An error is now raised when using the result of logical operation on Theano variable in an 'if' (i.e. an implicit call to __nonzeros__).
* An error is now raised when we receive a non-aligned ndarray as input to a function (this is not supported).
* An error is raised when the list of dimensions passed to dimshuffle() contains duplicates or is otherwise not sensible.
* Call NumPy BLAS bindings for gemv operations in addition to the already supported gemm.
* If gcc is unavailable at import time, Theano now falls back to a Python-based emulation mode after raising a warning.
* An error is now raised when tensor.grad is called on a non-scalar Theano variable (in the past we would implicitly do a sum on the tensor to make it a scalar).
* Added support for "erf" and "erfc" functions.
* The current default value of the parameter axis of theano.{max,min,argmax,argmin,max_and_argmax} is deprecated. We now use the default NumPy behavior of operating on the entire tensor.
* Theano is now available from PyPI and installable through "easy_install" or "pip".