提交 e0fd6770 authored 作者: Olivier Delalleau's avatar Olivier Delalleau

Copied NEWS.txt into doc dir to update online doc

上级 76c0c7c0
Theano 0.3 (2010-11-23) Modifications in the trunk since the last release
-----------------------
Theano 0.4.0 (2011-06-27)
This is the first major release of Theano since 0.1. Version 0.2 development started internally but it was never advertised as a release. --------------------------------------------------
There have been so many changes since 0.1 that we have lost track of many of them. Below is a *partial* list of changes since 0.1. Change in output memory storage for Ops:
If you implemented custom Ops, with either C or Python implementation,
* GPU code using NVIDIA's CUDA framework is now generated for many Ops. this will concern you.
* Some interface changes since 0.1:
* A new "shared variable" system to allow reusing memory space between Theano functions. The contract for memory storage of Ops has been changed. In particular,
* A new memory contract has been formally written for Theano, for people who want to minimize memory copies. it is no longer guaranteed that output memory buffers are either empty,
* The old module system has been deprecated. or allocated by a previous execution of the same Op.
* By default, inputs to a Theano function will not be silently downcasted (e.g. from float64 to float32).
* An error is now raised when using the result of logical operation on Theano variable in an 'if' (i.e. an implicit call to __nonzeros__). Right now, here is the situation:
* An error is now raised when we receive a non-aligned ndarray as input to a function (this is not supported). * For Python implementation (perform), what is inside output_storage
* An error is raised when the list of dimensions passed to dimshuffle() contains duplicates or is otherwise not sensible. may have been allocated from outside the perform() function, for
* Call NumPy BLAS bindings for gemv operations in addition to the already supported gemm. instance by another node (e.g., Scan) or the Mode. If that was the
* If gcc is unavailable at import time, Theano now falls back to a Python-based emulation mode after raising a warning. case, the memory can be assumed to be C-contiguous (for the moment).
* An error is now raised when tensor.grad is called on a non-scalar Theano variable (in the past we would implicitly do a sum on the tensor to make it a scalar). * For C implementations (c_code), nothing has changed yet.
* Added support for "erf" and "erfc" functions.
* The current default value of the parameter axis of theano.{max,min,argmax,argmin,max_and_argmax} is deprecated. We now use the default NumPy behavior of operating on the entire tensor. In a future version, the content of the output storage, both for Python and C
* Theano is now available from PyPI and installable through "easy_install" or "pip". versions, will either be NULL, or have the following guarantees:
* It will be a Python object of the appropriate Type (for a Tensor variable,
a numpy.ndarray, for a GPU variable, a CudaNdarray, for instance)
* It will have the correct number of dimensions, and correct dtype
However, its shape and memory layout (strides) will not be guaranteed.
When that change is made, the config flag DebugMode.check_preallocated_output
will help you find implementations that are not up-to-date.
Deprecation:
* tag.shape attribute deprecated (#633)
* CudaNdarray_new_null is deprecated in favour of CudaNdarray_New
* Dividing integers with / is deprecated: use // for integer division, or
cast one of the integers to a float type if you want a float result (you may
also change this behavior with config.int_division).
* Removed (already deprecated) sandbox/compile module
* Removed (already deprecated) incsubtensor and setsubtensor functions,
inc_subtensor and set_subtensor are to be used instead.
Bugs fixed:
* In CudaNdarray.__{iadd,idiv}__, when it is not implemented, return the error.
* THEANO_FLAGS='optimizer=None' now works as expected
* Fixed memory leak in error handling on GPU-to-host copy
* Fix relating specifically to Python 2.7 on Mac OS X
* infer_shape can now handle Python longs
* Trying to compute x % y with one or more arguments being complex now
raises an error.
* The output of random samples computed with uniform(..., dtype=...) is
guaranteed to be of the specified dtype instead of potentially being of a
higher-precision dtype.
* The perform() method of DownsampleFactorMax did not give the right result
when reusing output storage. This happen only if you use the Theano flags
'linker=c|py_nogc' or manually specify the mode to be 'c|py_nogc'.
Crash fixed:
* Work around a bug in gcc 4.3.0 that make the compilation of 2d convolution
crash.
* Some optimizations crashed when the "ShapeOpt" optimization was disabled.
Optimization:
* Optimize all subtensor followed by subtensor.
GPU:
* Move to the gpu fused elemwise that have other dtype then float32 in them
(except float64) if the input and output are float32.
* This allow to move elemwise comparisons to the GPU if we cast it to
float32 after that.
* Implemented CudaNdarray.ndim to have the same interface in ndarray.
* Fixed slowdown caused by multiple chained views on CudaNdarray objects
* CudaNdarray_alloc_contiguous changed so as to never try to free
memory on a view: new "base" property
* Safer decref behaviour in CudaNdarray in case of failed allocations
* New GPU implementation of tensor.basic.outer
* Multinomial random variates now available on GPU
New features:
* ProfileMode
* profile the scan overhead
* simple hook system to add profiler
* reordered the output to be in the order of more general to more specific
* DebugMode now checks Ops with different patterns of preallocated memory,
configured by config.DebugMode.check_preallocated_output.
* var[vector of index] now work, (grad work recursively, the direct grad
work inplace, gpu work)
* limitation: work only of the outer most dimensions.
* New way to test the graph as we build it. Allow to easily find the source
of shape mismatch error:
`http://deeplearning.net/software/theano/tutorial/debug_faq.html#interactive-debugger`__
* cuda.root inferred if nvcc is on the path, otherwise defaults to
/usr/local/cuda
* Better graph printing for graphs involving a scan subgraph
* Casting behavior can be controlled through config.cast_policy,
new (experimental) mode.
* Smarter C module cache, avoiding erroneous usage of the wrong C
implementation when some options change, and avoiding recompiling the
same module multiple times in some situations.
* The "theano-cache clear" command now clears the cache more thoroughly.
* More extensive linear algebra ops (CPU only) that wrap scipy.linalg
now available in the sandbox.
* CUDA devices 4 - 16 should now be available if present.
* infer_shape support for the View op, better infer_shape support in Scan
* infer_shape supported in all case of subtensor
* tensor.grad now gives an error by default when computing the gradient
wrt a node that is disconnected from the cost (not in the graph, or
no continuous path from that op to the cost).
* New tensor.isnan and isinf functions.
Documentation:
* Better commenting of cuda_ndarray.cu
* Fixes in the scan documentation: add missing declarations/print statements
* Better error message on failed __getitem__
* Updated documentation on profile mode
* Better documentation of testing on Windows
* Better documentation of the 'run_individual_tests' script
Unit tests:
* More strict float comparaison by default
* Reuse test for subtensor of tensor for gpu tensor(more gpu test)
* Tests that check for aliased function inputs and assure appropriate copying
(#374)
* Better test of copies in CudaNdarray
* New tests relating to the new base pointer requirements
* Better scripts to run tests individually or in batches
* Some tests are now run whenever cuda is available and not just when it has
been enabled before
* Tests display less pointless warnings.
Other:
* Correctly put the broadcast flag to True in the output var of
a Reshape op when we receive an int 1 in the new shape.
* pydotprint: high contrast mode is now the default, option to print
more compact node names.
* pydotprint: How trunk label that are too long.
* More compact printing (ignore leading "Composite" in op names)
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论