Copied NEWS.txt into doc dir to update online doc

e0fd6770 · Olivier Delalleau · 76c0c7c0 · e0fd6770
--- a/doc/NEWS.txt
+++ b/doc/NEWS.txt
-Theano 0.3 (2010-11-23)
+Modifications in the trunk since the last release
-----------------------
+Theano 0.4.0 (2011-06-27)
-This is the first major release of Theano since 0.1. Version 0.2 development started internally but it was never advertised as a release.
+--------------------------------------------------
-There have been so many changes since 0.1 that we have lost track of many of them. Below is a *partial* list of changes since 0.1.
+Change in output memory storage for Ops:
+ If you implemented custom Ops, with either C or Python implementation,
- * GPU code using NVIDIA's CUDA framework is now generated for many Ops.
+ this will concern you.
- * Some interface changes since 0.1:
-     * A new "shared variable" system to allow reusing memory space between Theano functions.
+ The contract for memory storage of Ops has been changed. In particular,
-         * A new memory contract has been formally written for Theano, for people who want to minimize memory copies.
+ it is no longer guaranteed that output memory buffers are either empty,
-     * The old module system has been deprecated.
+ or allocated by a previous execution of the same Op.
-     * By default, inputs to a Theano function will not be silently downcasted (e.g. from float64 to float32).
-     * An error is now raised when using the result of logical operation on Theano variable in an 'if' (i.e. an implicit call to __nonzeros__).
+ Right now, here is the situation:
-     * An error is now raised when we receive a non-aligned ndarray as input to a function (this is not supported).
+ * For Python implementation (perform), what is inside output_storage
-     * An error is raised when the list of dimensions passed to dimshuffle() contains duplicates or is otherwise not sensible.
+   may have been allocated from outside the perform() function, for
-     * Call NumPy BLAS bindings for gemv operations in addition to the already supported gemm.
+   instance by another node (e.g., Scan) or the Mode. If that was the
-     * If gcc is unavailable at import time, Theano now falls back to a Python-based emulation mode after raising a warning.
+   case, the memory can be assumed to be C-contiguous (for the moment).
-     * An error is now raised when tensor.grad is called on a non-scalar Theano variable (in the past we would implicitly do a sum on the tensor to make it a scalar).
+ * For C implementations (c_code), nothing has changed yet.
-     * Added support for "erf" and "erfc" functions.
- * The current default value of the parameter axis of theano.{max,min,argmax,argmin,max_and_argmax} is deprecated. We now use the default NumPy behavior of operating on the entire tensor.
+ In a future version, the content of the output storage, both for Python and C
- * Theano is now available from PyPI and installable through "easy_install" or "pip".
+ versions, will either be NULL, or have the following guarantees:
+ * It will be a Python object of the appropriate Type (for a Tensor variable,
+   a numpy.ndarray, for a GPU variable, a CudaNdarray, for instance)
+ * It will have the correct number of dimensions, and correct dtype
+ However, its shape and memory layout (strides) will not be guaranteed.
+ When that change is made, the config flag DebugMode.check_preallocated_output
+ will help you find implementations that are not up-to-date.
+Deprecation:
+ * tag.shape attribute deprecated (#633)
+ * CudaNdarray_new_null is deprecated in favour of CudaNdarray_New
+ * Dividing integers with / is deprecated: use // for integer division, or
+   cast one of the integers to a float type if you want a float result (you may
+   also change this behavior with config.int_division).
+ * Removed (already deprecated) sandbox/compile module
+ * Removed (already deprecated) incsubtensor and setsubtensor functions,
+   inc_subtensor and set_subtensor are to be used instead.
+Bugs fixed:
+ * In CudaNdarray.__{iadd,idiv}__, when it is not implemented, return the error.
+ * THEANO_FLAGS='optimizer=None' now works as expected
+ * Fixed memory leak in error handling on GPU-to-host copy
+ * Fix relating specifically to Python 2.7 on Mac OS X
+ * infer_shape can now handle Python longs
+ * Trying to compute x % y with one or more arguments being complex now
+   raises an error.
+ * The output of random samples computed with uniform(..., dtype=...) is
+   guaranteed to be of the specified dtype instead of potentially being of a
+   higher-precision dtype.
+ * The perform() method of DownsampleFactorMax did not give the right result
+   when reusing output storage. This happen only if you use the Theano flags 
+   'linker=c|py_nogc' or manually specify the mode to be 'c|py_nogc'.
+Crash fixed:
+ * Work around a bug in gcc 4.3.0 that make the compilation of 2d convolution
+   crash.
+ * Some optimizations crashed when the "ShapeOpt" optimization was disabled.
+Optimization:
+ * Optimize all subtensor followed by subtensor.
+GPU:
+ * Move to the gpu fused elemwise that have other dtype then float32 in them
+   (except float64) if the input and output are float32.
+   * This allow to move elemwise comparisons to the GPU if we cast it to
+     float32 after that.
+ * Implemented CudaNdarray.ndim to have the same interface in ndarray.
+ * Fixed slowdown caused by multiple chained views on CudaNdarray objects
+ * CudaNdarray_alloc_contiguous changed so as to never try to free
+   memory on a view: new "base" property
+ * Safer decref behaviour in CudaNdarray in case of failed allocations
+ * New GPU implementation of tensor.basic.outer
+ * Multinomial random variates now available on GPU
+New features:
+ * ProfileMode
+    * profile the scan overhead
+    * simple hook system to add profiler
+    * reordered the output to be in the order of more general to more specific
+ * DebugMode now checks Ops with different patterns of preallocated memory,
+   configured by config.DebugMode.check_preallocated_output.
+ * var[vector of index] now work, (grad work recursively, the direct grad
+   work inplace, gpu work)
+    * limitation: work only of the outer most dimensions.
+ * New way to test the graph as we build it. Allow to easily find the source
+   of shape mismatch error:
+   `http://deeplearning.net/software/theano/tutorial/debug_faq.html#interactive-debugger`__
+ * cuda.root inferred if nvcc is on the path, otherwise defaults to
+   /usr/local/cuda
+ * Better graph printing for graphs involving a scan subgraph
+ * Casting behavior can be controlled through config.cast_policy,
+   new (experimental) mode.
+ * Smarter C module cache, avoiding erroneous usage of the wrong C
+   implementation when some options change, and avoiding recompiling the
+   same module multiple times in some situations.
+ * The "theano-cache clear" command now clears the cache more thoroughly.
+ * More extensive linear algebra ops (CPU only) that wrap scipy.linalg
+   now available in the sandbox.
+ * CUDA devices 4 - 16 should now be available if present.
+ * infer_shape support for the View op, better infer_shape support in Scan
+ * infer_shape supported in all case of subtensor
+ * tensor.grad now gives an error by default when computing the gradient
+   wrt a node that is disconnected from the cost (not in the graph, or
+   no continuous path from that op to the cost).
+ * New tensor.isnan and isinf functions.
+Documentation:
+ * Better commenting of cuda_ndarray.cu
+ * Fixes in the scan documentation: add missing declarations/print statements
+ * Better error message on failed __getitem__
+ * Updated documentation on profile mode
+ * Better documentation of testing on Windows
+ * Better documentation of the 'run_individual_tests' script
+Unit tests:
+ * More strict float comparaison by default
+ * Reuse test for subtensor of tensor for gpu tensor(more gpu test)
+ * Tests that check for aliased function inputs and assure appropriate copying
+   (#374)
+ * Better test of copies in CudaNdarray
+ * New tests relating to the new base pointer requirements
+ * Better scripts to run tests individually or in batches
+ * Some tests are now run whenever cuda is available and not just when it has
+   been enabled before
+ * Tests display less pointless warnings.
+Other:
+ * Correctly put the broadcast flag to True in the output var of
+   a Reshape op when we receive an int 1 in the new shape.
+ * pydotprint: high contrast mode is now the default, option to print
+   more compact node names.
+ * pydotprint: How trunk label that are too long.
+ * More compact printing (ignore leading "Composite" in op names)