提交 14fcbe82 authored 作者: Frédéric Bastien's avatar Frédéric Bastien

Merge pull request #1464 from delallea/minor

Minor fixes
......@@ -14,97 +14,97 @@ Theano Development version
Highlights:
* Python 3.3 compatibility with buildbot.
* Full advanced indexing support.
* Better Windows 64 bits support.
* Better Windows 64 bit support.
* New profiler.
* Better error message that help debugging.
* Better error messages that help debugging.
Installation:
* Canopy support (direct link to MKL):
* On Linux and Mac OSX (Frédéric B., Robert Kern)
* On Windows (Edward Shi, Frédéric B.)
* Anaconda instruction (Pascal L., Frederic B.)
* Anaconda instructions (Pascal L., Frederic B.)
* Doc Ubuntu 13.04 (Frederic B.)
Commiters for this rc3 only:
Committers for this rc3 only:
Bug fix:
Bug fixes:
* Fix wrong result of GpuDownsampleFactorMaxGrad on Mac OSX. (Pascal L.)
* Auto-Detect and work around a bug in BLAS on MacOS X (Pascal L.)
* Work around bug in MacOS X. If 2 compiled module had the same name, the os or python was not always the right one event when we use the right handle to it. (Pascal L.)
* Work around bug in MacOS X. If 2 compiled modules had the same name, the OS or Python was not always the right one even when we used the right handle to it. (Pascal L.)
Use this hash in the Python module, and in %(nodename)s, so that different helper functions in the support code for different Ops will always have different names.
* Fix infinit loop related to Scan on the GPU. (Pascal L.)
* Fix ConstructSparseFromList.infer_shape, (Pascal L., reported by Rami Al-Rfou')
* Fix infinite loop related to Scan on the GPU. (Pascal L.)
* Fix ConstructSparseFromList.infer_shape (Pascal L., reported by Rami Al-Rfou')
* (introduced in the development version after 0.6rc3 release) (Frederic B.)
Reduction that upcast the input on no axis (ex: call theano.sum() on a scalar when the original dtype isn't float64 or [u]int64). It produce bad results as we don't upcast the inputs in the code, we just copy it.
* Fix some case of theano.clone() with we git it replacement of x that is a function of x. (Razvan P., reported by Akio Takano)
Reduction that upcasts the input on no axis (ex: call theano.sum() on a scalar when the original dtype isn't float64 or [u]int64). It produced bad results as we don't upcast the inputs in the code, we just copy them.
* Fix some cases of theano.clone() when we get a replacement of x that is a function of x. (Razvan P., reported by Akio Takano)
New Features:
* Python 3.3 compatible (abalkin, Gabe Schwartz, Frederic B.)
* A new profiler (Frederic B.)
The new profiler now can profile the memory with the Theano flag profile_memory=True.
The ProfileMode now can't profile memory anymore and print a message about it.
The ProfileMode now can't profile memory anymore and prints a message about it.
Now we raise an error if we try to profile when the gpu is enabled if we didn't set
correctly the env variable to force the driver to sync the kernel launch.
Otherwise the profile information are useless.
The new profiler support the enabling/disable of the garbage collection.
The new profiler supports the enabling/disabling of the garbage collection.
* Adds tensor.tri, tensor.triu, and tensor.tril functions that wrap Numpy equivalents (Jeremiah Lowin)
* Adds tensor.nonzero, tensor.flatnonzero functions that wrap Numpy equivalents (Jeremiah Lowin)
* Adds tensor.nonzero_values to get around lack of advanced indexing for nonzero elements (Jeremiah Lowin)
* Make {inc,set}_subtensor work on output of take. (Pascal L.)
* When device=cpu and force_device=True, force that we disable the gpu. (Frederic B.)
* Better Windows 64 bits support for indexing/reshaping (Pascal L.)
* Better Windows 64 bit support for indexing/reshaping (Pascal L.)
* Full advanced indexing support (John Salvatier, seberg)
* Add theano.tensor.stacklist(). Recursivly stack lists of tensors to maintain similar structure (Matthew R.)
* Add Theano flag value: on_opt_error=pdb (Olivier D.)
* GpuSoftmax[WithBias] for bigger row. (Frederic B.)
* Make Erfinv work on the GPU (Guillaume Desjardin, Pascal L.)
* Add "theano-cache basecompiledir purge" (Pascal L.)
This purge the all the compiledir that are in the base compiledir.
* A_tensor_variable.zeros_like() now support the dtype parameter (Pascal L.)
This purges all the compiledirs that are in the base compiledir.
* A_tensor_variable.zeros_like() now supports the dtype parameter (Pascal L.)
* More stable reduce operations by default (Pascal L.)
Add an accumulator dtype to CAReduceDtype (acc_dtype)
by default, acc_dtype is float64 for float32 inputs
then, cast to specified output dtype (float32 for float32 inputs)
by default, acc_dtype is float64 for float32 inputs,
then cast to specified output dtype (float32 for float32 inputs)
* Test default blas flag before using it (Pascal L.)
This make it work correctly by default if no blas library is installed.
* Add cuda.unuse() to help test that need to enable/disable the GPU (Fred)
* Add theano.tensor.nnet.ultra_fast_sigmoid and the opt(disabled by default) local_ultra_fast_sigmoid. (Frederic B.)
* Add theano.tensor.nnet.hard_sigmoid and the opt(disabled by default) local_hard_sigmoid. (Frederic B.)
This makes it work correctly by default if no blas library is installed.
* Add cuda.unuse() to help tests that need to enable/disable the GPU (Fred)
* Add theano.tensor.nnet.ultra_fast_sigmoid and the opt (disabled by default) local_ultra_fast_sigmoid. (Frederic B.)
* Add theano.tensor.nnet.hard_sigmoid and the opt (disabled by default) local_hard_sigmoid. (Frederic B.)
* Add class theano.compat.python2x.Counter() (Mehdi Mirza)
* Allow a_cuda_ndarray += another_cuda_ndarray for 6d tensor. (Frederic B.)
* Make the op ExtractDiag work on the GPU. (Frederic B.)
* New op theano.tensor.chi2sf (Ethan Buchman) TODO ??? LICENSES????
* Lift Flatten/Reshape toward input on unary elemwise. (Frederic B.)
This make the "log(1-sigmoid) -> softplus" stability optimization being applied with a flatten/reshape in the middle.
This makes the "log(1-sigmoid) -> softplus" stability optimization being applied with a flatten/reshape in the middle.
* Make MonitorMode use the default optimizers config and allow it to change used optimizers (Frederic B.)
* Add support for ScalarOp.c_support_code in GpuElemwise. (Frederic B.)
* Also make the Psi function run on GPU. (Frederic B.)
* Make tensor.outer(x,y) work when ndim != 1 as numpy.outer.
* Kron op: Speed up/generalize/GPU friendly. (Frederic B.)
(It is not an op anymore, but reuse current op)
(It is not an op anymore, but reuses current op)
* Add gpu max for pattern (0, 1) and added all gpu max pattern for gpu min. (Frederic B.)
* Add GpuEye (Frederic B.)
* Make GpuCrossentropySoftmaxArgmax1HotWithBias and GpuCrossentropySoftmax1HotWithBiasDx work for bigger inputs (Frederic B., reported by Ryan Price)
* Finish and move out of sandbox theano.sparse.basic.true_dot (Nicolas Bouchard, Frederic B.)
And document all sparse dot variant.
And document all sparse dot variants.
Interface Deprecation (a warning is printed):
* The mode ProfileMode is now deprecated, use the Theano flag profile=True to remplace it.
* The mode ProfileMode is now deprecated, use the Theano flag profile=True to replace it.
* New theano.sparse_grad() interface to get the sparse grad of a_tensor[an_int_vector]. (Frederic B.)
This can speed up the sparse computation when a small fraction of a_tensor is taken.
This can speed up the sparse computations when a small fraction of a_tensor is taken.
Deprecate the old interface for this. (Frederic B.)
Interface Change:
Interface Changes:
* Add -m32 or -m64 in the module cache key and add the python bitwidth in the compiledir path. (Pascal L.)
* mrg.normal now have the parameter size mandatory. It was crashing with the default value of None. (Olivier D.)
* Remove the deprecated passing of multiple mode to theano function. (Frederic B.)
* mrg.normal now has the parameter size mandatory. It was crashing with the default value of None. (Olivier D.)
* Remove the deprecated passing of multiple modes to theano function. (Frederic B.)
New Interface (reuse existing functionality):
New Interface (reuses existing functionality):
* Add hostname as a var in compiledir_format (Frederic B.)
New debug feature:
New debug features:
Speed-ups:
* Faster GpuAdvancedIncSubtensor1 on Fermi GPU (and up) on matrix. (Vivek Kulkarni)
* Faster GPUAdvancedIncSubtensor1 in some cases on all GPU (Vivek Kulkarni)
......@@ -114,19 +114,19 @@ Speed-ups:
* Add MakeVector.c_code (Fred)
* Add Shape.c_code (Fred)
* Optimize Elemwise when all the inputs are fortran (Frederic B.)
We now generate an fortran output and use vectorisable code.
We now generate a fortran output and use vectorisable code.
* Add ScalarOp.c_code_contiguous interface and do a default version. (Frederic B.)
This could optimize elemwise by helping the compiler generate SIMD instruction.
* Use ScalarOp.c_code_contiguous with amdlibm. (Frederic B.)
This speed up exp, pow, sin, cos, log, log2, log10 and sigmoid when the input is contiguous in memory.
* A fix that remove an local_setsubtensor_of_allocs optimization warning and enable it in that case. (Frederic B., reported by John Salvatier)
This speeds up exp, pow, sin, cos, log, log2, log10 and sigmoid when the input is contiguous in memory.
* A fix that removes a local_setsubtensor_of_allocs optimization warning and enables it in that case. (Frederic B., reported by John Salvatier)
* Make inv_as_solve optimization work (Matthew Rocklin)
Crash fixes:
* AdvancedSubtensor1: allow broadcasted index vector. (Frederic B., reported by Jeremiah Lowin)
* Fix compute_test_value for ifelse (Olivier D., reported by Bitton Tenessi)
* Fix import error with some version of NumPy (Olivier D.)
* Fix import error with some versions of NumPy (Olivier D.)
* Fix Scan grad exception (Razvan P., reported by Nicolas BL)
* Fix compute_test_value for a non_sequence when calling the gradient of Scan (Pascal L., reported by Bitton Tenessi).
* Crash fix in Scan following interface change in 0.6rc2 (Razvan P.)
......@@ -144,7 +144,7 @@ Crash fixes:
* Crash fix in the grad of GPU op in corner case (Pascal L.)
* Crash fix on MacOS X (Robert Kern)
* theano.misc.gnumpy_utils.garray_to_cudandarray() set strides correctly for dimensions of 1. (Frederic B., reported by Justin Bayer)
* Fix crash during optimization with consecutive sum and some combination of axis (Frederic B., reported by Çağlar Gülçehre)
* Fix crash during optimization with consecutive sums and some combination of axis (Frederic B., reported by Çağlar Gülçehre)
* Fix crash with keepdims and negative axis (Frederic B., reported by David W.-F.)
* Fix crash of theano.[sparse.]dot(x,y) when x or y is a vector. (Frederic B., reported by Zsolt Bitvai)
* Fix opt crash/disabled with ifelse on the gpu (Frederic B, reported by Ryan Price)
......@@ -152,23 +152,23 @@ Crash fixes:
Others:
* Theano flags are now evaluated lazyly, only if requeted (Frederic B.)
* Theano flags are now evaluated lazily, only if requested (Frederic B.)
* Fix test when g++ is not avail (Frederic B.)
* Typo/pep8 (Olivier D., Frederic B.)
* Update doc (Ben McCann)
* Doc compatibility guide (abalkin)
* Doc the MPI and load op (Frederic B.)
* Add manual instruction for OpenBLAS on Ubuntu by (Jianri Li )
* Doc the MPI and load Ops (Frederic B.)
* Add manual instructions for OpenBLAS on Ubuntu by (Jianri Li )
* Doc fixes (Yaroslav Halchenko)
* Better/more error message (Frederic B., Pascal L., Ian Goodfellow)
* Better/more error messages (Frederic B., Pascal L., Ian Goodfellow)
* More doc (Frederic B.)
* Fix Error reporting with GpuConv (Frederic B., reported by Heng Luo and Nicolas Pinto)
* Update BLAS compilation doc on windows to use OpenBLAS (Olivier D.)
* The infer_shape tester method now warn if the shapes values could hide errors. (Frederic B.)
* Now travis-ci test with scipy the part that need it (Frederic B.)
* Export some function that work on CudaNdarray for windows (Frederic B.)
* If the user specify an -arch=sm_* value in the Theano flags for the gpu, don't add one (Frederic B., Pascal L.)
* If a c thunk return an error, check if a python exception is set. Otherwise, set a default one (Pascal L.)
* The infer_shape tester method now warns if the shapes values could hide errors. (Frederic B.)
* Now travis-ci tests with scipy the parts that need it (Frederic B.)
* Export some functions that work on CudaNdarray for windows (Frederic B.)
* If the user specifies a -arch=sm_* value in the Theano flags for the gpu, don't add one (Frederic B., Pascal L.)
* If a C thunk returns an error, check if a python exception is set. Otherwise, set a default one (Pascal L.)
* Crash fix introduced in the development version (Wei LI)
* Added BLAS benchmark result (Frederic B., Ben McCann)
* Fix code comment (Hannes Schulz)
......@@ -177,11 +177,11 @@ Others:
* Better error message with compute_test_value (Frederic, reported by John Salvatier)
* Stochastic order behavior fix (Frederic B.)
* Simpler initial graph for subtensor infer shape (Olivier D.)
The optimization was doing the optimization, but this allow better reading of the graph before optimization.
* Better detectiont of non-aligned ndarray (Frederic B.)
* Updae MRG multinomial gradient to the new interface (Mehdi Mirza)
The optimization was doing the optimization, but this allows better reading of the graph before optimization.
* Better detection of non-aligned ndarray (Frederic B.)
* Update MRG multinomial gradient to the new interface (Mehdi Mirza)
* Implement Image2Neibs.perform() to help debug (Frederic B.)
* Remove Theano flags from the compilation key (Frederic B.)
* Remove some Theano flags from the compilation key (Frederic B.)
* Make theano-nose work on executable *.py files. (Alistair Muldal)
* Make theano-nose work with older nose version (Frederic B.)
* Add extra debug info in verify_grad() (Frederic B.)
......
......@@ -850,7 +850,7 @@ You can then proceed to the :ref:`windows_basic` or the :ref:`windows_bleeding_e
Alternative: Canopy
###################
Another software from Enthought that install all Theano dependancy.
Another software from Enthought that installs all Theano dependencies.
If you are affiliated with a university (as student or employee), you
can download the installation for free.
......@@ -863,8 +863,8 @@ can download the installation for free.
- In Canopy Package Manager, search and install packages "mingw 4.5.2" and "libpython 1.2"
- (Needed only for Theano 0.6rc3 or earlier)
The "libpython 1.2" package installs files `libpython27.a` and `libmsvcr90.a` to
`C:\Users\<USER>\AppData\Local\Enthought\Canopy\User\libs`. Copy the two files to
`C:\Users\<USER>\AppData\Local\Enthought\Canopy\App\appdata\canopy-1.0.0.1160.win-x86_64\libs`.
`C:\\Users\\<USER>\\AppData\\Local\\Enthought\\Canopy\\User\\libs`. Copy the two files to
`C:\\Users\\<USER>\\AppData\\Local\\Enthought\\Canopy\\App\\appdata\\canopy-1.0.0.1160.win-x86_64\libs`.
- (Needed only for Theano 0.6rc3 or earlier) Set the Theano flags
``blas.ldflags=-LC:\Users\<USER>\AppData\Local\Enthought\Canopy\App\appdata\canopy-1.0.0.1160.win-x86_64\Scripts -lmk2_core -lmk2_intel_thread -lmk2_rt``.
......
......@@ -288,12 +288,12 @@ can be achieved as follows:
To help understand what is happening in your graph, you can
disable the ``local_elemwise_fusion`` and all ``inplace``
optimizations. The first is a speed optimization that merge elemwise
operations together. This make it harder to know which particular
elemwise cause the problem. The second optimization make some ops
output overwrite its input. So, if an op create a bad output, you
won't be able see the input that was overwriten in the ``post_fun``
function. To disable those optimization (with a Theano version after
optimizations. The first is a speed optimization that merges elemwise
operations together. This makes it harder to know which particular
elemwise causes the problem. The second optimization makes some ops'
outputs overwrite their inputs. So, if an op creates a bad output, you
will not be able to see the input that was overwriten in the ``post_func``
function. To disable those optimizations (with a Theano version after
0.6rc3), define the MonitorMode like this:
.. code-block:: python
......@@ -311,7 +311,7 @@ function. To disable those optimization (with a Theano version after
mode with MonitorMode, as you need to define what you monitor.
To be sure all inputs of the node are available during the call to
``post_func``, you also must disable the garbage collector. Otherwise,
``post_func``, you must also disable the garbage collector. Otherwise,
the execution of the node can garbage collect its inputs that aren't
needed anymore by the Theano function. This can be done with the Theano
flag:
......
......@@ -37,14 +37,14 @@ class MonitorMode(Mode):
:param optimizer: The optimizer to use. One may use for instance
'fast_compile' to skip optimizations.
:param linker: DO NOT USE. This mode use its own linker.
:param linker: DO NOT USE. This mode uses its own linker.
The parameter is needed to allow selecting optimizers to use.
"""
self.pre_func = pre_func
self.post_func = post_func
wrap_linker = theano.gof.WrapLinkerMany([theano.gof.OpWiseCLinker()],
[self.eval])
if optimizer is 'default':
if optimizer == 'default':
optimizer = theano.config.optimizer
if (linker is not None and
not isinstance(linker.mode, MonitorMode)):
......
......@@ -442,7 +442,8 @@ def pfunc(params, outputs=None, mode=None, updates=None, givens=None,
if len(updates) > 0 and any(isinstance(v, Variable)
for v in iter_over_pairs(updates)):
raise ValueError(
"The updates parameter must an OrderedDict/dict or a list of list/tuple with 2 elements")
"The updates parameter must be an OrderedDict/dict or a list of "
"lists/tuples with 2 elements")
# transform params into theano.compile.In objects.
inputs = [_pfunc_param_to_in(p, allow_downcast=allow_input_downcast)
......
......@@ -1457,7 +1457,7 @@ def std_lib_dirs_and_libs():
# directories.
python_lib_dirs = [os.path.join(os.path.dirname(python_inc), 'libs')]
if "Canopy" in python_lib_dirs[0]:
# Canopy store libpython27.a and libmsccr90.a in this directory.
# Canopy stores libpython27.a and libmsccr90.a in this directory.
# For some reason, these files are needed when compiling Python
# modules, even when libpython27.lib and python27.dll are
# available, and the *.a files have to be found earlier than
......@@ -1467,7 +1467,7 @@ def std_lib_dirs_and_libs():
for f, lib in [('libpython27.a', 'libpython 1.2'),
('libmsvcr90.a', 'mingw 4.5.2')]:
if not os.path.exists(os.path.join(libdir, f)):
print ("Your python version is from Canopy. " +
print ("Your Python version is from Canopy. " +
"You need to install the package '" + lib +
"' from Canopy package manager."
)
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论