提交 ea3d6101 authored 作者: Rami Al-Rfou's avatar Rami Al-Rfou

Merge branch 'master' of https://github.com/Theano/Theano into grad_advinc_subtensor

.. _NEWS: .. _NEWS:
Updates in the Trunk since the last release: =============
Release Notes
=============
Theano 0.6rc2 (November 21th, 2012)
===================================
Highlight:
* Fix a few regression inserted in 0.6rc1.
* A few new features.
* Speed up.
* Scan fix.
* Crash fix.
* A few small interface change.
Commiters for this rc2 only:
Razvan Pascanu
Pascal Lamblin
Frederic Bastien
Ian Goodfellow
Jeremiah Lowin
Caglar Gulcehre
Jey Kottalam
Matthew Rocklin
abalkin
Regression in 0.6rc1 fixed:
* Fix the scan gradient dtype issue. In 0.6rc1, some upcast where inserted. (Razvan P.)
* Now grad() will do as before the 0.6rc1 for float, i.e. the grad dtype will be the same as the inputs inside the graph. If you ask for the direct grad, it will return the computed dtype. (Pascal L.)
Wrong results fix:
* Scan fix in some case didn't returned the good results. (Razvan P., reported by Jeremiah L.)
This happen if you have a state with only neg tap and the outputs of the state is a function of some sequence.
If you have multiple state, there was no problem.
* Fixed bug in Scan with multiple outputs,
where one output would sometimes overwrite another one. (Razvan P.)
* Clip.grad treated the gradient with respect to the clipping boundary as always 0. (Ian G.)
Interface change:
* Now we do not support unaligned ndarray in python code. (Frederic B.)
We did not support it in c code and supporting it in python code made
the detection harder.
* Now we only support officialy scipy 0.7.2 and numpy 1.5.0 (Frederic B.)
We weren't and aren't testing with older version.
* The theano.sparse.SparseType is available even when scipy is not (Frederic B.)
* Fixes issue where members of consider_constant grad parameter
were treated differently from Constant variables. (Ian G.)
* Remove the parameter g_cost to theano.grad(). (Ian G.)
Use the new more powerfull parameter known_grads instead.
NumPy interface support:
* theano.tensor.where is an alias for theano.tensor.switch to support NumPy semantic. (Ian G.)
* TensorVariable objects now have dot, argmin, argmax, clip, conj, repeat, trace, std, round,
ravel and argsort functions and the real and imag properties as numpy.ndarray object.
The functionality was already available in Theano. (abalkin)
Speed up:
* A C version of the SoftMax op (Razvan P.)
There was c code for the softmax with bias code.
* Faster GpuIncSubtensor (Ian G.)
* Faster copy on the GPU for 4d tensor. (Ian G.)
* The fix of flatten infer_shape re-enable an optimization (Pascal L.)
* The bug was introduced in 0.6rc1.
* Enable inc_subtensor on the GPU when updating it with a float64 dtype. (Ian G.)
It was causing an optimization warning.
* Make DeepCopy reuse preallocated memory. (Frederic B.)
* Move then convolution to the GPU when the image shape and logical image shape differ. (Frederic Bastien)
* C code for the View Op (Razvan P., Pascal L.)
New Feature:
* Added a monitoring mode "MonitorMode" as a debugging tool. (Olivier D.)
* Allow integer axes when keepdims==True (Jeremiah Lowin)
* Add erfinv and erfcinv op. (Jey Kottalam)
* Added tensor.batched_dot(). (Caglar Gulcehre)
It use scan behind the scene, but making doing this easier.
* theano.get_constant_value(x) (Frederic B.)
This try to do have x as a constant int.
This do some constant folding to try to convert x into an int.
Used by some optimization.
* Add theano.tensor.io.{MPIRecv,MPIRecvWait,MPISend,MPISendWait} (Matthew Rocklin)
Theano do not automatically use them. It is up to you to use them and split your computation.
* Added theano.sandbox.linalg.eig (abalkin)
* Started some support for Python3 (abalkin)
setup.py support python3 now.
It call 2to3 during the setup.
Python3 not fully supported as we didn't update the c code.
Crash Fix:
* Fix a crash related to scan.grad due to the new mechanism. (Ian G.)
* Fix an optimization warning. Now it get optimized. (Frederic B.)
* Fix crash introduced in 0.6rc1 in theano.grad (Ian G.)
* Fix crash introduced in 0.6rc1 in the grad of scan (Razvan P.)
* Fix crash introduced in 0.6rc1 in the grad of clip (Ian G.)
Also implement the gradient on the min/max bound.
* Fix crash in the grad of tensor.switch for int (Ian G.)
* Fix crash when mixing shared variable on the GPU and sparse dot. (Pascal L.)
* Fix crash as sometimes sparse.dot would return a different dtype number
that is equivalent but not the one expected. (Pascal L., reported by Rami Al-Rfou)
* Better error msg (Ian G.)
* Move all sparse random function back to sandbox as it don't have a state inside Theano. (Pascal L.)
They where moved outside the sandbox in 0.6rc1
* LoadFromDisk now is allowed to take only support some memmap mode. (Pascal L.)
Otherwise, this was causing errors, segmentation faults or wrong results.
* Fix import problem on PiCloud (Jeremiah Lowin)
* You need to use the c|py linker with the default
environment. Otherwise, you need to create your own environment.
* Fix a crash during optimization when we take a subtensor of a constant with a non constant index. (Ian G.)
* Better handling and error message of gradients on integer. (Ian G.)
* Fixes a crash where Scan assumed all TypeErrors raised by the grad function were due to undefined gradients (Ian G.)
https://github.com/Theano/Theano/wiki/Devnews Other:
* Doc typo fixes, Doc updates, Better error messages: Olivier D., David W.F., Frederic B., James B., Matthew Rocklin, Ian G., abalkin.
============= =============
Release Notes Release Notes
......
.. _NEWS: .. _NEWS:
Updates in the Trunk since the last release: =============
Release Notes
=============
Theano 0.6rc2 (November 21th, 2012)
===================================
Highlight:
* Fix a few regression inserted in 0.6rc1.
* A few new features.
* Speed up.
* Scan fix.
* Crash fix.
* A few small interface change.
Commiters for this rc2 only:
Razvan Pascanu
Pascal Lamblin
Frederic Bastien
Ian Goodfellow
Jeremiah Lowin
Caglar Gulcehre
Jey Kottalam
Matthew Rocklin
abalkin
Regression in 0.6rc1 fixed:
* Fix the scan gradient dtype issue. In 0.6rc1, some upcast where inserted. (Razvan P.)
* Now grad() will do as before the 0.6rc1 for float, i.e. the grad dtype will be the same as the inputs inside the graph. If you ask for the direct grad, it will return the computed dtype. (Pascal L.)
Wrong results fix:
* Scan fix in some case didn't returned the good results. (Razvan P., reported by Jeremiah L.)
This happen if you have a state with only neg tap and the outputs of the state is a function of some sequence.
If you have multiple state, there was no problem.
* Fixed bug in Scan with multiple outputs,
where one output would sometimes overwrite another one. (Razvan P.)
* Clip.grad treated the gradient with respect to the clipping boundary as always 0. (Ian G.)
Interface change:
* Now we do not support unaligned ndarray in python code. (Frederic B.)
We did not support it in c code and supporting it in python code made
the detection harder.
* Now we only support officialy scipy 0.7.2 and numpy 1.5.0 (Frederic B.)
We weren't and aren't testing with older version.
* The theano.sparse.SparseType is available even when scipy is not (Frederic B.)
* Fixes issue where members of consider_constant grad parameter
were treated differently from Constant variables. (Ian G.)
* Remove the parameter g_cost to theano.grad(). (Ian G.)
Use the new more powerfull parameter known_grads instead.
NumPy interface support:
* theano.tensor.where is an alias for theano.tensor.switch to support NumPy semantic. (Ian G.)
* TensorVariable objects now have dot, argmin, argmax, clip, conj, repeat, trace, std, round,
ravel and argsort functions and the real and imag properties as numpy.ndarray object.
The functionality was already available in Theano. (abalkin)
Speed up:
* A C version of the SoftMax op (Razvan P.)
There was c code for the softmax with bias code.
* Faster GpuIncSubtensor (Ian G.)
* Faster copy on the GPU for 4d tensor. (Ian G.)
* The fix of flatten infer_shape re-enable an optimization (Pascal L.)
* The bug was introduced in 0.6rc1.
* Enable inc_subtensor on the GPU when updating it with a float64 dtype. (Ian G.)
It was causing an optimization warning.
* Make DeepCopy reuse preallocated memory. (Frederic B.)
* Move then convolution to the GPU when the image shape and logical image shape differ. (Frederic Bastien)
* C code for the View Op (Razvan P., Pascal L.)
New Feature:
* Added a monitoring mode "MonitorMode" as a debugging tool. (Olivier D.)
* Allow integer axes when keepdims==True (Jeremiah Lowin)
* Add erfinv and erfcinv op. (Jey Kottalam)
* Added tensor.batched_dot(). (Caglar Gulcehre)
It use scan behind the scene, but making doing this easier.
* theano.get_constant_value(x) (Frederic B.)
This try to do have x as a constant int.
This do some constant folding to try to convert x into an int.
Used by some optimization.
* Add theano.tensor.io.{MPIRecv,MPIRecvWait,MPISend,MPISendWait} (Matthew Rocklin)
Theano do not automatically use them. It is up to you to use them and split your computation.
* Added theano.sandbox.linalg.eig (abalkin)
* Started some support for Python3 (abalkin)
setup.py support python3 now.
It call 2to3 during the setup.
Python3 not fully supported as we didn't update the c code.
Crash Fix:
* Fix a crash related to scan.grad due to the new mechanism. (Ian G.)
* Fix an optimization warning. Now it get optimized. (Frederic B.)
* Fix crash introduced in 0.6rc1 in theano.grad (Ian G.)
* Fix crash introduced in 0.6rc1 in the grad of scan (Razvan P.)
* Fix crash introduced in 0.6rc1 in the grad of clip (Ian G.)
Also implement the gradient on the min/max bound.
* Fix crash in the grad of tensor.switch for int (Ian G.)
* Fix crash when mixing shared variable on the GPU and sparse dot. (Pascal L.)
* Fix crash as sometimes sparse.dot would return a different dtype number
that is equivalent but not the one expected. (Pascal L., reported by Rami Al-Rfou)
* Better error msg (Ian G.)
* Move all sparse random function back to sandbox as it don't have a state inside Theano. (Pascal L.)
They where moved outside the sandbox in 0.6rc1
* LoadFromDisk now is allowed to take only support some memmap mode. (Pascal L.)
Otherwise, this was causing errors, segmentation faults or wrong results.
* Fix import problem on PiCloud (Jeremiah Lowin)
* You need to use the c|py linker with the default
environment. Otherwise, you need to create your own environment.
* Fix a crash during optimization when we take a subtensor of a constant with a non constant index. (Ian G.)
* Better handling and error message of gradients on integer. (Ian G.)
* Fixes a crash where Scan assumed all TypeErrors raised by the grad function were due to undefined gradients (Ian G.)
https://github.com/Theano/Theano/wiki/Devnews Other:
* Doc typo fixes, Doc updates, Better error messages: Olivier D., David W.F., Frederic B., James B., Matthew Rocklin, Ian G., abalkin.
============= =============
Release Notes Release Notes
...@@ -72,7 +183,7 @@ Deprecation: ...@@ -72,7 +183,7 @@ Deprecation:
This was a predecessor of SharedVariable with a less pythonic philosophy. This was a predecessor of SharedVariable with a less pythonic philosophy.
Interface changes: Interface changes:
* Now the base version requirements are numpy >= 1.5.0 and the optional scipy >= 0.8. * Now the base version requirements are numpy >= 1.5.0 and the optional scipy >= 0.7.2.
* In Theano 0.5, we removed the deprecated sharedvar.value property. * In Theano 0.5, we removed the deprecated sharedvar.value property.
Now we raise an error if you access it. (Frederic B.) Now we raise an error if you access it. (Frederic B.)
* theano.function does not accept duplicate inputs, so function([x, x], ...) * theano.function does not accept duplicate inputs, so function([x, x], ...)
......
...@@ -53,7 +53,7 @@ copyright = '2008--2012, LISA lab' ...@@ -53,7 +53,7 @@ copyright = '2008--2012, LISA lab'
# The short X.Y version. # The short X.Y version.
version = '0.6' version = '0.6'
# The full version, including alpha/beta/rc tags. # The full version, including alpha/beta/rc tags.
release = '0.6rc1' release = '0.6rc2'
# There are two options for replacing |today|: either, you set today to some # There are two options for replacing |today|: either, you set today to some
# non-false value, then it is used: # non-false value, then it is used:
......
...@@ -249,6 +249,8 @@ following methods: ...@@ -249,6 +249,8 @@ following methods:
1) They must be Variable instances. 1) They must be Variable instances.
2) When they are types that have dtypes, they must never have an integer dtype. 2) When they are types that have dtypes, they must never have an integer dtype.
The output gradients passed *to* Op.grad will also obey these constraints.
Integers are a tricky subject. Integers are the main reason for having DisconnectedType, Integers are a tricky subject. Integers are the main reason for having DisconnectedType,
NullType or zero gradient. When you have an integer as an argument to your grad method, NullType or zero gradient. When you have an integer as an argument to your grad method,
recall the definition of a derivative to help you decide what value to return: recall the definition of a derivative to help you decide what value to return:
......
...@@ -57,8 +57,8 @@ Theano also provides :func:`theano.printing.pydotprint` that creates a png image ...@@ -57,8 +57,8 @@ Theano also provides :func:`theano.printing.pydotprint` that creates a png image
The parameter in T.dscalar('x') in the first line is the name of this variable The parameter in T.dscalar('x') in the first line is the name of this variable
in the graph. This name is used when printing the graph to make it more readable. in the graph. This name is used when printing the graph to make it more readable.
If no name is provided the variable x is printed as its type as. In this example If no name is provided the variable x is printed as its type as returned by
<TensorType(float64, scalar)>. x.type(). In this example - <TensorType(float64, scalar)>.
The name parameter can be any string. There are no naming restrictions: The name parameter can be any string. There are no naming restrictions:
in particular, you can have many variables with the same name. in particular, you can have many variables with the same name.
...@@ -86,7 +86,7 @@ The line ``|x [@C`` means the variable named ``x`` with debugprint identifier ...@@ -86,7 +86,7 @@ The line ``|x [@C`` means the variable named ``x`` with debugprint identifier
your graph, their different debugprint identifier will be your clue. your graph, their different debugprint identifier will be your clue.
The line ``|TensorConstant{2.0} [@B]`` means that there is a constant 2.0 The line ``|TensorConstant{2.0} [@B]`` means that there is a constant 2.0
wit this debugprint identifier. with this debugprint identifier.
The line ``Elemwise{mul,no_inplace} [@A] ''`` is indented less than The line ``Elemwise{mul,no_inplace} [@A] ''`` is indented less than
the other ones, because it means there is a variable computed by multiplying the other ones, because it means there is a variable computed by multiplying
...@@ -121,7 +121,7 @@ Elemwise{mul} [@A] '' ...@@ -121,7 +121,7 @@ Elemwise{mul} [@A] ''
|Elemwise{mul} [@B] '' |Elemwise{mul} [@B] ''
|Elemwise{pow} [@C] '' |Elemwise{pow} [@C] ''
If the depth parameter is provided, it limits the nuber of levels that are If the depth parameter is provided, it limits the number of levels that are
shown. shown.
......
...@@ -14,7 +14,7 @@ own Theano code, and even (it happens) in Theano's internals, in ...@@ -14,7 +14,7 @@ own Theano code, and even (it happens) in Theano's internals, in
Isolating the Problem/Testing Theano Compiler Isolating the Problem/Testing Theano Compiler
--------------------------------------------- ---------------------------------------------
You can run your Theano function in a :ref:`DebugMode<using_debugmode>`. You can run your Theano function in a :ref:`DebugMode<using_debugmode>`.
This tests the Theano optimizations and helps to find where NaN, inf and other problems come from. This tests the Theano optimizations and helps to find where NaN, inf and other problems come from.
...@@ -56,12 +56,12 @@ following example. ...@@ -56,12 +56,12 @@ following example.
# compile and call the actual function # compile and call the actual function
f = theano.function([x], h2) f = theano.function([x], h2)
f(numpy.random.rand(5, 10)) f(numpy.random.rand(5, 10))
Running the above code generates the following error message: Running the above code generates the following error message:
.. code-block:: bash .. code-block:: bash
Definition in: Definition in:
File "/u/desjagui/workspace/PYTHON/theano/gof/opt.py", line 1102, in apply File "/u/desjagui/workspace/PYTHON/theano/gof/opt.py", line 1102, in apply
lopt_change = self.process_node(fgraph, node, lopt) lopt_change = self.process_node(fgraph, node, lopt)
File "/u/desjagui/workspace/PYTHON/theano/gof/opt.py", line 882, in process_node File "/u/desjagui/workspace/PYTHON/theano/gof/opt.py", line 882, in process_node
...@@ -83,8 +83,8 @@ Running the above code generates the following error message: ...@@ -83,8 +83,8 @@ Running the above code generates the following error message:
thunk() thunk()
File "/u/desjagui/workspace/PYTHON/Theano/theano/gof/cc.py", line 1111, in execute File "/u/desjagui/workspace/PYTHON/Theano/theano/gof/cc.py", line 1111, in execute
raise exc_type, exc_value, exc_trace raise exc_type, exc_value, exc_trace
ValueError: ('Shape mismatch: x has 10 cols but y has 20 rows', ValueError: ('Shape mismatch: x has 10 cols but y has 20 rows',
_dot22(x, <TensorType(float64, matrix)>), [_dot22.0], _dot22(x, <TensorType(float64, matrix)>), [_dot22.0],
_dot22(x, InplaceDimShuffle{1,0}.0), 'Sequence id of Apply node=4') _dot22(x, InplaceDimShuffle{1,0}.0), 'Sequence id of Apply node=4')
Needless to say, the above is not very informative and does not provide much in Needless to say, the above is not very informative and does not provide much in
...@@ -114,7 +114,7 @@ following error message, which properly identifies *line 23* as the culprit. ...@@ -114,7 +114,7 @@ following error message, which properly identifies *line 23* as the culprit.
Traceback (most recent call last): Traceback (most recent call last):
File "test2.py", line 23, in <module> File "test2.py", line 23, in <module>
h1 = T.dot(x,func_of_W1) h1 = T.dot(x,func_of_W1)
File "/u/desjagui/workspace/PYTHON/Theano/theano/gof/op.py", line 360, in __call__ File "/u/desjagui/workspace/PYTHON/Theano/theano/gof/op.py", line 360, in __call__
node.op.perform(node, input_vals, output_storage) node.op.perform(node, input_vals, output_storage)
File "/u/desjagui/workspace/PYTHON/Theano/theano/tensor/basic.py", line 4458, in perform File "/u/desjagui/workspace/PYTHON/Theano/theano/tensor/basic.py", line 4458, in perform
...@@ -167,8 +167,8 @@ Theano provides a 'Print' op to do this. ...@@ -167,8 +167,8 @@ Theano provides a 'Print' op to do this.
Since Theano runs your program in a topological order, you won't have precise Since Theano runs your program in a topological order, you won't have precise
control over the order in which multiple ``Print()`` ops are evaluted. For a more control over the order in which multiple ``Print()`` ops are evaluted. For a more
precise inspection of what's being computed where, when, and how, see the discussion precise inspection of what's being computed where, when, and how, see the discussion
:ref:`faq_wraplinker`. :ref:`faq_monitormode`.
.. warning:: .. warning::
...@@ -196,7 +196,7 @@ You can read about them in :ref:`libdoc_printing`. ...@@ -196,7 +196,7 @@ You can read about them in :ref:`libdoc_printing`.
"The Function I Compiled is Too Slow, what's up?" "The Function I Compiled is Too Slow, what's up?"
------------------------------------------------- -------------------------------------------------
First, make sure you're running in ``FAST_RUN`` mode. Even though First, make sure you're running in ``FAST_RUN`` mode. Even though
``FAST_RUN`` is the default mode, insist by passing ``mode='FAST_RUN'`` ``FAST_RUN`` is the default mode, insist by passing ``mode='FAST_RUN'``
to ``theano.function`` (or ``theano.make``) or by setting :attr:`config.mode` to ``theano.function`` (or ``theano.make``) or by setting :attr:`config.mode`
to ``FAST_RUN``. to ``FAST_RUN``.
...@@ -206,7 +206,7 @@ Second, try the Theano :ref:`using_profilemode`. This will tell you which ...@@ -206,7 +206,7 @@ Second, try the Theano :ref:`using_profilemode`. This will tell you which
Tips: Tips:
* Use the flags ``floatX=float32`` to require type *float32* instead of *float64*; * Use the flags ``floatX=float32`` to require type *float32* instead of *float64*;
Use the Theano constructors matrix(),vector(),... instead of dmatrix(), dvector(),... Use the Theano constructors matrix(),vector(),... instead of dmatrix(), dvector(),...
since they respectively involve the default types *float32* and *float64*. since they respectively involve the default types *float32* and *float64*.
* Check in the ``profile`` mode that there is no ``Dot`` op in the post-compilation * Check in the ``profile`` mode that there is no ``Dot`` op in the post-compilation
...@@ -216,48 +216,79 @@ Tips: ...@@ -216,48 +216,79 @@ Tips:
of type *float64*. of type *float64*.
.. _faq_wraplinker: .. _faq_monitormode:
"How do I Step through a Compiled Function with the WrapLinker?" "How do I Step through a Compiled Function?"
---------------------------------------------------------------- --------------------------------------------
This is not exactly a FAQ, but the doc is here for now... You can use ``MonitorMode`` to inspect the inputs and outputs of each
It's pretty easy to roll-your-own evaluation mode. node being executed when the function is called. The code snipped below
Check out this one: shows how to print all inputs and outputs:
.. code-block:: python .. code-block:: python
class PrintEverythingMode(Mode): import theano
def __init__(self):
def print_eval(i, node, fn): def inspect_inputs(i, node, fn):
print i, node, [input[0] for input in fn.inputs], print i, node, [input[0] for input in fn.inputs],
fn()
print [output[0] for output in fn.outputs] def inspect_outputs(i, node, fn):
wrap_linker = theano.gof.WrapLinkerMany([theano.gof.OpWiseCLinker()], [print_eval]) print [output[0] for output in fn.outputs]
super(PrintEverythingMode, self).__init__(wrap_linker, optimizer='fast_run')
When you use ``mode=PrintEverythingMode()`` as the mode for ``Function`` or ``Method``,
then you should see [potentially a lot of] output. Every ``Apply`` node will be printed out,
along with its position in the graph, the arguments to the functions ``perform`` or
``c_code`` and the output it computed.
>>> x = T.dscalar('x') x = theano.tensor.dscalar('x')
>>> f = function([x], [5 * x], mode=PrintEverythingMode()) f = theano.function([x], [5 * x],
>>> f(3) mode=theano.compile.MonitorMode(
>>> # print: 0 Elemwise{mul,no_inplace}(5, x) [array(5, dtype=int8), array(3.0)] [array(15.0)] pre_func=inspect_inputs,
>>> # print: [array(15.0)] post_func=inspect_outputs))
f(3)
# The code will print the following:
# 0 Elemwise{mul,no_inplace}(TensorConstant{5.0}, x) [array(5.0), array(3.0)] [array(15.0)]
When using these ``inspect_inputs`` and ``inspect_outputs`` functions
with ``MonitorMode``, you should see [potentially a lot of] printed output.
Every ``Apply`` node will be printed out,
along with its position in the graph, the arguments to the functions ``perform`` or
``c_code`` and the output it computed.
Admittedly, this may be a huge amount of Admittedly, this may be a huge amount of
output to read through if you are using big tensors... but you can choose to output to read through if you are using big tensors... but you can choose to
put logic inside of the *print_eval* function that would, for example, print add logic that would, for instance, print
something out only if a certain kind of op were used, at a certain program something out only if a certain kind of op were used, at a certain program
position, or only if a particular value showed up in one of the inputs or outputs. position, or only if a particular value showed up in one of the inputs or outputs.
Use your imagination :) A typical example is to detect when NaN values are added into computations, which
can be achieved as follows:
.. code-block:: python
import numpy
import theano
def detect_nan(i, node, fn):
for output in fn.outputs:
if numpy.isnan(output[0]).any():
print '*** NaN detected ***'
theano.printing.debugprint(node)
print 'Inputs : %s' % [input[0] for input in fn.inputs]
print 'Outputs: %s' % [output[0] for output in fn.outputs]
break
x = theano.tensor.dscalar('x')
f = theano.function([x], [theano.tensor.log(x) * x],
mode=theano.compile.MonitorMode(
post_func=detect_nan))
f(0) # log(0) * 0 = -inf * 0 = NaN
# The code above will print:
# *** NaN detected ***
# Elemwise{Composite{[mul(log(i0), i0)]}} [@A] ''
# |x [@B]
# Inputs : [array(0.0)]
# Outputs: [array(nan)]
.. TODO: documentation for link.WrapLinkerMany .. TODO: documentation for link.WrapLinkerMany
This can be a really powerful debugging tool. Note the call to *fn* inside the call to
*print_eval*; without it, the graph wouldn't get computed at all!
How to Use pdb How to Use pdb
-------------- --------------
......
...@@ -153,6 +153,13 @@ short name Full constructor ...@@ -153,6 +153,13 @@ short name Full constructor
``ProfileMode`` ``compile.profilemode.ProfileMode()`` C implementations where available, all available graph transformations, print profile information. ``ProfileMode`` ``compile.profilemode.ProfileMode()`` C implementations where available, all available graph transformations, print profile information.
================= =============================================================== =============================================================================== ================= =============================================================== ===============================================================================
.. Note::
For debugging purpose, there also exists a ``MonitorMode`` (which has no
short name). It can be used to step through the execution of a function:
see :ref:`the debugging FAQ<faq_monitormode>` for details.
Linkers Linkers
======= =======
......
...@@ -14,13 +14,13 @@ try: ...@@ -14,13 +14,13 @@ try:
except ImportError: except ImportError:
from distutils.core import setup from distutils.core import setup
try: try:
from distutils.command.build_py import build_py_2to3 \ from distutils.command.build_py import build_py_2to3 \
as build_py as build_py
from distutils.command.build_scripts import build_scripts_2to3 \ from distutils.command.build_scripts import build_scripts_2to3 \
as build_scripts as build_scripts
except ImportError: except ImportError:
from distutils.command.build_py import build_py from distutils.command.build_py import build_py
from distutils.command.build_scripts import build_scripts from distutils.command.build_scripts import build_scripts
CLASSIFIERS = """\ CLASSIFIERS = """\
...@@ -55,7 +55,7 @@ PLATFORMS = ["Windows", "Linux", "Solaris", "Mac OS-X", "Unix"] ...@@ -55,7 +55,7 @@ PLATFORMS = ["Windows", "Linux", "Solaris", "Mac OS-X", "Unix"]
MAJOR = 0 MAJOR = 0
MINOR = 6 MINOR = 6
MICRO = 0 MICRO = 0
SUFFIX = "rc1" # Should be blank except for rc's, betas, etc. SUFFIX = "rc2" # Should be blank except for rc's, betas, etc.
ISRELEASED = False ISRELEASED = False
VERSION = '%d.%d.%d%s' % (MAJOR, MINOR, MICRO, SUFFIX) VERSION = '%d.%d.%d%s' % (MAJOR, MINOR, MICRO, SUFFIX)
......
...@@ -21,6 +21,8 @@ from module import * ...@@ -21,6 +21,8 @@ from module import *
import debugmode # register DEBUG_MODE import debugmode # register DEBUG_MODE
from debugmode import DebugMode from debugmode import DebugMode
from monitormode import MonitorMode
from profilemode import ProfileMode from profilemode import ProfileMode
from theano.compile.sharedvalue import shared, shared_constructor, SharedVariable from theano.compile.sharedvalue import shared, shared_constructor, SharedVariable
......
...@@ -55,9 +55,12 @@ class OpFromGraph(gof.Op): ...@@ -55,9 +55,12 @@ class OpFromGraph(gof.Op):
if grad_depth > 0: if grad_depth > 0:
output_grads = [t() for t in self.output_types] output_grads = [t() for t in self.output_types]
gd = G.grad_sources_inputs(zip(self.outputs, output_grads), # OpFromGraph doesn't implement a connection_pattern, so for now we regard
self.inputs) # all inputs and outputs as connected. This will compute the right numerical
gs = map(gd.get, self.inputs) # value for the gradients but could fail to raise the disconnected inputs error
# in some cases.
gs = G.grad(cost=None, known_grads=dict(zip(self.outputs, output_grads)),
wrt=self.inputs, disconnected_inputs='ignore')
self.grad_ops = [] self.grad_ops = []
for g in gs: for g in gs:
if g is None: if g is None:
......
# Note: this code was initially copied from the 'pyutools' package by its
# original author, and re-licensed under Theano's license.
import theano
from theano.compile.mode import Mode
class MonitorMode(Mode):
"""
`MonitorMode` is a debug mode to easily step through function execution.
Its default behavior is to behave like the 'FAST_RUN' mode. By providing
either a `pre_func` (called before a node is executed) or a `post_func`
(called after a node is executed) monitoring function, the user can inspect
node behavior.
A typical use case is to detect the introduction of NaN values in a graph.
For an example of such a use case, see doc/tutorial/debug_faq.txt.
"""
def __init__(self, pre_func=None, post_func=None, optimizer='fast_run'):
"""
Constructor.
:param pre_func: A function to call before executing a thunk, with
arguments:
- the thunk index
- the Apply node
- the thunk to be called
:param post_func: A function to call after executing a thunk, with the
same three arguments as `pre_func`.
:param optimizer: The optimizer to use. One may use for instance
'fast_compile' to skip optimizations.
"""
self.pre_func = pre_func
self.post_func = post_func
wrap_linker = theano.gof.WrapLinkerMany([theano.gof.OpWiseCLinker()],
[self.eval])
super(MonitorMode, self).__init__(wrap_linker, optimizer=optimizer)
def eval(self, i, node, fn):
"""
The method that calls the thunk `fn`.
"""
if self.pre_func is not None:
self.pre_func(i, node, fn)
fn()
if self.post_func is not None:
self.post_func(i, node, fn)
import numpy
import theano
def test_detect_nan():
"""
Test the code snippet example that detects NaN values.
"""
nan_detected = [False]
def detect_nan(i, node, fn):
for output in fn.outputs:
if numpy.isnan(output[0]).any():
print '*** NaN detected ***'
theano.printing.debugprint(node)
print 'Inputs : %s' % [input[0] for input in fn.inputs]
print 'Outputs: %s' % [output[0] for output in fn.outputs]
nan_detected[0] = True
break
x = theano.tensor.dscalar('x')
f = theano.function([x], [theano.tensor.log(x) * x],
mode=theano.compile.MonitorMode(
post_func=detect_nan))
f(0) # log(0) * 0 = -inf * 0 = NaN
assert nan_detected[0]
差异被折叠。
...@@ -40,7 +40,7 @@ def debugprint(obj, depth=-1, print_type=False, ...@@ -40,7 +40,7 @@ def debugprint(obj, depth=-1, print_type=False,
:type depth: integer :type depth: integer
:param depth: print graph to this depth (-1 for unlimited) :param depth: print graph to this depth (-1 for unlimited)
:type print_type: boolean :type print_type: boolean
:param print_type: wether to print the type of printed objects :param print_type: whether to print the type of printed objects
:type file: None, 'str', or file-like object :type file: None, 'str', or file-like object
:param file: print to this file ('str' means to return a string) :param file: print to this file ('str' means to return a string)
:type ids: str :type ids: str
...@@ -531,11 +531,11 @@ def pydotprint(fct, outfile=None, ...@@ -531,11 +531,11 @@ def pydotprint(fct, outfile=None,
label each edge between an input and the Apply node with the label each edge between an input and the Apply node with the
input's index. input's index.
green boxes are inputs variables to the graph Green boxes are inputs variables to the graph,
blue boxes are outputs variables of the graph blue boxes are outputs variables of the graph,
grey boxes are variables that are not outputs and are not used grey boxes are variables that are not outputs and are not used,
red ellipses are transfers from/to the gpu (ops with names GpuFromHost, red ellipses are transfers from/to the gpu (ops with names GpuFromHost,
HostFromGpu) HostFromGpu).
""" """
if colorCodes is None: if colorCodes is None:
......
...@@ -221,7 +221,8 @@ class Scan(PureOp): ...@@ -221,7 +221,8 @@ class Scan(PureOp):
'following error has been encountered: The ' 'following error has been encountered: The '
'%s %s (argument number %d) has dtype ' '%s %s (argument number %d) has dtype '
'%s and %d dimension(s). The corresponding slice %s ' '%s and %d dimension(s). The corresponding slice %s '
'however has dtype %s and %d dimension(s). This ' 'however has dtype %s and %d dimension(s) (it should '
'have the same dtype and one fewer dimensions). This '
'should never happen, please ' 'should never happen, please '
'report to theano-dev mailing list' 'report to theano-dev mailing list'
) )
...@@ -1261,11 +1262,9 @@ class Scan(PureOp): ...@@ -1261,11 +1262,9 @@ class Scan(PureOp):
if x in diff_inputs] if x in diff_inputs]
for x in consider_inps: for x in consider_inps:
try: try:
_gmp = gradient.grad_sources_inputs( gmp[x] = gradient.grad(cost=None,
[(y, g_y)], known_grads={y: g_y}, wrt=x)
[x]) except gradient.NullTypeGradError:
gmp[x] = _gmp[x]
except TypeError:
# It means the gradient is undefined (which implies # It means the gradient is undefined (which implies
# is connected) # is connected)
gmp[x] = x gmp[x] = x
...@@ -1374,11 +1373,21 @@ class Scan(PureOp): ...@@ -1374,11 +1373,21 @@ class Scan(PureOp):
self.inner_nitsot_outs(self_outputs)) self.inner_nitsot_outs(self_outputs))
def compute_gradient(y, g_y): def compute_gradient(y, g_y):
gmp = gradient.grad_sources_inputs( if 'int' in str(g_y.dtype):
[(y, g_y)], raise TypeError("Gradients may never be integers but g_y "
[x for x in theano.gof.graph.inputs([y]) "has type "+str(g_y.type))
if x in diff_inputs])
return [gmp.get(p, None) for p in diff_inputs] wrt = [x for x in theano.gof.graph.inputs([y])
if x in diff_inputs]
grads = gradient.grad(
cost = None,
known_grads = {y : g_y },
wrt=wrt, consider_constant=wrt,
disconnected_inputs='ignore',
return_disconnected='None')
gmp = dict(zip(wrt, grads))
rval = [gmp.get(p, None) for p in diff_inputs]
return rval
dC_dinps_t = [None for inp in diff_inputs] dC_dinps_t = [None for inp in diff_inputs]
disconnected_dC_dinps_t = [True for inp in diff_inputs] disconnected_dC_dinps_t = [True for inp in diff_inputs]
dC_dXts = [] dC_dXts = []
......
...@@ -464,13 +464,27 @@ def _allclose(a, b, rtol=None, atol=None): ...@@ -464,13 +464,27 @@ def _allclose(a, b, rtol=None, atol=None):
return numpy.allclose(a, b, atol=atol_, rtol=rtol_) return numpy.allclose(a, b, atol=atol_, rtol=rtol_)
class NotConstantError(TypeError):
"""
Raised by get_constant_value if called on something that is
not constant.
For now it is a TypeError, to maintain the old interface
that get_constant_value should raise a TypeError in this
situation. However, this is unsafe because get_constant_value
could inadvertently raise a TypeError if it has a bug.
So we should eventually make NotConstantError derive
from Exception directly, and modify all code that uses
get_constant_value to catch this more specific exception.
"""
pass
def get_constant_value(v): def get_constant_value(v):
"""return the constant scalar(0-D) value underlying variable `v` """return the constant scalar(0-D) value underlying variable `v`
If v is the output of dimshuffles, fills, allocs, rebroadcasts, cast If v is the output of dimshuffles, fills, allocs, rebroadcasts, cast
this function digs through them. this function digs through them.
If `v` is not some view of constant data, then raise a TypeError. If `v` is not some view of constant data, then raise a NotConstantError.
:note: There may be another function similar to this one in the :note: There may be another function similar to this one in the
code, but I'm not sure where it is. code, but I'm not sure where it is.
...@@ -490,7 +504,7 @@ def get_constant_value(v): ...@@ -490,7 +504,7 @@ def get_constant_value(v):
numpy.complex(data) # works for all numeric scalars numpy.complex(data) # works for all numeric scalars
return data return data
except Exception: except Exception:
raise TypeError( raise NotConstantError(
'v.data is non-numeric, non-scalar, or has more than one' 'v.data is non-numeric, non-scalar, or has more than one'
' unique value', v) ' unique value', v)
if v.owner: if v.owner:
...@@ -518,9 +532,17 @@ def get_constant_value(v): ...@@ -518,9 +532,17 @@ def get_constant_value(v):
v.owner.op.perform(v.owner, [const], ret) v.owner.op.perform(v.owner, [const], ret)
return ret[0][0] return ret[0][0]
if isinstance(v.owner.op, Subtensor) and v.ndim == 0: if isinstance(v.owner.op, Subtensor) and v.ndim == 0:
if isinstance(v.owner.inputs[0], TensorConstant): # This condition depends on Subtensor always embedding constant
return v.owner.inputs[0].data.__getitem__( # indices in the Op rather than making them inputs to the Apply node
if isinstance(v.owner.inputs[0], TensorConstant) and \
len(v.owner.inputs) == 1:
try:
return v.owner.inputs[0].data.__getitem__(
tuple(v.owner.op.idx_list)) tuple(v.owner.op.idx_list))
except IndexError:
raise IndexError(str(tuple(v.owner.op.idx_list))+" is not a valid index into " + \
str(v.owner.inputs[0].data))
# The index list 'idx_list' should have length the same # The index list 'idx_list' should have length the same
# shape as the input. # shape as the input.
...@@ -1614,6 +1636,9 @@ class _tensor_py_operators: ...@@ -1614,6 +1636,9 @@ class _tensor_py_operators:
def flatten(self, ndim=1): def flatten(self, ndim=1):
return flatten(self, ndim) return flatten(self, ndim)
def ravel(self):
return flatten(self)
# CASTING # CASTING
def astype(self, dtype): def astype(self, dtype):
return cast(self, dtype) return cast(self, dtype)
...@@ -1712,6 +1737,8 @@ class _tensor_py_operators: ...@@ -1712,6 +1737,8 @@ class _tensor_py_operators:
def __rdot__(right, left): def __rdot__(right, left):
return dot(left, right) return dot(left, right)
dot = __dot__
def sum(self, axis=None, dtype=None, keepdims=False): def sum(self, axis=None, dtype=None, keepdims=False):
"""See `theano.tensor.sum`""" """See `theano.tensor.sum`"""
return sum(self, axis=axis, dtype=dtype, keepdims=keepdims) return sum(self, axis=axis, dtype=dtype, keepdims=keepdims)
...@@ -1736,6 +1763,10 @@ class _tensor_py_operators: ...@@ -1736,6 +1763,10 @@ class _tensor_py_operators:
"""See `theano.tensor.var`""" """See `theano.tensor.var`"""
return var(self, axis, keepdims=keepdims) return var(self, axis, keepdims=keepdims)
def std(self, axis=None, keepdims=False):
"""See `theano.tensor.std`"""
return std(self, axis, keepdims=keepdims)
def min(self, axis=None, keepdims=False): def min(self, axis=None, keepdims=False):
"""See `theano.tensor.min`""" """See `theano.tensor.min`"""
return min(self, axis, keepdims=keepdims) return min(self, axis, keepdims=keepdims)
...@@ -1744,6 +1775,40 @@ class _tensor_py_operators: ...@@ -1744,6 +1775,40 @@ class _tensor_py_operators:
"""See `theano.tensor.max`""" """See `theano.tensor.max`"""
return max(self, axis, keepdims=keepdims) return max(self, axis, keepdims=keepdims)
def argmin(self, axis=None, keepdims=False):
"""See `theano.tensor.argmin`"""
return argmin(self, axis, keepdims=keepdims)
def argmax(self, axis=None, keepdims=False):
"""See `theano.tensor.argmax`"""
return argmax(self, axis, keepdims=keepdims)
def argsort(self, axis=-1, kind='quicksort', order=None):
"""See `theano.tensor.sort.argsort`"""
from theano.tensor.sort import argsort
return argsort(self, axis, kind, order)
def clip(self, a_min, a_max):
"Clip (limit) the values in an array."
return clip(self, a_min, a_max)
def conj(self):
"""See `theano.tensor.conj`"""
return conj(self)
def repeat(self, repeats, axis=None):
"""See `theano.tensor.repeat`"""
from theano.tensor.extra_ops import repeat
return repeat(self, repeats, axis)
def round(self, mode="half_away_from_zero"):
"""See `theano.tensor.round`"""
return round(self, mode)
def trace(self):
from theano.sandbox.linalg import trace
return trace(self)
# TO TRUMP NUMPY OPERATORS # TO TRUMP NUMPY OPERATORS
__array_priority__ = 1000 __array_priority__ = 1000
...@@ -2949,12 +3014,12 @@ def psi(a): ...@@ -2949,12 +3014,12 @@ def psi(a):
@_scal_elemwise_with_nfunc('real', 1, -1) @_scal_elemwise_with_nfunc('real', 1, -1)
def real(z): def real(z):
"""Return real component of complex-valued tensor `z`""" """Return real component of complex-valued tensor `z`"""
_tensor_py_operators.real = property(real)
@_scal_elemwise_with_nfunc('imag', 1, -1) @_scal_elemwise_with_nfunc('imag', 1, -1)
def imag(z): def imag(z):
"""Return imaginary component of complex-valued tensor `z`""" """Return imaginary component of complex-valued tensor `z`"""
_tensor_py_operators.imag = property(imag)
@_scal_elemwise_with_nfunc('angle', 1, -1) @_scal_elemwise_with_nfunc('angle', 1, -1)
def angle(z): def angle(z):
...@@ -3782,7 +3847,7 @@ class AdvancedIndexingError(TypeError): ...@@ -3782,7 +3847,7 @@ class AdvancedIndexingError(TypeError):
class Subtensor(Op): class Subtensor(Op):
"""Return a subtensor view """Return a subtensor view
The inputs array is the tensor x, followed by scalar integer variables. The inputs array is the tensor x, followed by scalar integer types.
TODO: WRITEME: how are the scalar integer variables formatted? TODO: WRITEME: how are the scalar integer variables formatted?
This class uses a relatively complex internal representation of the inputs This class uses a relatively complex internal representation of the inputs
...@@ -3791,7 +3856,7 @@ class Subtensor(Op): ...@@ -3791,7 +3856,7 @@ class Subtensor(Op):
idx_list: instance variable TODO: WRITEME: is this a list or a tuple? idx_list: instance variable TODO: WRITEME: is this a list or a tuple?
(old docstring gives two conflicting (old docstring gives two conflicting
descriptions) descriptions)
elements are either integers, theano scalars, or slices. elements are either integers, theano scalar types, or slices.
one element per "explicitly named dimension" one element per "explicitly named dimension"
TODO: WRITEME: what is an "explicitly named dimension" ? TODO: WRITEME: what is an "explicitly named dimension" ?
...@@ -3800,7 +3865,11 @@ class Subtensor(Op): ...@@ -3800,7 +3865,11 @@ class Subtensor(Op):
if slice: if slice:
start/stop/step members of each slice are integer indices start/stop/step members of each slice are integer indices
into the inputs array or None into the inputs array or None
integer indices be actual integers or theano scalars integer indices be actual integers or theano scalar types
Note that the idx_list defines the Op, so two Subtensor instances are
considered to be different Ops if they have different idx_list fields.
This means that the entries in it are theano Types, not theano Variables.
@todo: add support for advanced tensor indexing (in Subtensor_dx too). @todo: add support for advanced tensor indexing (in Subtensor_dx too).
...@@ -3818,6 +3887,17 @@ class Subtensor(Op): ...@@ -3818,6 +3887,17 @@ class Subtensor(Op):
@staticmethod @staticmethod
def collapse(idxs, cond): def collapse(idxs, cond):
"""
idxs: a list of indices or slices.
cond: a callable that returns a bool
returns: idxs, with the slices flattened out into a list.
if cond is true for an entry, does not flatten it.
"""
ret = [] ret = []
def helper(entry): def helper(entry):
...@@ -3830,10 +3910,20 @@ class Subtensor(Op): ...@@ -3830,10 +3910,20 @@ class Subtensor(Op):
for idx in idxs: for idx in idxs:
helper(idx) helper(idx)
return ret return ret
@staticmethod @staticmethod
def convert(entry, slice_ok=True): def convert(entry, slice_ok=True):
"""
The "idx_list" field is unique to each Subtensor instance.
It is not unique to each Apply node, so it should not refer to
specific Variables. This method changes references to Variables
into references to Types.
TODO: WRITEME: This method also accepts "entry" already being a Type;
when would that happen?
"""
invalid_scal_types = [scal.float64, scal.float32] invalid_scal_types = [scal.float64, scal.float32]
scal_types = [scal.int64, scal.int32, scal.int16, scal.int8] scal_types = [scal.int64, scal.int32, scal.int16, scal.int8]
tensor_types = [lscalar, iscalar, wscalar, bscalar] tensor_types = [lscalar, iscalar, wscalar, bscalar]
......
...@@ -722,20 +722,19 @@ class Elemwise(Op): ...@@ -722,20 +722,19 @@ class Elemwise(Op):
def _bgrad(self, inputs, ograds): def _bgrad(self, inputs, ograds):
# returns grad, with respect to broadcasted versions of inputs # returns grad, with respect to broadcasted versions of inputs
# Gradients (especially on the final costs) don't have to be symbolic
# e.g., ograds will be [ 1. ] if your objective is c and the output
# of the current apply node is c
ograds = map(as_tensor_variable, ograds)
prev_setting = theano.config.compute_test_value prev_setting = theano.config.compute_test_value
try: try:
theano.config.compute_test_value = 'off' theano.config.compute_test_value = 'off'
scalar_inputs = [Scalar(dtype=t.type.dtype)() for t in inputs] def as_scalar(t):
scalar_ograds = [Scalar(dtype=ograd.type.dtype)() if isinstance(t.type, (NullType, DisconnectedType)):
for ograd in ograds] return t
return Scalar(t.type.dtype)()
scalar_inputs = map(as_scalar, inputs)
scalar_ograds = map(as_scalar, ograds)
scalar_igrads = self.scalar_op.grad(scalar_inputs, scalar_ograds) scalar_igrads = self.scalar_op.grad(scalar_inputs, scalar_ograds)
for igrad in scalar_igrads: for igrad in scalar_igrads:
assert igrad is not None assert igrad is not None
......
...@@ -801,10 +801,9 @@ class ConvOp(OpenMPOp): ...@@ -801,10 +801,9 @@ class ConvOp(OpenMPOp):
# mimic what happens inside theano.grad: get the input gradient # mimic what happens inside theano.grad: get the input gradient
# of the final cost wrt all variables involved. # of the final cost wrt all variables involved.
tmp_gmap = theano.gradient.grad_sources_inputs( return theano.gradient.grad(cost=None,
[(node, gz)], [inputs, kerns]) known_grads={node: gz}, wrt=[inputs, kerns])
return [tmp_gmap[inputs], tmp_gmap[kerns]]
if self.dx not in (1, 2) or self.dy not in (1, 2): if self.dx not in (1, 2) or self.dy not in (1, 2):
raise NotImplementedError( raise NotImplementedError(
......
...@@ -1046,7 +1046,7 @@ class T_CrossentropyCategorical1Hot(utt.InferShapeTester): ...@@ -1046,7 +1046,7 @@ class T_CrossentropyCategorical1Hot(utt.InferShapeTester):
# Verify the gradient when providing output gradient # Verify the gradient when providing output gradient
h = theano.function([x, y, a], h = theano.function([x, y, a],
T.grad(expr, x, g_cost=a * x.sum()), mode=mode) T.grad(expr, x, known_grads={expr:a * x.sum()}), mode=mode)
try: try:
assert 8 <= len(h.maker.fgraph.toposort()) <= 17 assert 8 <= len(h.maker.fgraph.toposort()) <= 17
validate_grad_graph(h) validate_grad_graph(h)
......
...@@ -14,7 +14,7 @@ builtin_min = __builtin__.min ...@@ -14,7 +14,7 @@ builtin_min = __builtin__.min
from nose.plugins.skip import SkipTest from nose.plugins.skip import SkipTest
import numpy import numpy
from numpy.testing import dec from numpy.testing import dec, assert_array_equal, assert_allclose
from numpy.testing.noseclasses import KnownFailureTest from numpy.testing.noseclasses import KnownFailureTest
import theano import theano
...@@ -7001,6 +7001,85 @@ class TestInferShape(utt.InferShapeTester): ...@@ -7001,6 +7001,85 @@ class TestInferShape(utt.InferShapeTester):
[tile(adtens4, aivec_val, ndim)], [tile(adtens4, aivec_val, ndim)],
[adtens4_val], Tile) [adtens4_val], Tile)
class TestTensorInstanceMethods(unittest.TestCase):
def setUp(self):
self.vars = matrices('X', 'Y')
self.vals = [rand(2,2),rand(2,2)]
def test_argmin(self):
X, _ = self.vars
x, _ = self.vals
assert_array_equal(X.argmin().eval({X: x}), x.argmin())
def test_argmax(self):
X, _ = self.vars
x, _ = self.vals
assert_array_equal(X.argmax().eval({X: x}), x.argmax())
def test_argsort(self):
X, _ = self.vars
x, _ = self.vals
assert_array_equal(X.argsort().eval({X: x}), x.argsort())
assert_array_equal(X.argsort(1).eval({X: x}), x.argsort(1))
def test_dot(self):
X, Y = self.vars
x, y = self.vals
Z = X.clip(0.5 - Y, 0.5 + Y)
z = x.clip(0.5 - y, 0.5 + y)
assert_array_equal(Z.eval({X: x, Y: y}), z)
def test_dot(self):
X, Y = self.vars
x, y = self.vals
assert_array_equal(x.dot(y), X.dot(Y).eval({X: x, Y: y}))
Z = X.dot(Y)
z = x.dot(y)
assert_array_equal(x.dot(z), X.dot(Z).eval({X: x, Z: z}))
def test_real_imag(self):
X, Y = self.vars
x, y = self.vals
Z = X + Y * 1j
z = x + y * 1j
assert_array_equal(Z.real.eval({Z: z}), x)
assert_array_equal(Z.imag.eval({Z: z}), y)
def test_conj(self):
X, Y = self.vars
x, y = self.vals
Z = X + Y * 1j
z = x + y * 1j
assert_array_equal(Z.conj().eval({Z: z}), z.conj())
def test_round(self):
X, _ = self.vars
x, _ = self.vals
assert_array_equal(X.round().eval({X: x}), x.round())
def test_std(self):
X, _ = self.vars
x, _ = self.vals
# std() is implemented as theano tree and does not pass its
# args directly to numpy. This sometimes results in small
# difference, so we use allclose test.
assert_allclose(X.std().eval({X: x}), x.std())
def test_repeat(self):
X, _ = self.vars
x, _ = self.vals
assert_array_equal(X.repeat(2).eval({X: x}), x.repeat(2))
def test_trace(self):
X, _ = self.vars
x, _ = self.vals
assert_array_equal(X.trace().eval({X: x}), x.trace())
def test_ravel(self):
X, _ = self.vars
x, _ = self.vals
assert_array_equal(X.ravel().eval({X: x}), x.ravel())
if __name__ == '__main__': if __name__ == '__main__':
......
...@@ -6,7 +6,6 @@ import unittest ...@@ -6,7 +6,6 @@ import unittest
import theano import theano
from theano import gof from theano import gof
from theano.gradient import grad_sources_inputs
from theano import gradient from theano import gradient
from theano.tensor.nnet.Conv3D import conv3D from theano.tensor.nnet.Conv3D import conv3D
from theano import config from theano import config
...@@ -16,6 +15,16 @@ from theano.gof.null_type import NullType ...@@ -16,6 +15,16 @@ from theano.gof.null_type import NullType
one = theano.tensor.as_tensor_variable(1.) one = theano.tensor.as_tensor_variable(1.)
def grad_sources_inputs(sources, inputs):
"""
This implements the old grad_sources_inputs function in terms of
the new interface so the tests don't need to be rewritten.
"""
if inputs is None:
inputs = theano.gof.graph.inputs([source[0] for source in sources])
return dict(zip(inputs,theano.gradient.grad(cost=None, known_grads=dict(sources),
wrt=inputs, consider_constant=inputs)))
class testgrad_sources_inputs(unittest.TestCase): class testgrad_sources_inputs(unittest.TestCase):
def test_retNone1(self): def test_retNone1(self):
...@@ -369,35 +378,6 @@ class test_grad(unittest.TestCase): ...@@ -369,35 +378,6 @@ class test_grad(unittest.TestCase):
# If we made it to here without an exception, then the # If we made it to here without an exception, then the
# connection_pattern functionality worked correctly # connection_pattern functionality worked correctly
def test_sum_disconnected(self):
# Tests that we can add DisconnectedType to other terms correctly
x = theano.tensor.scalar()
y = x * 2.
z = x + 1.
cost = y + z
theano.tensor.grad(cost, x, consider_constant=[y, z])
# In an earlier version of theano, the above line would have failed
# while trying to add two DisconnectedTypes
def test_output_grad_on_int(self):
# If the g_cost argument is specified when x has a discrete dtype,
# g_cost should be equivalent to 0.
x = theano.tensor.iscalar('x')
y = x * 2
# Should work:
c0 = theano.tensor.constant(0)
theano.grad(y, x, g_cost=c0)
theano.grad(y, x, g_cost=y.zeros_like())
theano.grad(y, x, g_cost=y.zeros_like().astype('float64'))
# Should raise ValueError
c1 = theano.tensor.constant(1)
self.assertRaises(ValueError, theano.grad, y, x, g_cost=c1)
s0 = theano.shared(np.zeros((), dtype='int8'))
self.assertRaises(ValueError, theano.grad, y, x, g_cost=s0)
def test_downcast_dtype(self): def test_downcast_dtype(self):
# Test that the gradient of a cost wrt a float32 variable does not # Test that the gradient of a cost wrt a float32 variable does not
# get upcasted to float64. # get upcasted to float64.
...@@ -418,6 +398,161 @@ class test_grad(unittest.TestCase): ...@@ -418,6 +398,161 @@ class test_grad(unittest.TestCase):
# be downcasted to float32, so dc_dx should also be float32 # be downcasted to float32, so dc_dx should also be float32
assert dc_dx.dtype == 'float32' assert dc_dx.dtype == 'float32'
def test_grad_constant(self):
# Test that the gradient handles Constants and consider_constant variables
# consistently
x = theano.tensor.scalar()
y = theano.tensor.scalar()
z_x = x + y
z_one = one + y
g_x = theano.tensor.grad(z_x, x, consider_constant=[x])
g_one = theano.tensor.grad(z_one, one)
f = theano.function([x, y],[g_x, g_one])
g_x, g_one = f(1, .5)
if not np.allclose(g_x, g_one):
raise AssertionError("Gradient using consider constant is " + str(g_x)\
+ " but gradient with respect to the same Constant is " + \
str(g_one))
def test_known_grads():
# Tests that the grad method with no known_grads
# matches what happens if you put its own known_grads
# in for each variable
full_range = theano.tensor.arange(10)
x = theano.tensor.scalar('x')
t = theano.tensor.iscalar('t')
ft = full_range[t]
ft.name = 'ft'
coeffs = theano.tensor.vector('c')
ct = coeffs[t]
ct.name = 'ct'
p = x ** ft
p.name = 'p'
y = ct * p
y.name = 'y'
cost = theano.tensor.sqr(y)
cost.name = 'cost'
layers = [
[cost],
[y],
[ct,p],
[ct, x, ft],
[coeffs, t, full_range, x]
]
inputs = [coeffs, t, x]
rng = np.random.RandomState([2012, 11, 15])
values = [rng.randn(10), rng.randint(10), rng.randn() ]
values = [np.cast[ipt.dtype](value) for ipt, value in zip(inputs, values)]
true_grads = theano.tensor.grad(cost, inputs, disconnected_inputs='ignore')
true_grads = theano.function(inputs, true_grads)
true_grads = true_grads(*values)
for layer in layers:
print 'Testing by separately computing ',layer
first = theano.tensor.grad(cost, layer, disconnected_inputs='ignore')
known = dict(zip(layer, first))
full = theano.tensor.grad(cost=None,
known_grads=known,wrt=inputs, disconnected_inputs='ignore')
full = theano.function(inputs, full)
full = full(*values)
assert len(true_grads) == len(full)
for a, b, var in zip(true_grads, full, inputs):
if not np.allclose(a, b):
print 'Failure'
print a
print b
print var
print layer
for v in known:
print v,':',theano.function(inputs,known[v])(*values)
assert False
def test_dxdx():
# Tests that the gradient of a scalar with respect to itself is 1
# I use an integer in this case because people keep changing this
# gradient to be 0 on integers but according to our interpretation
# of the gradient as defined in the Op contract, it should be 1.
# If you feel the need to change this unit test you are probably
# modifying the Op contract and should definitely get the approval
# of multiple people on theano-dev.
x = theano.tensor.iscalar()
g = theano.tensor.grad(x, x)
g = g.eval({ x : 12 })
assert np.allclose(g,1.)
def test_known_grads_integers():
# Tests that known_grads works on integers
x = theano.tensor.iscalar()
g_expected = theano.tensor.scalar()
g_grad = theano.gradient.grad(cost=None,
known_grads={x : g_expected},
wrt=x)
f = theano.function([g_expected],g_grad)
x = -3
gv = np.cast[theano.config.floatX](.6)
g_actual = f(gv)
assert np.allclose(g_actual, gv)
def test_undefined_cost_grad():
# Tests that if we say the cost is not differentiable via the
# known_grads mechanism, it is treated as such by the rest of the
# system.
# This is so that Ops that are built around minigraphs like OpFromGraph
# and scan can implement Op.grad by passing ograds to known_grads
x = theano.tensor.iscalar()
y = theano.tensor.iscalar()
cost = x + y
assert cost.dtype in theano.tensor.discrete_dtypes
try:
grads = theano.tensor.grad(cost, [x, y], known_grads = {cost: NullType()() })
except theano.gradient.NullTypeGradError:
return
raise AssertionError("An undefined gradient has been ignored.")
def test_disconnected_cost_grad():
# Tests that if we say the cost is disconnected via the
# known_grads mechanism, it is treated as such by the rest of the
# system.
# This is so that Ops that are built around minigraphs like OpFromGraph
# and scan can implement Op.grad by passing ograds to known_grads
x = theano.tensor.iscalar()
y = theano.tensor.iscalar()
cost = x + y
assert cost.dtype in theano.tensor.discrete_dtypes
try:
grads = theano.tensor.grad(cost, [x, y], known_grads = {cost: gradient.DisconnectedType()() },
disconnected_inputs='raise')
except theano.gradient.DisconnectedInputError:
return
raise AssertionError("A disconnected gradient has been ignored.")
if __name__ == '__main__': if __name__ == '__main__':
unittest.main() unittest.main()
...@@ -341,15 +341,9 @@ class test_RopLop(RopLop_checker): ...@@ -341,15 +341,9 @@ class test_RopLop(RopLop_checker):
rop_out2 = tensor.Rop((m, v, m + v), [m, v], [m_, v_]) rop_out2 = tensor.Rop((m, v, m + v), [m, v], [m_, v_])
assert isinstance(rop_out2, tuple) assert isinstance(rop_out2, tuple)
assert len(rop_out2) == 3 assert len(rop_out2) == 3
lop_out1 = tensor.Lop([m, v, m + v], (m, v), [m_, v_])
assert isinstance(lop_out1, tuple)
assert len(lop_out1) == 2
lop_out2 = tensor.Lop((m, v, m + v), [m, v], [m_, v_])
assert isinstance(lop_out2, list)
assert len(lop_out2) == 2
all_outs = [] all_outs = []
for o in rop_out1, rop_out2, lop_out1, lop_out2: for o in rop_out1, rop_out2:
all_outs.extend(o) all_outs.extend(o)
f = theano.function([m, v, m_, v_], all_outs) f = theano.function([m, v, m_, v_], all_outs)
f(mval, vval, m_val, v_val) f(mval, vval, m_val, v_val)
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论