提交 a01aca11 authored 作者: Ian Goodfellow's avatar Ian Goodfellow

merged

===========================
Announcing Theano 0.3.1
===========================
This is a bug/crash fix and small feature release.
The upgrade is recommended for everybody.
For those using the bleeding edge version in the
mercurial repository, we encourage you to update to the `0.3.1` tag.
Deleting old cache
------------------
Since the default path of the cache directory for compiled object
changed, we encourage you to delete the previous one.
The easiest way to do that is to execute:
python -c 'import theano; print theano.config.base_compiledir'
and then call "rm -rf" on the returned result.
A new cache directory will then be created next time you import theano.
What's New
----------
[Include the content of NEWS.txt here]
Download
--------
You can download Theano from http://pypi.python.org/pypi/Theano.
Description
-----------
Theano is a Python library that allows you to define, optimize, and
efficiently evaluate mathematical expressions involving
multi-dimensional arrays. It is built on top of NumPy. Theano
features:
* tight integration with NumPy: a similar interface to NumPy's.
numpy.ndarrays are also used internally in Theano-compiled functions.
* transparent use of a GPU: perform data-intensive computations up to
140x faster than on a CPU (support for float32 only).
* efficient symbolic differentiation: Theano can compute derivatives
for functions of one or many inputs.
* speed and stability optimizations: avoid nasty bugs when computing
expressions such as log(1+ exp(x) ) for large values of x.
* dynamic C code generation: evaluate expressions faster.
* extensive unit-testing and self-verification: includes tools for
detecting and diagnosing bugs and/or potential problems.
Theano has been powering large-scale computationally intensive
scientific research since 2007, but it is also approachable
enough to be used in the classroom (IFT6266 at the University of Montreal).
Resources
---------
About Theano:
http://deeplearning.net/software/theano/
About NumPy:
http://numpy.scipy.org/
About Scipy:
http://www.scipy.org/
Machine Learning Tutorial with Theano on Deep Architectures:
http://deeplearning.net/tutorial/
Acknowledgments
---------------
I would like to thank all contributors of Theano. For this particular
release, the people who have helped resolve many outstanding issues:
(in alphabetical order) Frederic Bastien, Arnaud Bergeron, James
Bergstra, Josh Bleecher Snyder, Olivier Delalleau, Guillaume
Desjardins, Dumitru Erhan, Ian Goodfellow, Pascal Lamblin, Razvan
Pascanu and Francois Savard and David Warde-Farley.
Also, thank you to all NumPy and Scipy developers as Theano builds on
its strength.
All questions/comments are always welcome on the Theano
mailing-lists ( http://deeplearning.net/software/theano/ )
...@@ -5,6 +5,30 @@ ...@@ -5,6 +5,30 @@
Release Notes Release Notes
============= =============
Theano 0.3 (2010-11-23)
=======================
This is the first major release of Theano since 0.1. Version 0.2 development started internally but it was never advertised as a release.
There have been so many changes since 0.1 that we have lost track of many of them. Below is a *partial* list of changes since 0.1.
* GPU code using NVIDIA's CUDA framework is now generated for many Ops.
* Some interface changes since 0.1:
* A new "shared variable" system to allow reusing memory space between Theano functions.
* A new memory contract has been formally written for Theano, for people who want to minimize memory copies.
* The old module system has been deprecated.
* By default, inputs to a Theano function will not be silently downcasted (e.g. from float64 to float32).
* An error is now raised when using the result of logical operation on Theano variable in an 'if' (i.e. an implicit call to __nonzeros__).
* An error is now raised when we receive a non-aligned ndarray as input to a function (this is not supported).
* An error is raised when the list of dimensions passed to dimshuffle() contains duplicates or is otherwise not sensible.
* Call NumPy BLAS bindings for gemv operations in addition to the already supported gemm.
* If gcc is unavailable at import time, Theano now falls back to a Python-based emulation mode after raising a warning.
* An error is now raised when tensor.grad is called on a non-scalar Theano variable (in the past we would implicitly do a sum on the tensor to make it a scalar).
* Added support for "erf" and "erfc" functions.
* The current default value of the parameter axis of theano.{max,min,argmax,argmin,max_and_argmax} is deprecated. We now use the default NumPy behavior of operating on the entire tensor.
* Theano is now available from PyPI and installable through "easy_install" or "pip".
Theano 0.1 Theano 0.1
========== ==========
......
Modification in the trunk since the last release Modifications in the trunk since the last release
------------------------------------------------
* Sparse type is now supported by the shape op and the ShapeFeature optimizer work correctly with them. In trunk since 0.3.1 release
* Fuse GpuElemwise more often (in the case where there are so many inputs that fusing them all would bust the 256 bytes limit of parameter to gpu function). ----------------------------
GPU:
* Move to the gpu fused elemwise that have other dtype then float32 in them(except float64) if the input and output are float32.
* This allow to move elemwise comparaison to the gpu if we cast it to float32 after that.
Theano 0.3.1 (2011-02-21)
----------------------------
Deprecation:
* The theano shared variable attribute `value` is deprecated, use `get_value()` or `set_value()`!
See http://deeplearning.net/software/theano/tutorial/aliasing.html
Bugs fixed:
* The random number generator in theano/sandbox/rng_mrg.py did not always return the same sequence of number on the CPU and GPU.
* In some cases, there was a (possibly large) fraction of non-random garbage in the returned sequence.
* In python mode (not the default mode) when input of elemwise operation was an empty ndarray, we were not returning an empty ndarray.
* Scan cached the number of steps. This caused no problem because each time you called scan the number of steps would got refreshed.
The problem was when you called ScanGrad which would use the cached number of steps without refreshing it.
To be affected by this bug, one would have to compile two graph, one that would contain a Scan and the other the corresponding GradScan, and
call the first function to cache the number of steps, and then call the second function with a different number of steps.
* In GpuConv, errors in conv_patch_stack_reduce when the entire kernel doesn't fit into shared memory.
The error was not found before as the impact was less then the relative tolerance of 1e-3. Now the relative tolerance is 1e-5.
Crash fixed:
* Add a feature to not have an exception that makes Theano crash when taking the gradient on DimShuffle in some particular case.
* Compilation crash for GpuElemwise with tensor with high number of dimensions (~6 or more).
* Disabled C code generator that make gcc crash on complex type.
* Crash in optimization when an Op has no input.
* Output shape is now computed correctly for matrix-vector multiplication on GPU.
* In Scan, when using numbers as inputs, not symbolic variables.
* In GradScan, when there is only 1 inputs in the Scan.
* In GpuSum, bug in calculation of n_blocks for the 10 pattern. (Sum on the row of a matrix)
* Some segfault at exit with GPU code.
Optimization:
* New SpecifyShape op that allow to pass more shape info in the graph.
* Speed up gemv by a work around scipy gemv slowness when the matrix is in C order (the default). * Speed up gemv by a work around scipy gemv slowness when the matrix is in C order (the default).
* Remove join of only 1 element.
* During optimization, consider one more case in get_constant_value.
GPU:
* cuda_shared.value = X now works inplace!
* cuda_shared_var.set_value(new_ndarray) will overwrite the old value inplace in the most common case.
* Allow to create a CudaNdarraySharedVariable from a CudaNdarray.
* New init_gpu_device theano flags.
* Fuse GpuElemwise more often (in the case where there are so many inputs that fusing them all would bust the 256 bytes limit of parameter to gpu function).
* CPU join of only 1 element that was not moved to the GPU.
New features:
* tensor.reshape now makes dimensions of length 1 broadcastable.
* tensor.prod now implements the gradient.
* DebugMode now warns if an Op declared itself as returning a view of the input but did not do so.
* This behaviour is a problem, because it can block other Ops from being inplace on the same inputs. This could lower the reuse of memory.
* Sparse.structured_dot now works when both matrices are sparse
* Sparse type is now supported by the shape op, and the ShapeFeature optimizer works correctly with them.
* New 3D convolution ops, with CPU and GPU implementations.
* New colors in pydotprint.
Documentation:
* Documented lib.amdlibm and (new) init_gpu_device config variables.
* A new page (was done for 0.3 but an error was hiding it on the web page) on the memory aliasing contract of Theano.
* Revision to the Windows installation instructions.
* The cuda documentation is now generated on the web server.
* Better documentation of .theanorc and its sections.
Unit tests:
* Stop usage of deprecated functions or syntax in the unit tests.
* Better testing of GPU convolution nets.
* Make more tests able to use different random seeds.
* Tests of sparse now use default mode, not a hard-coded one.
* Remove some tests of unimplemented features.
Theano 0.3 (2010-11-23) Other:
----------------------- * The name of compiledir now includes the Python version to make it easier for people with many Python versions
* Added theano.tensor.std as a shortcut to sqrt(var(input=input, axis=axis)).
This is the first major release of Theano since 0.1. Version 0.2 development started internally but it was never advertised as a release. * Whitespace, tabulation and indentation clean-up in the code.
* Better detection of memory sharing between variables.
There have been so many changes since 0.1 that we have lost track of many of them. Below is a *partial* list of changes since 0.1.
* GPU code using NVIDIA's CUDA framework is now generated for many Ops.
* Some interface changes since 0.1:
* A new "shared variable" system to allow reusing memory space between Theano functions.
* A new memory contract has been formally written for Theano, for people who want to minimize memory copies.
* The old module system has been deprecated.
* By default, inputs to a Theano function will not be silently downcasted (e.g. from float64 to float32).
* An error is now raised when using the result of logical operation on Theano variable in an 'if' (i.e. an implicit call to __nonzeros__).
* An error is now raised when we receive a non-aligned ndarray as input to a function (this is not supported).
* An error is raised when the list of dimensions passed to dimshuffle() contains duplicates or is otherwise not sensible.
* Call NumPy BLAS bindings for gemv operations in addition to the already supported gemm.
* If gcc is unavailable at import time, Theano now falls back to a Python-based emulation mode after raising a warning.
* An error is now raised when tensor.grad is called on a non-scalar Theano variable (in the past we would implicitly do a sum on the tensor to make it a scalar).
* Added support for "erf" and "erfc" functions.
* The current default value of the parameter axis of theano.{max,min,argmax,argmin,max_and_argmax} is deprecated. We now use the default NumPy behavior of operating on the entire tensor.
* Theano is now available from PyPI and installable through "easy_install" or "pip".
...@@ -45,15 +45,15 @@ master_doc = 'index' ...@@ -45,15 +45,15 @@ master_doc = 'index'
# General substitutions. # General substitutions.
project = 'Theano' project = 'Theano'
copyright = '2008--2010, LISA lab' copyright = '2008--2011, LISA lab'
# The default replacements for |version| and |release|, also used in various # The default replacements for |version| and |release|, also used in various
# other places throughout the built documents. # other places throughout the built documents.
# #
# The short X.Y version. # The short X.Y version.
version = '0.3' version = '0.3.1'
# The full version, including alpha/beta/rc tags. # The full version, including alpha/beta/rc tags.
release = '0.3.0' release = '0.3.1'
# There are two options for replacing |today|: either, you set today to some # There are two options for replacing |today|: either, you set today to some
# non-false value, then it is used: # non-false value, then it is used:
......
...@@ -57,7 +57,7 @@ There are less methods to define for an Op than for a Type: ...@@ -57,7 +57,7 @@ There are less methods to define for an Op than for a Type:
.. method:: infer_shape(node, (i0_shapes,i1_shapes,...)) .. method:: infer_shape(node, (i0_shapes,i1_shapes,...))
Allow optimization to lift the Shape op over this op. Allow optimization to lift the Shape op over this op.
Example of why this is good is that we compute an op only to take its shape, Example of why this is good is that we compute an op only to take its shape,
we will be able to have the shape without its computation. we will be able to have the shape without its computation.
must return a tuple with one tuple with the shape of each output. must return a tuple with one tuple with the shape of each output.
...@@ -160,13 +160,17 @@ version that it produces in the code I gave above. ...@@ -160,13 +160,17 @@ version that it produces in the code I gave above.
raise TypeError('%s only works on doubles' % self.name) raise TypeError('%s only works on doubles' % self.name)
return gof.Apply(self, [x, y], [double()]) return gof.Apply(self, [x, y], [double()])
def perform(self, node, (x, y), (z, )): def perform(self, node, inp, out):
x, y = inp
z, = out
z[0] = self.fn(x, y) z[0] = self.fn(x, y)
def __str__(self): def __str__(self):
return self.name return self.name
def c_code(self, node, name, (x, y), (z, ), sub): def c_code(self, node, name, inp, out, sub):
x, y = inp
z, = out
return self.ccode % locals() return self.ccode % locals()
......
...@@ -363,7 +363,9 @@ arithmetic operators: ...@@ -363,7 +363,9 @@ arithmetic operators:
raise TypeError('%s only works on doubles' % self.name) raise TypeError('%s only works on doubles' % self.name)
return gof.Apply(self, [x, y], [double()]) return gof.Apply(self, [x, y], [double()])
def perform(self, node, (x, y), (z, )): def perform(self, node, inp, out):
x, y = inp
z, = out
z[0] = self.fn(x, y) z[0] = self.fn(x, y)
def __str__(self): def __str__(self):
......
...@@ -240,7 +240,12 @@ documentation to know how to configure them differently. ...@@ -240,7 +240,12 @@ documentation to know how to configure them differently.
.. note:: .. note::
The tests should be run with the ``THEANO_FLAGS`` ``device=cpu`` (default). The tests should be run with the ``THEANO_FLAGS`` ``device=cpu`` (default).
Otherwise, it will generate false errors. Otherwise, it will generate false errors. If you have a GPU, it will
automatically be used to run GPU-related tests.
If you want the GPU-related tests to run on a specific GPU device, and not
the default one, you should use :attr:`~config.init_gpu_device`, for
instance ``THEANO_FLAGS=init_gpu_device=gpu1``.
All tests should pass except those marked as ``KnownFailureTest``. If some All tests should pass except those marked as ``KnownFailureTest``. If some
test fails on your machine, you are encouraged to tell us what went wrong on test fails on your machine, you are encouraged to tell us what went wrong on
...@@ -248,10 +253,10 @@ the ``theano-users@googlegroups.com`` mailing list. ...@@ -248,10 +253,10 @@ the ``theano-users@googlegroups.com`` mailing list.
.. note:: .. note::
`warn.ignore_bug_before=all` removes warnings that you don't need to see ``warn.ignore_bug_before=all`` removes warnings that you don't need to see
here. It is also recommended for a new user to set this flag to a here. It is also recommended for a new user to set this flag to a
different value in their ``.theanorc`` file. See different value in their ``.theanorc`` file. See
:attr:`config.warn.ignore_bug_before` for more details. :attr:`.config.warn.ignore_bug_before` for more details.
Troubleshooting: Make sure you have a BLAS library Troubleshooting: Make sure you have a BLAS library
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
...@@ -533,8 +538,8 @@ used within a MinGW Shell (not available if you only installed Python(x,y)). ...@@ -533,8 +538,8 @@ used within a MinGW Shell (not available if you only installed Python(x,y)).
C:\Users\login>echo %PYTHONPATH% C:\Users\login>echo %PYTHONPATH%
C:\Users\login\Theano C:\Users\login\Theano
- Create a new ``.theanorc`` text file (or ``.theanorc.txt``, which is easier - Create a new ``.theanorc`` text file (or ``.theanorc.txt``, whichever is easier
to create under Windows) in your user profile directory (the directory you for you to create under Windows) in your user profile directory (the directory you
are into when you start a new command prompt with ``cmd``), containing the are into when you start a new command prompt with ``cmd``), containing the
following two lines: following two lines:
...@@ -543,6 +548,16 @@ used within a MinGW Shell (not available if you only installed Python(x,y)). ...@@ -543,6 +548,16 @@ used within a MinGW Shell (not available if you only installed Python(x,y)).
[blas] [blas]
ldflags = ldflags =
You do not need to do the following now, because it is not usually needed, but if
later on, when running Theano, you see an error message that looks like:
*error: 'assert' was not declared in this scope*
then you will have to add another section:
.. code-block:: cfg
[gcc]
cxxflags = -IC:\MinGW\include
- You are now ready to run Theano. - You are now ready to run Theano.
It will use NumPy for dot products, which is still pretty fast (see below for It will use NumPy for dot products, which is still pretty fast (see below for
optional instructions on how to compile your own BLAS library). optional instructions on how to compile your own BLAS library).
......
...@@ -4,39 +4,107 @@ ...@@ -4,39 +4,107 @@
How to make a release How to make a release
================================================== ==================================================
Get a fresh copy of the repository
==================================
Clone the code:: Clone the code::
ssh projects@pylearn.org hg clone http://hg.assembla.com/theano Theano-0.X
hg clone hg/Theano Theano-0.X
It does not have to be in your PYTHONPATH.
Update the version number
=========================
Edit ``setup.py`` to contain the newest version number :: Edit ``setup.py`` to contain the newest version number ::
cd Theano-0.X cd Theano-0.X
vi setup.py # Edit the MAJOR, MINOR, MICRO and SUFFIX vi setup.py # Edit the MAJOR, MINOR, MICRO and SUFFIX
The homepage must link to the download URL, for PyPI to correctly get the
code.
Edit ``doc/index.txt`` to contain a link to what will be the download URL::
vi doc/index.txt # Edit the link to downloads/Theano-0.X.tar.gz
``conf.py`` in the ``doc/`` directory should be updated in the following ways: ``conf.py`` in the ``doc/`` directory should be updated in the following ways:
* Change the ``version`` and ``release`` variables to new version number. * Change the ``version`` and ``release`` variables to new version number.
* Change the upper copyright year to the current year if necessary. * Change the upper copyright year to the current year if necessary.
``NEWS.txt`` usually contains the name and date of the release, change them
too.
Tag the release
===============
You will need to commit the previous changes, tag the resulting version, and
push that into the original repository::
Tag the release. The syntax is something like the following:: Tag the release. The syntax is something like the following::
hg commit -m"setup.py modifications for 0.X release" setup.py hg commit -m"modifications for 0.X release" setup.py doc/conf.py NEWS.txt
hg tag 0.X hg tag 0.X
hg push hg push
The documentation will be automatically regenerated in the next few hours.
Generate and upload the package
===============================
On PyPI
-------
Now change ``ISRELEASED`` in setup.py to ``True``. Now change ``ISRELEASED`` in setup.py to ``True``.
Finally, use setuptools to register and upload the release:: Finally, use setuptools to register and upload the release::
python setup.py register sdist --formats=gztar,zip upload python setup.py register sdist --formats=gztar,zip upload
Change ``ISRELEASED`` back to ``False``. This command register and uploads the package on pypi.python.org. To be able
to do that, you must register on PyPI (you can create an new account, or use
OpenID), and be listed among the "Package Index Owners" of Theano.
On freshmeat
------------
Theano project page at freshmeat is `here <http://freshmeat.net/projects/theano>`__.
The package itself is not uploaded to freshmeat, the only thing to update is
the description and tags.
ou can request the rights to add a release from an admin (for instance Fred),
pointing them to `the "roles" page
<http://freshmeat.net/projects/theano/roles>`__. Then, create a new release from
`the "releases" page <http://freshmeat.net/projects/theano/releases>`__.
On mloss.org
------------
Project page is at http://mloss.org/software/view/241/.
Account jaberg is listed as submitter.
1. log in as jaberg to mloss
2. search for theano and click the logo
3. press 'update this project' on the left and change
- the version number
- the download link
- the description of what has changed
4. press save
Make sure the "what's changed" text isn't too long because it will show up on
the front page of mloss. You have to indent bullet lines by 4 spaces I think in
the description.
You can "update this project" and save lots of times to get the revision text
right. Just do not change the version number.
Finally
-------
Change ``ISRELEASED`` back to ``False``.
Announce the release
====================
Generate an e-mail from the template in in ``EMAIL.txt``, including content
from ``NEWS.txt``, and send it to the following mailing lists:
Regenerate the documentation. * theano-users
* theano-announce
* numpy-discussion@scipy.org
* scipy-user@scipy.org
...@@ -43,9 +43,8 @@ Environment Variables ...@@ -43,9 +43,8 @@ Environment Variables
.. envvar:: THEANO_FLAGS .. envvar:: THEANO_FLAGS
This is a list of comma-delimited key[=value] pairs that control This is a list of comma-delimited key=value pairs that control
Theano's behavior. A key that appears without an '=value' must be Theano's behavior.
for a boolean value, and it acts as setting it to True.
For example, in bash, you can override your :envvar:`THEANORC` defaults For example, in bash, you can override your :envvar:`THEANORC` defaults
for <myscript>.py by typing this: for <myscript>.py by typing this:
...@@ -101,34 +100,41 @@ import theano and print the config variable, as in: ...@@ -101,34 +100,41 @@ import theano and print the config variable, as in:
.. attribute:: device .. attribute:: device
String value: either 'cpu', 'gpu', 'gpu0', 'gpu1', 'gpu2', or 'gpu3' String value: either ``'cpu'``, ``'gpu'``, ``'gpu0'``, ``'gpu1'``,
``'gpu2'``, or ``'gpu3'``
Default device for computations. If gpu*, change the default to try to move computation to it and to put shared variable of float32 on it. Default device for computations. If ``gpu*``, change the default to try
to move computation to it and to put shared variable of float32 on
it.
Choose the default compute device for theano graphs. Setting this to a Choose the default compute device for theano graphs. Setting this to a
gpu* string will make theano to try by default to move computation to it. ``gpu*`` string will make theano to try by default to move computation to it.
Also it will make theano put by default shared variable of float32 on it. Also it will make theano put by default shared variable of float32 on it.
'gpu' lets the driver select the gpu to use, while 'gpu?' makes theano try ``'gpu'`` lets the driver select the GPU to use, while ``'gpu?'`` makes Theano try
to use a specific device. If we are not able to use the gpu, we fall back to use a specific device. If we are not able to use the GPU, either we fall back
to the cpu. on the CPU, or an error is raised, depending on the :attr:`force_device` flag.
.. attribute:: force_device .. attribute:: force_device
Bool value: either True or False Bool value: either ``True`` or ``False``
Default False Default: ``False``
If True, we raise an error if we can't use the specified device. If False, we fall back to the cpu. If ``True``, we raise an error if we cannot use the specified :attr:`device`.
Have precedence over the device flag. If ``False``, we fall back to the CPU.
.. attribute:: init_gpu_device .. attribute:: init_gpu_device
String value: either '', 'gpu0', 'gpu1', 'gpu2', or 'gpu3' String value: either ``''``, ``'gpu'``, ``'gpu0'``, ``'gpu1'``, ``'gpu2'``,
or ``'gpu3'``
Initialize the gpu device to use. This don't change anything other. So by Initialize the gpu device to use.
default we continue to do computation on the cpu and we keep shared variable When its value is gpu*, the theano flag :attr:`device` must be ``"cpu"``.
on the cpu memory. Unlike :attr:`device`, setting this flag to a specific GPU will not
try to use this device by default, in particular it will **not** move
computations, nor shared variables, to the specified GPU.
When its value is gpu*, the theano flag device must be cpu. This flag is useful to run GPU-specific tests on a particular GPU, instead
of using the default one.
.. attribute:: floatX .. attribute:: floatX
......
...@@ -89,9 +89,13 @@ write an Op:** ...@@ -89,9 +89,13 @@ write an Op:**
return x * numpy.log(x) return x * numpy.log(x)
def impl(self, x): def impl(self, x):
return XlogX.st_impl(x) return XlogX.st_impl(x)
def grad(self, (x,), (gz,)): def grad(self, inp, grads):
x, = inp
gz, = grads
return [gz * (1 + scalar.log(x))] return [gz * (1 + scalar.log(x))]
def c_code(self, node, name, (x,), (z,), sub): def c_code(self, node, name, inp, out, sub):
x, = inp
z, = out
if node.inputs[0].type in [scalar.float32, scalar.float64]: if node.inputs[0].type in [scalar.float32, scalar.float64]:
return """%(z)s = return """%(z)s =
%(x)s == 0.0 %(x)s == 0.0
......
...@@ -77,12 +77,13 @@ subsequently make to ``np_array`` have no effect on our shared variable. ...@@ -77,12 +77,13 @@ subsequently make to ``np_array`` have no effect on our shared variable.
np_array += 1 # now it is an array of 2.0 s np_array += 1 # now it is an array of 2.0 s
s_default.value # -> array([1.0, 1.0]) s_default.get_value() # -> array([1.0, 1.0])
s_false.value # -> array([1.0, 1.0]) s_false.get_value() # -> array([1.0, 1.0])
s_true.value # -> array([2.0, 2.0]) s_true.get_value() # -> array([2.0, 2.0])
If we are running this with the CPU as the device, If we are running this with the CPU as the device,
then changes we make to np_array *right away* will show up in ``s_true.value`` then changes we make to np_array *right away* will show up in
``s_true.get_value()``
because numpy arrays are mutable, and ``s_true`` is using the ``np_array`` because numpy arrays are mutable, and ``s_true`` is using the ``np_array``
object as it's internal buffer. object as it's internal buffer.
...@@ -101,8 +102,8 @@ will terminate the aliasing). ...@@ -101,8 +102,8 @@ will terminate the aliasing).
It is safe practice (and a good idea) to use ``borrow=True`` in a shared It is safe practice (and a good idea) to use ``borrow=True`` in a shared
variable constructor when the shared variable stands for a large object (in variable constructor when the shared variable stands for a large object (in
terms of memory footprint) and you do not want to create copies of it in memory terms of memory footprint) and you do not want to create copies of it in
. memory.
It is not a reliable technique to use ``borrow=True`` to modify shared variables It is not a reliable technique to use ``borrow=True`` to modify shared variables
by side-effect, because with some devices (e.g. GPU devices) this technique will by side-effect, because with some devices (e.g. GPU devices) this technique will
......
...@@ -256,10 +256,11 @@ internal state, and returns the old state value. ...@@ -256,10 +256,11 @@ internal state, and returns the old state value.
This code introduces a few new concepts. The ``shared`` function constructs This code introduces a few new concepts. The ``shared`` function constructs
so-called :term:`shared variables`. These are hybrid symbolic and non-symbolic so-called :term:`shared variables`. These are hybrid symbolic and non-symbolic
variables. Shared variables can be used in symbolic expressions just like variables. Shared variables can be used in symbolic expressions just like
the objects returned by ``dmatrices(...)`` but they also have a ``.value`` the objects returned by ``dmatrices(...)`` but they also have an internal
property that defines the value taken by this symbolic variable in *all* the value, that defines the value taken by this symbolic variable in *all* the
functions that use it. It is called a *shared* variable because its value is functions that use it. It is called a *shared* variable because its value is
shared between many functions. We will come back to this soon. shared between many functions. The value can be accessed and modified by the
``.get_value()`` and ``.set_value()`` methods. We will come back to this soon.
The other new thing in this code is the ``updates`` parameter of function. The other new thing in this code is the ``updates`` parameter of function.
The updates is a list of pairs of the form (shared-variable, new expression). The updates is a list of pairs of the form (shared-variable, new expression).
...@@ -274,23 +275,23 @@ Anyway, let's try it out! ...@@ -274,23 +275,23 @@ Anyway, let's try it out!
.. If you modify this code, also change : .. If you modify this code, also change :
.. theano/tests/test_tutorial.py:T_examples.test_examples_8 .. theano/tests/test_tutorial.py:T_examples.test_examples_8
>>> state.value >>> state.get_value()
array(0) array(0)
>>> accumulator(1) >>> accumulator(1)
array(0) array(0)
>>> state.value >>> state.get_value()
array(1) array(1)
>>> accumulator(300) >>> accumulator(300)
array(1) array(1)
>>> state.value >>> state.get_value()
array(301) array(301)
It is possible to reset the state. Just assign to the ``.value`` property: It is possible to reset the state. Just use the ``.set_value()`` method:
>>> state.value = -1 >>> state.set_value(-1)
>>> accumulator(3) >>> accumulator(3)
array(-1) array(-1)
>>> state.value >>> state.get_value()
array(2) array(2)
As we mentioned above, you can define more than one function to use the same As we mentioned above, you can define more than one function to use the same
...@@ -302,7 +303,7 @@ shared variable. These functions can both update the value. ...@@ -302,7 +303,7 @@ shared variable. These functions can both update the value.
>>> decrementor = function([inc], state, updates=[(state, state-inc)]) >>> decrementor = function([inc], state, updates=[(state, state-inc)])
>>> decrementor(2) >>> decrementor(2)
array(2) array(2)
>>> state.value >>> state.get_value()
array(0) array(0)
You might be wondering why the updates mechanism exists. You can always You might be wondering why the updates mechanism exists. You can always
...@@ -323,13 +324,13 @@ for the purpose of one particular function. ...@@ -323,13 +324,13 @@ for the purpose of one particular function.
.. theano/tests/test_tutorial.py:T_examples.test_examples_8 .. theano/tests/test_tutorial.py:T_examples.test_examples_8
>>> fn_of_state = state * 2 + inc >>> fn_of_state = state * 2 + inc
>>> foo = T.lscalar() # the type (lscalar) must match the shared variable we >>> foo = T.lscalar() # the type (lscalar) must match the shared variable we
>>> # are replacing with the ``givens`` list >>> # are replacing with the ``givens`` list
>>> skip_shared = function([inc, foo], fn_of_state, >>> skip_shared = function([inc, foo], fn_of_state,
givens=[(state, foo)]) givens=[(state, foo)])
>>> skip_shared(1, 3) # we're using 3 for the state, not state.value >>> skip_shared(1, 3) # we're using 3 for the state, not state.value
array(7) array(7)
>>> state.value # old state still there, but we didn't use it >>> state.get_value() # old state still there, but we didn't use it
array(0) array(0)
The givens parameter can be used to replace any symbolic variable, not just a The givens parameter can be used to replace any symbolic variable, not just a
...@@ -411,9 +412,11 @@ Seedings Streams ...@@ -411,9 +412,11 @@ Seedings Streams
Random variables can be seeded individually or collectively. Random variables can be seeded individually or collectively.
You can seed just one random variable by seeding or assigning to the You can seed just one random variable by seeding or assigning to the
``.rng.value`` attribute. ``.rng`` attribute, using ``.rng.set_value()``.
>>> rv_u.rng.value.seed(89234) # seeds the generator for rv_u >>> rng_val = rv_u.rng.get_value(borrow=True) # Get the rng for rv_u
>>> rng_val.seed(89234) # seeds the generator
>>> rv_u.rng.set_value(rng_val, borrow=True) # Assign back seeded rng
You can also seed *all* of the random variables allocated by a :class:`RandomStreams` You can also seed *all* of the random variables allocated by a :class:`RandomStreams`
object by that object's ``seed`` method. This seed will be used to seed a object by that object's ``seed`` method. This seed will be used to seed a
...@@ -431,10 +434,12 @@ update the state of the generators used in function ``f`` above. ...@@ -431,10 +434,12 @@ update the state of the generators used in function ``f`` above.
For example: For example:
>>> state_after_v0 = rv_u.rng.value.get_state() >>> state_after_v0 = rv_u.rng.get_value().get_state()
>>> nearly_zeros() # this affects rv_u's generator >>> nearly_zeros() # this affects rv_u's generator
>>> v1 = f() >>> v1 = f()
>>> rv_u.rng.value.set_state(state_after_v0) >>> rng = rng.get_value(borrow=True)
>>> rng.set_state(state_after_v0)
>>> rv_u.rng.set_value(rng, borrow=True)
>>> v2 = f() # v2 != v1 >>> v2 = f() # v2 != v1
......
...@@ -133,7 +133,8 @@ matrix ``W`` and a bias ``b``, you can define: ...@@ -133,7 +133,8 @@ matrix ``W`` and a bias ``b``, you can define:
def __getstate__(self): def __getstate__(self):
return (W, b) return (W, b)
def __setstate__(self, (W,b)): def __setstate__(self, state):
W, b = state
self.W = W self.W = W
self.b = b self.b = b
...@@ -146,7 +147,8 @@ functions to reflect the change in name: ...@@ -146,7 +147,8 @@ functions to reflect the change in name:
def __getstate__(self): def __getstate__(self):
return (weights, bias) return (weights, bias)
def __setstate__(self, (W,b)): def __setstate__(self, state):
W, b = state
self.weights = W self.weights = W
self.bias = b self.bias = b
......
...@@ -47,7 +47,7 @@ AUTHOR_EMAIL = "theano-dev@googlegroups.com" ...@@ -47,7 +47,7 @@ AUTHOR_EMAIL = "theano-dev@googlegroups.com"
PLATFORMS = ["Windows", "Linux", "Solaris", "Mac OS-X", "Unix"] PLATFORMS = ["Windows", "Linux", "Solaris", "Mac OS-X", "Unix"]
MAJOR = 0 MAJOR = 0
MINOR = 3 MINOR = 3
MICRO = 0 MICRO = 1
SUFFIX = "" # Should be blank except for rc's, betas, etc. SUFFIX = "" # Should be blank except for rc's, betas, etc.
ISRELEASED = False ISRELEASED = False
......
...@@ -69,8 +69,8 @@ FancyModule = Module ...@@ -69,8 +69,8 @@ FancyModule = Module
from printing import \ from printing import \
pprint, pp pprint, pp
import scan as scan_module
from scan import scan,map, reduce, foldl, foldr from scan import scan, map, reduce, foldl, foldr, Scan, ScanGrad
import tensor import tensor
import scalar import scalar
......
...@@ -381,6 +381,7 @@ class InvalidValueError(DebugModeError): ...@@ -381,6 +381,7 @@ class InvalidValueError(DebugModeError):
client_node = self.client_node client_node = self.client_node
hint = self.hint hint = self.hint
specific_hint = self.specific_hint specific_hint = self.specific_hint
context = debugprint(r, prefix=' ', depth=12, file=StringIO()).getvalue()
return """InvalidValueError return """InvalidValueError
type(variable) = %(type_r)s type(variable) = %(type_r)s
variable = %(r)s variable = %(r)s
...@@ -393,7 +394,8 @@ class InvalidValueError(DebugModeError): ...@@ -393,7 +394,8 @@ class InvalidValueError(DebugModeError):
isfinite = %(v_isfinite)s isfinite = %(v_isfinite)s
client_node = %(client_node)s client_node = %(client_node)s
hint = %(hint)s hint = %(hint)s
specific_hint = %(specific_hint)s specific_hint = %(specific_hint)s
context = ...\n%(context)s
""" % locals() """ % locals()
######################## ########################
...@@ -403,8 +405,9 @@ class InvalidValueError(DebugModeError): ...@@ -403,8 +405,9 @@ class InvalidValueError(DebugModeError):
######################## ########################
def debugprint(r, prefix='', depth=-1, done=None, print_type=False,
def debugprint(r, prefix='', depth=-1, done=None, print_type=False, file=sys.stdout, print_destroy_map=False, print_view_map=False): file=sys.stdout, print_destroy_map=False, print_view_map=False,
order=[]):
"""Print the graph leading to `r` to given depth. """Print the graph leading to `r` to given depth.
:param r: Variable instance :param r: Variable instance
...@@ -415,6 +418,7 @@ def debugprint(r, prefix='', depth=-1, done=None, print_type=False, file=sys.std ...@@ -415,6 +418,7 @@ def debugprint(r, prefix='', depth=-1, done=None, print_type=False, file=sys.std
:param file: file-like object to which to print :param file: file-like object to which to print
:param print_destroy_map: wether to print the op destroy_map after ofther info :param print_destroy_map: wether to print the op destroy_map after ofther info
:param print_view_map: wether to print the op view_map after ofther info :param print_view_map: wether to print the op view_map after ofther info
:param order: If not empty will print the index in the toposort.
""" """
if depth==0: if depth==0:
return return
...@@ -452,22 +456,28 @@ def debugprint(r, prefix='', depth=-1, done=None, print_type=False, file=sys.std ...@@ -452,22 +456,28 @@ def debugprint(r, prefix='', depth=-1, done=None, print_type=False, file=sys.std
if view_map_str and view_map_str!='{}': if view_map_str and view_map_str!='{}':
view_map_str='v='+view_map_str view_map_str='v='+view_map_str
o=''
if order:
o = str(order.index(r.owner))
if len(a.outputs) == 1: if len(a.outputs) == 1:
print >> file, '%s%s [@%i]%s \'%s\' %s %s' % (prefix, a.op, id(r), print >> file, '%s%s [@%i]%s \'%s\' %s %s %s' % (prefix, a.op, id(r),
type_str, r_name, type_str, r_name,
destroy_map_str, destroy_map_str,
view_map_str) view_map_str,
o)
else: else:
print >> file, '%s%s.%i [@%i]%s \'%s\' %s %s' % (prefix, a.op, print >> file, '%s%s.%i [@%i]%s \'%s\' %s %s %s' % (prefix, a.op,
a.outputs.index(r), a.outputs.index(r),
id(r), type_str, id(r), type_str,
r_name, r_name,
destroy_map_str, destroy_map_str,
view_map_str) view_map_str,
o)
if id(a) not in done: if id(a) not in done:
done.add(id(a)) done.add(id(a))
for i in a.inputs: for i in a.inputs:
debugprint(i, prefix+' |', depth=depth-1, done=done, print_type=print_type, file=file) debugprint(i, prefix+' |', depth=depth-1, done=done,
print_type=print_type, file=file, order=order)
else: else:
#this is a variable #this is a variable
print >> file, '%s%s [@%i]%s' % (prefix, r, id(r), type_str) print >> file, '%s%s [@%i]%s' % (prefix, r, id(r), type_str)
...@@ -532,20 +542,16 @@ def _check_inputs(node, storage_map, r_vals, dr_vals, active_nodes, clobber_dr_v ...@@ -532,20 +542,16 @@ def _check_inputs(node, storage_map, r_vals, dr_vals, active_nodes, clobber_dr_v
for oo,ii in vmap.iteritems(): for oo,ii in vmap.iteritems():
out_var = storage_map[node.outputs[oo]][0] out_var = storage_map[node.outputs[oo]][0]
in_var = storage_map[node.inputs[ii[0]]][0] in_var = storage_map[node.inputs[ii[0]]][0]
# We don't try to optimize simple scalar, as this is not worth our time # We don't try to optimize simple scalar and empty ndarray,
# This happen at least in Subtensor when the output is a scalar # as this is not worth our time. This happen at least in
# But this depend on the version of numpy! # Subtensor when the output is a scalar But this depend on
if getattr(out_var,'size',2)==1: # the version of numpy!
if getattr(out_var,'size',2)<=1:
continue continue
if isinstance(node.op, theano.compile.mode.OutputGuard): if isinstance(node.op, theano.compile.mode.OutputGuard):
# This class is not in the final graph. # This class is not in the final graph.
continue continue
if not _may_share_memory(out_var, in_var): if not _may_share_memory(out_var, in_var):
#when a subtensor return a tensor of ndim==0, numpy seam to return a copy.
#when have an empty ndarray(happen with output guard) it is not the same. why?
if hasattr(out_var,'ndim') and (out_var.ndim>0 and out_var.size>0):
continue
opt_warning("input idx %d marked as viewed but new memory allocated by node '%s'"%(ii[0],str(node))) opt_warning("input idx %d marked as viewed but new memory allocated by node '%s'"%(ii[0],str(node)))
for r_idx, r in enumerate(node.inputs): for r_idx, r in enumerate(node.inputs):
...@@ -1678,6 +1684,9 @@ class DebugMode(Mode): ...@@ -1678,6 +1684,9 @@ class DebugMode(Mode):
If any of these arguments (except optimizer) is not None, it overrides the class default. If any of these arguments (except optimizer) is not None, it overrides the class default.
The linker arguments is not used. It is set their to allow Mode.requiring() and some other fct to work with DebugMode too. The linker arguments is not used. It is set their to allow Mode.requiring() and some other fct to work with DebugMode too.
""" """
if linker is not None and not issubclass(linker, _Linker):
raise Exception("DebugMode can use only its own linker! Don't give him one to use it.", linker)
super(DebugMode, self).__init__( super(DebugMode, self).__init__(
optimizer=optimizer, optimizer=optimizer,
linker=_Linker) linker=_Linker)
......
...@@ -223,6 +223,8 @@ class DeepCopyOp(theano.gof.Op): ...@@ -223,6 +223,8 @@ class DeepCopyOp(theano.gof.Op):
} }
"""%locals() """%locals()
else:
super(DeepCopyOp, self).c_code(node, name, inames, onames, sub)
deep_copy_op = DeepCopyOp() deep_copy_op = DeepCopyOp()
...@@ -380,9 +382,9 @@ class Function(object): ...@@ -380,9 +382,9 @@ class Function(object):
finder[i] = c finder[i] = c
finder[input.variable] = c finder[input.variable] = c
if input.name not in finder: if input.name not in finder:
finder[input.name] = c finder[input.name] = c
else: else:
finder[input.name] = DUPLICATE finder[input.name] = DUPLICATE
if input.name is None: if input.name is None:
n_unnamed_inputs += 1 n_unnamed_inputs += 1
else: else:
...@@ -408,9 +410,9 @@ class Function(object): ...@@ -408,9 +410,9 @@ class Function(object):
finder[i] = f finder[i] = f
finder[input] = f finder[input] = f
if input.name not in finder: if input.name not in finder:
finder[input.name] = f finder[input.name] = f
else: else:
finder[input.name] = DUPLICATE finder[input.name] = DUPLICATE
#backport #backport
#finder[input.name] = f if input.name not in finder else DUPLICATE #finder[input.name] = f if input.name not in finder else DUPLICATE
#setters.append(f) #setters.append(f)
...@@ -421,9 +423,9 @@ class Function(object): ...@@ -421,9 +423,9 @@ class Function(object):
finder[sin.variable] = c finder[sin.variable] = c
finder[sin.name] = c finder[sin.name] = c
if sin.name not in finder: if sin.name not in finder:
finder[sin.name] = c finder[sin.name] = c
else: else:
finder[sin.name] = DUPLICATE finder[sin.name] = DUPLICATE
#backport #backport
#finder[sin.name] = c if sin.name not in finder else DUPLICATE #finder[sin.name] = c if sin.name not in finder else DUPLICATE
inv_finder[c] = input inv_finder[c] = input
...@@ -511,7 +513,7 @@ class Function(object): ...@@ -511,7 +513,7 @@ class Function(object):
# Set positional arguments # Set positional arguments
i = 0 i = 0
for arg_index, arg in enumerate(args): for arg in args:
#TODO: provide a Param option for skipping the filter if we #TODO: provide a Param option for skipping the filter if we
# really want speed. # really want speed.
s = self.input_storage[i] s = self.input_storage[i]
...@@ -523,7 +525,7 @@ class Function(object): ...@@ -523,7 +525,7 @@ class Function(object):
allow_downcast=s.allow_downcast) allow_downcast=s.allow_downcast)
except Exception, e: except Exception, e:
e.args = tuple(list(e.args)+["Bad input argument at index %d" % arg_index]) e.args = tuple(list(e.args)+["Bad input argument at index %d" % i])
raise raise
s.provided += 1 s.provided += 1
i+=1 i+=1
...@@ -626,8 +628,8 @@ class Function(object): ...@@ -626,8 +628,8 @@ class Function(object):
dt_call=time.time()-t0 dt_call=time.time()-t0
if hasattr(self.maker.mode,'fct_call_time'): if hasattr(self.maker.mode,'fct_call_time'):
self.maker.mode.fct_call_time[self] += dt_call self.maker.mode.fct_call_time[self] += dt_call
self.maker.mode.fct_call[self] += 1 self.maker.mode.fct_call[self] += 1
self.maker.mode.call_time += dt_call self.maker.mode.call_time += dt_call
self.maker.mode.fn_time += dt_fn self.maker.mode.fn_time += dt_fn
...@@ -732,9 +734,9 @@ class SanityCheckFunction(Function): ...@@ -732,9 +734,9 @@ class SanityCheckFunction(Function):
if not self.check_equal(c1.value, c2.value): if not self.check_equal(c1.value, c2.value):
name = c2.name name = c2.name
if name: if name:
the_name = name the_name = name
else: else:
the_name = "" the_name = ""
raise ValueError("Input #%i%s using %s and %s differs." raise ValueError("Input #%i%s using %s and %s differs."
% (i, % (i,
#backport #backport
...@@ -751,9 +753,9 @@ class SanityCheckFunction(Function): ...@@ -751,9 +753,9 @@ class SanityCheckFunction(Function):
if not self.check_equal(r1, r2): if not self.check_equal(r1, r2):
name = c2.name name = c2.name
if name: if name:
the_name = name the_name = name
else: else:
the_name = "" the_name = ""
raise ValueError("Variable #%i%s using %s and %s differs." raise ValueError("Variable #%i%s using %s and %s differs."
% (i, % (i,
#backport #backport
...@@ -868,9 +870,11 @@ class FunctionMaker(object): ...@@ -868,9 +870,11 @@ class FunctionMaker(object):
optimizer, linker = mode.optimizer, copy.copy(mode.linker) optimizer, linker = mode.optimizer, copy.copy(mode.linker)
# optimize the env # optimize the env
t0 = time.time() start_optimizer = time.time()
optimizer(env) optimizer(env)
_logger.debug('Optimizing took %f seconds' % (time.time() - t0)) end_optimizer = time.time()
mode.optimizer_time += end_optimizer - start_optimizer
_logger.debug('Optimizing took %f seconds' % (end_optimizer - start_optimizer))
# This loop was inserted to remove aliasing between outputs when they all # This loop was inserted to remove aliasing between outputs when they all
# evaluete to the same value. Originally it was OK for outputs to be aliased, # evaluete to the same value. Originally it was OK for outputs to be aliased,
...@@ -978,21 +982,23 @@ class FunctionMaker(object): ...@@ -978,21 +982,23 @@ class FunctionMaker(object):
# Get a function instance # Get a function instance
t0 = time.time() start_linker = time.time()
_fn, _i, _o = self.linker.make_thunk(input_storage = input_storage_lists) _fn, _i, _o = self.linker.make_thunk(input_storage = input_storage_lists)
_logger.debug('Linking took %f seconds' % (time.time() - t0)) end_linker = time.time()
_logger.debug('Linker took %f seconds' % (end_linker - start_linker))
self.mode.linker_time += end_linker - start_linker
fn = self.function_builder(_fn, _i, _o, self.indices, self.outputs, defaults, self.unpack_single, self.return_none, self) fn = self.function_builder(_fn, _i, _o, self.indices, self.outputs, defaults, self.unpack_single, self.return_none, self)
return fn return fn
def _pickle_FunctionMaker(fm): def _pickle_FunctionMaker(fm):
if fm.return_none: if fm.return_none:
outputs = None outputs = None
else: else:
if fm.unpack_single: if fm.unpack_single:
outputs = fm.outputs[0] outputs = fm.outputs[0]
else: else:
outputs = fm.outputs outputs = fm.outputs
#backport #backport
#outputs = None if fm.return_none else (fm.outputs[0] if fm.unpack_single else fm.outputs) #outputs = None if fm.return_none else (fm.outputs[0] if fm.unpack_single else fm.outputs)
...@@ -1086,10 +1092,10 @@ def orig_function(inputs, outputs, mode=None, accept_inplace = False, name=None) ...@@ -1086,10 +1092,10 @@ def orig_function(inputs, outputs, mode=None, accept_inplace = False, name=None)
# TODO This may need to be changed to use containers as defaults. # TODO This may need to be changed to use containers as defaults.
retval = [] retval = []
for default in defaults: for default in defaults:
if isinstance(default, gof.Container): if isinstance(default, gof.Container):
retval +=[copy.copy(default.value)] retval +=[copy.copy(default.value)]
else: else:
retval +=[copy.copy(default)] retval +=[copy.copy(default)]
return retval return retval
#backport #backport
#return [copy.copy(default.value) if isinstance(default, gof.Container) else #return [copy.copy(default.value) if isinstance(default, gof.Container) else
...@@ -1111,9 +1117,9 @@ def orig_function(inputs, outputs, mode=None, accept_inplace = False, name=None) ...@@ -1111,9 +1117,9 @@ def orig_function(inputs, outputs, mode=None, accept_inplace = False, name=None)
fn.name = name fn.name = name
if hasattr(mode,'fct_call_time'): if hasattr(mode,'fct_call_time'):
mode.fct_call_time.setdefault(fn,0) mode.fct_call_time.setdefault(fn,0)
if hasattr(mode,'fct_call'): if hasattr(mode,'fct_call'):
mode.fct_call.setdefault(fn,0) mode.fct_call.setdefault(fn,0)
return fn return fn
...@@ -1226,4 +1232,3 @@ def get_info_on_inputs(named_inputs, n_unnamed_inputs): ...@@ -1226,4 +1232,3 @@ def get_info_on_inputs(named_inputs, n_unnamed_inputs):
get_plural(n_unnamed_inputs), get_plural(n_unnamed_inputs),
get_plural(n_unnamed_inputs))) get_plural(n_unnamed_inputs)))
return msg return msg
...@@ -26,7 +26,7 @@ def check_equal(x, y): ...@@ -26,7 +26,7 @@ def check_equal(x, y):
#I put the import here to allow using theano without scipy. #I put the import here to allow using theano without scipy.
import scipy.sparse as sp import scipy.sparse as sp
x, y = x[0], y[0] x, y = x[0], y[0]
# TODO: bug in current scipy, two sparse matrices are never equal, remove when moving to 0.7 # TODO: bug in current scipy, two sparse matrices are never equal, remove when moving to 0.7
if sp.issparse(x): if sp.issparse(x):
x = x.todense() x = x.todense()
...@@ -99,11 +99,15 @@ class OutputGuard(gof.Op): ...@@ -99,11 +99,15 @@ class OutputGuard(gof.Op):
return type(self) == type(other) return type(self) == type(other)
def __hash__(self): def __hash__(self):
return hash(type(self)) return hash(type(self))
def perform(self, node, (x,), (z,)): def perform(self, node, inp, out):
x, = inp
z, = out
z[0] = x z[0] = x
def __str__(self): def __str__(self):
return '%s' % self.__class__.__name__ return '%s' % self.__class__.__name__
def c_code(self, node, nodename, (x,), (z,), sub): def c_code(self, node, nodename, inp, out, sub):
x, = inp
z, = out
return """ return """
Py_XDECREF(%(z)s); Py_XDECREF(%(z)s);
%(z)s = %(x)s; %(z)s = %(x)s;
...@@ -152,7 +156,7 @@ class PrintCurrentEnv(gof.Optimizer): ...@@ -152,7 +156,7 @@ class PrintCurrentEnv(gof.Optimizer):
theano.printing.debugprint(env.outputs) theano.printing.debugprint(env.outputs)
optdb = gof.SequenceDB() optdb = gof.SequenceDB()
optdb.register('merge1', gof.MergeOptimizer(), optdb.register('merge1', gof.MergeOptimizer(),
0, 'fast_run', 'fast_compile') 0, 'fast_run', 'fast_compile')
optdb.register('canonicalize', gof.EquilibriumDB(), # rearranges elemwise expressions optdb.register('canonicalize', gof.EquilibriumDB(), # rearranges elemwise expressions
1, 'fast_run', 'fast_compile') 1, 'fast_run', 'fast_compile')
...@@ -162,7 +166,7 @@ optdb.register('Print1.21', PrintCurrentEnv('Post-canonicalize'), ...@@ -162,7 +166,7 @@ optdb.register('Print1.21', PrintCurrentEnv('Post-canonicalize'),
1.21,)# 'fast_run', 'fast_compile') 1.21,)# 'fast_run', 'fast_compile')
optdb.register('stabilize', gof.EquilibriumDB(), # replace unstable subgraphs optdb.register('stabilize', gof.EquilibriumDB(), # replace unstable subgraphs
1.5, 'fast_run') 1.5, 'fast_run')
optdb.register('Print1.51', PrintCurrentEnv('Post-stabilize'), optdb.register('Print1.51', PrintCurrentEnv('Post-stabilize'),
1.51,) #'fast_run', 'fast_compile') 1.51,) #'fast_run', 'fast_compile')
optdb.register('specialize', gof.EquilibriumDB(), # misc special cases for speed optdb.register('specialize', gof.EquilibriumDB(), # misc special cases for speed
...@@ -175,7 +179,7 @@ optdb.register('specialize_device', gof.EquilibriumDB(), # misc specia ...@@ -175,7 +179,7 @@ optdb.register('specialize_device', gof.EquilibriumDB(), # misc specia
48.6, 'fast_run')#must be after gpu stuff at 48.5 48.6, 'fast_run')#must be after gpu stuff at 48.5
optdb.register('merge2', gof.MergeOptimizer(), # especially constant merge optdb.register('merge2', gof.MergeOptimizer(), # especially constant merge
49, 'fast_run') 49, 'fast_run')
optdb.register('add_destroy_handler', AddDestroyHandler(), optdb.register('add_destroy_handler', AddDestroyHandler(),
49.5, 'fast_run', 'inplace') 49.5, 'fast_run', 'inplace')
optdb.register('merge3', gof.MergeOptimizer(), # final pass just to make sure optdb.register('merge3', gof.MergeOptimizer(), # final pass just to make sure
100, 'fast_run') 100, 'fast_run')
...@@ -196,7 +200,7 @@ class Mode(object): ...@@ -196,7 +200,7 @@ class Mode(object):
See predefined_linkers, predefined_optimizers and also See predefined_linkers, predefined_optimizers and also
predefined_modes. predefined_modes.
""" """
def __init__(self, linker = config.linker, optimizer = config.optimizer): def __init__(self, linker = config.linker, optimizer = config.optimizer):
self.__setstate__((linker, optimizer)) self.__setstate__((linker, optimizer))
#self.provided_optimizer - typically the `optimizer` arg. But if the `optimizer` arg is #self.provided_optimizer - typically the `optimizer` arg. But if the `optimizer` arg is
...@@ -209,7 +213,8 @@ class Mode(object): ...@@ -209,7 +213,8 @@ class Mode(object):
def __getstate__(self): def __getstate__(self):
return (self.provided_linker, self.provided_optimizer) return (self.provided_linker, self.provided_optimizer)
def __setstate__(self, (linker, optimizer)): def __setstate__(self, state):
linker, optimizer = state
self.provided_linker = linker self.provided_linker = linker
self.provided_optimizer = optimizer self.provided_optimizer = optimizer
if isinstance(linker, str) or linker is None: if isinstance(linker, str) or linker is None:
...@@ -222,6 +227,8 @@ class Mode(object): ...@@ -222,6 +227,8 @@ class Mode(object):
self._optimizer = optimizer self._optimizer = optimizer
self.call_time = 0 self.call_time = 0
self.fn_time = 0 self.fn_time = 0
self.optimizer_time = 0
self.linker_time = 0
def __str__(self): def __str__(self):
return "Mode(linker = %s, optimizer = %s)" % (self.provided_linker, self.provided_optimizer) return "Mode(linker = %s, optimizer = %s)" % (self.provided_linker, self.provided_optimizer)
...@@ -239,8 +246,8 @@ class Mode(object): ...@@ -239,8 +246,8 @@ class Mode(object):
linker = predefined_linkers[linker] linker = predefined_linkers[linker]
if isinstance(optimizer, str) or optimizer is None: if isinstance(optimizer, str) or optimizer is None:
optimizer = predefined_optimizers[optimizer] optimizer = predefined_optimizers[optimizer]
return (linker, optimizer) return (linker, optimizer)
def including(self, *tags): def including(self, *tags):
link, opt = self.get_linker_optimizer(self.provided_linker, self.provided_optimizer) link, opt = self.get_linker_optimizer(self.provided_linker, self.provided_optimizer)
#N.B. opt might be a Query instance, not sure what else it might be... #N.B. opt might be a Query instance, not sure what else it might be...
...@@ -282,11 +289,15 @@ def get_mode(orig_string): ...@@ -282,11 +289,15 @@ def get_mode(orig_string):
if string in ['Mode','ProfileMode','DebugMode']: if string in ['Mode','ProfileMode','DebugMode']:
if instanciated_default_mode: if instanciated_default_mode:
return instanciated_default_mode return instanciated_default_mode
#need to import later to break circular dependency. if string == 'DebugMode':
from profilemode import ProfileMode,prof_mode_instance_to_print #need to import later to break circular dependency.
from debugmode import DebugMode from debugmode import DebugMode
#DebugMode use its own linker.
ret = eval(string+'(linker=config.linker, optimizer=config.optimizer)') ret = DebugMode(optimizer=config.optimizer)
else:
# The import is needed in case string is ProfileMode
from profilemode import ProfileMode
ret = eval(string+'(linker=config.linker, optimizer=config.optimizer)')
elif not predefined_modes.has_key(string): elif not predefined_modes.has_key(string):
raise Exception("No predefined mode exist for string: %s"%string) raise Exception("No predefined mode exist for string: %s"%string)
...@@ -303,6 +314,8 @@ def get_mode(orig_string): ...@@ -303,6 +314,8 @@ def get_mode(orig_string):
#must tell python to print the summary at the end. #must tell python to print the summary at the end.
if string == 'ProfileMode': if string == 'ProfileMode':
#need to import later to break circular dependency.
from profilemode import prof_mode_instance_to_print
prof_mode_instance_to_print.append(ret) prof_mode_instance_to_print.append(ret)
return ret return ret
...@@ -318,4 +331,3 @@ def register_mode(name, mode): ...@@ -318,4 +331,3 @@ def register_mode(name, mode):
if name in predefined_modes: if name in predefined_modes:
raise ValueError('Mode name already taken: %s' % name) raise ValueError('Mode name already taken: %s' % name)
predefined_modes[name] = mode predefined_modes[name] = mode
import sys import warnings
print >> sys.stderr, "DEPRECATION: theano.compile.sandbox no longer provides shared, shared_constructor, and pfunc. They have been moved to theano.compile." warnings.warn("theano.compile.sandbox no longer provides shared, shared_constructor, and pfunc. They have been moved to theano.compile.", DeprecationWarning)
from theano.compile.sharedvalue import shared, shared_constructor from theano.compile.sharedvalue import shared, shared_constructor
from theano.compile.pfunc import pfunc from theano.compile.pfunc import pfunc
"""Provide a simple user friendly API to Theano-managed memory""" """Provide a simple user friendly API to Theano-managed memory"""
__docformat__ = 'restructuredtext en' __docformat__ = 'restructuredtext en'
import traceback # Standard imports
import copy import copy
import logging
import traceback
import warnings
# Theano imports
from theano import config
from theano.configparser import (TheanoConfigParser, AddConfigVar, EnumStr,
StrParam, IntParam, FloatParam, BoolParam)
from theano.gof import Container, Variable, generic from theano.gof import Container, Variable, generic
import logging
_logger = logging.getLogger('theano.compile.sharedvalue') _logger = logging.getLogger('theano.compile.sharedvalue')
_logger.setLevel(logging.DEBUG) _logger.setLevel(logging.DEBUG)
def debug(*msg): _logger.debug(' '.join(str(m) for m in msg)) def debug(*msg): _logger.debug(' '.join(str(m) for m in msg))
...@@ -14,13 +21,11 @@ def warn(*msg): _logger.warn(' '.join(str(m) for m in msg)) ...@@ -14,13 +21,11 @@ def warn(*msg): _logger.warn(' '.join(str(m) for m in msg))
def warning(*msg): _logger.warning(' '.join(str(m) for m in msg)) def warning(*msg): _logger.warning(' '.join(str(m) for m in msg))
def error(*msg): _logger.error(' '.join(str(m) for m in msg)) def error(*msg): _logger.error(' '.join(str(m) for m in msg))
from theano.configparser import TheanoConfigParser, AddConfigVar, EnumStr, StrParam, IntParam, FloatParam, BoolParam
from theano import config
AddConfigVar('shared.value_borrows', AddConfigVar('shared.value_borrows',
("False: shared variables 'value' property is guaranteed to not" ("DEPRECATED. You should not use the 'value' property of shared"
" alias theano-managed memory. True: no guarantee, but faster." " variables, but use the .get_value() and .set_value() methods."
" For more control consider using shared.get_value() instead."), " False: shared variables 'value' property is guaranteed to not"
" alias theano-managed memory. True: no guarantee, but faster."),
BoolParam(True)) BoolParam(True))
class SharedVariable(Variable): class SharedVariable(Variable):
...@@ -82,7 +87,7 @@ class SharedVariable(Variable): ...@@ -82,7 +87,7 @@ class SharedVariable(Variable):
def get_value(self, borrow=False, return_internal_type=False): def get_value(self, borrow=False, return_internal_type=False):
"""Get the non-symbolic value associated with this SharedVariable. """Get the non-symbolic value associated with this SharedVariable.
:param borrow: :param borrow:
True to permit returning of an object aliased to internal memory. True to permit returning of an object aliased to internal memory.
:param return_internal_type: :param return_internal_type:
True to permit the returning of an arbitrary type object used internally to store True to permit the returning of an arbitrary type object used internally to store
...@@ -91,7 +96,7 @@ class SharedVariable(Variable): ...@@ -91,7 +96,7 @@ class SharedVariable(Variable):
Only with borrow=False and return_internal_type=True does this function guarantee that Only with borrow=False and return_internal_type=True does this function guarantee that
you actually get the internal object. But in that case, you may get different return you actually get the internal object. But in that case, you may get different return
types when using different compute devices. types when using different compute devices.
""" """
if borrow: if borrow:
return self.container.value return self.container.value
...@@ -101,10 +106,10 @@ class SharedVariable(Variable): ...@@ -101,10 +106,10 @@ class SharedVariable(Variable):
def set_value(self,new_value, borrow=False): def set_value(self,new_value, borrow=False):
"""Set the non-symbolic value associated with this SharedVariable. """Set the non-symbolic value associated with this SharedVariable.
:param borrow: :param borrow:
True to use the new_value directly, potentially creating problems True to use the new_value directly, potentially creating problems
related to aliased memory. related to aliased memory.
Changes to this value will be visible to all functions using this SharedVariable. Changes to this value will be visible to all functions using this SharedVariable.
""" """
if borrow: if borrow:
...@@ -115,7 +120,7 @@ class SharedVariable(Variable): ...@@ -115,7 +120,7 @@ class SharedVariable(Variable):
def clone(self): def clone(self):
cp = self.__class__( cp = self.__class__(
name=self.name, name=self.name,
type=self.type, type=self.type,
value=None, value=None,
strict=None, strict=None,
container=self.container) container=self.container)
...@@ -123,18 +128,26 @@ class SharedVariable(Variable): ...@@ -123,18 +128,26 @@ class SharedVariable(Variable):
return cp return cp
def _value_get(self): def _value_get(self):
warnings.warn(("The .value property of shared variables is deprecated."
" You should use the .get_value() method instead."),
stacklevel=2)
return self.get_value(borrow=config.shared.value_borrows, return_internal_type=False) return self.get_value(borrow=config.shared.value_borrows, return_internal_type=False)
def _value_set(self, new_value): def _value_set(self, new_value):
warnings.warn(("The .value property of shared variables is deprecated."
" You should use the .set_value() method instead."),
stacklevel=2)
return self.set_value(new_value, borrow=config.shared.value_borrows) return self.set_value(new_value, borrow=config.shared.value_borrows)
#TODO: USE A CONFIG VARIABLE TO set these get/set methods to the non-borrowing versions #TODO: USE A CONFIG VARIABLE TO set these get/set methods to the non-borrowing versions
# Semantically things are clearer when using non-borrow versions. That should be the # Semantically things are clearer when using non-borrow versions. That should be the
# default. The default support transparently (if slowly) when the 'raw' value is in a # default. The default support transparently (if slowly) when the 'raw' value is in a
# different memory space (e.g. GPU or other machine). # different memory space (e.g. GPU or other machine).
value = property(_value_get, _value_set, value = property(_value_get, _value_set,
doc=("shortcut for self.get_value() and self.set_value()." doc=("DEPRECATED. Shortcut for self.get_value() and "
"The `borrow` argument to these methods is read from " "self.set_value(). "
"`theano.config.shared.value_borrows`")) "The `borrow` argument to these methods is read from "
"`theano.config.shared.value_borrows`. "
"You should call get_value() and set_value() directly."))
def filter_update(self, update): def filter_update(self, update):
...@@ -170,10 +183,10 @@ def shared(value, name=None, strict=False, allow_downcast=None, **kwargs): ...@@ -170,10 +183,10 @@ def shared(value, name=None, strict=False, allow_downcast=None, **kwargs):
This function iterates over constructor functions (see `shared_constructor`) to find a This function iterates over constructor functions (see `shared_constructor`) to find a
suitable SharedVariable subclass. suitable SharedVariable subclass.
:note: :note:
By passing kwargs, you effectively limit the set of potential constructors to those that By passing kwargs, you effectively limit the set of potential constructors to those that
can accept those kwargs. can accept those kwargs.
""" """
for ctor in reversed(shared.constructors): for ctor in reversed(shared.constructors):
try: try:
...@@ -194,4 +207,3 @@ def generic_constructor(value, name=None, strict=False, allow_downcast=None): ...@@ -194,4 +207,3 @@ def generic_constructor(value, name=None, strict=False, allow_downcast=None):
"""SharedVariable Constructor""" """SharedVariable Constructor"""
return SharedVariable(type=generic, value=value, name=name, strict=strict, return SharedVariable(type=generic, value=value, name=name, strict=strict,
allow_downcast=allow_downcast) allow_downcast=allow_downcast)
...@@ -31,14 +31,18 @@ class BROKEN_ON_PURPOSE_Add(gof.Op): ...@@ -31,14 +31,18 @@ class BROKEN_ON_PURPOSE_Add(gof.Op):
r = gof.Apply(self, [a, b], [a.type()]) r = gof.Apply(self, [a, b], [a.type()])
return r return r
def perform(self, node, (a, b), (out,)): def perform(self, node, inp, out_):
a, b = inp
out, = out_
z = a+b z = a+b
#ERROR TO ADD THIS CRAPPY OFFSET #ERROR TO ADD THIS CRAPPY OFFSET
if self.py_offset: if self.py_offset:
out[0] = z+0.5 out[0] = z+0.5
else: out[0] = z else: out[0] = z
def c_code(self, node, name, (a, b), (z,), sub): def c_code(self, node, name, inp, out, sub):
a, b = inp
z, = out
return """ return """
if (%(a)s->nd != 1) {PyErr_SetString(PyExc_NotImplementedError, "rank(a) != 1"); %(fail)s;} if (%(a)s->nd != 1) {PyErr_SetString(PyExc_NotImplementedError, "rank(a) != 1"); %(fail)s;}
if (%(b)s->nd != 1) {PyErr_SetString(PyExc_NotImplementedError, "rank(b) != 1"); %(fail)s;} if (%(b)s->nd != 1) {PyErr_SetString(PyExc_NotImplementedError, "rank(b) != 1"); %(fail)s;}
...@@ -75,7 +79,7 @@ class BROKEN_ON_PURPOSE_Add(gof.Op): ...@@ -75,7 +79,7 @@ class BROKEN_ON_PURPOSE_Add(gof.Op):
# inconsistent is a invalid op, whose perform and c_code do not match # inconsistent is a invalid op, whose perform and c_code do not match
inconsistent = BROKEN_ON_PURPOSE_Add(False) inconsistent = BROKEN_ON_PURPOSE_Add(False)
# off_by_half is a good op, that is different from theano.sparse.sd_csc # off_by_half is a good op, that is different from theano.sparse.sd_csc
off_by_half = BROKEN_ON_PURPOSE_Add(True) off_by_half = BROKEN_ON_PURPOSE_Add(True)
class WeirdBrokenOp(gof.Op): class WeirdBrokenOp(gof.Op):
""" """
...@@ -100,7 +104,9 @@ class WeirdBrokenOp(gof.Op): ...@@ -100,7 +104,9 @@ class WeirdBrokenOp(gof.Op):
r = gof.Apply(self, [a_], [a_.type()]) r = gof.Apply(self, [a_], [a_.type()])
return r return r
def dontuse_perform(self, node, (a,), (out,)): def dontuse_perform(self, node, inp, out_):
a, = inp
out, = out_
if self.behaviour == 'times2': if self.behaviour == 'times2':
out[0] = a * 2 out[0] = a * 2
elif self.behaviour == 'times2_inplace': elif self.behaviour == 'times2_inplace':
...@@ -113,7 +119,9 @@ class WeirdBrokenOp(gof.Op): ...@@ -113,7 +119,9 @@ class WeirdBrokenOp(gof.Op):
else: else:
raise ValueError(self.behaviour) raise ValueError(self.behaviour)
def c_code(self, node, name, (a,), (z,), sub): def c_code(self, node, name, inp, out, sub):
a, = inp
z, = out
if "inplace" in self.behaviour: if "inplace" in self.behaviour:
z_code = """ z_code = """
{Py_XDECREF(%(z)s);} {Py_XDECREF(%(z)s);}
...@@ -172,11 +180,11 @@ def test_badclinkeroutput(): ...@@ -172,11 +180,11 @@ def test_badclinkeroutput():
a = theano.tensor.dvector() a = theano.tensor.dvector()
b = theano.tensor.dvector() b = theano.tensor.dvector()
f_good = theano.function([a, b], f_good = theano.function([a, b],
off_by_half(a, b), off_by_half(a, b),
mode=debugmode.DebugMode(check_c_code=True)) mode=debugmode.DebugMode(check_c_code=True))
f_inconsistent = theano.function([a,b], f_inconsistent = theano.function([a,b],
inconsistent(a, b), inconsistent(a, b),
mode=debugmode.DebugMode(check_c_code=True)) mode=debugmode.DebugMode(check_c_code=True))
#this should evaluate with no error #this should evaluate with no error
...@@ -189,7 +197,7 @@ def test_badclinkeroutput(): ...@@ -189,7 +197,7 @@ def test_badclinkeroutput():
return #TEST PASS return #TEST PASS
assert False #an error should have been detected assert False #an error should have been detected
def test_badoptimization(): def test_badoptimization():
@gof.local_optimizer([theano.tensor.add]) @gof.local_optimizer([theano.tensor.add])
...@@ -204,7 +212,7 @@ def test_badoptimization(): ...@@ -204,7 +212,7 @@ def test_badoptimization():
a = theano.tensor.dvector() a = theano.tensor.dvector()
b = theano.tensor.dvector() b = theano.tensor.dvector()
f = theano.function([a, b], a+b, f = theano.function([a, b], a+b,
mode=debugmode.DebugMode(optimizer=opt, check_c_code=True)) mode=debugmode.DebugMode(optimizer=opt, check_c_code=True))
try: try:
...@@ -235,8 +243,8 @@ def test_stochasticoptimization(): ...@@ -235,8 +243,8 @@ def test_stochasticoptimization():
b = theano.tensor.dvector() b = theano.tensor.dvector()
try: try:
f = theano.function([a, b], f = theano.function([a, b],
theano.tensor.add(a, b), theano.tensor.add(a, b),
mode=debugmode.DebugMode(optimizer=opt, check_c_code=True)) mode=debugmode.DebugMode(optimizer=opt, check_c_code=True))
except debugmode.StochasticOrder: except debugmode.StochasticOrder:
return #TEST PASS return #TEST PASS
...@@ -253,7 +261,9 @@ def test_baddestroymap(): ...@@ -253,7 +261,9 @@ def test_baddestroymap():
def make_node(self, a, b): def make_node(self, a, b):
c = a.type() c = a.type()
return gof.Apply(self, [a,b], [c]) return gof.Apply(self, [a,b], [c])
def perform(self, node, (a,b), (c,)): def perform(self, node, inp, out):
a, b = inp
c, = out
c[0] = a c[0] = a
c[0] += b c[0] += b
...@@ -283,14 +293,18 @@ class Test_ViewMap(unittest.TestCase): ...@@ -283,14 +293,18 @@ class Test_ViewMap(unittest.TestCase):
def make_node(self, a, b): def make_node(self, a, b):
c = b.type() c = b.type()
return gof.Apply(self, [a,b], [c]) return gof.Apply(self, [a,b], [c])
def perform(self, node, (a,b), (c,)): def perform(self, node, inp, out):
a, b = inp
c, = out
c[0] = b c[0] = b
class BadAddSlice(gof.Op): class BadAddSlice(gof.Op):
def make_node(self, a, b): def make_node(self, a, b):
c = b.type() c = b.type()
return gof.Apply(self, [a,b], [c]) return gof.Apply(self, [a,b], [c])
def perform(self, node, (a,b), (c,)): def perform(self, node, inp, out):
a, b = inp
c, = out
c[0] = b[1:3] c[0] = b[1:3]
def test_badviewmap_ref(self): def test_badviewmap_ref(self):
...@@ -343,7 +357,9 @@ class Test_ViewMap(unittest.TestCase): ...@@ -343,7 +357,9 @@ class Test_ViewMap(unittest.TestCase):
c = a.type() c = a.type()
d = a.type() d = a.type()
return gof.Apply(self, [a,b], [c,d]) return gof.Apply(self, [a,b], [c,d])
def perform(self, node, (a,b), (c,d)): def perform(self, node, inp, out):
a, b = inp
c, d = out
c[0] = a c[0] = a
d[0] = a[1:] d[0] = a[1:]
...@@ -364,7 +380,9 @@ class Test_ViewMap(unittest.TestCase): ...@@ -364,7 +380,9 @@ class Test_ViewMap(unittest.TestCase):
c = a.type() c = a.type()
d = a.type() d = a.type()
return gof.Apply(self, [a,b], [c,d]) return gof.Apply(self, [a,b], [c,d])
def perform(self, node, (a,b), (c,d)): def perform(self, node, inp, out):
a, b = inp
c, d = out
r = a * 2 r = a * 2
c[0] = r c[0] = r
d[0] = r[1:] d[0] = r[1:]
...@@ -387,7 +405,9 @@ class Test_ViewMap(unittest.TestCase): ...@@ -387,7 +405,9 @@ class Test_ViewMap(unittest.TestCase):
c = a.type() c = a.type()
d = a.type() d = a.type()
return gof.Apply(self, [a,b], [c,d]) return gof.Apply(self, [a,b], [c,d])
def perform(self, node, (a,b), (c,d)): def perform(self, node, inp, out):
a, b = inp
c, d = out
r = a * 1 r = a * 1
c[0] = r c[0] = r
d[0] = r[1:] d[0] = r[1:]
...@@ -409,7 +429,9 @@ class Test_ViewMap(unittest.TestCase): ...@@ -409,7 +429,9 @@ class Test_ViewMap(unittest.TestCase):
c = a.type() c = a.type()
d = a.type() d = a.type()
return gof.Apply(self, [a,b], [c,d]) return gof.Apply(self, [a,b], [c,d])
def perform(self, node, (a,b), (c,d)): def perform(self, node, inp, out):
a, b = inp
c, d = out
r = a * 1 r = a * 1
c[0] = r[:-1] c[0] = r[:-1]
d[0] = r[1:] d[0] = r[1:]
......
...@@ -9,8 +9,9 @@ from theano.compile import function ...@@ -9,8 +9,9 @@ from theano.compile import function
from theano import tensor from theano import tensor
from theano import tensor as T from theano import tensor as T
import random, theano import random, theano
import numpy as N
import numpy as N
from numpy.testing.noseclasses import KnownFailureTest
PatternOptimizer = lambda p1, p2, ign=True: gof.OpKeyOptimizer(gof.PatternSub(p1, p2), ignore_newtrees=ign) PatternOptimizer = lambda p1, p2, ign=True: gof.OpKeyOptimizer(gof.PatternSub(p1, p2), ignore_newtrees=ign)
...@@ -33,7 +34,7 @@ class T_function(unittest.TestCase): ...@@ -33,7 +34,7 @@ class T_function(unittest.TestCase):
fn = function([], None) #ok fn = function([], None) #ok
rval = fn() rval = fn()
if rval == []: if rval == []:
print >> sys.stderr, 'WARNING: ticket #254' raise KnownFailureTest('See #254: Using None as function output leads to [] return value')
else: else:
assert rval is None assert rval is None
...@@ -45,7 +46,7 @@ class T_function(unittest.TestCase): ...@@ -45,7 +46,7 @@ class T_function(unittest.TestCase):
x,s = T.scalars('xs') x,s = T.scalars('xs')
fn = function([x], [x]) fn = function([x], [x])
self.failUnlessRaises(TypeError,fn,1,2) self.failUnlessRaises(TypeError,fn,1,2)
def test_missing_inputs(self): def test_missing_inputs(self):
MissingInputException = TypeError MissingInputException = TypeError
...@@ -184,7 +185,7 @@ class T_function(unittest.TestCase): ...@@ -184,7 +185,7 @@ class T_function(unittest.TestCase):
def test_weird_names(self): def test_weird_names(self):
a,x,s = T.scalars('xxx') a,x,s = T.scalars('xxx')
checkfor(self, lambda:function([In(a,name=[])],[]), TypeError) checkfor(self, lambda:function([In(a,name=[])],[]), TypeError)
def t(): def t():
...@@ -290,7 +291,7 @@ class T_function(unittest.TestCase): ...@@ -290,7 +291,7 @@ class T_function(unittest.TestCase):
""" """
a = T.dmatrix() a = T.dmatrix()
aval = numpy.random.rand(3,3) aval = numpy.random.rand(3,3)
# when borrow=False, test that a destroy map cannot alias output to input # when borrow=False, test that a destroy map cannot alias output to input
f = theano.function([In(a, borrow=False)], Out(a+1, borrow=True)) f = theano.function([In(a, borrow=False)], Out(a+1, borrow=True))
assert numpy.all(f(aval) == aval+1) assert numpy.all(f(aval) == aval+1)
...@@ -447,7 +448,7 @@ class T_picklefunction(unittest.TestCase): ...@@ -447,7 +448,7 @@ class T_picklefunction(unittest.TestCase):
config.mode = old_default_mode config.mode = old_default_mode
config.optimizer = old_default_opt config.optimizer = old_default_opt
config.linker = old_default_link config.linker = old_default_link
if g == 'ok': if g == 'ok':
return return
...@@ -540,7 +541,7 @@ class T_picklefunction(unittest.TestCase): ...@@ -540,7 +541,7 @@ class T_picklefunction(unittest.TestCase):
def test_pickle_class_with_functions(self): def test_pickle_class_with_functions(self):
blah = SomethingToPickle() blah = SomethingToPickle()
assert blah.f2.container[blah.s].storage is blah.f1.container[blah.s].storage assert blah.f2.container[blah.s].storage is blah.f1.container[blah.s].storage
try: try:
blah2 = copy.deepcopy(blah) blah2 = copy.deepcopy(blah)
...@@ -550,7 +551,7 @@ class T_picklefunction(unittest.TestCase): ...@@ -550,7 +551,7 @@ class T_picklefunction(unittest.TestCase):
else: else:
raise raise
assert blah2.f2.container[blah2.s].storage is blah2.f1.container[blah2.s].storage assert blah2.f2.container[blah2.s].storage is blah2.f1.container[blah2.s].storage
assert blah.f1[blah.s] == blah2.f1[blah2.s] assert blah.f1[blah.s] == blah2.f1[blah2.s]
...@@ -598,4 +599,3 @@ if __name__ == '__main__': ...@@ -598,4 +599,3 @@ if __name__ == '__main__':
assert b assert b
t.failUnless = fu t.failUnless = fu
t.test_deepcopy_shared_container() t.test_deepcopy_shared_container()
...@@ -73,7 +73,7 @@ class TanhRnn(Op): ...@@ -73,7 +73,7 @@ class TanhRnn(Op):
This class implements the recurrent part of a recurrent neural network. This class implements the recurrent part of a recurrent neural network.
There is not a neat way to include this in a more fine-grained way in Theano at the moment, There is not a neat way to include this in a more fine-grained way in Theano at the moment,
so to get something working, I'm implementing a relatively complicated Op that could be so to get something working, I'm implementing a relatively complicated Op that could be
broken down later into constituents. broken down later into constituents.
Anyway, this Op implements recursive computation of the form: Anyway, this Op implements recursive computation of the form:
...@@ -81,7 +81,7 @@ class TanhRnn(Op): ...@@ -81,7 +81,7 @@ class TanhRnn(Op):
.. latex-eqn: .. latex-eqn:
z_t &= \tanh( z_{t-1} A + x_{t-1}) z_t &= \tanh( z_{t-1} A + x_{t-1})
For z0 a vector, and x a TxM matrix, it returns a matrix z of shape (T+1, M), For z0 a vector, and x a TxM matrix, it returns a matrix z of shape (T+1, M),
in which z[0] = z0. in which z[0] = z0.
""" """
...@@ -104,9 +104,10 @@ class TanhRnn(Op): ...@@ -104,9 +104,10 @@ class TanhRnn(Op):
z = x.type() #make a new symbolic variable with the same type as x z = x.type() #make a new symbolic variable with the same type as x
return Apply(self, [x, z0, A], [z]) return Apply(self, [x, z0, A], [z])
def perform(self, node, (x,z0,A), out): def perform(self, node, inp, out):
assert x is not None x, z0, A = inp
assert z0 is not None assert x is not None
assert z0 is not None
assert A is not None assert A is not None
T,M = x.shape T,M = x.shape
z = N.zeros((T+1, M)) z = N.zeros((T+1, M))
...@@ -115,7 +116,9 @@ class TanhRnn(Op): ...@@ -115,7 +116,9 @@ class TanhRnn(Op):
z[i+1] = N.tanh(N.dot(z[i], A) + x[i]) z[i+1] = N.tanh(N.dot(z[i], A) + x[i])
out[0][0] = z out[0][0] = z
def grad(self, (x, z0, A), (gz,)): def grad(self, inp, grads):
x, z0, A = inp
gz, = grads
z = tanh_rnn(x, z0, A) z = tanh_rnn(x, z0, A)
gz_incl_rnn, gx = tanh_rnn_grad(A, z, gz) gz_incl_rnn, gx = tanh_rnn_grad(A, z, gz)
return [gx, gz_incl_rnn[0], (T.dot(z[:-1].T, gx))] return [gx, gz_incl_rnn[0], (T.dot(z[:-1].T, gx))]
...@@ -136,7 +139,8 @@ class TanhRnnGrad(Op): ...@@ -136,7 +139,8 @@ class TanhRnnGrad(Op):
def make_node(self, A, z, gz): def make_node(self, A, z, gz):
return Apply(self, [A,z,gz], (z.type(), gz.type())) return Apply(self, [A,z,gz], (z.type(), gz.type()))
def perform(self, node, (A, z, gz), out): def perform(self, node, inp, out):
A, z, gz = inp
Tp1,M = z.shape Tp1,M = z.shape
T = Tp1 - 1 T = Tp1 - 1
gx = N.zeros((T, M)) gx = N.zeros((T, M))
...@@ -275,7 +279,7 @@ def test_WEIRD_STUFF(): ...@@ -275,7 +279,7 @@ def test_WEIRD_STUFF():
print rnn1.minimizer.step.maker.inputs print rnn1.minimizer.step.maker.inputs
print rnn2.minimizer.step.maker.inputs print rnn2.minimizer.step.maker.inputs
# for i in range(1,len(rnn1.minimizer.step.maker.inputs)): # for i in range(1,len(rnn1.minimizer.step.maker.inputs)):
# print "valid update:",theano.printing.pp(rnn1.minimizer.step.maker.inputs[i].update), # print "valid update:",theano.printing.pp(rnn1.minimizer.step.maker.inputs[i].update),
...@@ -284,7 +288,7 @@ def test_WEIRD_STUFF(): ...@@ -284,7 +288,7 @@ def test_WEIRD_STUFF():
# print rnn2.minimizer.step.maker.inputs[i].update.name # print rnn2.minimizer.step.maker.inputs[i].update.name
# print dir(rnn1.minimizer.step.maker.inputs[5].update) # print dir(rnn1.minimizer.step.maker.inputs[5].update)
# print dir(rnn2.minimizer.step.maker.inputs[5].update) # print dir(rnn2.minimizer.step.maker.inputs[5].update)
niter=3 niter=3
......
...@@ -24,7 +24,7 @@ class T_module(unittest.TestCase): ...@@ -24,7 +24,7 @@ class T_module(unittest.TestCase):
super(Blah, self).__init__() super(Blah, self).__init__()
self.stepsize = T.value(stepsize) self.stepsize = T.value(stepsize)
x = T.dscalar() x = T.dscalar()
self.step = Method([x], x - self.stepsize) self.step = Method([x], x - self.stepsize)
B = Blah(0.0) B = Blah(0.0)
...@@ -80,27 +80,27 @@ class T_module(unittest.TestCase): ...@@ -80,27 +80,27 @@ class T_module(unittest.TestCase):
m1.ly[0], m1.ly[0],
m1.lly[0][0], m1.lly[0][0],
m1.tx[0], #8 m1.tx[0], #8
m1.ty[0], m1.tlx[0][0], m1.ty[0], m1.tlx[0][0],
m1.tly[0][0], m1.ttx[0][0], m1.tty[0][0], m1.tdx[0]['x'], m1.tly[0][0], m1.ttx[0][0], m1.tty[0][0], m1.tdx[0]['x'],
m1.tdy[0]['y'], m1.dx['x'], m1.tdy[0]['y'], m1.dx['x'],
m1.dy['y'], m1.dlx['x'][0], m1.dly['y'][0], m1.dy['y'], m1.dlx['x'][0], m1.dly['y'][0],
m1.dtx['x'][0], m1.dty['y'][0], m1.ddx['x']['x'], m1.dtx['x'][0], m1.dty['y'][0], m1.ddx['x']['x'],
m1.ddy['y']['y']]): m1.ddy['y']['y']]):
assert isinstance(obj,(gof.Variable)) assert isinstance(obj,(gof.Variable))
inst=m1.make() inst=m1.make()
def get_l(): def get_l():
return [inst.lx, inst.ly, inst.tx, inst.ty, inst.dx, inst.dy, inst.llx, inst.lly, inst.ltx, inst.lty, inst.ldx, inst.ldy, inst.tlx, inst.tly, inst.ttx, inst.tty, inst.tdx, inst.tdy, inst.dly, inst.dlx, inst.dty, inst.dtx, inst.ddy, inst.ddx] return [inst.lx, inst.ly, inst.tx, inst.ty, inst.dx, inst.dy, inst.llx, inst.lly, inst.ltx, inst.lty, inst.ldx, inst.ldy, inst.tlx, inst.tly, inst.ttx, inst.tty, inst.tdx, inst.tdy, inst.dly, inst.dlx, inst.dty, inst.dtx, inst.ddy, inst.ddx]
def get_l2(): def get_l2():
# return [inst.lx[0], inst.ly[0], inst.tx[0], inst.ty[0], inst.dx['x'], inst.dy['y'], inst.llx[0][0], inst.lly[0][0], inst.ltx[0][0], inst.lty[0][0], inst.ldx[0]['x'], inst.ldy[0]['y'], inst.tlx[0][0], inst.tly[0][0], inst.ttx[0][0], inst.tty[0][0], inst.tdx, inst.tdy, inst.dly, inst.dlx, inst.dty, inst.dtx, inst.ddy, inst.ddx] # return [inst.lx[0], inst.ly[0], inst.tx[0], inst.ty[0], inst.dx['x'], inst.dy['y'], inst.llx[0][0], inst.lly[0][0], inst.ltx[0][0], inst.lty[0][0], inst.ldx[0]['x'], inst.ldy[0]['y'], inst.tlx[0][0], inst.tly[0][0], inst.ttx[0][0], inst.tty[0][0], inst.tdx, inst.tdy, inst.dly, inst.dlx, inst.dty, inst.dtx, inst.ddy, inst.ddx]
return [inst.lx, inst.ly, inst.tx, inst.ty, inst.llx[0], inst.lly[0], inst.ltx[0], inst.lty[0], inst.ldx[0], inst.ldy[0], inst.tlx[0], inst.tly[0], inst.ttx[0], inst.tty[0], inst.tdx[0], inst.tdy[0], inst.dly['y'], inst.dlx['x'], inst.dty['y'], inst.dtx['x']]#, inst.ddy['y'], inst.ddx['x']] return [inst.lx, inst.ly, inst.tx, inst.ty, inst.llx[0], inst.lly[0], inst.ltx[0], inst.lty[0], inst.ldx[0], inst.ldy[0], inst.tlx[0], inst.tly[0], inst.ttx[0], inst.tty[0], inst.tdx[0], inst.tdy[0], inst.dly['y'], inst.dlx['x'], inst.dty['y'], inst.dtx['x']]#, inst.ddy['y'], inst.ddx['x']]
#test that we can access the data #test that we can access the data
inst.x inst.x
inst.y inst.y
for i in get_l(): for i in get_l():
assert i assert i
...@@ -194,7 +194,7 @@ class T_module(unittest.TestCase): ...@@ -194,7 +194,7 @@ class T_module(unittest.TestCase):
print 'value test' print 'value test'
local_test(lambda:T.value(1),lambda:T.value(2)) local_test(lambda:T.value(1),lambda:T.value(2))
def test_method_in_list_or_dict(self): def test_method_in_list_or_dict(self):
"""Test that a Method which is only included via a list or dictionary is still treated as if it """Test that a Method which is only included via a list or dictionary is still treated as if it
were a toplevel attribute were a toplevel attribute
...@@ -253,7 +253,7 @@ class T_module(unittest.TestCase): ...@@ -253,7 +253,7 @@ class T_module(unittest.TestCase):
assert isinstance(f[0][0],theano.compile.function_module.Function) assert isinstance(f[0][0],theano.compile.function_module.Function)
for f in inst.dly['y'][0],inst.dty['y'][0], inst.dlz['z'][0],inst.dtz['z'][0], inst.ddy['y']['y'], inst.ddz['z']['z']: for f in inst.dly['y'][0],inst.dty['y'][0], inst.dlz['z'][0],inst.dtz['z'][0], inst.ddy['y']['y'], inst.ddz['z']['z']:
assert isinstance(f,theano.compile.function_module.Function) assert isinstance(f,theano.compile.function_module.Function)
def test_shared_members(self): def test_shared_members(self):
"""Test that under a variety of tricky conditions, the shared-ness of Variables and Members """Test that under a variety of tricky conditions, the shared-ness of Variables and Members
is respected.""" is respected."""
...@@ -302,7 +302,7 @@ class T_module(unittest.TestCase): ...@@ -302,7 +302,7 @@ class T_module(unittest.TestCase):
assert f==4 assert f==4
def test_shared_members_N(self): def test_shared_members_N(self):
"""Test that Members can be shared an arbitrary number of times between """Test that Members can be shared an arbitrary number of times between
many submodules and internal data structures.""" many submodules and internal data structures."""
def populate_module(m,x): def populate_module(m,x):
m.x=x m.x=x
...@@ -413,9 +413,9 @@ class T_module(unittest.TestCase): ...@@ -413,9 +413,9 @@ class T_module(unittest.TestCase):
def test_member_method_inputs(self): def test_member_method_inputs(self):
"""Test that module Members can be named as Method inputs, in which case the function will """Test that module Members can be named as Method inputs, in which case the function will
*not* use the storage allocated for the Module's version of that Member. *not* use the storage allocated for the Module's version of that Member.
""" """
# test that explicit Method inputs don't use shared storage # test that explicit Method inputs don't use shared storage
M = Module() M = Module()
M.x = T.dscalar() M.x = T.dscalar()
...@@ -436,14 +436,14 @@ class T_module(unittest.TestCase): ...@@ -436,14 +436,14 @@ class T_module(unittest.TestCase):
Method inputs""" Method inputs"""
if config.mode == 'FAST_COMPILE': if config.mode == 'FAST_COMPILE':
return return
M = Module() M = Module()
M.x = T.dvector() M.x = T.dvector()
M.y = T.dvector() M.y = T.dvector()
xval= numpy.asarray([0, 0.5]) xval= numpy.asarray([0, 0.5])
M.f = Method([io.In(M.x, M.f = Method([io.In(M.x,
mutable=True, mutable=True,
update=(M.x - M.y), update=(M.x - M.y),
value=xval)], M.x + M.y) value=xval)], M.x + M.y)
m = M.make() m = M.make()
...@@ -540,52 +540,52 @@ def test_multiple_references(): ...@@ -540,52 +540,52 @@ def test_multiple_references():
class A(theano.Module): class A(theano.Module):
def __init__(self, sub_module): def __init__(self, sub_module):
super(A, self).__init__() super(A, self).__init__()
self.sub_module = sub_module self.sub_module = sub_module
def _instance_initialize(self, obj): def _instance_initialize(self, obj):
print 'Initializing A' print 'Initializing A'
class B(theano.Module): class B(theano.Module):
def __init__(self, sub_module): def __init__(self, sub_module):
super(B, self).__init__() super(B, self).__init__()
self.sub_module = sub_module self.sub_module = sub_module
def _instance_initialize(self, obj): def _instance_initialize(self, obj):
print 'Initializing B' print 'Initializing B'
class C(theano.Module): class C(theano.Module):
def __init__(self): def __init__(self):
super(C, self).__init__() super(C, self).__init__()
self.value = theano.tensor.scalar() self.value = theano.tensor.scalar()
def _instance_initialize(self, obj): def _instance_initialize(self, obj):
print 'Initializing C' print 'Initializing C'
obj.value = 0 obj.value = 0
def _instance_set(self, obj, value): def _instance_set(self, obj, value):
print 'Setting C' print 'Setting C'
obj.value = value obj.value = value
class D(theano.Module): class D(theano.Module):
def __init__(self): def __init__(self):
super(D, self).__init__() super(D, self).__init__()
self.c = C() self.c = C()
self.a = A(self.c) self.a = A(self.c)
self.b = B(self.c) self.b = B(self.c)
# Workaround for bug exhibited in a previous email. # Workaround for bug exhibited in a previous email.
self.bug = theano.tensor.scalar() self.bug = theano.tensor.scalar()
def _instance_initialize(self, obj): def _instance_initialize(self, obj):
print 'Initializing D' print 'Initializing D'
obj.c.set(1) obj.c.set(1)
d = D() d = D()
...@@ -735,6 +735,8 @@ def test_pickle_aliased_memory(): ...@@ -735,6 +735,8 @@ def test_pickle_aliased_memory():
sio = StringIO.StringIO() sio = StringIO.StringIO()
handler = logging.StreamHandler(sio) handler = logging.StreamHandler(sio)
logging.getLogger('theano.compile.function_module').addHandler(handler) logging.getLogger('theano.compile.function_module').addHandler(handler)
# Silence original handler when intentionnally generating warning messages
logging.getLogger('theano').removeHandler(theano.logging_default_handler)
try: try:
m.f.pickle_aliased_memory_strategy = 'warn' m.f.pickle_aliased_memory_strategy = 'warn'
m.g.pickle_aliased_memory_strategy = 'warn' m.g.pickle_aliased_memory_strategy = 'warn'
...@@ -742,6 +744,7 @@ def test_pickle_aliased_memory(): ...@@ -742,6 +744,7 @@ def test_pickle_aliased_memory():
assert sio.getvalue().startswith('aliased relat') assert sio.getvalue().startswith('aliased relat')
finally: finally:
logging.getLogger('theano.compile.function_module').removeHandler(handler) logging.getLogger('theano.compile.function_module').removeHandler(handler)
logging.getLogger('theano').addHandler(theano.logging_default_handler)
try: try:
m.f.pickle_aliased_memory_strategy = 'raise' m.f.pickle_aliased_memory_strategy = 'raise'
...@@ -772,7 +775,7 @@ def test_default_instance_initialize(): ...@@ -772,7 +775,7 @@ def test_default_instance_initialize():
""" """
Testing the default _instance_initialize provided by module. Testing the default _instance_initialize provided by module.
""" """
class M1(Module): class M1(Module):
def __init__(self): def __init__(self):
super(M1, self).__init__() super(M1, self).__init__()
...@@ -830,4 +833,3 @@ if __name__ == '__main__': ...@@ -830,4 +833,3 @@ if __name__ == '__main__':
# t.test_shared_members() # t.test_shared_members()
# tests = unittest.TestLoader().loadTestsFromModule("T_test_module") # tests = unittest.TestLoader().loadTestsFromModule("T_test_module")
# tests.debug() # tests.debug()
...@@ -42,10 +42,10 @@ class Test_SharedVariable(unittest.TestCase): ...@@ -42,10 +42,10 @@ class Test_SharedVariable(unittest.TestCase):
# generic can hold anything even when strict=True # generic can hold anything even when strict=True
u = shared('asdf', strict=False) u = shared('asdf', strict=False)
v = shared('asdf', strict=True) v = shared('asdf', strict=True)
u.value = 88 u.set_value(88)
v.value = 88 v.set_value(88)
def test_create_numpy_strict_false(self): def test_create_numpy_strict_false(self):
...@@ -96,14 +96,14 @@ class Test_SharedVariable(unittest.TestCase): ...@@ -96,14 +96,14 @@ class Test_SharedVariable(unittest.TestCase):
strict=False) strict=False)
# check that assignments to value are casted properly # check that assignments to value are casted properly
u.value = [3,4] u.set_value([3,4])
assert type(u.value) is numpy.ndarray assert type(u.get_value()) is numpy.ndarray
assert str(u.value.dtype) == 'float64' assert str(u.get_value(borrow=True).dtype) == 'float64'
assert numpy.all(u.value == [3,4]) assert numpy.all(u.get_value() == [3,4])
# check that assignments of nonsense fail # check that assignments of nonsense fail
try: try:
u.value = 'adsf' u.set_value('adsf')
assert 0 assert 0
except ValueError: except ValueError:
pass pass
...@@ -114,7 +114,8 @@ class Test_SharedVariable(unittest.TestCase): ...@@ -114,7 +114,8 @@ class Test_SharedVariable(unittest.TestCase):
assert u.get_value(borrow=True) is uval assert u.get_value(borrow=True) is uval
def test_scalar_strict(self): def test_scalar_strict(self):
def f(var, val): var.value = val def f(var, val):
var.set_value(val)
b = shared(numpy.int64(7), strict=True) b = shared(numpy.int64(7), strict=True)
assert b.type == theano.tensor.lscalar assert b.type == theano.tensor.lscalar
...@@ -123,7 +124,7 @@ class Test_SharedVariable(unittest.TestCase): ...@@ -123,7 +124,7 @@ class Test_SharedVariable(unittest.TestCase):
b = shared(numpy.int32(7), strict=True) b = shared(numpy.int32(7), strict=True)
assert b.type == theano.tensor.iscalar assert b.type == theano.tensor.iscalar
self.failUnlessRaises(TypeError, f, b, 8.23) self.failUnlessRaises(TypeError, f, b, 8.23)
b = shared(numpy.int16(7), strict=True) b = shared(numpy.int16(7), strict=True)
assert b.type == theano.tensor.wscalar assert b.type == theano.tensor.wscalar
self.failUnlessRaises(TypeError, f, b, 8.23) self.failUnlessRaises(TypeError, f, b, 8.23)
...@@ -154,7 +155,8 @@ class Test_SharedVariable(unittest.TestCase): ...@@ -154,7 +155,8 @@ class Test_SharedVariable(unittest.TestCase):
def test_tensor_strict(self): def test_tensor_strict(self):
def f(var, val): var.value = val def f(var, val):
var.set_value(val)
b = shared(numpy.int64([7]), strict=True) b = shared(numpy.int64([7]), strict=True)
assert b.type == theano.tensor.lvector assert b.type == theano.tensor.lvector
...@@ -206,47 +208,48 @@ class Test_SharedVariable(unittest.TestCase): ...@@ -206,47 +208,48 @@ class Test_SharedVariable(unittest.TestCase):
# Since downcasting of a value now raises an Exception, # Since downcasting of a value now raises an Exception,
def f(var, val): var.value = val def f(var, val):
var.set_value(val)
b = shared(numpy.int64(7), allow_downcast=True) b = shared(numpy.int64(7), allow_downcast=True)
assert b.type == theano.tensor.lscalar assert b.type == theano.tensor.lscalar
f(b,8.23) f(b,8.23)
assert b.value==8 assert b.get_value()==8
b = shared(numpy.int32(7), allow_downcast=True) b = shared(numpy.int32(7), allow_downcast=True)
assert b.type == theano.tensor.iscalar assert b.type == theano.tensor.iscalar
f(b,8.23) f(b,8.23)
assert b.value==8 assert b.get_value()==8
b = shared(numpy.int16(7), allow_downcast=True) b = shared(numpy.int16(7), allow_downcast=True)
assert b.type == theano.tensor.wscalar assert b.type == theano.tensor.wscalar
f(b,8.23) f(b,8.23)
assert b.value==8 assert b.get_value()==8
b = shared(numpy.int8(7), allow_downcast=True) b = shared(numpy.int8(7), allow_downcast=True)
assert b.type == theano.tensor.bscalar assert b.type == theano.tensor.bscalar
f(b,8.23) f(b,8.23)
assert b.value==8 assert b.get_value()==8
b = shared(numpy.float64(7.234), allow_downcast=True) b = shared(numpy.float64(7.234), allow_downcast=True)
assert b.type == theano.tensor.dscalar assert b.type == theano.tensor.dscalar
f(b,8) f(b,8)
assert b.value==8 assert b.get_value()==8
b = shared(numpy.float32(7.234), allow_downcast=True) b = shared(numpy.float32(7.234), allow_downcast=True)
assert b.type == theano.tensor.fscalar assert b.type == theano.tensor.fscalar
f(b,8) f(b,8)
assert b.value==8 assert b.get_value()==8
b = shared(numpy.float(7.234), allow_downcast=True) b = shared(numpy.float(7.234), allow_downcast=True)
assert b.type == theano.tensor.dscalar assert b.type == theano.tensor.dscalar
f(b,8) f(b,8)
assert b.value==8 assert b.get_value()==8
b = shared(7.234, allow_downcast=True) b = shared(7.234, allow_downcast=True)
assert b.type == theano.tensor.dscalar assert b.type == theano.tensor.dscalar
f(b,8) f(b,8)
assert b.value==8 assert b.get_value()==8
c = shared(numpy.zeros((5,5), dtype='float32'), allow_downcast=True) c = shared(numpy.zeros((5,5), dtype='float32'), allow_downcast=True)
self.failUnlessRaises(TypeError, f, b, numpy.random.rand(5,5)) self.failUnlessRaises(TypeError, f, b, numpy.random.rand(5,5))
...@@ -254,37 +257,38 @@ class Test_SharedVariable(unittest.TestCase): ...@@ -254,37 +257,38 @@ class Test_SharedVariable(unittest.TestCase):
def test_tensor_floatX(self): def test_tensor_floatX(self):
def f(var, val): var.value = val def f(var, val):
var.set_value(val)
b = shared(numpy.int64([7]), allow_downcast=True) b = shared(numpy.int64([7]), allow_downcast=True)
assert b.type == theano.tensor.lvector assert b.type == theano.tensor.lvector
f(b,[8.23]) f(b,[8.23])
assert b.value == 8 assert b.get_value() == 8
b = shared(numpy.int32([7]), allow_downcast=True) b = shared(numpy.int32([7]), allow_downcast=True)
assert b.type == theano.tensor.ivector assert b.type == theano.tensor.ivector
f(b,[8.23]) f(b,[8.23])
assert b.value == 8 assert b.get_value() == 8
b = shared(numpy.int16([7]), allow_downcast=True) b = shared(numpy.int16([7]), allow_downcast=True)
assert b.type == theano.tensor.wvector assert b.type == theano.tensor.wvector
f(b,[8.23]) f(b,[8.23])
assert b.value == 8 assert b.get_value() == 8
b = shared(numpy.int8([7]), allow_downcast=True) b = shared(numpy.int8([7]), allow_downcast=True)
assert b.type == theano.tensor.bvector assert b.type == theano.tensor.bvector
f(b,[8.23]) f(b,[8.23])
assert b.value == 8 assert b.get_value() == 8
b = shared(numpy.float64([7.234]), allow_downcast=True) b = shared(numpy.float64([7.234]), allow_downcast=True)
assert b.type == theano.tensor.dvector assert b.type == theano.tensor.dvector
f(b,[8]) f(b,[8])
assert b.value == 8 assert b.get_value() == 8
b = shared(numpy.float32([7.234]), allow_downcast=True) b = shared(numpy.float32([7.234]), allow_downcast=True)
assert b.type == theano.tensor.fvector assert b.type == theano.tensor.fvector
f(b,[8]) f(b,[8])
assert b.value == 8 assert b.get_value() == 8
#numpy.float([7.234]) don't work #numpy.float([7.234]) don't work
# b = shared(numpy.float([7.234])) # b = shared(numpy.float([7.234]))
...@@ -299,10 +303,7 @@ class Test_SharedVariable(unittest.TestCase): ...@@ -299,10 +303,7 @@ class Test_SharedVariable(unittest.TestCase):
b = shared(numpy.asarray([7.234],dtype=theano.config.floatX), allow_downcast=True) b = shared(numpy.asarray([7.234],dtype=theano.config.floatX), allow_downcast=True)
assert b.dtype == theano.config.floatX assert b.dtype == theano.config.floatX
f(b,[8]) f(b,[8])
assert b.value == 8 assert b.get_value() == 8
c = shared(numpy.zeros((5,5), dtype='float32'), allow_downcast=True) c = shared(numpy.zeros((5,5), dtype='float32'), allow_downcast=True)
self.failUnlessRaises(TypeError, f, b, numpy.random.rand(5,5)) self.failUnlessRaises(TypeError, f, b, numpy.random.rand(5,5))
...@@ -28,7 +28,7 @@ AddConfigVar('init_gpu_device', ...@@ -28,7 +28,7 @@ AddConfigVar('init_gpu_device',
"Unlike 'device', setting this option will NOT move computations, " "Unlike 'device', setting this option will NOT move computations, "
"nor shared variables, to the specified GPU. " "nor shared variables, to the specified GPU. "
"It can be used to run GPU-specific tests on a particular GPU."), "It can be used to run GPU-specific tests on a particular GPU."),
EnumStr('', 'gpu0', 'gpu1', 'gpu2', 'gpu3', EnumStr('', 'gpu', 'gpu0', 'gpu1', 'gpu2', 'gpu3',
allow_override=False) allow_override=False)
) )
...@@ -49,14 +49,13 @@ try: ...@@ -49,14 +49,13 @@ try:
subprocess.Popen('gcc', stdout=subprocess.PIPE, stderr=subprocess.PIPE) subprocess.Popen('gcc', stdout=subprocess.PIPE, stderr=subprocess.PIPE)
# Keep the default linker the same as the one for the mode FAST_RUN # Keep the default linker the same as the one for the mode FAST_RUN
AddConfigVar('linker', AddConfigVar('linker',
"Default linker. If not None, will use this linker with the Mode "+ "Default linker used if the theano flags mode is Mode or ProfileMode",
"object(not ProfileMode or DebugMode)", EnumStr('c|py', 'py', 'c', 'c|py_nogc', 'c&py'))
EnumStr('c|py', 'py', 'c', 'c|py_nogc', 'c&py'))
except OSError: except OSError:
# gcc is not present, linker should default to python only # gcc is not present, linker should default to python only
AddConfigVar('linker', AddConfigVar('linker',
"Default linker. If not None, will use this linker with the Mode object(not ProfileMode or DebugMode)", "Default linker used if the theano flags mode is Mode or ProfileMode",
EnumStr('py', 'c|py', 'c', 'c|py_nogc', 'c&py')) EnumStr('py', 'c|py', 'c', 'c|py_nogc', 'c&py'))
warning('GCC not detected ! Theano will be unable to execute optimized '+ warning('GCC not detected ! Theano will be unable to execute optimized '+
'C-implementations (for both CPU and GPU) and will default to '+ 'C-implementations (for both CPU and GPU) and will default to '+
'Python implementations. Performance will be severely degraded.') 'Python implementations. Performance will be severely degraded.')
......
#For flag of bool type, we consider the string 'False','false' and '0' as False # For flag of bool type, we consider the string 'False','false' and '0' as False
# and the string 'True', 'true', '1' as true. # and the string 'True', 'true', '1' as true.
#We alsoaccept the bool type as its corresponding value! # We also accept the bool type as its corresponding value!
#Normally numpy consider only the empty string as false, but this give
# impression that it work when it do different people expected.
import os, StringIO, sys import os, StringIO, sys
import ConfigParser import ConfigParser
import logging import logging
import warnings
_logger = logging.getLogger('theano.config') _logger = logging.getLogger('theano.config')
class TheanoConfigWarning(Warning):
def warn(cls, message, stacklevel=0):
warnings.warn(message, cls, stacklevel=stacklevel + 3)
warn = classmethod(warn)
# Check for deprecated environment variables
for key in os.environ: for key in os.environ:
if key.startswith("THEANO"): if key.startswith("THEANO"):
if key not in ("THEANO_FLAGS", "THEANORC"): if key not in ("THEANO_FLAGS", "THEANORC"):
print >> sys.stderr, "ERROR: Ignoring deprecated environment variable", key TheanoConfigWarning.warn("Ignoring deprecated environment variable %s" % key)
THEANO_FLAGS = os.getenv("THEANO_FLAGS", "")
# The THEANO_FLAGS environment variable should be a list of comma-separated
# [section.]option=value entries. If the section part is omitted, their should be only one
# section that contains the given option.
def parse_config_string(config_string, issue_warnings=True):
"""
Parses a config string composed of comma-separated key=value components into a dict.
"""
config_dict = {}
for kv_pair in THEANO_FLAGS.split(','):
kv_pair = kv_pair.strip()
if not kv_pair:
continue
kv_tuple = kv_pair.split('=', 1)
if len(kv_tuple) == 1:
if issue_warnings:
TheanoConfigWarning.warn("Config key '%s' has no value, ignoring it" % kv_tuple[0], stacklevel=1)
else:
k, v = kv_tuple
# subsequent values for k will override earlier ones
config_dict[k] = v
return config_dict
THEANO_FLAGS=os.getenv("THEANO_FLAGS","") THEANO_FLAGS_DICT = parse_config_string(THEANO_FLAGS, issue_warnings=True)
# The THEANO_FLAGS environement variable should be a list of comma-separated
# [section.]option[=value] entries. If the section part is omited, their should be only one
# section with that contain the gived option.
# THEANORC can contain a colon-delimited list of config files, like # THEANORC can contain a colon-delimited list of config files, like
# THEANORC=~lisa/.theanorc:~/.theanorc # THEANORC=~lisa/.theanorc:~/.theanorc
...@@ -27,31 +53,17 @@ THEANO_FLAGS=os.getenv("THEANO_FLAGS","") ...@@ -27,31 +53,17 @@ THEANO_FLAGS=os.getenv("THEANO_FLAGS","")
# precedence over those in files on the left. # precedence over those in files on the left.
def config_files_from_theanorc(): def config_files_from_theanorc():
rval = [os.path.expanduser(s) for s in os.getenv('THEANORC', '~/.theanorc').split(os.pathsep)] rval = [os.path.expanduser(s) for s in os.getenv('THEANORC', '~/.theanorc').split(os.pathsep)]
if os.getenv('THEANORC') is None and sys.platform=="win32": if os.getenv('THEANORC') is None and sys.platform == "win32":
#To don't need to change the filename and make it open easily # to don't need to change the filename and make it open easily
rval.append(os.path.expanduser('~/.theanorc.txt')) rval.append(os.path.expanduser('~/.theanorc.txt'))
return rval return rval
theano_cfg = ConfigParser.SafeConfigParser({'USER':os.getenv("USER", os.path.split(os.path.expanduser('~'))[-1])}) theano_cfg = ConfigParser.SafeConfigParser({'USER':os.getenv("USER", os.path.split(os.path.expanduser('~'))[-1])})
theano_cfg.read(config_files_from_theanorc()) theano_cfg.read(config_files_from_theanorc())
def parse_env_flags(flags, name , default_value=None):
#The value in the env variable THEANO_FLAGS override the previous value
val = default_value
for flag in flags.split(','):
if not flag:
continue
sp=flag.split('=',1)
if sp[0]==name:
if len(sp)==1:
val=True
else:
val=sp[1]
val=str(val)
return val
def fetch_val_for_key(key): def fetch_val_for_key(key):
"""Return the overriding config value for a key. """Return the overriding config value for a key.
A successful search returs a string value. A successful search returns a string value.
An unsuccessful search raises a KeyError An unsuccessful search raises a KeyError
The (decreasing) priority order is: The (decreasing) priority order is:
...@@ -61,23 +73,10 @@ def fetch_val_for_key(key): ...@@ -61,23 +73,10 @@ def fetch_val_for_key(key):
""" """
# first try to find it in the FLAGS # first try to find it in the FLAGS
rval = None try:
for name_val in THEANO_FLAGS.split(','): return THEANO_FLAGS_DICT[key]
if not name_val: except KeyError:
continue pass
name_val_tuple=name_val.split('=',1)
if len(name_val_tuple)==1:
name, val = name_val_tuple, str(True)
else:
name, val = name_val_tuple
if name == key:
# rval might be overriden by a later definition in THEANO_FLAGS
rval = val
# If an rval is found, it should be a string
if rval is not None:
return rval
# next try to find it in the config file # next try to find it in the config file
......
...@@ -158,7 +158,7 @@ from opt import (Optimizer, optimizer, SeqOptimizer, ...@@ -158,7 +158,7 @@ from opt import (Optimizer, optimizer, SeqOptimizer,
from optdb import \ from optdb import \
DB, Query, \ DB, Query, \
EquilibriumDB, SequenceDB EquilibriumDB, SequenceDB, ProxyDB
from toolbox import \ from toolbox import \
Bookkeeper, History, Validator, ReplaceValidate, NodeFinder, PrintListener Bookkeeper, History, Validator, ReplaceValidate, NodeFinder, PrintListener
......
...@@ -961,8 +961,8 @@ class CLinker(link.Linker): ...@@ -961,8 +961,8 @@ class CLinker(link.Linker):
preargs.remove('-DREPLACE_WITH_AMDLIBM') preargs.remove('-DREPLACE_WITH_AMDLIBM')
if 'amdlibm' in libs: if 'amdlibm' in libs:
libs.remove('amdlibm') libs.remove('amdlibm')
try:
module = c_compiler( module = c_compiler(
module_name=mod.name, module_name=mod.name,
src_code = mod.code(), src_code = mod.code(),
location=location, location=location,
...@@ -970,6 +970,9 @@ class CLinker(link.Linker): ...@@ -970,6 +970,9 @@ class CLinker(link.Linker):
lib_dirs=self.lib_dirs(), lib_dirs=self.lib_dirs(),
libs=libs, libs=libs,
preargs=preargs) preargs=preargs)
except Exception, e:
e.args += (str(self.env),)
raise
finally: finally:
release_lock() release_lock()
......
"""Generate and compile C modules for Python, """Generate and compile C modules for Python,
""" """
import os, tempfile, StringIO, sys, logging, subprocess, cPickle, atexit, time, shutil, stat import os, tempfile, StringIO, sys, logging, subprocess, cPickle, atexit, time, shutil, stat
import distutils.sysconfig import distutils.sysconfig
...@@ -220,8 +220,8 @@ class ModuleCache(object): ...@@ -220,8 +220,8 @@ class ModuleCache(object):
will be deleted in an atexit() handler. will be deleted in an atexit() handler.
If the ``version`` is neither 0 nor (), then the module will be kept in the cache between If the ``version`` is neither 0 nor (), then the module will be kept in the cache between
processes. processes.
An unversioned module is not deleted by the process that creates it. Deleting such modules An unversioned module is not deleted by the process that creates it. Deleting such modules
does not work on NFS filesystems because the tmpdir in which the library resides is in use does not work on NFS filesystems because the tmpdir in which the library resides is in use
until the end of the process' lifetime. Instead, unversioned modules are left in their until the end of the process' lifetime. Instead, unversioned modules are left in their
...@@ -234,7 +234,7 @@ class ModuleCache(object): ...@@ -234,7 +234,7 @@ class ModuleCache(object):
module_from_name = {} module_from_name = {}
"""maps a module filename to the loaded module object""" """maps a module filename to the loaded module object"""
entry_from_key = {} entry_from_key = {}
"""Maps keys to the filename of a .so/.pyd. """Maps keys to the filename of a .so/.pyd.
""" """
...@@ -262,7 +262,7 @@ class ModuleCache(object): ...@@ -262,7 +262,7 @@ class ModuleCache(object):
self.entry_from_key = dict(self.entry_from_key) self.entry_from_key = dict(self.entry_from_key)
self.stats = [0, 0, 0] self.stats = [0, 0, 0]
if force_fresh is not None: if force_fresh is not None:
self.force_fresh = force_fresh self.force_fresh = force_fresh
self.loaded_key_pkl = set() self.loaded_key_pkl = set()
self.refresh() self.refresh()
...@@ -294,6 +294,8 @@ class ModuleCache(object): ...@@ -294,6 +294,8 @@ class ModuleCache(object):
Also, remove malformed cache directories. Also, remove malformed cache directories.
""" """
too_old_to_use = []
compilelock.get_lock() compilelock.get_lock()
try: try:
# add entries that are not in the entry_from_key dictionary # add entries that are not in the entry_from_key dictionary
...@@ -316,11 +318,14 @@ class ModuleCache(object): ...@@ -316,11 +318,14 @@ class ModuleCache(object):
try: try:
entry = module_name_from_dir(root) entry = module_name_from_dir(root)
except ValueError: # there is a key but no dll! except ValueError: # there is a key but no dll!
warning("ModuleCache.refresh() Found key without dll in cache, deleting it.", key_pkl) if not root.startswith("/tmp"):
# Under /tmp, file are removed periodically by the os.
# So it is normal that this happen from time to time.
warning("ModuleCache.refresh() Found key without dll in cache, deleting it.", key_pkl)
info("Erasing broken cache directory", key_pkl) info("Erasing broken cache directory", key_pkl)
shutil.rmtree(root) shutil.rmtree(root)
continue continue
if (time_now - last_access_time(module_name_from_dir(root)))<self.age_thresh_use: if (time_now - last_access_time(entry))<self.age_thresh_use:
debug('refresh adding', key_pkl) debug('refresh adding', key_pkl)
try: try:
key = cPickle.load(open(key_pkl, 'rb')) key = cPickle.load(open(key_pkl, 'rb'))
...@@ -347,6 +352,9 @@ class ModuleCache(object): ...@@ -347,6 +352,9 @@ class ModuleCache(object):
# assert that we haven't already got this entry somehow # assert that we haven't already got this entry somehow
assert entry not in self.module_from_name assert entry not in self.module_from_name
self.loaded_key_pkl.add(key_pkl) self.loaded_key_pkl.add(key_pkl)
else:
too_old_to_use.append(entry)
# remove entries that are not in the filesystem # remove entries that are not in the filesystem
items_copy = list(self.entry_from_key.iteritems()) items_copy = list(self.entry_from_key.iteritems())
...@@ -359,7 +367,7 @@ class ModuleCache(object): ...@@ -359,7 +367,7 @@ class ModuleCache(object):
gone = True gone = True
if gone: if gone:
# assert that we didn't have one of the deleted files # assert that we didn't have one of the deleted files
# loaded up and in use. # loaded up and in use.
# If so, it should not have been deleted. This should be considered a # If so, it should not have been deleted. This should be considered a
# failure of the OTHER process, that deleted it. # failure of the OTHER process, that deleted it.
if entry in self.module_from_name: if entry in self.module_from_name:
...@@ -374,12 +382,17 @@ class ModuleCache(object): ...@@ -374,12 +382,17 @@ class ModuleCache(object):
# printing a warning, removing evidence that we ever saw this mystery # printing a warning, removing evidence that we ever saw this mystery
# key. # key.
pkl_file_to_remove = os.path.join(os.path.dirname(entry), 'key.pkl') pkl_file_to_remove = os.path.join(os.path.dirname(entry), 'key.pkl')
warning('Removing key file %s because the corresponding module is gone from the file system.' % pkl_file_to_remove) if not root.startswith("/tmp"):
# Under /tmp, file are removed periodically by the os.
# So it is normal that this happen from time to time.
warning('Removing key file %s because the corresponding module is gone from the file system.' % pkl_file_to_remove)
self.loaded_key_pkl.remove(pkl_file_to_remove) self.loaded_key_pkl.remove(pkl_file_to_remove)
finally: finally:
compilelock.release_lock() compilelock.release_lock()
return too_old_to_use
def module_from_key(self, key, fn=None): def module_from_key(self, key, fn=None):
""" """
:param fn: a callable object that will return a module for the key (it is called only if the key isn't in :param fn: a callable object that will return a module for the key (it is called only if the key isn't in
...@@ -478,24 +491,28 @@ class ModuleCache(object): ...@@ -478,24 +491,28 @@ class ModuleCache(object):
seconds ago will be erased. seconds ago will be erased.
""" """
if age_thresh_del is None: if age_thresh_del is None:
age_thresh_del = self.age_thresh_del age_thresh_del = self.age_thresh_del
compilelock.get_lock() compilelock.get_lock()
try: try:
# update the age of modules that have been accessed by other processes # update the age of modules that have been accessed by other processes
self.refresh() # and get all module that are too old to use.(not loaded in self.entry_from_key)
too_old_to_use = self.refresh()
too_old_to_use = [(None,entry) for entry in too_old_to_use]
time_now = time.time() time_now = time.time()
# the .items() is important here: # the .items() is important here:
# we need to get a copy of the whole list of keys and entries # we need to get a copy of the whole list of keys and entries
items_copy = list(self.entry_from_key.iteritems()) items_copy = list(self.entry_from_key.iteritems())
for key, entry in items_copy: for key, entry in items_copy+too_old_to_use:
age = time_now - last_access_time(entry) age = time_now - last_access_time(entry)
if age > age_thresh_del: if age > age_thresh_del:
# TODO: we are assuming that modules that haven't been accessed in over # TODO: we are assuming that modules that haven't been accessed in over
# age_thresh_del are not currently in use by other processes, but that could be # age_thresh_del are not currently in use by other processes, but that could be
# false for long-running jobs... # false for long-running jobs...
assert entry not in self.module_from_name assert entry not in self.module_from_name
del self.entry_from_key[key] if key is not None:
del self.entry_from_key[key]
parent = os.path.dirname(entry) parent = os.path.dirname(entry)
assert parent.startswith(os.path.join(self.dirname, 'tmp')) assert parent.startswith(os.path.join(self.dirname, 'tmp'))
info("clear_old removing cache dir", parent) info("clear_old removing cache dir", parent)
...@@ -516,7 +533,7 @@ class ModuleCache(object): ...@@ -516,7 +533,7 @@ class ModuleCache(object):
filesystem. filesystem.
""" """
items_copy = list(self.entry_from_key.iteritems()) items_copy = list(self.entry_from_key.iteritems())
for key, entry in items_copy: for key, entry in items_copy:
version, rest = key version, rest = key
if not version: if not version:
del self.entry_from_key[key] del self.entry_from_key[key]
...@@ -525,13 +542,13 @@ class ModuleCache(object): ...@@ -525,13 +542,13 @@ class ModuleCache(object):
# because an unversioned entry should never have been loaded via refresh # because an unversioned entry should never have been loaded via refresh
assert entry in self.module_from_name assert entry in self.module_from_name
del self.module_from_name[entry] del self.module_from_name[entry]
parent = os.path.dirname(entry) parent = os.path.dirname(entry)
assert parent.startswith(os.path.join(self.dirname, 'tmp')) assert parent.startswith(os.path.join(self.dirname, 'tmp'))
info("clear_unversioned removing cache dir", parent) info("clear_unversioned removing cache dir", parent)
_rmtree(parent) _rmtree(parent)
time_now = time.time() time_now = time.time()
for filename in os.listdir(self.dirname): for filename in os.listdir(self.dirname):
if filename.startswith('tmp'): if filename.startswith('tmp'):
...@@ -648,9 +665,9 @@ def gcc_module_compile_str(module_name, src_code, location=None, include_dirs=[] ...@@ -648,9 +665,9 @@ def gcc_module_compile_str(module_name, src_code, location=None, include_dirs=[]
#TODO: Do not do the dlimport in this function #TODO: Do not do the dlimport in this function
if preargs is None: if preargs is None:
preargs = [] preargs = []
else: else:
preargs = list(preargs) preargs = list(preargs)
if sys.platform != 'win32': if sys.platform != 'win32':
# Under Windows it looks like fPIC is useless. Compiler warning: # Under Windows it looks like fPIC is useless. Compiler warning:
...@@ -658,7 +675,7 @@ def gcc_module_compile_str(module_name, src_code, location=None, include_dirs=[] ...@@ -658,7 +675,7 @@ def gcc_module_compile_str(module_name, src_code, location=None, include_dirs=[]
preargs.append('-fPIC') preargs.append('-fPIC')
no_opt = False no_opt = False
include_dirs = include_dirs + std_include_dirs() include_dirs = include_dirs + std_include_dirs()
libs = std_libs() + libs libs = std_libs() + libs
lib_dirs = std_lib_dirs() + lib_dirs lib_dirs = std_lib_dirs() + lib_dirs
if sys.platform == 'win32': if sys.platform == 'win32':
...@@ -674,7 +691,7 @@ def gcc_module_compile_str(module_name, src_code, location=None, include_dirs=[] ...@@ -674,7 +691,7 @@ def gcc_module_compile_str(module_name, src_code, location=None, include_dirs=[]
python_inc = distutils.sysconfig.get_python_inc() python_inc = distutils.sysconfig.get_python_inc()
libname = os.path.basename(python_inc) libname = os.path.basename(python_inc)
#DSE Patch 1 for supporting OSX frameworks; add -framework Python #DSE Patch 1 for supporting OSX frameworks; add -framework Python
if sys.platform=='darwin' : if sys.platform=='darwin' :
preargs.extend(['-undefined','dynamic_lookup']) preargs.extend(['-undefined','dynamic_lookup'])
# link with the framework library *if specifically requested* # link with the framework library *if specifically requested*
...@@ -688,11 +705,11 @@ def gcc_module_compile_str(module_name, src_code, location=None, include_dirs=[] ...@@ -688,11 +705,11 @@ def gcc_module_compile_str(module_name, src_code, location=None, include_dirs=[]
preargs.extend(['-m%s' % n_bits]) preargs.extend(['-m%s' % n_bits])
debug("OS X: compiling for %s bit architecture" % n_bits) debug("OS X: compiling for %s bit architecture" % n_bits)
# sometimes, the linker cannot find -lpython so we need to tell it # sometimes, the linker cannot find -lpython so we need to tell it
# explicitly where it is located # explicitly where it is located
# this returns somepath/lib/python2.x # this returns somepath/lib/python2.x
python_lib = distutils.sysconfig.get_python_lib(plat_specific=1, \ python_lib = distutils.sysconfig.get_python_lib(plat_specific=1, \
standard_lib=1) standard_lib=1)
python_lib = os.path.dirname(python_lib) python_lib = os.path.dirname(python_lib)
if python_lib not in lib_dirs: if python_lib not in lib_dirs:
lib_dirs.append(python_lib) lib_dirs.append(python_lib)
...@@ -722,7 +739,7 @@ def gcc_module_compile_str(module_name, src_code, location=None, include_dirs=[] ...@@ -722,7 +739,7 @@ def gcc_module_compile_str(module_name, src_code, location=None, include_dirs=[]
#print >> sys.stderr, config.gcc.cxxflags.split(' ') #print >> sys.stderr, config.gcc.cxxflags.split(' ')
cmd.extend(cxxflags) cmd.extend(cxxflags)
cmd.extend('-I%s'%idir for idir in include_dirs) cmd.extend('-I%s'%idir for idir in include_dirs)
cmd.extend(['-o',lib_filename]) cmd.extend(['-o',lib_filename])
cmd.append(cppfilename) cmd.append(cppfilename)
cmd.extend(['-L%s'%ldir for ldir in lib_dirs]) cmd.extend(['-L%s'%ldir for ldir in lib_dirs])
cmd.extend(['-l%s'%l for l in libs]) cmd.extend(['-l%s'%l for l in libs])
...@@ -747,4 +764,3 @@ def gcc_module_compile_str(module_name, src_code, location=None, include_dirs=[] ...@@ -747,4 +764,3 @@ def gcc_module_compile_str(module_name, src_code, location=None, include_dirs=[]
def icc_module_compile_str(*args): def icc_module_compile_str(*args):
raise NotImplementedError() raise NotImplementedError()
"""WRITEME""" """WRITEME"""
import sys import sys
if sys.version_info[:2] >= (2,5): if sys.version_info[:2] >= (2,5):
from collections import defaultdict from collections import defaultdict
# otherwise it's implemented in python25.py # otherwise it's implemented in python25.py
...@@ -12,7 +12,7 @@ from theano.gof import deque ...@@ -12,7 +12,7 @@ from theano.gof import deque
from env import InconsistencyError from env import InconsistencyError
class ProtocolError(Exception): class ProtocolError(Exception):
"""WRITEME""" """WRITEME"""
pass pass
...@@ -35,7 +35,7 @@ class DestroyHandler(object): ...@@ -35,7 +35,7 @@ class DestroyHandler(object):
def on_prune(self, env, op): def on_prune(self, env, op):
self.map[env].on_prune(env, op) self.map[env].on_prune(env, op)
def on_change_input(self, env, node, i, r, new_r): def on_change_input(self, env, node, i, r, new_r):
self.map[env].on_change_input(env, node, i, r, new_r) self.map[env].on_change_input(env, node, i, r, new_r)
...@@ -65,10 +65,10 @@ def _dfs_toposort(i, r_out, orderings): ...@@ -65,10 +65,10 @@ def _dfs_toposort(i, r_out, orderings):
iset = set(i) iset = set(i)
if 0: if 0:
def expand(obj): def expand(obj):
rval = [] rval = []
if obj not in iset: if obj not in iset:
if isinstance(obj, graph.Variable): if isinstance(obj, graph.Variable):
if obj.owner: if obj.owner:
rval = [obj.owner] rval = [obj.owner]
if isinstance(obj, graph.Apply): if isinstance(obj, graph.Apply):
...@@ -137,7 +137,7 @@ def getroot(r, view_i): ...@@ -137,7 +137,7 @@ def getroot(r, view_i):
For views: Return non-view variable which is ultimatly viewed by r. For views: Return non-view variable which is ultimatly viewed by r.
For non-views: return self. For non-views: return self.
""" """
try: try:
return getroot(view_i[r], view_i) return getroot(view_i[r], view_i)
except KeyError: except KeyError:
return r return r
...@@ -170,7 +170,7 @@ def fast_inplace_check(inputs): ...@@ -170,7 +170,7 @@ def fast_inplace_check(inputs):
protected_inputs = sum(protected_inputs,[])#flatten the list protected_inputs = sum(protected_inputs,[])#flatten the list
protected_inputs.extend(env.outputs) protected_inputs.extend(env.outputs)
inputs = [i for i in inputs if inputs = [i for i in inputs if
not isinstance(i,graph.Constant) not isinstance(i,graph.Constant)
and not env.destroyers(i) and not env.destroyers(i)
and i not in protected_inputs] and i not in protected_inputs]
...@@ -185,7 +185,7 @@ class DestroyHandlerHelper2(toolbox.Bookkeeper): ...@@ -185,7 +185,7 @@ class DestroyHandlerHelper2(toolbox.Bookkeeper):
When an Op uses its view_map property to declare that an output may be aliased When an Op uses its view_map property to declare that an output may be aliased
to an input, then if that output is destroyed, the input is also considering to be to an input, then if that output is destroyed, the input is also considering to be
destroyed. The view_maps of several Ops can feed into one another and form a directed graph. destroyed. The view_maps of several Ops can feed into one another and form a directed graph.
The consequence of destroying any variable in such a graph is that all variables in the graph The consequence of destroying any variable in such a graph is that all variables in the graph
must be considered to be destroyed, because they could all be refering to the same must be considered to be destroyed, because they could all be refering to the same
underlying storage. In the current implementation, that graph is a tree, and the root of underlying storage. In the current implementation, that graph is a tree, and the root of
...@@ -195,7 +195,7 @@ class DestroyHandlerHelper2(toolbox.Bookkeeper): ...@@ -195,7 +195,7 @@ class DestroyHandlerHelper2(toolbox.Bookkeeper):
the foundation of that variable as being destroyed, with the `root_destroyer` property. the foundation of that variable as being destroyed, with the `root_destroyer` property.
""" """
droot = {} droot = {}
""" """
destroyed view + nonview variables -> foundation destroyed view + nonview variables -> foundation
""" """
...@@ -237,7 +237,7 @@ class DestroyHandlerHelper2(toolbox.Bookkeeper): ...@@ -237,7 +237,7 @@ class DestroyHandlerHelper2(toolbox.Bookkeeper):
self.view_i = {} # variable -> variable used in calculation self.view_i = {} # variable -> variable used in calculation
self.view_o = {} # variable -> set of variables that use this one as a direct input self.view_o = {} # variable -> set of variables that use this one as a direct input
#clients: how many times does an apply use a given variable #clients: how many times does an apply use a given variable
self.clients = {} # variable -> apply -> ninputs self.clients = {} # variable -> apply -> ninputs
self.stale_droot = True self.stale_droot = True
self.debug_all_apps = set() self.debug_all_apps = set()
...@@ -290,7 +290,7 @@ class DestroyHandlerHelper2(toolbox.Bookkeeper): ...@@ -290,7 +290,7 @@ class DestroyHandlerHelper2(toolbox.Bookkeeper):
delattr(self.env, 'destroyers') delattr(self.env, 'destroyers')
delattr(self.env, 'destroy_handler') delattr(self.env, 'destroy_handler')
self.env = None self.env = None
def on_import(self, env, app): def on_import(self, env, app):
"""Add Apply instance to set which must be computed""" """Add Apply instance to set which must be computed"""
...@@ -318,7 +318,7 @@ class DestroyHandlerHelper2(toolbox.Bookkeeper): ...@@ -318,7 +318,7 @@ class DestroyHandlerHelper2(toolbox.Bookkeeper):
for i, output in enumerate(app.outputs): for i, output in enumerate(app.outputs):
self.clients.setdefault(output, {}) self.clients.setdefault(output, {})
self.stale_droot = True self.stale_droot = True
def on_prune(self, env, app): def on_prune(self, env, app):
...@@ -350,18 +350,18 @@ class DestroyHandlerHelper2(toolbox.Bookkeeper): ...@@ -350,18 +350,18 @@ class DestroyHandlerHelper2(toolbox.Bookkeeper):
self.view_o[i].remove(o) self.view_o[i].remove(o)
if not self.view_o[i]: if not self.view_o[i]:
del self.view_o[i] del self.view_o[i]
self.stale_droot = True self.stale_droot = True
def on_change_input(self, env, app, i, old_r, new_r): def on_change_input(self, env, app, i, old_r, new_r):
"""app.inputs[i] changed from old_r to new_r """ """app.inputs[i] changed from old_r to new_r """
if app == 'output': if app == 'output':
# app == 'output' is special key that means Env is redefining which nodes are being # app == 'output' is special key that means Env is redefining which nodes are being
# considered 'outputs' of the graph. # considered 'outputs' of the graph.
pass pass
else: else:
if app not in self.debug_all_apps: raise ProtocolError("change without import") if app not in self.debug_all_apps: raise ProtocolError("change without import")
#UPDATE self.clients #UPDATE self.clients
self.clients[old_r][app] -= 1 self.clients[old_r][app] -= 1
if self.clients[old_r][app] == 0: if self.clients[old_r][app] == 0:
...@@ -388,7 +388,7 @@ class DestroyHandlerHelper2(toolbox.Bookkeeper): ...@@ -388,7 +388,7 @@ class DestroyHandlerHelper2(toolbox.Bookkeeper):
del self.view_o[old_r] del self.view_o[old_r]
self.view_o.setdefault(new_r,set()).add(output) self.view_o.setdefault(new_r,set()).add(output)
self.stale_droot = True self.stale_droot = True
def validate(self, env): def validate(self, env):
...@@ -400,7 +400,7 @@ class DestroyHandlerHelper2(toolbox.Bookkeeper): ...@@ -400,7 +400,7 @@ class DestroyHandlerHelper2(toolbox.Bookkeeper):
""" """
#print '\nVALIDATE' #print '\nVALIDATE'
if self.destroyers: if self.destroyers:
try: try:
ords = self.orderings(env) ords = self.orderings(env)
except Exception, e: except Exception, e:
...@@ -423,18 +423,18 @@ class DestroyHandlerHelper2(toolbox.Bookkeeper): ...@@ -423,18 +423,18 @@ class DestroyHandlerHelper2(toolbox.Bookkeeper):
pass pass
return True return True
def orderings(self, env): def orderings(self, env):
"""Return orderings induced by destructive operations. """Return orderings induced by destructive operations.
Raise InconsistencyError when Raise InconsistencyError when
a) attempting to destroy indestructable variable, or a) attempting to destroy indestructable variable, or
b) attempting to destroy a value multiple times, or b) attempting to destroy a value multiple times, or
c) an Apply destroys (illegally) one of its own inputs by aliasing c) an Apply destroys (illegally) one of its own inputs by aliasing
""" """
rval = {} rval = {}
if self.destroyers: if self.destroyers:
# BUILD DATA STRUCTURES # BUILD DATA STRUCTURES
# CHECK for multiple destructions during construction of variables # CHECK for multiple destructions during construction of variables
...@@ -444,7 +444,7 @@ class DestroyHandlerHelper2(toolbox.Bookkeeper): ...@@ -444,7 +444,7 @@ class DestroyHandlerHelper2(toolbox.Bookkeeper):
#print "view_i", self.view_i #print "view_i", self.view_i
#print "view_o", self.view_o #print "view_o", self.view_o
# check for destruction of constants # check for destruction of constants
illegal_destroy = [r for r in droot if \ illegal_destroy = [r for r in droot if \
getattr(r.tag,'indestructible', False) or \ getattr(r.tag,'indestructible', False) or \
isinstance(r, graph.Constant)] isinstance(r, graph.Constant)]
...@@ -472,7 +472,7 @@ class DestroyHandlerHelper2(toolbox.Bookkeeper): ...@@ -472,7 +472,7 @@ class DestroyHandlerHelper2(toolbox.Bookkeeper):
# add_inplace(x, x). An Op that can still work in this case should declare # add_inplace(x, x). An Op that can still work in this case should declare
# so via the 'tolerate_same' attribute # so via the 'tolerate_same' attribute
# #
# tolerate_same should be a list of pairs of the form # tolerate_same should be a list of pairs of the form
# [(idx0, idx1), (idx0, idx2), ...] # [(idx0, idx1), (idx0, idx2), ...]
# The first element of each pair is the index of a destroyed # The first element of each pair is the index of a destroyed
# variable. # variable.
...@@ -491,7 +491,7 @@ class DestroyHandlerHelper2(toolbox.Bookkeeper): ...@@ -491,7 +491,7 @@ class DestroyHandlerHelper2(toolbox.Bookkeeper):
for i, input in enumerate(app.inputs): for i, input in enumerate(app.inputs):
if input in root_impact \ if input in root_impact \
and (i not in tolerated or input is not destroyed_variable): and (i not in tolerated or input is not destroyed_variable):
raise InconsistencyError("Input aliasing: %s (%i, %i)" raise InconsistencyError("Input aliasing: %s (%i, %i)"
% (app, destroyed_idx, i)) % (app, destroyed_idx, i))
# add the rule: app must be preceded by all other Apply instances that # add the rule: app must be preceded by all other Apply instances that
...@@ -505,4 +505,3 @@ class DestroyHandlerHelper2(toolbox.Bookkeeper): ...@@ -505,4 +505,3 @@ class DestroyHandlerHelper2(toolbox.Bookkeeper):
rval[app] = root_clients rval[app] = root_clients
return rval return rval
...@@ -77,7 +77,7 @@ class Env(utils.object2): ...@@ -77,7 +77,7 @@ class Env(utils.object2):
""" """
Create an Env which operates on the subgraph bound by the inputs and outputs Create an Env which operates on the subgraph bound by the inputs and outputs
sets. sets.
This class keeps a pointer to the inputs and outputs, and also modifies them. This class keeps a pointer to the inputs and outputs, and also modifies them.
#TODO: document what variables are[not] set in the env when a feature is added via the #TODO: document what variables are[not] set in the env when a feature is added via the
...@@ -86,10 +86,10 @@ class Env(utils.object2): ...@@ -86,10 +86,10 @@ class Env(utils.object2):
""" """
self._features = [] self._features = []
# All nodes in the subgraph defined by inputs and outputs are cached in nodes # All nodes in the subgraph defined by inputs and outputs are cached in nodes
self.nodes = set() self.nodes = set()
# Ditto for variables # Ditto for variables
self.variables = set() self.variables = set()
...@@ -136,7 +136,7 @@ class Env(utils.object2): ...@@ -136,7 +136,7 @@ class Env(utils.object2):
""" WRITEME """ WRITEME
Cleans up all of this Env's nodes and variables so they are not Cleans up all of this Env's nodes and variables so they are not
associated with this Env anymore. associated with this Env anymore.
The Env should not be used anymore after disown is called. The Env should not be used anymore after disown is called.
This may not clean everything this Env's features set in the This may not clean everything this Env's features set in the
...@@ -232,7 +232,7 @@ class Env(utils.object2): ...@@ -232,7 +232,7 @@ class Env(utils.object2):
raise Exception("%s is already owned by another env" % r) raise Exception("%s is already owned by another env" % r)
if r.owner is None and not isinstance(r, graph.Value) and r not in self.inputs: if r.owner is None and not isinstance(r, graph.Value) and r not in self.inputs:
raise TypeError("An input of the graph was not provided and not given a value", r) raise TypeError("An input of the graph was not provided and not given a value", r)
for node in new_nodes: for node in new_nodes:
assert node not in self.nodes assert node not in self.nodes
self.__setup_node__(node) self.__setup_node__(node)
...@@ -274,7 +274,7 @@ class Env(utils.object2): ...@@ -274,7 +274,7 @@ class Env(utils.object2):
self.nodes.remove(node) self.nodes.remove(node)
self.variables.difference_update(node.outputs) self.variables.difference_update(node.outputs)
self.execute_callbacks('on_prune', node) self.execute_callbacks('on_prune', node)
for i, input in enumerate(node.inputs): for i, input in enumerate(node.inputs):
self.__remove_clients__(input, [(node, i)]) self.__remove_clients__(input, [(node, i)])
#self.__prune_r__(node.inputs) #self.__prune_r__(node.inputs)
...@@ -306,7 +306,7 @@ class Env(utils.object2): ...@@ -306,7 +306,7 @@ class Env(utils.object2):
if not r.type == new_r.type: if not r.type == new_r.type:
raise TypeError("The type of the replacement must be the same as the type of the original Variable.", r, new_r) raise TypeError("The type of the replacement must be the same as the type of the original Variable.", r, new_r)
node.inputs[i] = new_r node.inputs[i] = new_r
self.__import_r__([new_r]) self.__import_r__([new_r])
self.__add_clients__(new_r, [(node, i)]) self.__add_clients__(new_r, [(node, i)])
prune = self.__remove_clients__(r, [(node, i)], False) prune = self.__remove_clients__(r, [(node, i)], False)
...@@ -348,7 +348,7 @@ class Env(utils.object2): ...@@ -348,7 +348,7 @@ class Env(utils.object2):
### features ### ### features ###
def extend(self, feature): def extend(self, feature):
"""WRITEME """WRITEME
Adds a feature to this env. The feature may define one Adds a feature to this env. The feature may define one
...@@ -358,7 +358,7 @@ class Env(utils.object2): ...@@ -358,7 +358,7 @@ class Env(utils.object2):
if feature in self._features: if feature in self._features:
return # the feature is already present return # the feature is already present
attach = getattr(feature, 'on_attach', None) attach = getattr(feature, 'on_attach', None)
if attach is not None: if attach is not None:
try: try:
attach(self) attach(self)
except toolbox.AlreadyThere: except toolbox.AlreadyThere:
...@@ -381,7 +381,7 @@ class Env(utils.object2): ...@@ -381,7 +381,7 @@ class Env(utils.object2):
### callback utils ### ### callback utils ###
def execute_callbacks(self, name, *args, **kwargs): def execute_callbacks(self, name, *args, **kwargs):
"""WRITEME """WRITEME
Calls Calls
...@@ -446,7 +446,7 @@ class Env(utils.object2): ...@@ -446,7 +446,7 @@ class Env(utils.object2):
ords.setdefault(op, []).extend(prereqs) ords.setdefault(op, []).extend(prereqs)
order = graph.io_toposort(env.inputs, env.outputs, ords) order = graph.io_toposort(env.inputs, env.outputs, ords)
return order return order
def nclients(self, r): def nclients(self, r):
"""WRITEME Same as len(self.clients(r)).""" """WRITEME Same as len(self.clients(r))."""
return len(self.clients(r)) return len(self.clients(r))
...@@ -523,10 +523,3 @@ class Env(utils.object2): ...@@ -523,10 +523,3 @@ class Env(utils.object2):
for feature in self._features: for feature in self._features:
e.extend(feature) e.extend(feature)
return e, equiv return e, equiv
...@@ -417,8 +417,10 @@ def stack_search(start, expand, mode='bfs', build_inv = False): ...@@ -417,8 +417,10 @@ def stack_search(start, expand, mode='bfs', build_inv = False):
raise ValueError('mode should be bfs or dfs', mode) raise ValueError('mode should be bfs or dfs', mode)
rval_set = set() rval_set = set()
rval_list = list() rval_list = list()
if mode is 'bfs': start_pop = start.popleft if mode == 'bfs':
else: start_pop = start.pop start_pop = start.popleft
else:
start_pop = start.pop
expand_inv = {} expand_inv = {}
while start: while start:
l = start_pop() l = start_pop()
......
...@@ -85,7 +85,7 @@ class Linker(object): ...@@ -85,7 +85,7 @@ class Linker(object):
of that env. If inplace is True, the calculations will operate in the of that env. If inplace is True, the calculations will operate in the
same storage the env uses, else independent storage will be allocated same storage the env uses, else independent storage will be allocated
for the function. for the function.
Example:: Example::
e = x + y e = x + y
env = Env([x, y], [e]) env = Env([x, y], [e])
...@@ -114,13 +114,13 @@ class Linker(object): ...@@ -114,13 +114,13 @@ class Linker(object):
execute.thunk = thunk execute.thunk = thunk
execute.inputs = inputs execute.inputs = inputs
execute.outputs = outputs execute.outputs = outputs
return execute return execute
#TODO: Move this class to the compile module, where it is used (and for which it exists). #TODO: Move this class to the compile module, where it is used (and for which it exists).
class Container(object): class Container(object):
"""This class joins a variable with its computed value. """This class joins a variable with its computed value.
It is used in linkers, especially for the inputs and outputs of a Function. It is used in linkers, especially for the inputs and outputs of a Function.
""" """
def __init__(self, r, storage, readonly=False, strict=False, def __init__(self, r, storage, readonly=False, strict=False,
...@@ -146,7 +146,7 @@ class Container(object): ...@@ -146,7 +146,7 @@ class Container(object):
else: else:
self.type = r.type self.type = r.type
if name is None: if name is None:
self.name = r.name self.name = r.name
self.storage = storage self.storage = storage
self.readonly = readonly self.readonly = readonly
...@@ -195,7 +195,7 @@ def map_storage(env, order, input_storage, output_storage): ...@@ -195,7 +195,7 @@ def map_storage(env, order, input_storage, output_storage):
:rtype: 3-tuple :rtype: 3-tuple
:returns: (list of storage for inputs, list of storage for outputs, and the `storage_map`) :returns: (list of storage for inputs, list of storage for outputs, and the `storage_map`)
This function iterates over the nodes in `order` and ensures that for every This function iterates over the nodes in `order` and ensures that for every
input and output `Variable`, there is a unique storage container. This is input and output `Variable`, there is a unique storage container. This is
...@@ -258,22 +258,22 @@ def streamline(env, thunks, order, post_thunk_old_storage = None, no_recycling = ...@@ -258,22 +258,22 @@ def streamline(env, thunks, order, post_thunk_old_storage = None, no_recycling =
:param no_recycling: storage elements that cannot be 'recycled' by repeatedly executing the :param no_recycling: storage elements that cannot be 'recycled' by repeatedly executing the
program. These storage elements are cleared before re-running. program. These storage elements are cleared before re-running.
:param profiler: deprecated :param profiler: deprecated
:param nice_errors: run in such a way that the double-traceback is printed. This costs a :param nice_errors: run in such a way that the double-traceback is printed. This costs a
bit of performance in the inner python loop. bit of performance in the inner python loop.
""" """
if profiler is not None: if profiler is not None:
raise NotImplementedError() raise NotImplementedError()
if len(thunks) != len(order): if len(thunks) != len(order):
raise ValueError('Length of thunks and order must match', raise ValueError('Length of thunks and order must match',
(len(thunks), len(order))) (len(thunks), len(order)))
if post_thunk_old_storage: if post_thunk_old_storage:
if len(thunks) != len(post_thunk_old_storage): if len(thunks) != len(post_thunk_old_storage):
raise ValueError('Length of thunks and post_thunk_old_storage must match', raise ValueError('Length of thunks and post_thunk_old_storage must match',
(len(thunks), len(post_thunk_old_storage))) (len(thunks), len(post_thunk_old_storage)))
def streamline_default_f(): def streamline_default_f():
...@@ -319,10 +319,10 @@ class LocalLinker(Linker): ...@@ -319,10 +319,10 @@ class LocalLinker(Linker):
return self.make_all(profiler = profiler, return self.make_all(profiler = profiler,
input_storage = input_storage, input_storage = input_storage,
output_storage = output_storage)[:3] output_storage = output_storage)[:3]
def make_all(self, profiler, input_storage, output_storage): def make_all(self, profiler, input_storage, output_storage):
# By convention, subclasses of LocalLinker should implement this function! # By convention, subclasses of LocalLinker should implement this function!
# #
# This function should return a tuple of 5 things # This function should return a tuple of 5 things
# 1. function to run the program # 1. function to run the program
# 2. input storage # 2. input storage
...@@ -338,7 +338,7 @@ def gc_helper(node_list): ...@@ -338,7 +338,7 @@ def gc_helper(node_list):
:rtype: a 2-tuple :rtype: a 2-tuple
:returns: FIRST, the set of Variable instances which are computed by node_list, and SECOND a :returns: FIRST, the set of Variable instances which are computed by node_list, and SECOND a
dictionary that maps each Variable instance to a the last node to use Variable as an input. dictionary that maps each Variable instance to a the last node to use Variable as an input.
This is used to allow garbage collection within graphs. This is used to allow garbage collection within graphs.
""" """
#for freeing memory #for freeing memory
...@@ -350,7 +350,7 @@ def gc_helper(node_list): ...@@ -350,7 +350,7 @@ def gc_helper(node_list):
for output in node.outputs: for output in node.outputs:
computed.add(output) computed.add(output)
return computed, last_user return computed, last_user
class PerformLinker(LocalLinker): class PerformLinker(LocalLinker):
"""WRITEME """WRITEME
...@@ -366,7 +366,7 @@ class PerformLinker(LocalLinker): ...@@ -366,7 +366,7 @@ class PerformLinker(LocalLinker):
def accept(self, env, no_recycling = []): def accept(self, env, no_recycling = []):
""" """
:param env: a PerformLinker can have accepted one Env instance at a time. :param env: a PerformLinker can have accepted one Env instance at a time.
:param no_recycling: WRITEME :param no_recycling: WRITEME
:returns: self (TODO: WHY? Who calls this function?) :returns: self (TODO: WHY? Who calls this function?)
...@@ -415,11 +415,11 @@ class PerformLinker(LocalLinker): ...@@ -415,11 +415,11 @@ class PerformLinker(LocalLinker):
thunks.append(thunk) thunks.append(thunk)
if self.allow_gc: if self.allow_gc:
post_thunk_old_storage.append([storage_map[input] post_thunk_old_storage.append([storage_map[input]
for input in node.inputs for input in node.inputs
if (input in computed) and (input not in env.outputs) and node == last_user[input]]) if (input in computed) and (input not in env.outputs) and node == last_user[input]])
if no_recycling is True: if no_recycling is True:
# True seems like some special code for *everything*?? -JB # True seems like some special code for *everything*?? -JB
# FunctionMaker always passes a list I think -JB # FunctionMaker always passes a list I think -JB
no_recycling = storage_map.values() no_recycling = storage_map.values()
...@@ -429,7 +429,7 @@ class PerformLinker(LocalLinker): ...@@ -429,7 +429,7 @@ class PerformLinker(LocalLinker):
# The function that actually runs your program is one of the f's in streamline. # The function that actually runs your program is one of the f's in streamline.
f = streamline(env, thunks, order, post_thunk_old_storage, no_recycling = no_recycling, profiler = profiler) f = streamline(env, thunks, order, post_thunk_old_storage, no_recycling = no_recycling, profiler = profiler)
f.allow_gc = self.allow_gc #HACK: this is a way of passing an arg to Function.__call__ f.allow_gc = self.allow_gc #HACK: this is a way of passing an arg to Function.__call__
add_clear_storage(f, computed, storage_map) add_clear_storage(f, computed, storage_map)
...@@ -490,12 +490,12 @@ class WrapLinker(Linker): ...@@ -490,12 +490,12 @@ class WrapLinker(Linker):
@type env: gof.Env @type env: gof.Env
@param env: the env which we will link @param env: the env which we will link
@type no_recycling: a list of Variables that belong to env. @type no_recycling: a list of Variables that belong to env.
@param no_recycling: If a Variable is in no_recycling, L{WrapLinker} will clear @param no_recycling: If a Variable is in no_recycling, L{WrapLinker} will clear
the output storage associated to it (for each linker in linkers) during the output storage associated to it (for each linker in linkers) during
the computation to avoid reusing it. the computation to avoid reusing it.
""" """
if self.env is not None and self.env is not env: if self.env is not None and self.env is not env:
return type(self)(self.linkers, self.wrapper).accept(env, no_recycling) return type(self)(self.linkers, self.wrapper).accept(env, no_recycling)
...@@ -562,5 +562,3 @@ def WrapLinkerMany(linkers, wrappers): ...@@ -562,5 +562,3 @@ def WrapLinkerMany(linkers, wrappers):
for f in wrappers: for f in wrappers:
f(*args) f(*args)
return WrapLinker(linkers, wrapper) return WrapLinker(linkers, wrapper)
...@@ -42,10 +42,10 @@ class CLinkerObject(object): ...@@ -42,10 +42,10 @@ class CLinkerObject(object):
Provide search paths for headers, in addition to those in any relevant environment Provide search paths for headers, in addition to those in any relevant environment
variables. variables.
Hint: for unix compilers, these are the things that get '-I' prefixed in the compiler Hint: for unix compilers, these are the things that get '-I' prefixed in the compiler
cmdline. cmdline.
:Exceptions: :Exceptions:
- `MethodNotDefined`: Subclass does not implement this method - `MethodNotDefined`: Subclass does not implement this method
...@@ -78,10 +78,10 @@ class CLinkerObject(object): ...@@ -78,10 +78,10 @@ class CLinkerObject(object):
Provide search paths for libraries, in addition to those in any relevant environment Provide search paths for libraries, in addition to those in any relevant environment
variables (e.g. LD_LIBRARY_PATH). variables (e.g. LD_LIBRARY_PATH).
Hint: for unix compilers, these are the things that get '-L' prefixed in the compiler Hint: for unix compilers, these are the things that get '-L' prefixed in the compiler
cmdline. cmdline.
:Exceptions: :Exceptions:
- `MethodNotDefined`: Subclass does not implement this method - `MethodNotDefined`: Subclass does not implement this method
...@@ -148,9 +148,9 @@ class CLinkerObject(object): ...@@ -148,9 +148,9 @@ class CLinkerObject(object):
def c_no_compile_args(self): def c_no_compile_args(self):
"""Optional: Return a list of incompatible gcc compiler arguments. """Optional: Return a list of incompatible gcc compiler arguments.
We will remove those arguments from the command line of gcc. So if We will remove those arguments from the command line of gcc. So if
another Op adds a compile arg in the graph that is incompatible another Op adds a compile arg in the graph that is incompatible
with this Op, the incompatible arg will not be used. with this Op, the incompatible arg will not be used.
Useful for instance to remove -ffast-math. Useful for instance to remove -ffast-math.
EXAMPLE EXAMPLE
...@@ -178,7 +178,7 @@ class CLinkerOp(CLinkerObject): ...@@ -178,7 +178,7 @@ class CLinkerOp(CLinkerObject):
Returns C code that does the computation associated to this `Op`, Returns C code that does the computation associated to this `Op`,
given names for the inputs and outputs. given names for the inputs and outputs.
:Parameters: :Parameters:
`node` : Apply instance `node` : Apply instance
WRITEME WRITEME
...@@ -207,8 +207,8 @@ class CLinkerOp(CLinkerObject): ...@@ -207,8 +207,8 @@ class CLinkerOp(CLinkerObject):
QUESTION: is this function optional? QUESTION: is this function optional?
This is a convenient place to clean up things allocated by c_code(). This is a convenient place to clean up things allocated by c_code().
:Parameters: :Parameters:
`node` : Apply instance `node` : Apply instance
WRITEME WRITEME
...@@ -224,7 +224,7 @@ class CLinkerOp(CLinkerObject): ...@@ -224,7 +224,7 @@ class CLinkerOp(CLinkerObject):
`sub` : dict of strings `sub` : dict of strings
extra symbols defined in `CLinker` sub symbols (such as 'fail'). extra symbols defined in `CLinker` sub symbols (such as 'fail').
WRITEME WRITEME
WRITEME WRITEME
:Exceptions: :Exceptions:
...@@ -256,11 +256,11 @@ class PureOp(object): ...@@ -256,11 +256,11 @@ class PureOp(object):
An :term:`Op` is a type of operation. An :term:`Op` is a type of operation.
`Op` is an abstract class that documents the interface for theano's data transformations. `Op` is an abstract class that documents the interface for theano's data transformations.
It has many subclasses, such as It has many subclasses, such as
`sparse dot <http://pylearn.org/epydoc/theano.sparse.Dot-class.html>`__, `sparse dot <http://pylearn.org/epydoc/theano.sparse.Dot-class.html>`__,
and `Shape <http://pylearn.org/epydoc/theano.tensor.Shape-class.html>`__. and `Shape <http://pylearn.org/epydoc/theano.tensor.Shape-class.html>`__.
These subclasses are meant to be instantiated. These subclasses are meant to be instantiated.
An instance has several responsabilities: An instance has several responsabilities:
- making `Apply` instances, which mean "apply this type of operation to some particular inputs" (via `make_node`), - making `Apply` instances, which mean "apply this type of operation to some particular inputs" (via `make_node`),
...@@ -278,7 +278,7 @@ class PureOp(object): ...@@ -278,7 +278,7 @@ class PureOp(object):
""" """
default_output = None default_output = None
""" """
configuration variable for `__call__` configuration variable for `__call__`
A subclass should not change this class variable, but instead over-ride it with a subclass A subclass should not change this class variable, but instead over-ride it with a subclass
...@@ -304,7 +304,7 @@ class PureOp(object): ...@@ -304,7 +304,7 @@ class PureOp(object):
raise utils.MethodNotDefined("make_node", type(self), self.__class__.__name__) raise utils.MethodNotDefined("make_node", type(self), self.__class__.__name__)
def __call__(self, *inputs, **kwargs): def __call__(self, *inputs, **kwargs):
"""Optional: Return some or all output[s] of `make_node`. """Optional: Return some or all output[s] of `make_node`.
It is called by code such as: It is called by code such as:
...@@ -313,8 +313,8 @@ class PureOp(object): ...@@ -313,8 +313,8 @@ class PureOp(object):
x = tensor.matrix() x = tensor.matrix()
# tensor.exp is an Op instance, calls Op.__call__(self=<instance of exp>, inputs=(x,)) # tensor.exp is an Op instance, calls Op.__call__(self=<instance of exp>, inputs=(x,))
y = tensor.exp(x) y = tensor.exp(x)
This class implements a convenience function (for graph-building) which uses This class implements a convenience function (for graph-building) which uses
`default_output`, but subclasses are free to override this function and ignore `default_output`, but subclasses are free to override this function and ignore
`default_output`. `default_output`.
...@@ -344,7 +344,7 @@ class PureOp(object): ...@@ -344,7 +344,7 @@ class PureOp(object):
output storage. Return None. output storage. Return None.
:Parameters: :Parameters:
`node` : Apply instance `node` : Apply instance
contains the symbolic inputs and outputs contains the symbolic inputs and outputs
`inputs` : list `inputs` : list
sequence of inputs (immutable) sequence of inputs (immutable)
...@@ -362,6 +362,6 @@ class PureOp(object): ...@@ -362,6 +362,6 @@ class PureOp(object):
""" """
raise utils.MethodNotDefined("perform", type(self), self.__class__.__name__) raise utils.MethodNotDefined("perform", type(self), self.__class__.__name__)
class Op(utils.object2, PureOp, CLinkerOp): class Op(utils.object2, PureOp, CLinkerOp):
"""Convenience class to bundle `PureOp` and `CLinkerOp`""" """Convenience class to bundle `PureOp` and `CLinkerOp`"""
pass pass
...@@ -94,7 +94,7 @@ class FromFunctionOptimizer(Optimizer): ...@@ -94,7 +94,7 @@ class FromFunctionOptimizer(Optimizer):
env.extend(toolbox.ReplaceValidate()) env.extend(toolbox.ReplaceValidate())
def print_summary(self, stream=sys.stdout, level=0): def print_summary(self, stream=sys.stdout, level=0):
print >> stream, "%s%s id=%i" %(' '*level, print >> stream, "%s%s id=%i" %(' '*level,
str(self.apply), str(self.apply),
id(self)) id(self))
...@@ -236,7 +236,7 @@ class _metadict: ...@@ -236,7 +236,7 @@ class _metadict:
class MergeOptimizer(Optimizer): class MergeOptimizer(Optimizer):
""" """
Merges parts of the graph that are identical and redundant. Merges parts of the graph that are identical and redundant.
The basic principle is that if two Applies have ops that compare equal, and identical The basic principle is that if two Applies have ops that compare equal, and identical
inputs, then they do not both need to be computed. The clients of one are transfered to inputs, then they do not both need to be computed. The clients of one are transfered to
the other and one of them is removed from the graph. This procedure is carried out in the other and one of them is removed from the graph. This procedure is carried out in
...@@ -264,9 +264,9 @@ class MergeOptimizer(Optimizer): ...@@ -264,9 +264,9 @@ class MergeOptimizer(Optimizer):
sig = c.signature() sig = c.signature()
other_c = const_sig_inv.get(sig, None) other_c = const_sig_inv.get(sig, None)
if other_c is not None: if other_c is not None:
# multiple names will clobber each other.. # multiple names will clobber each other..
# we adopt convention to keep the last name # we adopt convention to keep the last name
if c.name: if c.name:
other_c.name = c.name other_c.name = c.name
env.replace_validate(c, other_c, reason='Constant Merge') env.replace_validate(c, other_c, reason='Constant Merge')
else: else:
...@@ -286,7 +286,7 @@ class MergeOptimizer(Optimizer): ...@@ -286,7 +286,7 @@ class MergeOptimizer(Optimizer):
# should at least contain `node` itself! # should at least contain `node` itself!
# #
if node.inputs: if node.inputs:
assert len(node.inputs[0].clients) > 0 assert len(node.inputs[0].clients) > 0
assert (node,0) in node.inputs[0].clients assert (node,0) in node.inputs[0].clients
merge_candidates = [(nodes_seen[c],c) for (c,i) in node.inputs[0].clients if c in nodes_seen] merge_candidates = [(nodes_seen[c],c) for (c,i) in node.inputs[0].clients if c in nodes_seen]
else: else:
...@@ -352,7 +352,7 @@ def MergeOptMerge(opt): ...@@ -352,7 +352,7 @@ def MergeOptMerge(opt):
class LocalOptimizer(object): class LocalOptimizer(object):
"""A class for node-based optimizations. """A class for node-based optimizations.
Instances should implement the transform function, Instances should implement the transform function,
and be passed to configure a env-based Optimizer instance. and be passed to configure a env-based Optimizer instance.
""" """
...@@ -396,7 +396,7 @@ class FromFunctionLocalOptimizer(LocalOptimizer): ...@@ -396,7 +396,7 @@ class FromFunctionLocalOptimizer(LocalOptimizer):
def __str__(self): def __str__(self):
return getattr(self, '__name__', '<FromFunctionLocalOptimizer instance>') return getattr(self, '__name__', '<FromFunctionLocalOptimizer instance>')
def print_summary(self, stream=sys.stdout, level=0): def print_summary(self, stream=sys.stdout, level=0):
print >> stream, "%s%s id=%i" %(' '*level, print >> stream, "%s%s id=%i" %(' '*level,
str(self.transform), str(self.transform),
id(self)) id(self))
...@@ -439,7 +439,7 @@ class _LocalOpKeyOptGroup(LocalOptGroup): ...@@ -439,7 +439,7 @@ class _LocalOpKeyOptGroup(LocalOptGroup):
if any(not hasattr(opt, 'op_key'), optimizers): if any(not hasattr(opt, 'op_key'), optimizers):
raise TypeError("All LocalOptimizers passed here must have an op_key method.") raise TypeError("All LocalOptimizers passed here must have an op_key method.")
CompositeLocalOptimizer.__init__(self, optimizers) CompositeLocalOptimizer.__init__(self, optimizers)
def op_key(self): def op_key(self):
return [opt.op_key() for opt in self.opts] return [opt.op_key() for opt in self.opts]
...@@ -510,8 +510,8 @@ class OpRemove(LocalOptimizer): ...@@ -510,8 +510,8 @@ class OpRemove(LocalOptimizer):
return "%s(x) -> x" % (self.op) return "%s(x) -> x" % (self.op)
def print_summary(self, stream=sys.stdout, level=0): def print_summary(self, stream=sys.stdout, level=0):
print >> stream, "%s%s(%s) id=%i" %(' '*level, print >> stream, "%s%s(%s) id=%i" %(' '*level,
self.__class__.__name__, self.__class__.__name__,
str(self.op), str(self.op),
id(self)) id(self))
...@@ -519,7 +519,7 @@ class OpRemove(LocalOptimizer): ...@@ -519,7 +519,7 @@ class OpRemove(LocalOptimizer):
class PatternSub(LocalOptimizer): class PatternSub(LocalOptimizer):
"""WRITEME """WRITEME
@todo update @todo update
Replaces all occurrences of the input pattern by the output pattern: Replaces all occurrences of the input pattern by the output pattern:
input_pattern ::= (op, <sub_pattern1>, <sub_pattern2>, ...) input_pattern ::= (op, <sub_pattern1>, <sub_pattern2>, ...)
...@@ -531,7 +531,7 @@ class PatternSub(LocalOptimizer): ...@@ -531,7 +531,7 @@ class PatternSub(LocalOptimizer):
sub_pattern ::= int sub_pattern ::= int
sub_pattern ::= float sub_pattern ::= float
constraint ::= lambda env, expr: additional matching condition constraint ::= lambda env, expr: additional matching condition
output_pattern ::= (op, <output_pattern1>, <output_pattern2>, ...) output_pattern ::= (op, <output_pattern1>, <output_pattern2>, ...)
output_pattern ::= string output_pattern ::= string
output_pattern ::= int output_pattern ::= int
...@@ -574,7 +574,7 @@ class PatternSub(LocalOptimizer): ...@@ -574,7 +574,7 @@ class PatternSub(LocalOptimizer):
:param in_pattern: the input pattern that we want to replace :param in_pattern: the input pattern that we want to replace
:param out_pattern: the replacement pattern :param out_pattern: the replacement pattern
:param allow_multiple_clients: if False, the pattern matching will fail :param allow_multiple_clients: if False, the pattern matching will fail
if one of the subpatterns has more than if one of the subpatterns has more than
one client. one client.
:param pdb: if True, we invoke pdb when the first node in the pattern match. :param pdb: if True, we invoke pdb when the first node in the pattern match.
""" """
...@@ -705,8 +705,8 @@ class PatternSub(LocalOptimizer): ...@@ -705,8 +705,8 @@ class PatternSub(LocalOptimizer):
return str(self) return str(self)
def print_summary(self, stream=sys.stdout, level=0): def print_summary(self, stream=sys.stdout, level=0):
print >> stream, "%s%s(%s, %s) id=%i" %(' '*level, print >> stream, "%s%s(%s, %s) id=%i" %(' '*level,
self.__class__.__name__, self.__class__.__name__,
str(self.in_pattern), str(self.in_pattern),
str(self.out_pattern), str(self.out_pattern),
id(self)) id(self))
...@@ -721,7 +721,7 @@ class PatternSub(LocalOptimizer): ...@@ -721,7 +721,7 @@ class PatternSub(LocalOptimizer):
class NavigatorOptimizer(Optimizer): class NavigatorOptimizer(Optimizer):
"""Abstract class """Abstract class
""" """
@staticmethod @staticmethod
def warn(exc, nav, repl_pairs, local_opt): def warn(exc, nav, repl_pairs, local_opt):
...@@ -748,14 +748,14 @@ class NavigatorOptimizer(Optimizer): ...@@ -748,14 +748,14 @@ class NavigatorOptimizer(Optimizer):
def __init__(self, local_opt, ignore_newtrees = 'auto', failure_callback = None): def __init__(self, local_opt, ignore_newtrees = 'auto', failure_callback = None):
""" """
:param local_opt: a LocalOptimizer to apply over a Env (or None is Ok too). :param local_opt: a LocalOptimizer to apply over a Env (or None is Ok too).
:param ignore_newtrees: :param ignore_newtrees:
- True: new subgraphs returned by an optimization is not a candidate for optimization - True: new subgraphs returned by an optimization is not a candidate for optimization
- False: new subgraphs returned by an optimization is a candidate for optimization - False: new subgraphs returned by an optimization is a candidate for optimization
- 'auto': let the local_opt set this parameter via its 'reentrant' attribute. - 'auto': let the local_opt set this parameter via its 'reentrant' attribute.
:param failure_callback: :param failure_callback:
a function that takes (exception, navigator, [(old, new), a function that takes (exception, navigator, [(old, new),
(old,new),...]) and we call it if there's an exception. (old,new),...]) and we call it if there's an exception.
If the trouble is from local_opt.transform(), the new variables will be 'None'. If the trouble is from local_opt.transform(), the new variables will be 'None'.
If the trouble is from validation (the new types don't match for If the trouble is from validation (the new types don't match for
...@@ -896,7 +896,7 @@ class TopoOptimizer(NavigatorOptimizer): ...@@ -896,7 +896,7 @@ class TopoOptimizer(NavigatorOptimizer):
if node is not current_node: if node is not current_node:
try: q.remove(node) try: q.remove(node)
except ValueError: pass except ValueError: pass
u = self.attach_updater(env, importer, pruner) u = self.attach_updater(env, importer, pruner)
try: try:
while q: while q:
...@@ -920,7 +920,7 @@ class OpKeyOptimizer(NavigatorOptimizer): ...@@ -920,7 +920,7 @@ class OpKeyOptimizer(NavigatorOptimizer):
if not hasattr(local_opt, 'op_key'): if not hasattr(local_opt, 'op_key'):
raise TypeError("LocalOptimizer for OpKeyOptimizer must have an 'op_key' method.") raise TypeError("LocalOptimizer for OpKeyOptimizer must have an 'op_key' method.")
NavigatorOptimizer.__init__(self, local_opt, ignore_newtrees, failure_callback) NavigatorOptimizer.__init__(self, local_opt, ignore_newtrees, failure_callback)
def apply(self, env): def apply(self, env):
op = self.local_opt.op_key() op = self.local_opt.op_key()
if isinstance(op, (list, tuple)): if isinstance(op, (list, tuple)):
...@@ -961,19 +961,19 @@ from utils import D ...@@ -961,19 +961,19 @@ from utils import D
class ChangeTracker: class ChangeTracker:
def __init__(self): def __init__(self):
self.changed = False self.changed = False
def on_import(self, env, node): def on_import(self, env, node):
self.changed = True self.changed = True
def on_change_input(self, env, node, i, r, new_r): def on_change_input(self, env, node, i, r, new_r):
self.changed = True self.changed = True
def reset(self): def reset(self):
self.changed = False self.changed = False
def on_attach(self, env): def on_attach(self, env):
env.change_tracker = self env.change_tracker = self
class EquilibriumOptimizer(NavigatorOptimizer): class EquilibriumOptimizer(NavigatorOptimizer):
def __init__(self, def __init__(self,
optimizers, optimizers,
...@@ -1026,7 +1026,7 @@ class EquilibriumOptimizer(NavigatorOptimizer): ...@@ -1026,7 +1026,7 @@ class EquilibriumOptimizer(NavigatorOptimizer):
gopt.apply(env) gopt.apply(env)
if env.change_tracker.changed: if env.change_tracker.changed:
changed = True changed = True
#apply local optimizer #apply local optimizer
for node in start_from: for node in start_from:
assert node in env.outputs assert node in env.outputs
...@@ -1041,7 +1041,7 @@ class EquilibriumOptimizer(NavigatorOptimizer): ...@@ -1041,7 +1041,7 @@ class EquilibriumOptimizer(NavigatorOptimizer):
if node is not current_node: if node is not current_node:
try: q.remove(node) try: q.remove(node)
except ValueError: pass except ValueError: pass
u = self.attach_updater(env, importer, pruner) u = self.attach_updater(env, importer, pruner)
try: try:
while q: while q:
...@@ -1098,7 +1098,10 @@ def _check_chain(r, chain): ...@@ -1098,7 +1098,10 @@ def _check_chain(r, chain):
r = r.owner.inputs[chain.pop()] r = r.owner.inputs[chain.pop()]
#print 'check_chain', _check_chain.n_calls #print 'check_chain', _check_chain.n_calls
#_check_chain.n_calls += 1 #_check_chain.n_calls += 1
return r
# The return value will be used as a Boolean, but some Variables cannot
# be used as Booleans (the results of comparisons, for instance)
return (r is not None)
#_check_chain.n_calls = 0 #_check_chain.n_calls = 0
def check_chain(r, *chain): def check_chain(r, *chain):
...@@ -1137,6 +1140,3 @@ class PureThenInplaceOptimizer(Optimizer): ...@@ -1137,6 +1140,3 @@ class PureThenInplaceOptimizer(Optimizer):
self.pure(env) self.pure(env)
env.extend(dh.DestroyHandler()) env.extend(dh.DestroyHandler())
self.inplace(env) self.inplace(env)
...@@ -32,12 +32,18 @@ class DB(object): ...@@ -32,12 +32,18 @@ class DB(object):
# this is not always the case. # this is not always the case.
if not isinstance(obj, (DB, opt.Optimizer, opt.LocalOptimizer)): if not isinstance(obj, (DB, opt.Optimizer, opt.LocalOptimizer)):
raise TypeError('Object cannot be registered in OptDB', obj) raise TypeError('Object cannot be registered in OptDB', obj)
if name in self.__db__:
raise ValueError('The name of the object cannot be an existing tag or the name of another existing object.', obj, name)
# This restriction is there because in many place we suppose that
# something in the DB is there only once.
if getattr(obj, 'name', "") in self.__db__:
raise ValueError('''You can\'t register the same optimization
multiple time in a DB. Tryed to register "%s" again under the new name "%s".
Use theano.gof.ProxyDB to work around that'''%(obj.name, name))
if self.name is not None: if self.name is not None:
tags = tags + (self.name,) tags = tags + (self.name,)
obj.name = name obj.name = name
if name in self.__db__:
raise ValueError('The name of the object cannot be an existing tag or the name of another existing object.', obj, name)
self.__db__[name] = set([obj]) self.__db__[name] = set([obj])
self._names.add(name) self._names.add(name)
...@@ -223,3 +229,15 @@ class SequenceDB(DB): ...@@ -223,3 +229,15 @@ class SequenceDB(DB):
return sio.getvalue() return sio.getvalue()
class ProxyDB(DB):
"""
This is needed as we can't register the same DB mutiple time in different position
in a SequentialDB
"""
def __init__(self, db):
assert isinstance(db, DB), ""
self.db = db
def query(self, *tags, **kwtags):
return self.db.query(*tags, **kwtags)
...@@ -24,7 +24,7 @@ class TDouble(Type): ...@@ -24,7 +24,7 @@ class TDouble(Type):
return """ return """
%(name)s = 0; %(name)s = 0;
%(name)s_bad_thing = malloc(100000); %(name)s_bad_thing = malloc(100000);
//printf("Initializing %(name)s\\n"); //printf("Initializing %(name)s\\n");
""" % locals() """ % locals()
def c_literal(self, data): def c_literal(self, data):
...@@ -40,7 +40,7 @@ class TDouble(Type): ...@@ -40,7 +40,7 @@ class TDouble(Type):
%(name)s_bad_thing = NULL; %(name)s_bad_thing = NULL;
//printf("Extracting %(name)s\\n"); //printf("Extracting %(name)s\\n");
""" % dict(locals(), **sub) """ % dict(locals(), **sub)
def c_sync(self, name, sub): def c_sync(self, name, sub):
return """ return """
Py_XDECREF(py_%(name)s); Py_XDECREF(py_%(name)s);
...@@ -71,7 +71,7 @@ class MyOp(Op): ...@@ -71,7 +71,7 @@ class MyOp(Op):
def __init__(self, nin, name): def __init__(self, nin, name):
self.nin = nin self.nin = nin
self.name = name self.name = name
def make_node(self, *inputs): def make_node(self, *inputs):
assert len(inputs) == self.nin assert len(inputs) == self.nin
inputs = map(as_variable, inputs) inputs = map(as_variable, inputs)
...@@ -83,8 +83,9 @@ class MyOp(Op): ...@@ -83,8 +83,9 @@ class MyOp(Op):
def __str__(self): def __str__(self):
return self.name return self.name
def perform(self, node, inputs, (out, )): def perform(self, node, inputs, out_):
out, = out_
out[0] = self.impl(*inputs) out[0] = self.impl(*inputs)
def c_code_cache_version(self): def c_code_cache_version(self):
return () return ()
...@@ -98,30 +99,38 @@ class Binary(MyOp): ...@@ -98,30 +99,38 @@ class Binary(MyOp):
def __init__(self): def __init__(self):
MyOp.__init__(self, 2, self.__class__.__name__) MyOp.__init__(self, 2, self.__class__.__name__)
class Add(Binary): class Add(Binary):
def c_code(self, node, name, (x, y), (z, ), sub): def c_code(self, node, name, inp, out, sub):
x, y = inp
z, = out
return "%(z)s = %(x)s + %(y)s;" % locals() return "%(z)s = %(x)s + %(y)s;" % locals()
def impl(self, x, y): def impl(self, x, y):
return x + y return x + y
add = Add() add = Add()
class Sub(Binary): class Sub(Binary):
def c_code(self, node, name, (x, y), (z, ), sub): def c_code(self, node, name, inp, out, sub):
x, y = inp
z, = out
return "%(z)s = %(x)s - %(y)s;" % locals() return "%(z)s = %(x)s - %(y)s;" % locals()
def impl(self, x, y): def impl(self, x, y):
return -10 # erroneous (most of the time) return -10 # erroneous (most of the time)
sub = Sub() sub = Sub()
class Mul(Binary): class Mul(Binary):
def c_code(self, node, name, (x, y), (z, ), sub): def c_code(self, node, name, inp, out, sub):
x, y = inp
z, = out
return "%(z)s = %(x)s * %(y)s;" % locals() return "%(z)s = %(x)s * %(y)s;" % locals()
def impl(self, x, y): def impl(self, x, y):
return x * y return x * y
mul = Mul() mul = Mul()
class Div(Binary): class Div(Binary):
def c_code(self, node, name, (x, y), (z, ), sub): def c_code(self, node, name, inp, out, sub):
x, y = inp
z, = out
return "%(z)s = %(x)s / %(y)s;" % locals() return "%(z)s = %(x)s / %(y)s;" % locals()
def impl(self, x, y): def impl(self, x, y):
return x / y return x / y
...@@ -185,7 +194,7 @@ def test_clinker_dups_inner(): ...@@ -185,7 +194,7 @@ def test_clinker_dups_inner():
lnk = CLinker().accept(Env([x, y, z], [e])) lnk = CLinker().accept(Env([x, y, z], [e]))
fn = lnk.make_function() fn = lnk.make_function()
assert fn(1.0, 2.0, 3.0) == 8.0 assert fn(1.0, 2.0, 3.0) == 8.0
###################### ######################
...@@ -254,9 +263,11 @@ def test_duallinker_mismatch(): ...@@ -254,9 +263,11 @@ def test_duallinker_mismatch():
################################ ################################
# Test that failure code works # # Test that failure code works #
################################ ################################
class AddFail(Binary): class AddFail(Binary):
def c_code(self, node, name, (x, y), (z, ), sub): def c_code(self, node, name, inp, out, sub):
x, y = inp
z, = out
fail=sub['fail'] fail=sub['fail']
return """%(z)s = %(x)s + %(y)s; return """%(z)s = %(x)s + %(y)s;
PyErr_SetString(PyExc_RuntimeError, "failing here"); PyErr_SetString(PyExc_RuntimeError, "failing here");
......
...@@ -32,7 +32,7 @@ class MyOp(Op): ...@@ -32,7 +32,7 @@ class MyOp(Op):
self.name = name self.name = name
if impl: if impl:
self.impl = impl self.impl = impl
def make_node(self, *inputs): def make_node(self, *inputs):
assert len(inputs) == self.nin assert len(inputs) == self.nin
inputs = map(as_variable, inputs) inputs = map(as_variable, inputs)
...@@ -44,8 +44,9 @@ class MyOp(Op): ...@@ -44,8 +44,9 @@ class MyOp(Op):
def __str__(self): def __str__(self):
return self.name return self.name
def perform(self, node, inputs, (out, )): def perform(self, node, inputs, out_):
out, = out_
out[0] = self.impl(*inputs) out[0] = self.impl(*inputs)
add = MyOp(2, 'Add', lambda x, y: x + y) add = MyOp(2, 'Add', lambda x, y: x + y)
...@@ -85,7 +86,7 @@ class TestPerformLinker: ...@@ -85,7 +86,7 @@ class TestPerformLinker:
i[1].data = 2 i[1].data = 2
fn() fn()
assert o[0].data == 1.5 assert o[0].data == 1.5
def test_function(self): def test_function(self):
x, y, z = inputs() x, y, z = inputs()
e = mul(add(x, y), div(x, y)) e = mul(add(x, y), div(x, y))
...@@ -130,7 +131,7 @@ class TestWrapLinker: ...@@ -130,7 +131,7 @@ class TestWrapLinker:
nodes = [] nodes = []
def wrap(i, node, th): def wrap(i, node, th):
nodes.append(node.op) nodes.append(node.op)
x, y, z = inputs() x, y, z = inputs()
e = mul(add(x, y), div(x, y)) e = mul(add(x, y), div(x, y))
fn, i, o = wrap_linker(Env([x, y, z], [e]), [PerformLinker(allow_gc=False)], wrap).make_thunk() fn, i, o = wrap_linker(Env([x, y, z], [e]), [PerformLinker(allow_gc=False)], wrap).make_thunk()
...@@ -154,8 +155,8 @@ class TestWrapLinker: ...@@ -154,8 +155,8 @@ class TestWrapLinker:
fn() fn()
assert nodes == [div, add, mul] assert nodes == [div, add, mul]
assert o[0].data == 1.5 assert o[0].data == 1.5
...@@ -3,6 +3,7 @@ ...@@ -3,6 +3,7 @@
#C=a*C+dot(A,B)*b #C=a*C+dot(A,B)*b
#A,B,C matrix #A,B,C matrix
#a,b scalar #a,b scalar
import os
s=""" s="""
result for shapes=(2000,2000) and iters=100 result for shapes=(2000,2000) and iters=100
...@@ -32,6 +33,10 @@ def execute(execute=True, verbose=True): ...@@ -32,6 +33,10 @@ def execute(execute=True, verbose=True):
print ' blas.ldflags=',theano.config.blas.ldflags print ' blas.ldflags=',theano.config.blas.ldflags
print ' compiledir=',theano.config.compiledir print ' compiledir=',theano.config.compiledir
print ' floatX=',theano.config.floatX print ' floatX=',theano.config.floatX
print 'Some env flags:'
print ' MKL_NUM_THREADS=',os.getenv('MKL_NUM_THREADS')
print ' OMP_NUM_THREADS=',os.getenv('OMP_NUM_THREADS')
print ' GOTO_NUM_THREADS=',os.getenv('GOTO_NUM_THREADS')
print print
print 'Numpy config:(used when the theano flags "blas.ldflags" is empty)' print 'Numpy config:(used when the theano flags "blas.ldflags" is empty)'
numpy.show_config(); numpy.show_config();
...@@ -83,25 +88,37 @@ if __name__ == "__main__": ...@@ -83,25 +88,37 @@ if __name__ == "__main__":
print """ print """
Some result that you can compare again. They where 10 executions of gemm in float64 with matrix of shape 2000x2000 on FC9. Some result that you can compare again. They where 10 executions of gemm in float64 with matrix of shape 2000x2000 on FC9.
Cpu tested: Xeon E5345, Xeon E5430, Xeon E5450, Core 2 E8500, Core i7 930(hyper-threads enabled) Cpu tested: Xeon E5345, Xeon E5430, Xeon E5450(3Ghz), Xeon X5560(2.8Ghz, hyper-threads enabled?)
Core 2 E8500, Core i7 930(2.8Ghz, hyper-threads enabled), Core i7 950(3.07GHz, hyper-threads enabled)
Lib tested: Lib tested:
* numpy with ATLAS from distribution(FC9) package (1 thread) * numpy with ATLAS from distribution(FC9) package (1 thread)
* manually compiled numpy and ATLAS with 2 threads * manually compiled numpy and ATLAS with 2 threads
* goto with 1, 2, 4 and 8 threads. * goto 1.26 with 1, 2, 4 and 8 threads.
Xeon Xeon Xeon Core2 i7 Xeon Xeon Xeon Core2 i7 i7 Xeon
lib/nb threads E5345 E5430 E5450 E8500 930 lib/nb threads E5345 E5430 E5450 E8500 930 950 X5560
numpy_FC9_atlas/1 39.2s 35.0s 30.7s 29.6s 21.5s numpy_FC9_atlas/1 39.2s 35.0s 30.7s 29.6s 21.5s 19.60s
goto/1 18.7s 16.1s 14.2s 13.7s 16.1s goto/1 18.7s 16.1s 14.2s 13.7s 16.1s 14.67s
numpy_MAN_atlas/2 12.0s 11.6s 10.2s 9.2s 9.0s numpy_MAN_atlas/2 12.0s 11.6s 10.2s 9.2s 9.0s
goto/2 9.5s 8.1s 7.1s 7.3s 8.1s goto/2 9.5s 8.1s 7.1s 7.3s 8.1s 7.4s
goto/4 4.9s 4.4s 3.7s - 4.1s goto/4 4.9s 4.4s 3.7s - 4.1s 3.8s
goto/8 2.7s 2.4s 2.0s - 4.1s goto/8 2.7s 2.4s 2.0s - 4.1s 3.8s
openblas/1 14.04s
openblas/2 7.16s
openblas/4 3.71s
openblas/8 3.70s
mkl 10.2.2.025/1 13.7s
mkl 10.2.2.025/2 7.6s
mkl 10.2.2.025/4 4.0s
mkl 10.2.2.025/8 2.0s
mkl 11.0.083/1 7.97s
Test time in float32 with cuda 3.0.14 Test time in float32 with cuda 3.0.14
(cuda version 3.2RC and up are supposed to have faster gemm on the GTX4?? card) (cuda version 3.2RC and up are supposed to have faster gemm on the GTX4?? card)
cpu/cuda version cpu/cuda version
GTX580/3.2 0.20s
GTX480/3.2 0.24s GTX480/3.2 0.24s
GTX480/3.0 0.27s GTX480/3.0 0.27s
GTX470/3.2 0.29s GTX470/3.2 0.29s
......
...@@ -7,16 +7,15 @@ cat /proc/cpuinfo |grep processor ...@@ -7,16 +7,15 @@ cat /proc/cpuinfo |grep processor
free free
uname -a uname -a
t0=`THEANO_FLAGS=blas.ldflags= OMP_NUM_THREADS=1 time python misc/check_blas.py --quiet` TIME_PREFIX=time
t1=`OMP_NUM_THREADS=1 time python misc/check_blas.py --quiet` VAR=OMP_NUM_THREADS
t2=`OMP_NUM_THREADS=2 time python misc/check_blas.py --quiet` echo "numpy gemm take="
t4=`OMP_NUM_THREADS=4 time python misc/check_blas.py --quiet` THEANO_FLAGS=blas.ldflags= $TIME_PREFIX python misc/check_blas.py --quiet
t8=`OMP_NUM_THREADS=8 time python misc/check_blas.py --quiet` for i in 1 2 4 8:
do
echo "numpy gemm took: $t0" export $VAR=$i
echo "theano gemm 1 thread took: $t1" x=`$TIME_PREFIX python misc/check_blas.py --quiet`
echo "theano gemm 2 thread took: $t2" echo "theano gemm with $VAR=$i took: ${x}s"
echo "theano gemm 4 thread took: $t4" done
echo "theano gemm 8 thread took: $t8"
#Fred to test distro numpy at LISA: PYTHONPATH=/u/bastienf/repos:/usr/lib64/python2.5/site-packages THEANO_FLAGS=blas.ldflags= OMP_NUM_THREADS=8 time python misc/check_blas.py #Fred to test distro numpy at LISA: PYTHONPATH=/u/bastienf/repos:/usr/lib64/python2.5/site-packages THEANO_FLAGS=blas.ldflags= OMP_NUM_THREADS=8 time python misc/check_blas.py
\ No newline at end of file
#!/bin/bash
#we set the compiledir to the /Tmp dir to make the test faster by bypassing the nfs network.
date
ROOT_CWD=/Tmp/nightly_build
COMPILEDIR=/Tmp/lisa_theano_compile_dir_theano
NOSETESTS=/usr/bin/nosetests
echo "nb element in the compiledir:"
ls ${COMPILEDIR}|wc -l
FLAGS=warn.argmax_pushdown_bug=False,warn.gpusum_01_011_0111_bug=False,warn.sum_sum_bug=False,warn.sum_div_dimshuffle_bug=False,compiledir=${COMPILEDIR}
export PYTHONPATH=${ROOT_CWD}:$PYTHONPATH
cd ${ROOT_CWD}
echo "executing nosetests with mode=FAST_COMPILE"
THEANO_FLAGS=${FLAGS},mode=FAST_COMPILE ${NOSETESTS} Theano
echo "nb element in the compiledir:"
ls ${COMPILEDIR}|wc -l
echo "executing nosetests with mode=FAST_RUN"
THEANO_FLAGS=${FLAGS},mode=FAST_RUN ${NOSETESTS} --with-coverage --cover-package=theano Theano
echo "nb element in the compiledir:"
ls ${COMPILEDIR}|wc -l
echo "executing nosetests with mode=FAST_RUN,floatX=float32"
THEANO_FLAGS=${FLAGS},mode=FAST_RUN,floatX=float32 ${NOSETESTS} Theano
echo "nb element in the compiledir:"
ls ${COMPILEDIR}|wc -l
#we change the seed and record it everyday to test different combination. We record it to be able to reproduce bug caused by different seed. We don't want multiple test in DEBUG_MODE each day as this take too long.
seed=$RANDOM
echo "executing nosetests with mode=DEBUG_MODE with seed of the day $seed"
THEANO_FLAGS=${FLAGS},unittests.rseed=$seed,mode=DEBUG_MODE,DebugMode.check_strides=0,DebugMode.patience=3 ${NOSETESTS} Theano
echo "nb element in the compiledir:"
ls ${COMPILEDIR}|wc -l
date
\ No newline at end of file
#!/bin/env python
# Import smtplib for the actual sending function
import smtplib
import os.path
import sys
from theano.misc.buildbot_filter import filter_output
# me == the sender's email address
# family = the list of all recipients' email addresses
family=['theano-buildbot@googlegroups.com']
me='lisa@iro.umontreal.ca'
#Those file contain the output of the do_nightly_build script.
files=["/tmp/do_nightly_build_theano", "/tmp/do_nightly_build_pylearn", "/tmp/do_nightly_build_deeplearning"]
print files
print sys.argv
if len(sys.argv)==2:
files=[x+sys.argv[1] for x in files]
print files
# Here are the email package modules we'll need
from email.mime.text import MIMEText
from email.mime.multipart import MIMEMultipart
COMMASPACE = ', '
def mysend(subject, file):
# Create the container (outer) email message.
if not os.path.isfile(file):
print "Error: no file",file
return
msg = MIMEMultipart()
msg['From'] = me
msg['To'] = COMMASPACE.join(family)
msg.preamble = 'The output of the buildbot'
# Open the files in binary mode. Let the MIMEImage class automatically
# guess the specific image type.
fp = open(file, 'rb')
s=fp.read()
failures=0
errors=0
ran=False
nb_ran=0
skip=0
speed_failure=0
show_speed_failure=False
knownfail=0
gpu_time = None
float32_time = None
float64_time = None
for token in s.split():
token=token.strip('(,)')
if token.startswith("failures="):
failures+=int(token[9:])
elif token.startswith("errors="):
errors+=int(token[+7:])
elif token == "Ran":
ran=True
elif token.startswith("SKIP="):
skip+=int(token[5:])
elif token == "KnownFailureTest:":
knownfail+=1
elif token.startswith("speed_failure_"):
speed_failure+=int(token.split('=')[1])
show_speed_failure=True
elif ran:
ran=False
try:
nb_ran+=int(token)
except Exception, e:
print e
start = ""
for line in s.splitlines():
if gpu_time is None and line.startswith("gpu % expected/get"):
start=line
elif float32_time is None and line.startswith("float32 % expected/get"):
start=line
elif float64_time is None and line.startswith("float64 % expected/get"):
start=line
elif start:
start+=line
if start[-1]=="]":
if start.startswith("gpu % expected/get"):
gpu_time = start
start = ""
elif start.startswith("float32 % expected/get"):
float32_time = start
start = ""
elif start.startswith("float64 % expected/get"):
float64_time = start
start = ""
s="KnownFailure are removed from Error. \n Resume of the output:\n"+filter_output(open(file))+"Full output:\n"+s
img = MIMEText(s)
fp.close()
msg.attach(img)
errors-=knownfail
# Send the email via our own SMTP server.
if show_speed_failure:
msg['Subject'] = subject+" Fail="+str(failures)+" Err="+str(errors)+" Ran="+str(nb_ran)+" Skip="+str(skip)+" KnownFail="+str(knownfail)+ " SpeedFailure="+str(speed_failure)
else:
msg['Subject'] = subject+" Fail="+str(failures)+" Err="+str(errors)+" Ran="+str(nb_ran)+" Skip="+str(skip)+" KnownFail="+str(knownfail)
print msg['Subject']
s = smtplib.SMTP()
s.connect()
s.sendmail(me, family, msg.as_string())
s.close()
print "Finished sending email for",subject
mysend('Theano buildbot',files[0])
mysend('Pylearn buildbot',files[1])
mysend('Deep Learning Tutorial buildbot',files[2])
...@@ -255,13 +255,13 @@ class Reindenter: ...@@ -255,13 +255,13 @@ class Reindenter:
return line return line
# Line-eater for tokenize. # Line-eater for tokenize.
def tokeneater(self, type, token, (sline, scol), end, line, def tokeneater(self, type, token, pos, end, line,
INDENT=tokenize.INDENT, INDENT=tokenize.INDENT,
DEDENT=tokenize.DEDENT, DEDENT=tokenize.DEDENT,
NEWLINE=tokenize.NEWLINE, NEWLINE=tokenize.NEWLINE,
COMMENT=tokenize.COMMENT, COMMENT=tokenize.COMMENT,
NL=tokenize.NL): NL=tokenize.NL):
sline, scol = pos
if type == NEWLINE: if type == NEWLINE:
# A program statement, or ENDMARKER, will eventually follow, # A program statement, or ENDMARKER, will eventually follow,
# after some (possibly empty) run of tokens of the form # after some (possibly empty) run of tokens of the form
......
...@@ -10,46 +10,40 @@ from theano.tensor.basic import TensorType ...@@ -10,46 +10,40 @@ from theano.tensor.basic import TensorType
try: try:
import scipy.sparse import scipy.sparse
from theano.sparse.basic import SparseType
def _is_sparse(a):
return scipy.sparse.issparse(a)
except ImportError: except ImportError:
#scipy not imported, their can be only ndarray #scipy not imported, their can be only ndarray and cudandarray
def may_share_memory(a, b, raise_other_type=True): def _is_sparse(a):
if not isinstance(a, numpy.ndarray) or not isinstance(b, numpy.ndarray): return False
if raise_other_type:
raise TypeError("may_share_memory support only ndarray when scipy is not available") import theano.sandbox.cuda as cuda
return False if cuda.cuda_available:
return numpy.may_share_memory(a,b) def _is_cuda(a):
return isinstance(a, cuda.CudaNdarray)
else: else:
#scipy imported, their can be ndarray and sparse type def _is_cuda(a):
from theano.sparse.basic import _is_sparse, SparseType return False
def may_share_memory(a, b, raise_other_type=True):
a_ndarray = isinstance(a, numpy.ndarray)
b_ndarray = isinstance(b, numpy.ndarray)
try:
a_sparse = _is_sparse(a)
except NotImplementedError:
a_sparse = False
try:
b_sparse = _is_sparse(b)
except NotImplementedError:
b_sparse = False
a_cuda = False def may_share_memory(a, b, raise_other_type=True):
b_cuda = False a_ndarray = isinstance(a, numpy.ndarray)
if a.__class__.__name__ == "CudaNdarray": b_ndarray = isinstance(b, numpy.ndarray)
a_cuda = True a_sparse = _is_sparse(a)
if b.__class__.__name__ == "CudaNdarray": b_sparse = _is_sparse(b)
b_cuda = True a_cuda = _is_cuda(a)
b_cuda = _is_cuda(b)
if not(a_ndarray or a_sparse or a_cuda) or not(b_ndarray or b_sparse or b_cuda): if not(a_ndarray or a_sparse or a_cuda) or not(b_ndarray or b_sparse or b_cuda):
if raise_other_type: if raise_other_type:
raise TypeError("may_share_memory support only ndarray and scipy.sparse and CudaNdarray type") raise TypeError("may_share_memory support only ndarray and scipy.sparse and CudaNdarray type")
return False return False
if a_ndarray and b_ndarray: if a_ndarray and b_ndarray:
return TensorType.may_share_memory(a,b) return TensorType.may_share_memory(a,b)
if a_cuda and b_cuda: if a_cuda and b_cuda:
from theano.sandbox.cuda.type import CudaNdarrayType from theano.sandbox.cuda.type import CudaNdarrayType
return CudaNdarrayType.may_share_memory(a,b) return CudaNdarrayType.may_share_memory(a,b)
if a_cuda or b_cuda: if a_cuda or b_cuda:
return False return False
return SparseType.may_share_memory(a,b) return SparseType.may_share_memory(a,b)
...@@ -7,9 +7,9 @@ The PycudaElemwiseSourceModuleOp is a Theano op use pycuda code generated with p ...@@ -7,9 +7,9 @@ The PycudaElemwiseSourceModuleOp is a Theano op use pycuda code generated with p
The PycudaElemwiseKernelOp op use pycuda code generated with pycuda.elementwise.ElementwiseKernel. It must be wrapper by TheanoElementwiseKernel. The PycudaElemwiseKernelOp op use pycuda code generated with pycuda.elementwise.ElementwiseKernel. It must be wrapper by TheanoElementwiseKernel.
Their is a test in test_pycuda.py. Their is a test in test_pycuda.py.
This don't work with broadcast and non-contiguous memory as pycuda don't support that, but we make sure we don't introduce problem. This don't work with broadcast and non-contiguous memory as pycuda don't support that, but we make sure we don't introduce problem.
If the memory is non-contiguous, we create a new copy that is contiguous. If the memory is non-contiguous, we create a new copy that is contiguous.
If their is broadcasted dimensions, we raise an error. If their is broadcasted dimensions, we raise an error.
""" """
...@@ -47,7 +47,7 @@ class TheanoElementwiseKernel(pycuda.elementwise.ElementwiseKernel): ...@@ -47,7 +47,7 @@ class TheanoElementwiseKernel(pycuda.elementwise.ElementwiseKernel):
if isinstance(arguments, str): if isinstance(arguments, str):
arguments = [theano_parse_c_arg(arg) for arg in arguments.split(",")] arguments = [theano_parse_c_arg(arg) for arg in arguments.split(",")]
pycuda.elementwise.ElementwiseKernel.__init__(self, arguments, operation, name, keep, options, **kwargs) pycuda.elementwise.ElementwiseKernel.__init__(self, arguments, operation, name, keep, options, **kwargs)
def __call__(self, *args): def __call__(self, *args):
vectors = [] vectors = []
...@@ -124,9 +124,10 @@ class PycudaElemwiseSourceModuleOp(Op): ...@@ -124,9 +124,10 @@ class PycudaElemwiseSourceModuleOp(Op):
self.pycuda_fct = mod.get_function(fct_name) self.pycuda_fct = mod.get_function(fct_name)
return out_node return out_node
def perform(self, node, inputs, (z,)): def perform(self, node, inputs, out):
#TODO support broadcast! #TODO support broadcast!
#TODO assert all input have the same shape #TODO assert all input have the same shape
z, = out
if z[0] is None or z[0].shape!=inputs[0].shape: if z[0] is None or z[0].shape!=inputs[0].shape:
z[0] = theano.sandbox.cuda.CudaNdarray.zeros(inputs[0].shape) z[0] = theano.sandbox.cuda.CudaNdarray.zeros(inputs[0].shape)
self.pycuda_fct(inputs[0],inputs[1],z[0], block=(inputs[0].shape[0],inputs[0].shape[1],1)) self.pycuda_fct(inputs[0],inputs[1],z[0], block=(inputs[0].shape[0],inputs[0].shape[1],1))
...@@ -182,7 +183,7 @@ class PycudaElemwiseKernelOp(Op): ...@@ -182,7 +183,7 @@ class PycudaElemwiseKernelOp(Op):
in_name = ["i"+str(id) for id in range(len(inputs))] in_name = ["i"+str(id) for id in range(len(inputs))]
out_name = ["o"+str(id) for id in range(self.nout)] out_name = ["o"+str(id) for id in range(self.nout)]
c_code = self.scalar_op.c_code(out_node, "some_name", tuple([n+"[i]"for n in in_name]), tuple(n+"[i]"for n in out_name), {}) c_code = self.scalar_op.c_code(out_node, "some_name", tuple([n+"[i]"for n in in_name]), tuple(n+"[i]"for n in out_name), {})
self.pycuda_fct = TheanoElementwiseKernel( self.pycuda_fct = TheanoElementwiseKernel(
", ".join([var.type.dtype_specs()[1]+" *"+name for var,name in zip(inputs,in_name) + zip(out_node.outputs,out_name)]), ", ".join([var.type.dtype_specs()[1]+" *"+name for var,name in zip(inputs,in_name) + zip(out_node.outputs,out_name)]),
c_code, c_code,
...@@ -191,8 +192,9 @@ class PycudaElemwiseKernelOp(Op): ...@@ -191,8 +192,9 @@ class PycudaElemwiseKernelOp(Op):
#include <numpy/arrayobject.h>""") #include <numpy/arrayobject.h>""")
return out_node return out_node
def perform(self, node, inputs, (z,)): def perform(self, node, inputs, out):
#TODO assert all input have the same shape #TODO assert all input have the same shape
z, = out
if z[0] is None or z[0].shape!=inputs[0].shape: if z[0] is None or z[0].shape!=inputs[0].shape:
z[0] = theano.sandbox.cuda.CudaNdarray.zeros(inputs[0].shape) z[0] = theano.sandbox.cuda.CudaNdarray.zeros(inputs[0].shape)
i = inputs + z i = inputs + z
......
...@@ -56,8 +56,8 @@ def test_pycuda_elemwise_kernel(): ...@@ -56,8 +56,8 @@ def test_pycuda_elemwise_kernel():
assert any([ isinstance(node.op, theano.sandbox.cuda.GpuElemwise) for node in f.maker.env.toposort()]) assert any([ isinstance(node.op, theano.sandbox.cuda.GpuElemwise) for node in f.maker.env.toposort()])
assert any([ isinstance(node.op, PycudaElemwiseKernelOp) for node in f2.maker.env.toposort()]) assert any([ isinstance(node.op, PycudaElemwiseKernelOp) for node in f2.maker.env.toposort()])
val1 = numpy.random.rand(5,5) val1 = numpy.asarray(numpy.random.rand(5,5), dtype='float32')
val2 = numpy.random.rand(5,5) val2 = numpy.asarray(numpy.random.rand(5,5), dtype='float32')
#val1 = numpy.ones((5,5)) #val1 = numpy.ones((5,5))
#val2 = numpy.arange(25).reshape(5,5) #val2 = numpy.arange(25).reshape(5,5)
assert (f(val1,val2) == f2(val1,val2)).all() assert (f(val1,val2) == f2(val1,val2)).all()
......
...@@ -47,18 +47,21 @@ def debugprint(obj, depth=-1, print_type=False, file=None): ...@@ -47,18 +47,21 @@ def debugprint(obj, depth=-1, print_type=False, file=None):
_file = file _file = file
done = set() done = set()
results_to_print = [] results_to_print = []
order = []
if isinstance(obj, gof.Variable): if isinstance(obj, gof.Variable):
results_to_print.append(obj) results_to_print.append(obj)
elif isinstance(obj, gof.Apply): elif isinstance(obj, gof.Apply):
results_to_print.extend(obj.outputs) results_to_print.extend(obj.outputs)
elif isinstance(obj, Function): elif isinstance(obj, Function):
results_to_print.extend(obj.maker.env.outputs) results_to_print.extend(obj.maker.env.outputs)
order = obj.maker.env.toposort()
elif isinstance(obj, (list, tuple)): elif isinstance(obj, (list, tuple)):
results_to_print.extend(obj) results_to_print.extend(obj)
else: else:
raise TypeError("debugprint cannot print an object of this type", obj) raise TypeError("debugprint cannot print an object of this type", obj)
for r in results_to_print: for r in results_to_print:
debugmode.debugprint(r, depth=depth, done=done, print_type=print_type, file=_file) debugmode.debugprint(r, depth=depth, done=done, print_type=print_type,
file=_file, order=order)
if file is _file: if file is _file:
return file return file
elif file=='str': elif file=='str':
...@@ -71,16 +74,16 @@ def _print_fn(op, xin): ...@@ -71,16 +74,16 @@ def _print_fn(op, xin):
for attr in op.attrs: for attr in op.attrs:
temp = getattr(xin, attr) temp = getattr(xin, attr)
if callable(temp): if callable(temp):
pmsg = temp() pmsg = temp()
else: else:
pmsg = temp pmsg = temp
print op.message, attr,'=', pmsg print op.message, attr,'=', pmsg
class Print(Op): class Print(Op):
"""This identity-like Op has the side effect of printing a message followed by its inputs """This identity-like Op has the side effect of printing a message followed by its inputs
when it runs. Default behaviour is to print the __str__ representation. Optionally, one when it runs. Default behaviour is to print the __str__ representation. Optionally, one
can pass a list of the input member functions to execute, or attributes to print. can pass a list of the input member functions to execute, or attributes to print.
@type message: String @type message: String
@param message: string to prepend to the output @param message: string to prepend to the output
@type attrs: list of Strings @type attrs: list of Strings
...@@ -122,7 +125,7 @@ class Print(Op): ...@@ -122,7 +125,7 @@ class Print(Op):
return (1,) return (1,)
class PrinterState(gof.utils.scratchpad): class PrinterState(gof.utils.scratchpad):
def __init__(self, props = {}, **more_props): def __init__(self, props = {}, **more_props):
if isinstance(props, gof.utils.scratchpad): if isinstance(props, gof.utils.scratchpad):
self.__update__(props) self.__update__(props)
...@@ -311,9 +314,9 @@ class PPrinter: ...@@ -311,9 +314,9 @@ class PPrinter:
i += 1 i += 1
if output.name is not None or output in outputs: if output.name is not None or output in outputs:
if output.name is None: if output.name is None:
name = 'out[%i]' % outputs.index(output) name = 'out[%i]' % outputs.index(output)
else: else:
name = output.name name = output.name
#backport #backport
#name = 'out[%i]' % outputs.index(output) if output.name is None else output.name #name = 'out[%i]' % outputs.index(output) if output.name is None else output.name
current = output current = output
...@@ -370,16 +373,14 @@ pprint.assign(lambda pstate, r: hasattr(pstate, 'target') and pstate.target is n ...@@ -370,16 +373,14 @@ pprint.assign(lambda pstate, r: hasattr(pstate, 'target') and pstate.target is n
pp = pprint pp = pprint
def pydotprint(fct, outfile=os.path.join(config.compiledir,'theano.pydotprint.png'), def pydotprint(fct, outfile=None,
compact=True, mode=None, format='png', with_ids=False): compact=True, format='png', with_ids=False):
""" """
print to a file in png format the graph of op of a compile theano fct. print to a file in png format the graph of op of a compile theano fct.
:param fct: the theano fct returned by theano.function. :param fct: the theano fct returned by theano.function.
:param outfile: the output file where to put the graph. :param outfile: the output file where to put the graph.
:param compact: if True, will remove intermediate var that don't have name. :param compact: if True, will remove intermediate var that don't have name.
:param mode: if a ProfileMode, add to each Apply label (s in apply,% in apply in total op time, % in fct time)
Otherwise ignore it
:param format: the file format of the output. :param format: the file format of the output.
In the graph, box are an Apply Node(the execution of an op) and ellipse are variable. In the graph, box are an Apply Node(the execution of an op) and ellipse are variable.
...@@ -388,25 +389,31 @@ def pydotprint(fct, outfile=os.path.join(config.compiledir,'theano.pydotprint.pn ...@@ -388,25 +389,31 @@ def pydotprint(fct, outfile=os.path.join(config.compiledir,'theano.pydotprint.pn
We print the op of the apply in the Apply box with a number that represent the toposort order of application of those Apply. We print the op of the apply in the Apply box with a number that represent the toposort order of application of those Apply.
If an Apply have more then 1 input, print add a label to the edge that in the index of the inputs. If an Apply have more then 1 input, print add a label to the edge that in the index of the inputs.
green ellipse are input to the graph green ellipses are inputs to the graph
blue ellipse are output of the graph blue ellipses are outputs of the graph
grey ellipse are var generated by the graph that are not output and are not used. grey ellipses are var generated by the graph that are not output and are not used.
red ellipses are transfer to/from the gpu.
op with those name GpuFromHost, HostFromGpu
""" """
if outfile is None:
outfile = os.path.join(config.compiledir,'theano.pydotprint.' +
config.device + '.' + format)
mode = fct.maker.mode
if not isinstance(mode,ProfileMode) or not mode.fct_call.has_key(fct): if not isinstance(mode,ProfileMode) or not mode.fct_call.has_key(fct):
mode=None mode = None
try: try:
import pydot as pd import pydot as pd
except: except:
print "failed to import pydot. Yous must install pydot for this function to work." print "failed to import pydot. Yous must install pydot for this function to work."
return return
g=pd.Dot() g=pd.Dot()
var_str={} var_str={}
all_strings = set() all_strings = set()
def var_name(var): def var_name(var):
if var in var_str: if var in var_str:
return var_str[var] return var_str[var]
if var.name is not None: if var.name is not None:
varstr = 'name='+var.name+" "+str(var.type) varstr = 'name='+var.name+" "+str(var.type)
elif isinstance(var,gof.Constant): elif isinstance(var,gof.Constant):
...@@ -445,7 +452,8 @@ def pydotprint(fct, outfile=os.path.join(config.compiledir,'theano.pydotprint.pn ...@@ -445,7 +452,8 @@ def pydotprint(fct, outfile=os.path.join(config.compiledir,'theano.pydotprint.pn
prof_str=' (%.3fs,%.3f%%,%.3f%%)'%(time,pt,pf) prof_str=' (%.3fs,%.3f%%,%.3f%%)'%(time,pt,pf)
applystr = str(node.op).replace(':','_') applystr = str(node.op).replace(':','_')
if (applystr in all_strings) or with_ids: if (applystr in all_strings) or with_ids:
applystr = applystr+' id='+str(topo.index(node))+prof_str applystr = applystr+' id='+str(topo.index(node))
applystr += prof_str
all_strings.add(applystr) all_strings.add(applystr)
apply_name_cache[node] = applystr apply_name_cache[node] = applystr
return applystr return applystr
...@@ -461,12 +469,18 @@ def pydotprint(fct, outfile=os.path.join(config.compiledir,'theano.pydotprint.pn ...@@ -461,12 +469,18 @@ def pydotprint(fct, outfile=os.path.join(config.compiledir,'theano.pydotprint.pn
var_shape='box' var_shape='box'
for node_idx,node in enumerate(topo): for node_idx,node in enumerate(topo):
astr=apply_name(node) astr=apply_name(node)
g.add_node(pd.Node(astr,shape=apply_shape))
if node.op.__class__.__name__ in ('GpuFromHost','HostFromGpu'):
# highlight CPU-GPU transfers to simplify optimization
g.add_node(pd.Node(astr,color='red',shape=apply_shape))
else:
g.add_node(pd.Node(astr,shape=apply_shape))
for id,var in enumerate(node.inputs): for id,var in enumerate(node.inputs):
varstr=var_name(var) varstr=var_name(var)
label='' label=str(var.type)
if len(node.inputs)>1: if len(node.inputs)>1:
label=str(id) label=str(id)+' '+label
if var.owner is None: if var.owner is None:
g.add_node(pd.Node(varstr,color='green',shape=var_shape)) g.add_node(pd.Node(varstr,color='green',shape=var_shape))
g.add_edge(pd.Edge(varstr,astr, label=label)) g.add_edge(pd.Edge(varstr,astr, label=label))
...@@ -475,14 +489,14 @@ def pydotprint(fct, outfile=os.path.join(config.compiledir,'theano.pydotprint.pn ...@@ -475,14 +489,14 @@ def pydotprint(fct, outfile=os.path.join(config.compiledir,'theano.pydotprint.pn
else: else:
#no name, so we don't make a var ellipse #no name, so we don't make a var ellipse
g.add_edge(pd.Edge(apply_name(var.owner),astr, label=label)) g.add_edge(pd.Edge(apply_name(var.owner),astr, label=label))
for id,var in enumerate(node.outputs): for id,var in enumerate(node.outputs):
varstr=var_name(var) varstr=var_name(var)
out = any([x[0]=='output' for x in var.clients]) out = any([x[0]=='output' for x in var.clients])
label='' label=str(var.type)
if len(node.outputs)>1: if len(node.outputs)>1:
label=str(id) label=str(id)+' '+label
if out: if out:
g.add_edge(pd.Edge(astr, varstr, label=label)) g.add_edge(pd.Edge(astr, varstr, label=label))
g.add_node(pd.Node(varstr,color='blue',shape=var_shape)) g.add_node(pd.Node(varstr,color='blue',shape=var_shape))
...@@ -581,8 +595,3 @@ def pydot_var(vars, outfile=os.path.join(config.compiledir,'theano.pydotprint.pn ...@@ -581,8 +595,3 @@ def pydot_var(vars, outfile=os.path.join(config.compiledir,'theano.pydotprint.pn
g.write_png(outfile, prog='dot') g.write_png(outfile, prog='dot')
print 'The output file is available at',outfile print 'The output file is available at',outfile
差异被折叠。
差异被折叠。
差异被折叠。
差异被折叠。
差异被折叠。
差异被折叠。
差异被折叠。
差异被折叠。
差异被折叠。
差异被折叠。
差异被折叠。
差异被折叠。
差异被折叠。
差异被折叠。
差异被折叠。
差异被折叠。
差异被折叠。
差异被折叠。
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论