merge

09bce224 · Olivier Breuleux · bcea84cf · 24f4ad19 · 09bce224 · 09bce224
--- a/doc/doc/sandbox.txt
+++ b/doc/doc/sandbox.txt
@@ -193,3 +193,9 @@ How to reuse (overwrite) a storage tensor
 ``theano.compile.io.Out(gw1, borrow = True)`` for that value in
 ``compile.function``
+=========================================
+ProfileMode
+=========================================
+*** write up how to use it ***
--- a/doc/index.txt
+++ b/doc/index.txt
@@ -5,43 +5,49 @@
 Theano
 ======
-Theano is a Python library aiming to allow definition, optimization
+Theano is a Python library that allows you to definite, optimize, and
-and efficient evaluation of mathematical expressions involving
+efficiently evaluate mathematical expressions involving multi-dimensional
-multi-dimensional arrays (though it may be extended to support many
+arrays. It can be extended to support other types. Theano melds some
-other types). Theano melds some aspects of a computer algebra system
+aspects of a computer algebra system (CAS) with aspects of an optimizing
-(CAS) with aspects of an optimizing compiler. This is particularly
+compiler. It can even transform some or all of the expression into C code
-useful in fields such as machine learning where complicated algorithms
+and compile it into native machine instructions. This combination of CAS
-must be run over large amounts of data.
+with optimizing compilation is particularly useful for computational
+fields in which complicated mathematical expressions are evaluated
-Theano supports a wide range of numerical types in multiple
+numerous times over large data sets.
-dimensions, a rapidly growing number of well-tested operations as well
-as utilities to compute the gradient of an expression with respect to
+Theano was written at the LISA_ lab to support the development of
-another. Symbolic expressions may be compiled into functions, which
-work merrily on the same data structures as numpy_, allowing for easy
-interoperability.
-Theano's compiler applies many optimizations of varying
-complexity. These optimizations include, but are not limited to
-constant folding, merging of similar subgraphs (to avoid calculating
-the same values more than once), simple arithmetic simplification
-(``x*y/x -> y``), inserting efficient BLAS_ operations and using
-inplace operations wherever it is safe to do so. Theano also defines
-several optimizations which improve the numerical stability of
-computations and it provides a framework to add and test new
-optimizers.
-Theano was written at the LISA_ to support the development of
 efficient machine learning algorithms while minimizing human
-time. Theano was named after the `Greek mathematician`_ who may have
+time. We use it especially in gradient-based learning techniques.
-been Pythagoras' wife.
+Theano supports a range of numerical types in multiple dimensions and
+a number of well-tested operations. It also allows you to compute the
+gradient of an expression with respect to another. Symbolic expressions
+may be compiled into functions, which work on the same data structures
+as numpy_, allowing for easy interoperability.
+Theano's compiler applies many optimizations of varying complexity
+to these symbolic expressions. These optimizations include, but are
+not limited to:
+* constant folding
+* merging of similar subgraphs, to avoid calculating the same values more than once
+* simple arithmetic simplification (``x*y/x -> y``)
+* inserting efficient BLAS_ operations
+* using inplace operations wherever it is safe to do so.
+Theano defines several optimizations which improve the numerical
+stability of computations. It also provides a framework to add and test
+new optimizers.
+Theano was named after the `Greek mathematician`_, who may have
+been Pythagoras' wife.
 Theano is released under a BSD license (:ref:`link <license>`)
 Sneak peek
 ==========
-Here's a very simple example of how to use Theano.  It doesn't show
+Here is a simple example of how to use Theano. It doesn't show
 off many of Theano's features, but it illustrates concretely what
 Theano is.
@@ -67,8 +73,7 @@ Theano is.
 Theano is not a programming language in the normal sense because you
 write a program in Python that builds expressions for Theano. Still
-it is like a programming language in the sense that to use theano, you
+it is like a programming language in the sense that you have to
-have to
 - declare variables (``a,b``) and give their types
@@ -77,8 +82,8 @@ have to
 - compile expression graphs to functions in order to use them for computation.
 It is good to think of ``theano.function`` as the interface to a
-compiler which builds a callable object from a purely symbolic graph;
+compiler which builds a callable object from a purely symbolic graph.
-one of theano's most important features is that ``theano.function``
+One of theano's most important features is that ``theano.function``
 can optimize a graph and even compile some or all of it into native
 machine instructions.
@@ -95,18 +100,18 @@ package, so what does Theano do that Python and numpy do not?
  parts your expression graph into native machine code, which runs
  much faster than python.
- *symbolic differentiation*: Theano can convert a symbolic graph
+- *symbolic differentiation*: Theano can automatic build symbolic graphs
-  build symbolic graphs for computing gradients.
+  for computing gradients.
 - *stability optimizations*: Theano can recognize numerically unstable
  expressions and compute them with more stable algorithms.
-There also exists symbolic packages in Python, namely sympy_. Theano
+There exist another symbolic package in Python, namely sympy_. Theano
-is different from them in the sense that while it allows symbolic
+is different from sympy in the sense that while Theano allows symbolic
-manipulation it puts more emphasis on the evaluation of these
+manipulation it puts more emphasis on the evaluation of these expressions
-expressions and being able to repeatedly evaluate them on many
+and being able to repeatedly evaluate them on many different inputs. Theano
-different sets of inputs. It is also better suited to handling very
+is also better suited to handling very large tensors which have no
-large tensors which have no assumed structures.
+assumed structures.
 If numpy_ is to be compared to MATLAB_ and sympy_ to Mathematica_,
 Theano is a sort of hybrid of the two which tries to make the best of
@@ -145,10 +150,9 @@ issues that concern the end users.
 Questions, comments, praise, criticism as well as bug reports should
 be submitted to these mailing lists.
-We welcome all kinds of contributions. Our `task list`_ is full of
+We welcome all kinds of contributions. If you have any questions
-interesting ideas awaiting a champion.  If you have any questions
+regarding how to extend Theano, please feel free to ask on the theano-dev_
-regarding how to extend Theano, please feel free to ask on the
+mailing list.
-theano-dev_ mailing list.

--- a/doc/install.txt
+++ b/doc/install.txt
@@ -12,15 +12,31 @@ Requirements
 In order to use Theano, the following libraries and software will need
 to be installed:
+    Linux or OS-X operating system
- linux or OS-X operating system
+        We develop mainly on 64-bit Linux machines. 32-bit architectures are
- python >=2.5
+        not well-tested.
- numpy >=1.2 (earlier versions have memory leaks)
- SciPy (specifically numpy, sparse, weave). We recommend scipy >=0.7 if you are using sparse matrices, because scipy.sparse is buggy in 0.6. (scipy.csc_matrix dot has a bug with singleton dimensions. There may be more bugs.)
+    python >= 2.5
- g++, python-dev (optional but highly recommended, to compile generated C code)
- sphinx >=0.5.1, pygments (optional, to build documentation) (also latex and dvipng if you want math to show up as images...)
+    `numpy <http://numpy.scipy.org/>`_ >= 1.2
- mercurial (optional, to download the source)
+        Earlier versions have memory leaks.
- nose (nosetests) (optional, for testing)
+    `SciPy <http://scipy.org>`_
+        Specifically numpy, sparse, and weave. We recommend scipy
+        >=0.7 if you are using sparse matrices, because scipy.sparse
+        is buggy in 0.6. (scipy.csc_matrix dot has a bug with singleton
+        dimensions. There may be more bugs.)
+The following libraries and software are optional:
+    g++, python-dev
+        Highly recommended, to compile generated C code.
+    `nose <http://somethingaboutorange.com/mrl/projects/nose/>`_
+        Recommended, to run Theano's test-suite.
+    `sphinx <http://sphinx.pocoo.org/>`_ >=0.5.1, `pygments <http://pygments.org/>`_
+        Used to build documentation. latex and dvipng
+        are also necessary for math to show up as images.
+    `mercurial <http://www.selenic.com/mercurial/>`_
+        To download bleeding-edge 
 ------------

--- a/doc/tutorials/basic/adding.txt
+++ b/doc/tutorials/basic/adding.txt
@@ -37,19 +37,15 @@ objects).
 >>> x = T.dscalar('x')
 >>> y = T.dscalar('y')
-In Theano, all symbols must be typed. In particular, ``T.dscalar`` is
+In Theano, all symbols must be typed. In particular, ``T.dscalar``
-the type we assign to "0-dimensional arrays of doubles". It is a
+is the type we assign to "0-dimensional arrays (`scalar`) of doubles
-Theano :term:`Type`. Therefore, you can guess that by calling
+(`d`)". It is a Theano :term:`Type`.
-``T.dscalar`` with a string argument, you create a :term:`Result`
-representing a floating-point scalar quantity with the given name (if
+``dscalar`` is not a class. Therefore, neither ``x`` nor ``y``
-you provide no argument, the symbol will be unnamed, which can cause
+are actually instances of ``dscalar``. They are instances of
-difficulties in debugging).
+:api:`TensorResult <theano.tensor.basic.TensorResult>`. ``x`` and ``y``
+are, however, assigned the theano Type ``dscalar`` in their ``type``
-Note that ``dscalar`` is not a class and that therefore neither ``x``
+field, as you can see here:
-nor ``y`` are actually instances of ``dscalar``. They are instances of
-:api:`TensorResult <theano.tensor.basic.TensorResult>`. It is however
-assigned the theano Type ``dscalar`` in its ``type`` field, as you can
-see here:
 >>> type(x)
 <class 'theano.tensor.basic.TensorResult'>
@@ -60,9 +56,14 @@ Tensor(float64, scalar)
 >>> x.type == T.dscalar
 True
-Ditto for ``y``. You may learn more about the structures in Theano in
+You can learn more about the structures in Theano in
 the :ref:`advtutorial` and in :ref:`graphstructures`.
+By calling ``T.dscalar`` with a string argument, you create a
+:term:`Result` representing a floating-point scalar quantity with the
+given name. If you provide no argument, the symbol will be unnamed. Names
+are not require, but they can aid debugging.
 -------------------------------------------
 **Step 2**
@@ -83,14 +84,14 @@ x + y
 **Step 3**
 The last step is to create a function taking ``x`` and ``y`` as inputs
-and giving out ``z`` as output:
+and giving ``z`` as output:
 >>> f = function([x, y], z)
-The first argument to ``function`` is a list of :term:`Results
+The first argument to ``function`` is a list of :term:`Results <Result>`
-<Result>` that will be provided as inputs to the function. The second
+that will be provided as inputs to the function. The second argument
-argument is a single Result that we want to see as output *or* a list
+is a single Result *or* a list of Results. For either case, the second
-of output results.
+argument is what we want to see as output when we apply the function.
 ``f`` may then be used like a normal Python function.

--- a/doc/tutorials/basic/index.txt
+++ b/doc/tutorials/basic/index.txt
@@ -17,7 +17,7 @@ installed:
 >>> from theano import *
-Many of symbols you will need to use lie in the ``tensor`` subpackage
+Many of symbols you will need to use are in the ``tensor`` subpackage
 of theano. Let's import that subpackage under a handy name. I like
 ``T``.

--- a/theano/compile/debugmode.py
+++ b/theano/compile/debugmode.py
@@ -195,8 +195,10 @@ def _optcheck_env(input_specs, output_specs, accept_inplace = False):
    inputs, outputs = gof.graph.clone(orig_inputs, orig_outputs)
    equivalence_tracker = _ResultEquivalenceTracker()
    env = gof.env.Env(inputs, outputs,
-            features=[equivalence_tracker,
+            #DestroyHandler is not needed because it is actually installed by an optimization
-                gof.DestroyHandler(do_imports_on_attach=False)])
+            # after canonicalization.  This results in a big speed gain.
+            #features=[equivalence_tracker, gof.DestroyHandler(do_imports_on_attach=False)])
+            features=[equivalence_tracker])
    if not accept_inplace:
        for node in env.nodes:

--- a/theano/compile/function_module.py
+++ b/theano/compile/function_module.py
 """Driver of graph construction, optimization, and linking.
 """
+__docformat__ = "restructuredtext en"
 import copy_reg
 import cPickle

--- a/theano/compile/module.py
+++ b/theano/compile/module.py
--- a/theano/compile/tests/test_inplace_opt_for_value.py
+++ b/theano/compile/tests/test_inplace_opt_for_value.py
 #!/usr/bin/env python
 import numpy as N
-from theano import Op, Apply, tensor as T, Module, Member, Method, Mode, compile
+from theano import Op, Apply, tensor as T, Module, Method, Mode, compile
 from theano.gof import OpSub, TopoOptimizer
-from pylearn.algorithms.minimizer import make_minimizer # minimizer
 from theano.printing import Print
 from theano.tests import unittest_tools
-#import sgd #until Olivier's module-import thing works better
 ####################
 # Library-type stuff
@@ -15,8 +13,6 @@ from theano.tests import unittest_tools
 from theano.compile import module
 from theano import tensor as T
-from pylearn.algorithms.minimizer import minimizer_factory
 class StochasticGradientDescent(module.FancyModule):
    """Fixed stepsize gradient descent"""
    def __init__(self, args, cost, params, gradients=None, stepsize=None, WEIRD_STUFF=True):
@@ -29,18 +25,18 @@ class StochasticGradientDescent(module.FancyModule):
        self.stepsize_init = None
        if stepsize is None:
-            self.stepsize = module.Member(T.dscalar())
+            self.stepsize = (T.dscalar())
        elif isinstance(stepsize, T.TensorResult):
            self.stepsize = stepsize
        else:
            if self.WEIRD_STUFF:
                #TODO: why is this necessary? why does the else clause not work?
 #                self.stepsize = module.Member(T.dscalar(), init = stepsize)
-                self.stepsize = module.Member(T.dscalar())
+                self.stepsize = (T.dscalar())
                self.stepsize_init = stepsize
            else:
 #                self.stepsize = module.Member(T.value(stepsize))
-                self.stepsize = module.Member(T.constant(stepsize))#work!
+                self.stepsize = (T.constant(stepsize))#work!
        if self.stepsize.ndim != 0:
            raise ValueError('stepsize must be a scalar', stepsize)
@@ -63,7 +59,6 @@ class StochasticGradientDescent(module.FancyModule):
            pass
-@minimizer_factory('sgd')
 def sgd_minimizer(stepsize=None, **args):
    def m(i,c,p,g=None):
        return StochasticGradientDescent(i, c, p, stepsize=stepsize, **args)
@@ -101,6 +96,9 @@ class TanhRnn(Op):
        return Apply(self, [x, z0, A], [z])
    def perform(self, node, (x,z0,A), out):
+        assert x is not None 
+        assert z0 is not None 
+        assert A is not None
        T,M = x.shape
        z = N.zeros((T+1, M))
        z[0] = z0
@@ -161,10 +159,10 @@ class ExampleRNN(Module):
        self.n_vis = n_vis
        #recurrent weight matrix in latent space
-        self.z0 = Member(T.dvector())
+        self.z0 = (T.dvector())
-        self.w = Member(T.dmatrix())
+        self.w = (T.dmatrix())
-        self.params = [self.w]
+        self.params = [self.z0, self.w]
        #input and target
        x, y = T.dmatrix(), T.dmatrix()
@@ -176,6 +174,7 @@ class ExampleRNN(Module):
        self.minimizer = minimizer([x, y], self.cost, self.params)
    def _instance_initialize(self, obj):
+        print 'INITIALIZE EXAMPLE RNN'
        n_vis = self.n_vis
        rng = N.random.RandomState(unittest_tools.fetch_seed(2342))
@@ -185,14 +184,14 @@ class ExampleRNN(Module):
        obj.minimizer.initialize()
 def test_example_rnn():
-    minimizer_fn = make_minimizer('sgd', stepsize = 0.001)
+    minimizer_fn = sgd_minimizer(stepsize = 0.001)
    n_vis = 5
    n_out = 3
    n_hid = 4
    rnn_module = ExampleRNN(n_vis, minimizer_fn)
-    rnn = rnn_module.make(mode='FAST_RUN')
+    rnn = rnn_module.make()
    rng = N.random.RandomState(unittest_tools.fetch_seed(7722342))
    x = rng.randn(10,n_vis)
@@ -212,6 +211,7 @@ def test_example_rnn():
            print i, rnn.minimizer.step_cost(x, y), rnn.minimizer.stepsize
        else:
            rnn.minimizer.step_cost(x, y)
+    assert rnn.minimizer.step_cost(x,y) < -20 #it starts around -.28
 def test_WEIRD_STUFF():
    n_vis = 3
@@ -224,8 +224,8 @@ def test_WEIRD_STUFF():
    LAG = 4
    y[LAG:] = x[:-LAG, 0:n_vis]
-    minimizer_fn1 = make_minimizer('sgd', stepsize = 0.001, WEIRD_STUFF = False)
+    minimizer_fn1 = sgd_minimizer(stepsize = 0.001, WEIRD_STUFF = False)
-    minimizer_fn2 = make_minimizer('sgd', stepsize = 0.001, WEIRD_STUFF = True)
+    minimizer_fn2 = sgd_minimizer(stepsize = 0.001, WEIRD_STUFF = True)
    rnn_module1 = ExampleRNN(n_vis, minimizer_fn1)
    rnn_module2 = ExampleRNN(n_vis, minimizer_fn2)
    rnn1 = rnn_module1.make(mode='FAST_RUN')

--- a/theano/compile/tests/test_module.py
+++ b/theano/compile/tests/test_module.py
--- a/theano/gof/graph.py
+++ b/theano/gof/graph.py
@@ -84,6 +84,9 @@ class Apply(utils.object2):
            else:
                raise TypeError("The 'outputs' argument to Apply must contain Result instances with no owner, not %s" % output)
+        self._creation_idx = _creation_idx[0]
+        _creation_idx[0] += 1
    def default_output(self):
        """Returns the default output for this node.
@@ -123,9 +126,6 @@ class Apply(utils.object2):
        return self
    def __hash__(self):
-        if not hasattr(self, '_creation_idx'):
-            self._creation_idx = _creation_idx[0]
-            _creation_idx[0] += 1
        return self._creation_idx

--- a/theano/tensor/blas.py
+++ b/theano/tensor/blas.py
@@ -473,15 +473,6 @@ class GemmLocalOptimizer(LocalOptimizer):
                            return [T.add(*new_add_inputs)]
        return False
-    @staticmethod
-    def failure_callback(exc, nav, repl_pairs):
-        """WRITEME"""
-        if not isinstance(exc, InconsistencyError):
-            traceback.print_exc()
-        else:
-            #print 'GEMM caused cycle, it happens.'
-            pass
    @staticmethod
    def _as_scalar(res):
        """Return None or a TensorResult whose type is in T.float_scalar_types"""
@@ -579,11 +570,11 @@ class GemmLocalOptimizer(LocalOptimizer):
 # TODO: This could be an equilibriumOptmizer, but I don't know how to combine an OpKeyOptimizer and
 # an EquilibriumOptimizer.
 compile.optdb.register('inplace_gemm_0', OpKeyOptimizer(GemmLocalOptimizer(), 
-    failure_callback=GemmLocalOptimizer.failure_callback), 70.00, 'fast_run', 'inplace', 'gemm')
+    failure_callback=OpKeyOptimizer.warn_inplace), 70.00, 'fast_run', 'inplace', 'gemm')
 compile.optdb.register('inplace_gemm_1', OpKeyOptimizer(GemmLocalOptimizer(), 
-    failure_callback=GemmLocalOptimizer.failure_callback), 70.01, 'fast_run', 'inplace', 'gemm')
+    failure_callback=OpKeyOptimizer.warn_inplace), 70.01, 'fast_run', 'inplace', 'gemm')
 compile.optdb.register('inplace_gemm_2', OpKeyOptimizer(GemmLocalOptimizer(), 
-    failure_callback=GemmLocalOptimizer.failure_callback), 70.02, 'fast_run', 'inplace', 'gemm')
+    failure_callback=OpKeyOptimizer.warn_inplace), 70.02, 'fast_run', 'inplace', 'gemm')
 class Dot22(GemmRelated):
    """Compute a matrix-matrix product.

--- a/theano/tensor/tests/test_basic.py
+++ b/theano/tensor/tests/test_basic.py
@@ -1305,14 +1305,26 @@ class test_matinv(unittest.TestCase):
            ssd, gw = fn(x,w)
            #print ssd, x*w, x, w
            if i == 0:
-                str0 = str(ssd)
+                ssd0 = ssd
            w -= 0.4 * gw
-        return str0, str(ssd)
+        return ssd0, ssd
    def test_reciprocal(self):
        """Matrix reciprocal by gradient descent"""
-        self.assertEqual(('6.10141615619', '0.00703816291711'), self.mat_reciprocal(3))
+        ssd0,ssd = self.mat_reciprocal(3)
+        numpy.random.seed(unittest_tools.fetch_seed(1))
+        # hand-coded numpy implementation for verification
+        x = numpy.random.rand(3,3)+0.1
+        w = numpy.random.rand(3,3)
+        myssd0 = numpy.sum((x*w - numpy.ones((3,3)))**2.0)
+        for i in xrange(300):
+            gw = 2*(x*w - numpy.ones((3,3)))*x  # derivative of dMSE/dw
+            myssd = numpy.sum((x*w - numpy.ones((3,3)))**2)
+            w -= 0.4 * gw
+        self.failUnlessAlmostEqual(ssd0, myssd0)
+        self.failUnlessAlmostEqual(ssd, myssd)
 class t_dot(unittest.TestCase):
    def setUp(self):

--- a/theano/tensor/tests/test_naacl09.py
+++ b/theano/tensor/tests/test_naacl09.py