merge

a024466f · Olivier Breuleux · a3bf27fa · cff93cf4 · a024466f · a024466f
--- a/doc/advanced_tutorial/ex1/type.txt
+++ b/doc/advanced_tutorial/ex1/type.txt
@@ -12,9 +12,9 @@ Type's contract
 In Theano's framework, a Type is any object which defines the following
 methods. To obtain the default methods described below, the Type should
-be an instance of :api:``theano.gof.Type`` or should be an instance of a
+be an instance of :api:`theano.gof.Type` or should be an instance of a
-subclass of :api:``theano.gof.Type``. If you will write all methods yourself,
+subclass of :api:`theano.gof.Type`. If you will write all methods yourself,
-you need not use an instance of :api:``theano.gof.Type``.
+you need not use an instance of :api:`theano.gof.Type`.
 Methods with default arguments must be defined with the same signature,
 i.e.  the same default argument names and values. If you wish to add
@@ -73,9 +73,9 @@ default values.
  - *Default*: ``make_variable``
-For each method, the *default* is what :api:``theano.gof.Type`` defines
+For each method, the *default* is what :api:`theano.gof.Type` defines
-for you. So, if you create an instance of :api:``theano.gof.Type`` or an
+for you. So, if you create an instance of :api:`theano.gof.Type` or an
-instance of a subclass of :api:``theano.gof.Type``, you
+instance of a subclass of :api:`theano.gof.Type`, you
 must define ``filter``. You might want to override ``values_eq_approx``,
 as well as ``values_eq``. The other defaults generally need not be
 overridden.
@@ -135,7 +135,7 @@ chose to be 1e-4.
 .. note::
    ``values_eq`` is never actually used by Theano, but it might be used
-    internally in the future. Currently, all equality testing is done
+    internally in the future. Equality testing in DebugMode is done
    using ``values_eq_approx``.
 **Putting them together**
@@ -185,15 +185,15 @@ instances of ``Double`` are technically the same Type. However, different
 >>> double1 == double2
 False
-Theano often compares Types using ``==`` to see if they are the same. If
+Theano compares Types using ``==`` to see if they are the same. If
-the inputs of two different :ref:`Applies <apply>` have the same Type
+the inputs of two different :ref:`Applies <apply>` are the same
-and the :ref:`op` applied on them is the same, they can be :term:`merged
+and two :ref:`op`s applied to them compare equal, then only one of those ops
-<merge>`.
+must be evaluated.  (This is sometimes called a :term:`merging <merge>` and it
+is done by the :api:`MergeOptimizer`.)
-There are several ways to make it that instances of Type ``Double``
+There are several ways to make sure graphs are merged properly:
-compare equal:
- #. Define ``Double.__eq__`` so that all instances of type Double
+ #. Define ``Double.__eq__`` so that instances of type Double
    are equal. For example:
    .. code-block:: python
@@ -202,9 +202,10 @@ compare equal:
            return type(self) is Double and type(other) is Double
 #. Override ``Double.__new__`` to always return the same instance.
- #. Hide Double and only publish a single instance of it.
+ #. Hide the Double class and only advertise a single instance of it.
-We prefer the final option, because it's the simplest.
+Here we will prefer the final option, because it's the simplest.
+Often Ops in the theano code define the ``__eq__`` function though.
 Untangling some concepts

--- a/doc/advanced_tutorial/graphstructures.txt
+++ b/doc/advanced_tutorial/graphstructures.txt
@@ -8,7 +8,7 @@ Graph Structures
 Theano represents symbolic mathematical computations as graphs. These
 graphs are composed of interconnected :ref:`apply` and :ref:`variable`
 nodes. They are associated to *function application* and *data*,
-respectively. Operations are represented :ref:`op` instances and data
+respectively. Operations are represented by :ref:`op` instances and data
 types are represented by :ref:`type` instances. Here is a piece of code
 and a diagram showing the structure built by that piece of code. This
 should help you understand how these pieces fit together:
@@ -40,24 +40,26 @@ bi-partite, directed, acyclic graph. Variables point to the Apply nodes
 representing the function application producing them via their
 ``owner`` field. These Apply nodes point in turn to their input and
 output Variables via their ``inputs`` and ``outputs`` fields.
+(Apply instances also contain a list of references to their ``outputs``, but
+those pointers don't count in this graph.)
 The ``owner`` field of both ``x`` and ``y`` point to ``None`` because
-they are not the variable of another computation. If they were the
+they are not the result of another computation. If one of them was the
-variable of another computation, they would point to another blue box
+result of another computation, it's ``owner`` field would point to another
-like ``z`` does, and so on.
+blue box like ``z`` does, and so on.
 Note that the ``Apply`` instance's outputs points to
-``z``. ``z.owner`` points to the ``Apply`` instance.
+``z``, and ``z.owner`` points back to the ``Apply`` instance.
 An explicit example
 ===================
-In this example we will see in turn a short example in which the
+In this example we will compare two ways of defining the same graph.
-graph construction is hidden behind the standard interface's syntactic
+First, a short bit of code will build an expression (graph) the *normal* way, with most of the
-shortcuts. We will then see the same example but rolled out so that the
+graph construction being done automatically.
-graph construction is made explicit.
+Second, we will walk through a longer re-coding of the same thing
+without any shortcuts that will make the graph construction very explicit.
 **Short example**
@@ -87,7 +89,7 @@ This is what you would type to build the graph explicitly:
    # Instantiate a type that represents a matrix of doubles
    float64_matrix = TensorType(dtype = 'float64',              # double
-                                 broadcastable = (False, False)) # matrix
+                                broadcastable = (False, False)) # matrix
    # We make the Variable instances we need.
    x = Variable(type = float64_matrix, name = 'x')
@@ -332,5 +334,5 @@ eligible to participate in numerous optimizations: constant inlining
 in C code, constant folding, etc.
 A constant does not need to be specified in a :ref:`function`'s list
-of inputs.
+of inputs.  In fact, doing so will raise an exception.
--- a/doc/basic_tutorial/module.txt
+++ b/doc/basic_tutorial/module.txt
@@ -159,23 +159,7 @@ subclass of Module:
 .. code-block:: python
-   class Accumulator(Module):
+.. literalinclude:: ../examples/module/accumulator.py
-       def __init__(self):
-           super(Accumulator, self).__init__() # don't forget this
-           self.inc = T.dscalar()
-           self.state = T.dscalar()
-           self.new_state = self.inc + self.state
-           self.add = Method(inputs = self.inc,
-                             outputs = self.new_state,
-                             updates = {self.state: self.new_state})
-           self.sub = Method(inputs = self.inc,
-                             outputs = None,
-                             updates = {self.state: self.state - self.inc})
-   m = Accumulator()
-   acc = m.make(state = 0)
 This is just like the previous example except slightly fancier.
@@ -203,30 +187,7 @@ boilerplate code.
 All we need to do to use this mechanism is to give a method called
 ``_instance_print_state`` to our Module class.
-.. code-block:: python
+.. literalinclude:: ../examples/module/mechanism1.py
-   class Accumulator(Module):
-       def __init__(self):
-           super(Accumulator, self).__init__() # don't forget this
-           self.inc = T.dscalar()
-           self.state = T.dscalar()
-           self.new_state = self.inc + self.state
-           self.add = Method(inputs = self.inc,
-                             outputs = self.new_state,
-                             updates = {self.state: self.new_state})
-           self.sub = Method(inputs = self.inc,
-                             outputs = None,
-                             updates = {self.state: self.state - self.inc})
-       def _instance_print_state(self, acc):
-           print '%s is: %s' % (self.state, acc.state)
-   m = Accumulator()
-   acc = m.make(state = 0)
-   acc.print_state() # --> prints "state is: 0.0"
 Any method called like ``_instance_XXX`` will cause the object
 obtained through a call to ``make`` to have a method called ``XXX``.
@@ -246,37 +207,7 @@ If a number of instance methods are going to be defined, and especially if you
 will want to inherit from the kind of class that gets instantiated by make,
 you might prefer to consider using the InstanceType mechanism.
-.. code-block:: python
+.. literalinclude:: ../examples/module/mechanism2.py
-   class AccumulatorInstance(ModuleInstance):
-       def print_state(self):
-           #self.component points to the Module from which this was compiled.
-           print '%s is: %s' % (self.component.state, self.state)
-   class Accumulator(Module):
-       # This line tells theano to instantiate an AccumulatorInstance 
-       # when make() is called.
-       InstanceType = AccumulatorInstance
-       def __init__(self):
-           super(Accumulator, self).__init__() # don't forget this
-           self.inc = T.dscalar()
-           self.state = T.dscalar()
-           self.new_state = self.inc + self.state
-           self.add = Method(inputs = self.inc,
-                             outputs = self.new_state,
-                             updates = {self.state: self.new_state})
-           self.sub = Method(inputs = self.inc,
-                             outputs = None,
-                             updates = {self.state: self.state - self.inc})
-   m = Accumulator()
-   acc = m.make(state = 0)
-   acc.print_state() # --> prints "state is: 0.0"
 Adding custom initialization
 ============================
@@ -293,39 +224,7 @@ can override the default with your own method, which has to be called
 Here is an example where we take width and height arguments to
 initialize a state with a matrix of zeros:
+.. literalinclude:: ../examples/module/accumulator.py
-.. code-block:: python
-   import numpy
-   class MatrixAccumulator(Module):
-       def __init__(self):
-           super(MatrixAccumulator, self).__init__() # don't forget this
-           self.inc = T.dscalar()
-           self.state = T.dmatrix()
-           self.new_state = self.inc + self.state
-           self.add = Method(inputs = self.inc,
-                             outputs = self.new_state,
-                             updates = {self.state: self.new_state})
-           self.sub = Method(inputs = self.inc,
-                             outputs = None,
-                             updates = {self.state: self.state - self.inc})
-       def _instance_print_state(self, acc):
-           print '%s is: %s' % (self.state, acc.state)
-       def _instance_initialize(self, acc, nrows, ncols):
-           acc.state = numpy.zeros((nrows, ncols))
-   m = Accumulator()
-   acc = m.make(2, 5) # this calls m._instance_initialize(acc, 2, 5)
-   acc.print_state()
-   # OUTPUT:
-   # state is: [[ 0.  0.  0.  0.  0.]
-   #  [ 0.  0.  0.  0.  0.]]
 Nesting Modules
@@ -335,25 +234,7 @@ Probably the most powerful feature of theano's modules is that one can be
 included as an attribute to another so that the storage of each is available
 to both.
-.. code-block:: python
+.. literalinclude:: ../examples/module/nested.py
-   M = theano.Module()
-   M.a, M.b, M.c = [theano.dvector() for i in 1,2,3]
-   P = theano.Module()
-   P.m = M   #include a module by nesting
-   x = theano.dvector()
-   P.f = Method([x], None, {M.b: M.b + x})
-   p = P.make()  #this converts both M and P because M was nested within P
-   p.m.b = [4, 5, 6]
-   p.f(3)
-   print p.m.b 
-   #  prints  array([7.,8.,9.])
 As you read through examples of Theano code, you will probably see many
 instances of Modules being nested in this way.
--- a/doc/examples/module/accumulator.py
+++ b/doc/examples/module/accumulator.py
+from theano.compile import Module, Method
+import theano.tensor as T
+class Accumulator(Module):
+    def __init__(self):
+        super(Accumulator, self).__init__() # don't forget this
+        self.inc = T.dscalar()
+        self.state = T.dscalar()
+        self.new_state = self.inc + self.state
+        self.add = Method(inputs = self.inc,
+                          outputs = self.new_state,
+                          updates = {self.state: self.new_state})
+        self.sub = Method(inputs = self.inc,
+                          outputs = None,
+                          updates = {self.state: self.state - self.inc})
+if __name__ == '__main__':
+    m = Accumulator()
+    acc = m.make(state = 0)
--- a/doc/examples/module/custom_init.py
+++ b/doc/examples/module/custom_init.py
+import numpy
+from theano.compile import Module, Method
+import theano.tensor as T
+class MatrixAccumulator(Module):
+    def __init__(self):
+        super(MatrixAccumulator, self).__init__() # don't forget this
+        self.inc = T.dscalar()
+        self.state = T.dmatrix()
+        self.new_state = self.inc + self.state
+        self.add = Method(inputs = self.inc,
+                          outputs = self.new_state,
+                          updates = {self.state: self.new_state})
+        self.sub = Method(inputs = self.inc,
+                          outputs = None,
+                          updates = {self.state: self.state - self.inc})
+    def _instance_print_state(self, acc):
+        print '%s is: %s' % (self.state, acc.state)
+    def _instance_initialize(self, acc, nrows, ncols):
+        acc.state = numpy.zeros((nrows, ncols))
+if __name__ == '__main__':
+    m = Accumulator()
+    acc = m.make(2, 5) # this calls m._instance_initialize(acc, 2, 5)
+    acc.print_state()
+    # OUTPUT:
+    # state is: [[ 0.  0.  0.  0.  0.]
+    #  [ 0.  0.  0.  0.  0.]]
--- a/doc/examples/module/mechanism1.py
+++ b/doc/examples/module/mechanism1.py
+from theano.compile import Module, Method
+import theano.tensor as T
+class Accumulator(Module):
+    def __init__(self):
+        super(Accumulator, self).__init__() # don't forget this
+        self.inc = T.dscalar()
+        self.state = T.dscalar()
+        self.new_state = self.inc + self.state
+        self.add = Method(inputs = self.inc,
+                          outputs = self.new_state,
+                          updates = {self.state: self.new_state})
+        self.sub = Method(inputs = self.inc,
+                          outputs = None,
+                          updates = {self.state: self.state - self.inc})
+    def _instance_print_state(self, acc):
+        print '%s is: %s' % (self.state, acc.state)
+if __name__ == '__main__':
+    m = Accumulator()
+    acc = m.make(state = 0)
+    acc.print_state() # --> prints "state is: 0.0"
--- a/doc/examples/module/mechanism2.py
+++ b/doc/examples/module/mechanism2.py
+from theano.compile import Module, ModuleInstance, Method
+import theano.tensor as T
+class AccumulatorInstance(ModuleInstance):
+    def print_state(self):
+        #self.component points to the Module from which this was compiled.
+        print '%s is: %s' % (self.component.state, self.state)
+class Accumulator(Module):
+    # This line tells theano to instantiate an AccumulatorInstance
+    # when make() is called.
+    InstanceType = AccumulatorInstance
+    def __init__(self):
+        super(Accumulator, self).__init__() # don't forget this
+        self.inc = T.dscalar()
+        self.state = T.dscalar()
+        self.new_state = self.inc + self.state
+        self.add = Method(inputs = self.inc,
+                          outputs = self.new_state,
+                          updates = {self.state: self.new_state})
+        self.sub = Method(inputs = self.inc,
+                          outputs = None,
+                          updates = {self.state: self.state - self.inc})
+if __name__=='__main__':
+    m = Accumulator()
+    acc = m.make(state = 0)
+    acc.print_state() # --> prints "state is: 0.0"
--- a/doc/examples/module/nested.py
+++ b/doc/examples/module/nested.py
+import theano
+import theano.tensor as T
+M = theano.Module()
+M.a, M.b, M.c = [T.dvector() for i in 1,2,3]
+P = theano.Module()
+P.m = M   #include a module by nesting
+x = T.dvector()
+P.f = theano.Method([x], None, {M.b: M.b + x})
+p = P.make()  #this converts both M and P because M was nested within P
+p.m.b = [4, 5, 6]
+p.f(3)
+print p.m.b
+#  prints  array([7.,8.,9.])
--- a/doc/examples/module/test_module_doc.py
+++ b/doc/examples/module/test_module_doc.py
+import numpy
+import unittest
+import os
+def makeTester(fname):
+    class Test(unittest.TestCase):
+        def test_example(self):
+            print 'Executing file', self.fname
+    Test.__name__ = fname
+    Test.fname = fname 
+    return Test
+def test_module_doc():
+    """
+    This test executes all of the Module code examples.
+    It goes through the directory and executes all .py files.
+    """
+    for fname in os.listdir('.'):
+        if fname.endswith('.py'):
+            f = fname.split('.')[0]
+            print 'Executing ', fname
+            execfile(fname, locals())
--- a/theano/gof/compiledir.py
+++ b/theano/gof/compiledir.py
+import errno
 import os
 import platform
 import re
@@ -37,7 +38,7 @@ def set_compiledir(path=None):
        except OSError, e:
            # Maybe another parallel execution of theano was trying to create
            # the same directory at the same time.
-            if e.errno != EEXIST:
+            if e.errno != errno.EEXIST:
                raise
    # PROBLEM: sometimes the first approach based on os.system('touch')

--- a/theano/sandbox/wraplinker.py
+++ b/theano/sandbox/wraplinker.py
-from __future__ import absolute_import
-import time
-import numpy
-from ..gof.cutils import run_cthunk
-from ..gof.link import WrapLinker
-from ..compile.mode import Mode
-class Todo(Exception): """todo"""
-#WrapLinker wrappers
-if 0:
-    from ..gradient import numeric_grad
-    def cmp_outputs(i, node, *thunks):
-        """WrapLinker wrapper: raise an exception if outputs are different
-        numpy.ndarrays of floating point types are compared approximately, rather
-        than exactly.
-        """
-        class MisMatch(Exception): """Output mismatch"""
-        #define a comparison function, which works for all the variables in a graph
-        #TODO: consider factoring this out (and maybe passing args explicitly
-        # instead of by closure)
-        def my_check_equal(x, y):
-            if type(x) != type(y):
-                raise MisMatch("Output type mismatch", (x, y))
-            if hasattr(x, 'dtype'):
-                # was: isinstance(x,numpy.ndarray), which doesn't
-                # catch numpy.float64
-                if x.dtype != y.dtype or x.shape != y.shape:
-                    raise MisMatch("ndarray type/shape.", (x,y))
-                if str(x.dtype).startswith('float'):
-                    assert str(x.dtype) == 'float64' #otherwise we need to adjust
-                    #our constant below... but to what?
-                    abs_rel_err = numeric_grad.abs_rel_err(x, y)
-                    max_abs_rel_err = numpy.max(abs_rel_err)
-                    if max_abs_rel_err > 1.0e-7:
-                        raise MisMatch('max_abs_rel_err exceeds tolerence', (max_abs_rel_err,
-                            x, y))
-                elif str(x.dtype).startswith('complex'):
-                    raise Todo()
-                else:
-                    if not numpy.all(x==y):
-                        raise MisMatch
-            else:
-                print 'wtf??', type(x), type(y), node.op
-                if x != y:
-                    print 'wow!! wtf??'
-                    raise MisMatch("Output mismatch.", (x, y))
-        #loop over all the thunks
-        # ensure that the outputs from the first thunk match the outputs from
-        # all subsequent thunks
-        n_thunks = len(thunks)
-        if n_thunks > 1:
-            th0 = thunks[0]
-            for th in thunks[1:]:
-                for out0, outN in zip(th0.outputs, th.outputs):
-                    my_check_equal(out0[0], outN[0])
-#TODO: better name for 'f'
-def numpy_wrapper(f):
-    def wrapper(i, node, *thunks):
-        """WrapLinker wrapper: raise an exception if a NaN is found in outputs
-        """
-        for thunk in thunks:
-            for output in thunk.outputs:
-                if hasattr(output[0], 'dtype'):
-                    if f(output[0]):
-                        raise Exception('uh oh', (i, node, thunk, output[0]))
-    return wrapper
-numpy_any_isinf = numpy_wrapper(lambda a:numpy.any(numpy.isinf(a)))
-numpy_any_isnan = numpy_wrapper(lambda a:numpy.any(numpy.isnan(a)))
-numpy_notall_isfinite = numpy_wrapper(lambda a: not numpy.all(numpy.isfinite(a)))
-def run_all(i, node, *thunks):
-    """WrapLinker wrapper: run the thunks
-    """
-    for th in thunks:
-        th()
-def DualLinker(linkers):
-    #still in sandbox pending ticket 247
-    # when value_cmp is implemented, then cmp_outputs can be rewritten in a solid way, and the
-    # DualLinker can be this simple.
-    return WrapLinkerMany(linkers, [run_all, cmp_outputs])
-####
-#
-#  The Stats and Profiler classes used to be in gof/link.
-#  But Stats was not used I think, and Profiler has been implemented using the wraplinker.
-#
-#  -JB20090119
-###
-import time
-class Stats:
-    """WRITEME"""
-    def __init__(self):
-        self.ncalls = 0
-        self.time = 0
-        self.nfailures = 0
-        self.time_failures = 0
-    def inc_ncalls(self, v): self.ncalls += v
-    def inc_time(self, v): self.time += v
-    def inc_nfailures(self, v): self.nfailures += v
-    def inc_time_failures(self, v): self.time_failures += v
-class Profiler:
-    """WRITEME
-    Collects performance statistics on a function on a per-L{Op}
-    or per-L{Op}-class basis.
-    """
-    def __init__(self, ignore = [], by_class = True):
-        """
-        Creates a L{Profiler}. If by_class is True, stats will
-        be collected for each L{Op} class, adding the totals for
-        each occurrence of that L{Op} in the computation. If
-        by_class is False, each node will be timed individually.
-        All L{Op} classes or L{Op}s (depending on the value of by_class)
-        listed in ignore will not be timed.
-        """
-        self.ignore = ignore
-        self.stats = {}
-        self.by_class = by_class
-    def profile_env(self, f, env):
-        """WRITEME"""
-        stats = self.stats.setdefault('TOTAL', Stats())
-        n, t = stats.inc_ncalls, stats.inc_time
-        failed = False
-        start = time.time()
-        try:
-            f()
-            end = time.time()
-        except:
-            end = time.time()
-            n, t = stats.inc_nfailures, stats.inc_times_failures
-            failed = True
-            ety, eva, etr = sys.exc_info()
-        n(1)
-        t(end - start)
-        if failed:
-            raise ety, eva, etr
-    def profile_op(self, f, op):
-        """WRITEME"""
-        if self.by_class:
-            entry = op.__class__
-        else:
-            entry = op
-        stats = self.stats.setdefault(entry, Stats())
-        n, t = stats.inc_ncalls, stats.inc_time
-        failed = False
-        start = time.time()
-        try:
-            f()
-            end = time.time()
-        except:
-            end = time.time()
-            n, t = stats.inc_nfailures, stats.inc_times_failures
-            failed = True
-            exc = sys.exc_info()
-        if entry not in self.ignore:
-            n(1)
-            t(end - start)
-        if failed:
-            raise_with_op(op, exc)
-    def print_stats(self, sort_by = 'time'):
-        """WRITEME"""
-        def compare_fn((op1, stat1), (op2, stat2)):
-            x1 = getattr(stat2, sort_by)
-            x2 = getattr(stat1, sort_by)
-            if x1 > x2:
-                return 1
-            elif x1 < x2:
-                return -1
-            else:
-                return 0
-        totals = self.stats['TOTAL']
-        print 'CPU usage statistics' 
-        print "  %-25s %9s %12s %12s %12s" % (("Op%s" % (self.by_class and ' class' or '')), 'NCALLS', 'PER_CALL', 'TOTAL', 'CPU%')
-        for op, stat in sorted(self.stats.items(), compare_fn):
-            if op == 'TOTAL': continue
-            to_print = self.by_class and (op.__module__ + "." + op.__name__) or str(op)
-            print "  %-25s %9i %12.5f %12.5f %12.5f" % (to_print, stat.ncalls, stat.time / stat.ncalls, stat.time, stat.time / totals.time)
-        stat = self.stats['TOTAL']
-        print "  %-25s %9i %12.5f %12.5f %12.5f" % ('TOTAL (includes overhead)', stat.ncalls, stat.time / stat.ncalls, stat.time, stat.time / totals.time)