test_naacl passes

d65d9f79 · james@X40 · 3231b8dd · a908120e · d65d9f79 · d65d9f79
--- a/doc/doc/sandbox.txt
+++ b/doc/doc/sandbox.txt
@@ -193,3 +193,9 @@ How to reuse (overwrite) a storage tensor
 ``theano.compile.io.Out(gw1, borrow = True)`` for that value in
 ``compile.function``
+=========================================
+ProfileMode
+=========================================
+*** write up how to use it ***
--- a/doc/index.txt
+++ b/doc/index.txt
@@ -5,43 +5,49 @@
 Theano
 ======
-Theano is a Python library aiming to allow definition, optimization
+Theano is a Python library that allows you to definite, optimize, and
-and efficient evaluation of mathematical expressions involving
+efficiently evaluate mathematical expressions involving multi-dimensional
-multi-dimensional arrays (though it may be extended to support many
+arrays. It can be extended to support other types. Theano melds some
-other types). Theano melds some aspects of a computer algebra system
+aspects of a computer algebra system (CAS) with aspects of an optimizing
-(CAS) with aspects of an optimizing compiler. This is particularly
+compiler. It can even transform some or all of the expression into C code
-useful in fields such as machine learning where complicated algorithms
+and compile it into native machine instructions. This combination of CAS
-must be run over large amounts of data.
+with optimizing compilation is particularly useful for computational
+fields in which complicated mathematical expressions are evaluated
-Theano supports a wide range of numerical types in multiple
+numerous times over large data sets.
-dimensions, a rapidly growing number of well-tested operations as well
-as utilities to compute the gradient of an expression with respect to
+Theano was written at the LISA_ lab to support the development of
-another. Symbolic expressions may be compiled into functions, which
-work merrily on the same data structures as numpy_, allowing for easy
-interoperability.
-Theano's compiler applies many optimizations of varying
-complexity. These optimizations include, but are not limited to
-constant folding, merging of similar subgraphs (to avoid calculating
-the same values more than once), simple arithmetic simplification
-(``x*y/x -> y``), inserting efficient BLAS_ operations and using
-inplace operations wherever it is safe to do so. Theano also defines
-several optimizations which improve the numerical stability of
-computations and it provides a framework to add and test new
-optimizers.
-Theano was written at the LISA_ to support the development of
 efficient machine learning algorithms while minimizing human
-time. Theano was named after the `Greek mathematician`_ who may have
+time. We use it especially in gradient-based learning techniques.
-been Pythagoras' wife.
+Theano supports a range of numerical types in multiple dimensions and
+a number of well-tested operations. It also allows you to compute the
+gradient of an expression with respect to another. Symbolic expressions
+may be compiled into functions, which work on the same data structures
+as numpy_, allowing for easy interoperability.
+Theano's compiler applies many optimizations of varying complexity
+to these symbolic expressions. These optimizations include, but are
+not limited to:
+* constant folding
+* merging of similar subgraphs, to avoid calculating the same values more than once
+* simple arithmetic simplification (``x*y/x -> y``)
+* inserting efficient BLAS_ operations
+* using inplace operations wherever it is safe to do so.
+Theano defines several optimizations which improve the numerical
+stability of computations. It also provides a framework to add and test
+new optimizers.
+Theano was named after the `Greek mathematician`_, who may have
+been Pythagoras' wife.
 Theano is released under a BSD license (:ref:`link <license>`)
 Sneak peek
 ==========
-Here's a very simple example of how to use Theano.  It doesn't show
+Here is a simple example of how to use Theano. It doesn't show
 off many of Theano's features, but it illustrates concretely what
 Theano is.
@@ -66,9 +72,8 @@ Theano is.
 Theano is not a programming language in the normal sense because you
-write a program in Python that builds expressions for Theano.  Still
+write a program in Python that builds expressions for Theano. Still
-it is like a programming language in the sense that to use theano, you
+it is like a programming language in the sense that you have to
-have to
 - declare variables (``a,b``) and give their types
@@ -77,8 +82,8 @@ have to
 - compile expression graphs to functions in order to use them for computation.
 It is good to think of ``theano.function`` as the interface to a
-compiler which builds a callable object from a purely symbolic graph;
+compiler which builds a callable object from a purely symbolic graph.
-one of theano's most important features is that ``theano.function``
+One of theano's most important features is that ``theano.function``
 can optimize a graph and even compile some or all of it into native
 machine instructions.
@@ -95,18 +100,18 @@ package, so what does Theano do that Python and numpy do not?
  parts your expression graph into native machine code, which runs
  much faster than python.
- *symbolic differentiation*: Theano can convert a symbolic graph
+- *symbolic differentiation*: Theano can automatic build symbolic graphs
-  build symbolic graphs for computing gradients.
+  for computing gradients.
 - *stability optimizations*: Theano can recognize numerically unstable
  expressions and compute them with more stable algorithms.
-There also exists symbolic packages in Python, namely sympy_. Theano
+There exist another symbolic package in Python, namely sympy_. Theano
-is different from them in the sense that while it allows symbolic
+is different from sympy in the sense that while Theano allows symbolic
-manipulation it puts more emphasis on the evaluation of these
+manipulation it puts more emphasis on the evaluation of these expressions
-expressions and being able to repeatedly evaluate them on many
+and being able to repeatedly evaluate them on many different inputs. Theano
-different sets of inputs. It is also better suited to handling very
+is also better suited to handling very large tensors which have no
-large tensors which have no assumed structures.
+assumed structures.
 If numpy_ is to be compared to MATLAB_ and sympy_ to Mathematica_,
 Theano is a sort of hybrid of the two which tries to make the best of
@@ -145,10 +150,9 @@ issues that concern the end users.
 Questions, comments, praise, criticism as well as bug reports should
 be submitted to these mailing lists.
-We welcome all kinds of contributions. Our `task list`_ is full of
+We welcome all kinds of contributions. If you have any questions
-interesting ideas awaiting a champion.  If you have any questions
+regarding how to extend Theano, please feel free to ask on the theano-dev_
-regarding how to extend Theano, please feel free to ask on the
+mailing list.
-theano-dev_ mailing list.

--- a/theano/compile/module.py
+++ b/theano/compile/module.py
@@ -826,18 +826,15 @@ def default_initialize(self, init = {}, **kwinit):
    for k, initv in dict(init, **kwinit).iteritems():
        self[k] = initv
-class ComponentDictInstance(CompositeInstance):
+class ComponentDictInstanceNoInit(CompositeInstance):
-    """
+    """Component Instance that allows new items to be added"""
-    ComponentDictInstance is meant to be instantiated by ComponentDict.
-    """
    def __setitem__(self, item, value):
        if item not in self.__items__:
            # Set it if it's not there
            # TODO: is this needed here? move to ModuleInstance?
            self.__items__[item] = value
        else:
-            super(ComponentDictInstance, self).__setitem__(item, value)
+            super(ComponentDictInstanceNoInit, self).__setitem__(item, value)
    def __str__(self):
        strings = []
@@ -849,6 +846,12 @@ class ComponentDictInstance(CompositeInstance):
                strings.append('%s%s' % (pre, str(v).replace('\n', '\n' + ' '*len(pre))))
        return '{%s}' % '\n'.join(strings).replace('\n', '\n ')
+class ComponentDictInstance(ComponentDictInstanceNoInit):
+    """
+    ComponentDictInstance is meant to be instantiated by ComponentDict.
+    """
    def initialize(self, init={}, **kwinit):
        for k, initv in dict(init, **kwinit).iteritems():
            self[k] = initv
@@ -990,7 +993,7 @@ class Curry:
        self.meth = getattr(self.obj, self.name)
-class ModuleInstance(ComponentDictInstance):
+class ModuleInstance(ComponentDictInstanceNoInit):
    """
    WRITEME
@@ -1087,19 +1090,18 @@ class Module(ComponentDict):
        if not isinstance(inst, ModuleInstance):
            raise TypeError('The InstanceType of a Module should inherit from ModuleInstance',
                    (self, type(inst)))
-        print 'BUILD', self
        for methodname in dir(self):
            # Any method with a name like '_instance_XXX' is added to
            # the object built under the name obj.XXX
            if methodname.startswith('_instance_'):
-                print 'INSTALLING', inst, methodname
                new_methodname = methodname[len('_instance_'):]
-                new_obj = Curry(self, methodname, inst)
+                if not hasattr(inst, new_methodname):
-                # setattr doesn't work here because we overrode __setattr__
+                    curried = Curry(self, methodname, inst)
-                # setattr(inst, new_methodname, new_obj)
+                    # setattr doesn't work here because we overrode __setattr__
-                inst.__dict__[new_methodname] = new_obj
+                    # setattr(inst, new_methodname, curried)
-                assert getattr(inst, new_methodname) == new_obj
+                    inst.__dict__[new_methodname] = curried
-                #print 'ADDING METHOD', method, 'to', id(inst), new_methodname, getattr(inst, new_methodname)
+                    assert getattr(inst, new_methodname) == curried
+                    #print 'ADDING METHOD', method, 'to', id(inst), new_methodname, getattr(inst, new_methodname)
        return inst
    def _instance_initialize(self, inst, init = {}, **kwinit):

--- a/theano/tensor/tests/test_basic.py
+++ b/theano/tensor/tests/test_basic.py
@@ -1305,14 +1305,26 @@ class test_matinv(unittest.TestCase):
            ssd, gw = fn(x,w)
            #print ssd, x*w, x, w
            if i == 0:
-                str0 = str(ssd)
+                ssd0 = ssd
            w -= 0.4 * gw
-        return str0, str(ssd)
+        return ssd0, ssd
    def test_reciprocal(self):
        """Matrix reciprocal by gradient descent"""
-        self.assertEqual(('6.10141615619', '0.00703816291711'), self.mat_reciprocal(3))
+        ssd0,ssd = self.mat_reciprocal(3)
+        numpy.random.seed(unittest_tools.fetch_seed(1))
+        # hand-coded numpy implementation for verification
+        x = numpy.random.rand(3,3)+0.1
+        w = numpy.random.rand(3,3)
+        myssd0 = numpy.sum((x*w - numpy.ones((3,3)))**2.0)
+        for i in xrange(300):
+            gw = 2*(x*w - numpy.ones((3,3)))*x  # derivative of dMSE/dw
+            myssd = numpy.sum((x*w - numpy.ones((3,3)))**2)
+            w -= 0.4 * gw
+        self.failUnlessAlmostEqual(ssd0, myssd0)
+        self.failUnlessAlmostEqual(ssd, myssd)
 class t_dot(unittest.TestCase):
    def setUp(self):

--- a/theano/tensor/tests/test_naacl09.py
+++ b/theano/tensor/tests/test_naacl09.py
@@ -179,6 +179,7 @@ class QuadraticDenoisingAA(module.Module):
        #self.validate = theano.Method(self.input, [self.cost, self.output])
    def _instance_initialize(self, obj, input_size, hidden_size, seed, lr, qfilter_relscale):
+        print 'QDAA init'
        """
        qfilter_relscale is the initial range for any quadratic filters (relative to the linear
        filter's initial range)
@@ -326,9 +327,6 @@ class Module_Nclass(module.FancyModule):
 class ConvolutionalMLPInstance(module.FancyModuleInstance, Loss01):
    #initialize is called by Module.make
    def initialize(self, input_size, input_representation_size, hidden_representation_size, output_size, lr, seed, noise_level, qfilter_relscale):
-        print 'INITIALIZING'
-        # ASK JAMES: Is the following necessary?
-#        super(ConvolutionalMLPInstance, self)._instance_initialize(obj, **kwargs)
        R = N.random.RandomState(unittest_tools.fetch_seed(seed))
@@ -341,19 +339,29 @@ class ConvolutionalMLPInstance(module.FancyModuleInstance, Loss01):
 #        for layer in obj.layers:
 #            if layer.lr is None:
 #                layer.lr = lr
+        assert self.input_representations[-1] is not self.input_representations[0]
+        assert self.input_representations[-1].w1 is self.input_representations[0].w1
        for i in self.input_representations:
 #            i.initialize(input_size=self.input_size, hidden_size=self.input_representation_size, seed=R.random_integers(2**30), noise_level=noise_level, qfilter_relscale=qfilter_relscale)
-            i.initialize(input_size=self.input_size, hidden_size=self.input_representation_size, noise_level=noise_level, seed=R.random_integers(2**30), lr=lr, qfilter_relscale=qfilter_relscale)
+            i.initialize(input_size=self.input_size,
+                    hidden_size=self.input_representation_size, noise_level=noise_level,
+                    seed=int(R.random_integers(2**30)), lr=lr, qfilter_relscale=qfilter_relscale)
+            print type(i.w1)
+            assert isinstance(i.w1, N.ndarray)
        for i in self.input_representations[1:]:
+            print type(i.w1)
+            assert isinstance(i.w1, N.ndarray)
            assert (i.w1 == self.input_representations[0].w1).all()
            assert (i.w2 == self.input_representations[0].w2).all()
            assert (i.b1 == self.input_representations[0].b1).all()
            assert (i.b2 == self.input_representations[0].b2).all()
            assert all((a==b).all() for a, b in zip(i.qfilters, self.input_representations[0].qfilters)) 
-        self.hidden.initialize(input_size=(len(self.inputs) * self.input_representation_size), hidden_size=self.hidden_representation_size, noise_level=noise_level, seed=R.random_integers(2**30), lr=lr, qfilter_relscale=qfilter_relscale)
+        self.hidden.initialize(input_size=(len(self.inputs) * self.input_representation_size),
+                hidden_size=self.hidden_representation_size, noise_level=noise_level,
+                seed=int(R.random_integers(2**30)), lr=lr, qfilter_relscale=qfilter_relscale)
        self.output.initialize(n_in=self.hidden_representation_size, n_out=self.output_size, lr=lr, seed=R.random_integers(2**30))
@@ -401,6 +409,7 @@ class ConvolutionalMLP(module.FancyModule):
                                _qfilters = self.input_representations[0].qfilters
                            )
            )
+            assert self.input_representations[-1].w1 is self.input_representations[0].w1
        self.input_representation = T.concatenate([i.hidden for i in self.input_representations], axis=1)
        self.hidden = QDAA(
@@ -445,7 +454,7 @@ class ConvolutionalMLP(module.FancyModule):
        finetuning_cost = self.output.cost
        finetuning_gradients = T.grad(finetuning_cost, finetuning_params)
        finetuning_updates = dict((p, p - self.lr * g) for p, g in zip(finetuning_params, finetuning_gradients))
-        ###DEBUG: self.finetuning_update = module.Method(self.inputs + [self.targ], self.output.cost, finetuning_updates)
+        self.finetuning_update = module.Method(self.inputs + [self.targ], self.output.cost, finetuning_updates)
        #self.validate = module.Method(self.inputs + [self.targ], [self.output.cost, self.output.argmax, self.output.max_pr])
        #self.softmax_output = module.Method(self.inputs, self.output.softmax_unsupervised)
@@ -537,8 +546,8 @@ def test_naacl_model(iters_per_unsup=10, iters_per_sup=10,
        s0, s1 = [str(j) for j in m.pretraining_update(*inputs)]
        print 'huh?', i, iters_per_unsup, iters_per_unsup * (i+1), s0, s1
    if iters_per_unsup == 10:
-        assert s0.startswith('0.40218760858')
+        assert s0.startswith('0.40304459240')
-        assert s1.startswith('0.074450801777')
+        assert s1.startswith('0.074898707938')
    print 'FINETUNING GRAPH'
    print 'SUPERVISED PHASE COSTS (%s)'%optimizer
@@ -548,9 +557,9 @@ def test_naacl_model(iters_per_unsup=10, iters_per_sup=10,
        s0 = str(m.finetuning_update(*(inputs + [targets])))
        print iters_per_sup * (i+1), s0
    if iters_per_sup == 10:
-        assert s0.startswith('15.65127763')#should check for the 8 decimal only.
+        assert s0.startswith('15.65111049')#should check for the 8 decimal only.
-if __name__ == '__main__':
+def jtest_main():
    from theano import gof
    JTEST = theano.compile.mode.optdb.query(*sys.argv[2:])
    print 'JTEST', JTEST
@@ -558,3 +567,23 @@ if __name__ == '__main__':
    optimizer = eval(sys.argv[1])
    test_naacl_model(optimizer, 10, 10, realistic=False)
+def real_main():
+    test_naacl_model()
+def profile_main():
+    # This is the main function for profiling 
+    # We've renamed our original main() above to real_main()
+    import cProfile, pstats, StringIO
+    prof = cProfile.Profile()
+    prof = prof.runctx("real_main()", globals(), locals())
+    stream = StringIO.StringIO()
+    stats = pstats.Stats(prof)
+    stats.sort_stats("time")  # Or cumulative
+    stats.print_stats(80)  # 80 = how many to print
+    # The rest is optional.
+    # stats.print_callees()
+    # stats.print_callers()
+if __name__ == '__main__':
+    real_main()
+    #profile_main()