提交 d65d9f79 authored 作者: james@X40's avatar james@X40

test_naacl passes

...@@ -193,3 +193,9 @@ How to reuse (overwrite) a storage tensor ...@@ -193,3 +193,9 @@ How to reuse (overwrite) a storage tensor
``theano.compile.io.Out(gw1, borrow = True)`` for that value in ``theano.compile.io.Out(gw1, borrow = True)`` for that value in
``compile.function`` ``compile.function``
=========================================
ProfileMode
=========================================
*** write up how to use it ***
...@@ -5,43 +5,49 @@ ...@@ -5,43 +5,49 @@
Theano Theano
====== ======
Theano is a Python library aiming to allow definition, optimization Theano is a Python library that allows you to definite, optimize, and
and efficient evaluation of mathematical expressions involving efficiently evaluate mathematical expressions involving multi-dimensional
multi-dimensional arrays (though it may be extended to support many arrays. It can be extended to support other types. Theano melds some
other types). Theano melds some aspects of a computer algebra system aspects of a computer algebra system (CAS) with aspects of an optimizing
(CAS) with aspects of an optimizing compiler. This is particularly compiler. It can even transform some or all of the expression into C code
useful in fields such as machine learning where complicated algorithms and compile it into native machine instructions. This combination of CAS
must be run over large amounts of data. with optimizing compilation is particularly useful for computational
fields in which complicated mathematical expressions are evaluated
Theano supports a wide range of numerical types in multiple numerous times over large data sets.
dimensions, a rapidly growing number of well-tested operations as well
as utilities to compute the gradient of an expression with respect to Theano was written at the LISA_ lab to support the development of
another. Symbolic expressions may be compiled into functions, which
work merrily on the same data structures as numpy_, allowing for easy
interoperability.
Theano's compiler applies many optimizations of varying
complexity. These optimizations include, but are not limited to
constant folding, merging of similar subgraphs (to avoid calculating
the same values more than once), simple arithmetic simplification
(``x*y/x -> y``), inserting efficient BLAS_ operations and using
inplace operations wherever it is safe to do so. Theano also defines
several optimizations which improve the numerical stability of
computations and it provides a framework to add and test new
optimizers.
Theano was written at the LISA_ to support the development of
efficient machine learning algorithms while minimizing human efficient machine learning algorithms while minimizing human
time. Theano was named after the `Greek mathematician`_ who may have time. We use it especially in gradient-based learning techniques.
been Pythagoras' wife.
Theano supports a range of numerical types in multiple dimensions and
a number of well-tested operations. It also allows you to compute the
gradient of an expression with respect to another. Symbolic expressions
may be compiled into functions, which work on the same data structures
as numpy_, allowing for easy interoperability.
Theano's compiler applies many optimizations of varying complexity
to these symbolic expressions. These optimizations include, but are
not limited to:
* constant folding
* merging of similar subgraphs, to avoid calculating the same values more than once
* simple arithmetic simplification (``x*y/x -> y``)
* inserting efficient BLAS_ operations
* using inplace operations wherever it is safe to do so.
Theano defines several optimizations which improve the numerical
stability of computations. It also provides a framework to add and test
new optimizers.
Theano was named after the `Greek mathematician`_, who may have
been Pythagoras' wife.
Theano is released under a BSD license (:ref:`link <license>`) Theano is released under a BSD license (:ref:`link <license>`)
Sneak peek Sneak peek
========== ==========
Here's a very simple example of how to use Theano. It doesn't show Here is a simple example of how to use Theano. It doesn't show
off many of Theano's features, but it illustrates concretely what off many of Theano's features, but it illustrates concretely what
Theano is. Theano is.
...@@ -66,9 +72,8 @@ Theano is. ...@@ -66,9 +72,8 @@ Theano is.
Theano is not a programming language in the normal sense because you Theano is not a programming language in the normal sense because you
write a program in Python that builds expressions for Theano. Still write a program in Python that builds expressions for Theano. Still
it is like a programming language in the sense that to use theano, you it is like a programming language in the sense that you have to
have to
- declare variables (``a,b``) and give their types - declare variables (``a,b``) and give their types
...@@ -77,8 +82,8 @@ have to ...@@ -77,8 +82,8 @@ have to
- compile expression graphs to functions in order to use them for computation. - compile expression graphs to functions in order to use them for computation.
It is good to think of ``theano.function`` as the interface to a It is good to think of ``theano.function`` as the interface to a
compiler which builds a callable object from a purely symbolic graph; compiler which builds a callable object from a purely symbolic graph.
one of theano's most important features is that ``theano.function`` One of theano's most important features is that ``theano.function``
can optimize a graph and even compile some or all of it into native can optimize a graph and even compile some or all of it into native
machine instructions. machine instructions.
...@@ -95,18 +100,18 @@ package, so what does Theano do that Python and numpy do not? ...@@ -95,18 +100,18 @@ package, so what does Theano do that Python and numpy do not?
parts your expression graph into native machine code, which runs parts your expression graph into native machine code, which runs
much faster than python. much faster than python.
- *symbolic differentiation*: Theano can convert a symbolic graph - *symbolic differentiation*: Theano can automatic build symbolic graphs
build symbolic graphs for computing gradients. for computing gradients.
- *stability optimizations*: Theano can recognize numerically unstable - *stability optimizations*: Theano can recognize numerically unstable
expressions and compute them with more stable algorithms. expressions and compute them with more stable algorithms.
There also exists symbolic packages in Python, namely sympy_. Theano There exist another symbolic package in Python, namely sympy_. Theano
is different from them in the sense that while it allows symbolic is different from sympy in the sense that while Theano allows symbolic
manipulation it puts more emphasis on the evaluation of these manipulation it puts more emphasis on the evaluation of these expressions
expressions and being able to repeatedly evaluate them on many and being able to repeatedly evaluate them on many different inputs. Theano
different sets of inputs. It is also better suited to handling very is also better suited to handling very large tensors which have no
large tensors which have no assumed structures. assumed structures.
If numpy_ is to be compared to MATLAB_ and sympy_ to Mathematica_, If numpy_ is to be compared to MATLAB_ and sympy_ to Mathematica_,
Theano is a sort of hybrid of the two which tries to make the best of Theano is a sort of hybrid of the two which tries to make the best of
...@@ -145,10 +150,9 @@ issues that concern the end users. ...@@ -145,10 +150,9 @@ issues that concern the end users.
Questions, comments, praise, criticism as well as bug reports should Questions, comments, praise, criticism as well as bug reports should
be submitted to these mailing lists. be submitted to these mailing lists.
We welcome all kinds of contributions. Our `task list`_ is full of We welcome all kinds of contributions. If you have any questions
interesting ideas awaiting a champion. If you have any questions regarding how to extend Theano, please feel free to ask on the theano-dev_
regarding how to extend Theano, please feel free to ask on the mailing list.
theano-dev_ mailing list.
......
...@@ -826,18 +826,15 @@ def default_initialize(self, init = {}, **kwinit): ...@@ -826,18 +826,15 @@ def default_initialize(self, init = {}, **kwinit):
for k, initv in dict(init, **kwinit).iteritems(): for k, initv in dict(init, **kwinit).iteritems():
self[k] = initv self[k] = initv
class ComponentDictInstance(CompositeInstance): class ComponentDictInstanceNoInit(CompositeInstance):
""" """Component Instance that allows new items to be added"""
ComponentDictInstance is meant to be instantiated by ComponentDict.
"""
def __setitem__(self, item, value): def __setitem__(self, item, value):
if item not in self.__items__: if item not in self.__items__:
# Set it if it's not there # Set it if it's not there
# TODO: is this needed here? move to ModuleInstance? # TODO: is this needed here? move to ModuleInstance?
self.__items__[item] = value self.__items__[item] = value
else: else:
super(ComponentDictInstance, self).__setitem__(item, value) super(ComponentDictInstanceNoInit, self).__setitem__(item, value)
def __str__(self): def __str__(self):
strings = [] strings = []
...@@ -849,6 +846,12 @@ class ComponentDictInstance(CompositeInstance): ...@@ -849,6 +846,12 @@ class ComponentDictInstance(CompositeInstance):
strings.append('%s%s' % (pre, str(v).replace('\n', '\n' + ' '*len(pre)))) strings.append('%s%s' % (pre, str(v).replace('\n', '\n' + ' '*len(pre))))
return '{%s}' % '\n'.join(strings).replace('\n', '\n ') return '{%s}' % '\n'.join(strings).replace('\n', '\n ')
class ComponentDictInstance(ComponentDictInstanceNoInit):
"""
ComponentDictInstance is meant to be instantiated by ComponentDict.
"""
def initialize(self, init={}, **kwinit): def initialize(self, init={}, **kwinit):
for k, initv in dict(init, **kwinit).iteritems(): for k, initv in dict(init, **kwinit).iteritems():
self[k] = initv self[k] = initv
...@@ -990,7 +993,7 @@ class Curry: ...@@ -990,7 +993,7 @@ class Curry:
self.meth = getattr(self.obj, self.name) self.meth = getattr(self.obj, self.name)
class ModuleInstance(ComponentDictInstance): class ModuleInstance(ComponentDictInstanceNoInit):
""" """
WRITEME WRITEME
...@@ -1087,19 +1090,18 @@ class Module(ComponentDict): ...@@ -1087,19 +1090,18 @@ class Module(ComponentDict):
if not isinstance(inst, ModuleInstance): if not isinstance(inst, ModuleInstance):
raise TypeError('The InstanceType of a Module should inherit from ModuleInstance', raise TypeError('The InstanceType of a Module should inherit from ModuleInstance',
(self, type(inst))) (self, type(inst)))
print 'BUILD', self
for methodname in dir(self): for methodname in dir(self):
# Any method with a name like '_instance_XXX' is added to # Any method with a name like '_instance_XXX' is added to
# the object built under the name obj.XXX # the object built under the name obj.XXX
if methodname.startswith('_instance_'): if methodname.startswith('_instance_'):
print 'INSTALLING', inst, methodname
new_methodname = methodname[len('_instance_'):] new_methodname = methodname[len('_instance_'):]
new_obj = Curry(self, methodname, inst) if not hasattr(inst, new_methodname):
# setattr doesn't work here because we overrode __setattr__ curried = Curry(self, methodname, inst)
# setattr(inst, new_methodname, new_obj) # setattr doesn't work here because we overrode __setattr__
inst.__dict__[new_methodname] = new_obj # setattr(inst, new_methodname, curried)
assert getattr(inst, new_methodname) == new_obj inst.__dict__[new_methodname] = curried
#print 'ADDING METHOD', method, 'to', id(inst), new_methodname, getattr(inst, new_methodname) assert getattr(inst, new_methodname) == curried
#print 'ADDING METHOD', method, 'to', id(inst), new_methodname, getattr(inst, new_methodname)
return inst return inst
def _instance_initialize(self, inst, init = {}, **kwinit): def _instance_initialize(self, inst, init = {}, **kwinit):
......
...@@ -1305,14 +1305,26 @@ class test_matinv(unittest.TestCase): ...@@ -1305,14 +1305,26 @@ class test_matinv(unittest.TestCase):
ssd, gw = fn(x,w) ssd, gw = fn(x,w)
#print ssd, x*w, x, w #print ssd, x*w, x, w
if i == 0: if i == 0:
str0 = str(ssd) ssd0 = ssd
w -= 0.4 * gw w -= 0.4 * gw
return str0, str(ssd) return ssd0, ssd
def test_reciprocal(self): def test_reciprocal(self):
"""Matrix reciprocal by gradient descent""" """Matrix reciprocal by gradient descent"""
self.assertEqual(('6.10141615619', '0.00703816291711'), self.mat_reciprocal(3)) ssd0,ssd = self.mat_reciprocal(3)
numpy.random.seed(unittest_tools.fetch_seed(1))
# hand-coded numpy implementation for verification
x = numpy.random.rand(3,3)+0.1
w = numpy.random.rand(3,3)
myssd0 = numpy.sum((x*w - numpy.ones((3,3)))**2.0)
for i in xrange(300):
gw = 2*(x*w - numpy.ones((3,3)))*x # derivative of dMSE/dw
myssd = numpy.sum((x*w - numpy.ones((3,3)))**2)
w -= 0.4 * gw
self.failUnlessAlmostEqual(ssd0, myssd0)
self.failUnlessAlmostEqual(ssd, myssd)
class t_dot(unittest.TestCase): class t_dot(unittest.TestCase):
def setUp(self): def setUp(self):
......
...@@ -179,6 +179,7 @@ class QuadraticDenoisingAA(module.Module): ...@@ -179,6 +179,7 @@ class QuadraticDenoisingAA(module.Module):
#self.validate = theano.Method(self.input, [self.cost, self.output]) #self.validate = theano.Method(self.input, [self.cost, self.output])
def _instance_initialize(self, obj, input_size, hidden_size, seed, lr, qfilter_relscale): def _instance_initialize(self, obj, input_size, hidden_size, seed, lr, qfilter_relscale):
print 'QDAA init'
""" """
qfilter_relscale is the initial range for any quadratic filters (relative to the linear qfilter_relscale is the initial range for any quadratic filters (relative to the linear
filter's initial range) filter's initial range)
...@@ -326,9 +327,6 @@ class Module_Nclass(module.FancyModule): ...@@ -326,9 +327,6 @@ class Module_Nclass(module.FancyModule):
class ConvolutionalMLPInstance(module.FancyModuleInstance, Loss01): class ConvolutionalMLPInstance(module.FancyModuleInstance, Loss01):
#initialize is called by Module.make #initialize is called by Module.make
def initialize(self, input_size, input_representation_size, hidden_representation_size, output_size, lr, seed, noise_level, qfilter_relscale): def initialize(self, input_size, input_representation_size, hidden_representation_size, output_size, lr, seed, noise_level, qfilter_relscale):
print 'INITIALIZING'
# ASK JAMES: Is the following necessary?
# super(ConvolutionalMLPInstance, self)._instance_initialize(obj, **kwargs)
R = N.random.RandomState(unittest_tools.fetch_seed(seed)) R = N.random.RandomState(unittest_tools.fetch_seed(seed))
...@@ -341,19 +339,29 @@ class ConvolutionalMLPInstance(module.FancyModuleInstance, Loss01): ...@@ -341,19 +339,29 @@ class ConvolutionalMLPInstance(module.FancyModuleInstance, Loss01):
# for layer in obj.layers: # for layer in obj.layers:
# if layer.lr is None: # if layer.lr is None:
# layer.lr = lr # layer.lr = lr
assert self.input_representations[-1] is not self.input_representations[0]
assert self.input_representations[-1].w1 is self.input_representations[0].w1
for i in self.input_representations: for i in self.input_representations:
# i.initialize(input_size=self.input_size, hidden_size=self.input_representation_size, seed=R.random_integers(2**30), noise_level=noise_level, qfilter_relscale=qfilter_relscale) # i.initialize(input_size=self.input_size, hidden_size=self.input_representation_size, seed=R.random_integers(2**30), noise_level=noise_level, qfilter_relscale=qfilter_relscale)
i.initialize(input_size=self.input_size, hidden_size=self.input_representation_size, noise_level=noise_level, seed=R.random_integers(2**30), lr=lr, qfilter_relscale=qfilter_relscale) i.initialize(input_size=self.input_size,
hidden_size=self.input_representation_size, noise_level=noise_level,
seed=int(R.random_integers(2**30)), lr=lr, qfilter_relscale=qfilter_relscale)
print type(i.w1)
assert isinstance(i.w1, N.ndarray)
for i in self.input_representations[1:]: for i in self.input_representations[1:]:
print type(i.w1)
assert isinstance(i.w1, N.ndarray)
assert (i.w1 == self.input_representations[0].w1).all() assert (i.w1 == self.input_representations[0].w1).all()
assert (i.w2 == self.input_representations[0].w2).all() assert (i.w2 == self.input_representations[0].w2).all()
assert (i.b1 == self.input_representations[0].b1).all() assert (i.b1 == self.input_representations[0].b1).all()
assert (i.b2 == self.input_representations[0].b2).all() assert (i.b2 == self.input_representations[0].b2).all()
assert all((a==b).all() for a, b in zip(i.qfilters, self.input_representations[0].qfilters)) assert all((a==b).all() for a, b in zip(i.qfilters, self.input_representations[0].qfilters))
self.hidden.initialize(input_size=(len(self.inputs) * self.input_representation_size), hidden_size=self.hidden_representation_size, noise_level=noise_level, seed=R.random_integers(2**30), lr=lr, qfilter_relscale=qfilter_relscale) self.hidden.initialize(input_size=(len(self.inputs) * self.input_representation_size),
hidden_size=self.hidden_representation_size, noise_level=noise_level,
seed=int(R.random_integers(2**30)), lr=lr, qfilter_relscale=qfilter_relscale)
self.output.initialize(n_in=self.hidden_representation_size, n_out=self.output_size, lr=lr, seed=R.random_integers(2**30)) self.output.initialize(n_in=self.hidden_representation_size, n_out=self.output_size, lr=lr, seed=R.random_integers(2**30))
...@@ -401,6 +409,7 @@ class ConvolutionalMLP(module.FancyModule): ...@@ -401,6 +409,7 @@ class ConvolutionalMLP(module.FancyModule):
_qfilters = self.input_representations[0].qfilters _qfilters = self.input_representations[0].qfilters
) )
) )
assert self.input_representations[-1].w1 is self.input_representations[0].w1
self.input_representation = T.concatenate([i.hidden for i in self.input_representations], axis=1) self.input_representation = T.concatenate([i.hidden for i in self.input_representations], axis=1)
self.hidden = QDAA( self.hidden = QDAA(
...@@ -445,7 +454,7 @@ class ConvolutionalMLP(module.FancyModule): ...@@ -445,7 +454,7 @@ class ConvolutionalMLP(module.FancyModule):
finetuning_cost = self.output.cost finetuning_cost = self.output.cost
finetuning_gradients = T.grad(finetuning_cost, finetuning_params) finetuning_gradients = T.grad(finetuning_cost, finetuning_params)
finetuning_updates = dict((p, p - self.lr * g) for p, g in zip(finetuning_params, finetuning_gradients)) finetuning_updates = dict((p, p - self.lr * g) for p, g in zip(finetuning_params, finetuning_gradients))
###DEBUG: self.finetuning_update = module.Method(self.inputs + [self.targ], self.output.cost, finetuning_updates) self.finetuning_update = module.Method(self.inputs + [self.targ], self.output.cost, finetuning_updates)
#self.validate = module.Method(self.inputs + [self.targ], [self.output.cost, self.output.argmax, self.output.max_pr]) #self.validate = module.Method(self.inputs + [self.targ], [self.output.cost, self.output.argmax, self.output.max_pr])
#self.softmax_output = module.Method(self.inputs, self.output.softmax_unsupervised) #self.softmax_output = module.Method(self.inputs, self.output.softmax_unsupervised)
...@@ -537,8 +546,8 @@ def test_naacl_model(iters_per_unsup=10, iters_per_sup=10, ...@@ -537,8 +546,8 @@ def test_naacl_model(iters_per_unsup=10, iters_per_sup=10,
s0, s1 = [str(j) for j in m.pretraining_update(*inputs)] s0, s1 = [str(j) for j in m.pretraining_update(*inputs)]
print 'huh?', i, iters_per_unsup, iters_per_unsup * (i+1), s0, s1 print 'huh?', i, iters_per_unsup, iters_per_unsup * (i+1), s0, s1
if iters_per_unsup == 10: if iters_per_unsup == 10:
assert s0.startswith('0.40218760858') assert s0.startswith('0.40304459240')
assert s1.startswith('0.074450801777') assert s1.startswith('0.074898707938')
print 'FINETUNING GRAPH' print 'FINETUNING GRAPH'
print 'SUPERVISED PHASE COSTS (%s)'%optimizer print 'SUPERVISED PHASE COSTS (%s)'%optimizer
...@@ -548,9 +557,9 @@ def test_naacl_model(iters_per_unsup=10, iters_per_sup=10, ...@@ -548,9 +557,9 @@ def test_naacl_model(iters_per_unsup=10, iters_per_sup=10,
s0 = str(m.finetuning_update(*(inputs + [targets]))) s0 = str(m.finetuning_update(*(inputs + [targets])))
print iters_per_sup * (i+1), s0 print iters_per_sup * (i+1), s0
if iters_per_sup == 10: if iters_per_sup == 10:
assert s0.startswith('15.65127763')#should check for the 8 decimal only. assert s0.startswith('15.65111049')#should check for the 8 decimal only.
if __name__ == '__main__': def jtest_main():
from theano import gof from theano import gof
JTEST = theano.compile.mode.optdb.query(*sys.argv[2:]) JTEST = theano.compile.mode.optdb.query(*sys.argv[2:])
print 'JTEST', JTEST print 'JTEST', JTEST
...@@ -558,3 +567,23 @@ if __name__ == '__main__': ...@@ -558,3 +567,23 @@ if __name__ == '__main__':
optimizer = eval(sys.argv[1]) optimizer = eval(sys.argv[1])
test_naacl_model(optimizer, 10, 10, realistic=False) test_naacl_model(optimizer, 10, 10, realistic=False)
def real_main():
test_naacl_model()
def profile_main():
# This is the main function for profiling
# We've renamed our original main() above to real_main()
import cProfile, pstats, StringIO
prof = cProfile.Profile()
prof = prof.runctx("real_main()", globals(), locals())
stream = StringIO.StringIO()
stats = pstats.Stats(prof)
stats.sort_stats("time") # Or cumulative
stats.print_stats(80) # 80 = how many to print
# The rest is optional.
# stats.print_callees()
# stats.print_callers()
if __name__ == '__main__':
real_main()
#profile_main()
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论