提交 d65d9f79 authored 作者: james@X40's avatar james@X40

test_naacl passes

......@@ -193,3 +193,9 @@ How to reuse (overwrite) a storage tensor
``theano.compile.io.Out(gw1, borrow = True)`` for that value in
``compile.function``
=========================================
ProfileMode
=========================================
*** write up how to use it ***
......@@ -5,43 +5,49 @@
Theano
======
Theano is a Python library aiming to allow definition, optimization
and efficient evaluation of mathematical expressions involving
multi-dimensional arrays (though it may be extended to support many
other types). Theano melds some aspects of a computer algebra system
(CAS) with aspects of an optimizing compiler. This is particularly
useful in fields such as machine learning where complicated algorithms
must be run over large amounts of data.
Theano supports a wide range of numerical types in multiple
dimensions, a rapidly growing number of well-tested operations as well
as utilities to compute the gradient of an expression with respect to
another. Symbolic expressions may be compiled into functions, which
work merrily on the same data structures as numpy_, allowing for easy
interoperability.
Theano's compiler applies many optimizations of varying
complexity. These optimizations include, but are not limited to
constant folding, merging of similar subgraphs (to avoid calculating
the same values more than once), simple arithmetic simplification
(``x*y/x -> y``), inserting efficient BLAS_ operations and using
inplace operations wherever it is safe to do so. Theano also defines
several optimizations which improve the numerical stability of
computations and it provides a framework to add and test new
optimizers.
Theano was written at the LISA_ to support the development of
Theano is a Python library that allows you to definite, optimize, and
efficiently evaluate mathematical expressions involving multi-dimensional
arrays. It can be extended to support other types. Theano melds some
aspects of a computer algebra system (CAS) with aspects of an optimizing
compiler. It can even transform some or all of the expression into C code
and compile it into native machine instructions. This combination of CAS
with optimizing compilation is particularly useful for computational
fields in which complicated mathematical expressions are evaluated
numerous times over large data sets.
Theano was written at the LISA_ lab to support the development of
efficient machine learning algorithms while minimizing human
time. Theano was named after the `Greek mathematician`_ who may have
been Pythagoras' wife.
time. We use it especially in gradient-based learning techniques.
Theano supports a range of numerical types in multiple dimensions and
a number of well-tested operations. It also allows you to compute the
gradient of an expression with respect to another. Symbolic expressions
may be compiled into functions, which work on the same data structures
as numpy_, allowing for easy interoperability.
Theano's compiler applies many optimizations of varying complexity
to these symbolic expressions. These optimizations include, but are
not limited to:
* constant folding
* merging of similar subgraphs, to avoid calculating the same values more than once
* simple arithmetic simplification (``x*y/x -> y``)
* inserting efficient BLAS_ operations
* using inplace operations wherever it is safe to do so.
Theano defines several optimizations which improve the numerical
stability of computations. It also provides a framework to add and test
new optimizers.
Theano was named after the `Greek mathematician`_, who may have
been Pythagoras' wife.
Theano is released under a BSD license (:ref:`link <license>`)
Sneak peek
==========
Here's a very simple example of how to use Theano. It doesn't show
Here is a simple example of how to use Theano. It doesn't show
off many of Theano's features, but it illustrates concretely what
Theano is.
......@@ -66,9 +72,8 @@ Theano is.
Theano is not a programming language in the normal sense because you
write a program in Python that builds expressions for Theano. Still
it is like a programming language in the sense that to use theano, you
have to
write a program in Python that builds expressions for Theano. Still
it is like a programming language in the sense that you have to
- declare variables (``a,b``) and give their types
......@@ -77,8 +82,8 @@ have to
- compile expression graphs to functions in order to use them for computation.
It is good to think of ``theano.function`` as the interface to a
compiler which builds a callable object from a purely symbolic graph;
one of theano's most important features is that ``theano.function``
compiler which builds a callable object from a purely symbolic graph.
One of theano's most important features is that ``theano.function``
can optimize a graph and even compile some or all of it into native
machine instructions.
......@@ -95,18 +100,18 @@ package, so what does Theano do that Python and numpy do not?
parts your expression graph into native machine code, which runs
much faster than python.
- *symbolic differentiation*: Theano can convert a symbolic graph
build symbolic graphs for computing gradients.
- *symbolic differentiation*: Theano can automatic build symbolic graphs
for computing gradients.
- *stability optimizations*: Theano can recognize numerically unstable
expressions and compute them with more stable algorithms.
There also exists symbolic packages in Python, namely sympy_. Theano
is different from them in the sense that while it allows symbolic
manipulation it puts more emphasis on the evaluation of these
expressions and being able to repeatedly evaluate them on many
different sets of inputs. It is also better suited to handling very
large tensors which have no assumed structures.
There exist another symbolic package in Python, namely sympy_. Theano
is different from sympy in the sense that while Theano allows symbolic
manipulation it puts more emphasis on the evaluation of these expressions
and being able to repeatedly evaluate them on many different inputs. Theano
is also better suited to handling very large tensors which have no
assumed structures.
If numpy_ is to be compared to MATLAB_ and sympy_ to Mathematica_,
Theano is a sort of hybrid of the two which tries to make the best of
......@@ -145,10 +150,9 @@ issues that concern the end users.
Questions, comments, praise, criticism as well as bug reports should
be submitted to these mailing lists.
We welcome all kinds of contributions. Our `task list`_ is full of
interesting ideas awaiting a champion. If you have any questions
regarding how to extend Theano, please feel free to ask on the
theano-dev_ mailing list.
We welcome all kinds of contributions. If you have any questions
regarding how to extend Theano, please feel free to ask on the theano-dev_
mailing list.
......
......@@ -826,18 +826,15 @@ def default_initialize(self, init = {}, **kwinit):
for k, initv in dict(init, **kwinit).iteritems():
self[k] = initv
class ComponentDictInstance(CompositeInstance):
"""
ComponentDictInstance is meant to be instantiated by ComponentDict.
"""
class ComponentDictInstanceNoInit(CompositeInstance):
"""Component Instance that allows new items to be added"""
def __setitem__(self, item, value):
if item not in self.__items__:
# Set it if it's not there
# TODO: is this needed here? move to ModuleInstance?
self.__items__[item] = value
else:
super(ComponentDictInstance, self).__setitem__(item, value)
super(ComponentDictInstanceNoInit, self).__setitem__(item, value)
def __str__(self):
strings = []
......@@ -849,6 +846,12 @@ class ComponentDictInstance(CompositeInstance):
strings.append('%s%s' % (pre, str(v).replace('\n', '\n' + ' '*len(pre))))
return '{%s}' % '\n'.join(strings).replace('\n', '\n ')
class ComponentDictInstance(ComponentDictInstanceNoInit):
"""
ComponentDictInstance is meant to be instantiated by ComponentDict.
"""
def initialize(self, init={}, **kwinit):
for k, initv in dict(init, **kwinit).iteritems():
self[k] = initv
......@@ -990,7 +993,7 @@ class Curry:
self.meth = getattr(self.obj, self.name)
class ModuleInstance(ComponentDictInstance):
class ModuleInstance(ComponentDictInstanceNoInit):
"""
WRITEME
......@@ -1087,19 +1090,18 @@ class Module(ComponentDict):
if not isinstance(inst, ModuleInstance):
raise TypeError('The InstanceType of a Module should inherit from ModuleInstance',
(self, type(inst)))
print 'BUILD', self
for methodname in dir(self):
# Any method with a name like '_instance_XXX' is added to
# the object built under the name obj.XXX
if methodname.startswith('_instance_'):
print 'INSTALLING', inst, methodname
new_methodname = methodname[len('_instance_'):]
new_obj = Curry(self, methodname, inst)
# setattr doesn't work here because we overrode __setattr__
# setattr(inst, new_methodname, new_obj)
inst.__dict__[new_methodname] = new_obj
assert getattr(inst, new_methodname) == new_obj
#print 'ADDING METHOD', method, 'to', id(inst), new_methodname, getattr(inst, new_methodname)
if not hasattr(inst, new_methodname):
curried = Curry(self, methodname, inst)
# setattr doesn't work here because we overrode __setattr__
# setattr(inst, new_methodname, curried)
inst.__dict__[new_methodname] = curried
assert getattr(inst, new_methodname) == curried
#print 'ADDING METHOD', method, 'to', id(inst), new_methodname, getattr(inst, new_methodname)
return inst
def _instance_initialize(self, inst, init = {}, **kwinit):
......
......@@ -1305,14 +1305,26 @@ class test_matinv(unittest.TestCase):
ssd, gw = fn(x,w)
#print ssd, x*w, x, w
if i == 0:
str0 = str(ssd)
ssd0 = ssd
w -= 0.4 * gw
return str0, str(ssd)
return ssd0, ssd
def test_reciprocal(self):
"""Matrix reciprocal by gradient descent"""
self.assertEqual(('6.10141615619', '0.00703816291711'), self.mat_reciprocal(3))
ssd0,ssd = self.mat_reciprocal(3)
numpy.random.seed(unittest_tools.fetch_seed(1))
# hand-coded numpy implementation for verification
x = numpy.random.rand(3,3)+0.1
w = numpy.random.rand(3,3)
myssd0 = numpy.sum((x*w - numpy.ones((3,3)))**2.0)
for i in xrange(300):
gw = 2*(x*w - numpy.ones((3,3)))*x # derivative of dMSE/dw
myssd = numpy.sum((x*w - numpy.ones((3,3)))**2)
w -= 0.4 * gw
self.failUnlessAlmostEqual(ssd0, myssd0)
self.failUnlessAlmostEqual(ssd, myssd)
class t_dot(unittest.TestCase):
def setUp(self):
......
......@@ -179,6 +179,7 @@ class QuadraticDenoisingAA(module.Module):
#self.validate = theano.Method(self.input, [self.cost, self.output])
def _instance_initialize(self, obj, input_size, hidden_size, seed, lr, qfilter_relscale):
print 'QDAA init'
"""
qfilter_relscale is the initial range for any quadratic filters (relative to the linear
filter's initial range)
......@@ -326,9 +327,6 @@ class Module_Nclass(module.FancyModule):
class ConvolutionalMLPInstance(module.FancyModuleInstance, Loss01):
#initialize is called by Module.make
def initialize(self, input_size, input_representation_size, hidden_representation_size, output_size, lr, seed, noise_level, qfilter_relscale):
print 'INITIALIZING'
# ASK JAMES: Is the following necessary?
# super(ConvolutionalMLPInstance, self)._instance_initialize(obj, **kwargs)
R = N.random.RandomState(unittest_tools.fetch_seed(seed))
......@@ -341,19 +339,29 @@ class ConvolutionalMLPInstance(module.FancyModuleInstance, Loss01):
# for layer in obj.layers:
# if layer.lr is None:
# layer.lr = lr
assert self.input_representations[-1] is not self.input_representations[0]
assert self.input_representations[-1].w1 is self.input_representations[0].w1
for i in self.input_representations:
# i.initialize(input_size=self.input_size, hidden_size=self.input_representation_size, seed=R.random_integers(2**30), noise_level=noise_level, qfilter_relscale=qfilter_relscale)
i.initialize(input_size=self.input_size, hidden_size=self.input_representation_size, noise_level=noise_level, seed=R.random_integers(2**30), lr=lr, qfilter_relscale=qfilter_relscale)
i.initialize(input_size=self.input_size,
hidden_size=self.input_representation_size, noise_level=noise_level,
seed=int(R.random_integers(2**30)), lr=lr, qfilter_relscale=qfilter_relscale)
print type(i.w1)
assert isinstance(i.w1, N.ndarray)
for i in self.input_representations[1:]:
print type(i.w1)
assert isinstance(i.w1, N.ndarray)
assert (i.w1 == self.input_representations[0].w1).all()
assert (i.w2 == self.input_representations[0].w2).all()
assert (i.b1 == self.input_representations[0].b1).all()
assert (i.b2 == self.input_representations[0].b2).all()
assert all((a==b).all() for a, b in zip(i.qfilters, self.input_representations[0].qfilters))
self.hidden.initialize(input_size=(len(self.inputs) * self.input_representation_size), hidden_size=self.hidden_representation_size, noise_level=noise_level, seed=R.random_integers(2**30), lr=lr, qfilter_relscale=qfilter_relscale)
self.hidden.initialize(input_size=(len(self.inputs) * self.input_representation_size),
hidden_size=self.hidden_representation_size, noise_level=noise_level,
seed=int(R.random_integers(2**30)), lr=lr, qfilter_relscale=qfilter_relscale)
self.output.initialize(n_in=self.hidden_representation_size, n_out=self.output_size, lr=lr, seed=R.random_integers(2**30))
......@@ -401,6 +409,7 @@ class ConvolutionalMLP(module.FancyModule):
_qfilters = self.input_representations[0].qfilters
)
)
assert self.input_representations[-1].w1 is self.input_representations[0].w1
self.input_representation = T.concatenate([i.hidden for i in self.input_representations], axis=1)
self.hidden = QDAA(
......@@ -445,7 +454,7 @@ class ConvolutionalMLP(module.FancyModule):
finetuning_cost = self.output.cost
finetuning_gradients = T.grad(finetuning_cost, finetuning_params)
finetuning_updates = dict((p, p - self.lr * g) for p, g in zip(finetuning_params, finetuning_gradients))
###DEBUG: self.finetuning_update = module.Method(self.inputs + [self.targ], self.output.cost, finetuning_updates)
self.finetuning_update = module.Method(self.inputs + [self.targ], self.output.cost, finetuning_updates)
#self.validate = module.Method(self.inputs + [self.targ], [self.output.cost, self.output.argmax, self.output.max_pr])
#self.softmax_output = module.Method(self.inputs, self.output.softmax_unsupervised)
......@@ -537,8 +546,8 @@ def test_naacl_model(iters_per_unsup=10, iters_per_sup=10,
s0, s1 = [str(j) for j in m.pretraining_update(*inputs)]
print 'huh?', i, iters_per_unsup, iters_per_unsup * (i+1), s0, s1
if iters_per_unsup == 10:
assert s0.startswith('0.40218760858')
assert s1.startswith('0.074450801777')
assert s0.startswith('0.40304459240')
assert s1.startswith('0.074898707938')
print 'FINETUNING GRAPH'
print 'SUPERVISED PHASE COSTS (%s)'%optimizer
......@@ -548,9 +557,9 @@ def test_naacl_model(iters_per_unsup=10, iters_per_sup=10,
s0 = str(m.finetuning_update(*(inputs + [targets])))
print iters_per_sup * (i+1), s0
if iters_per_sup == 10:
assert s0.startswith('15.65127763')#should check for the 8 decimal only.
assert s0.startswith('15.65111049')#should check for the 8 decimal only.
if __name__ == '__main__':
def jtest_main():
from theano import gof
JTEST = theano.compile.mode.optdb.query(*sys.argv[2:])
print 'JTEST', JTEST
......@@ -558,3 +567,23 @@ if __name__ == '__main__':
optimizer = eval(sys.argv[1])
test_naacl_model(optimizer, 10, 10, realistic=False)
def real_main():
test_naacl_model()
def profile_main():
# This is the main function for profiling
# We've renamed our original main() above to real_main()
import cProfile, pstats, StringIO
prof = cProfile.Profile()
prof = prof.runctx("real_main()", globals(), locals())
stream = StringIO.StringIO()
stats = pstats.Stats(prof)
stats.sort_stats("time") # Or cumulative
stats.print_stats(80) # 80 = how many to print
# The rest is optional.
# stats.print_callees()
# stats.print_callers()
if __name__ == '__main__':
real_main()
#profile_main()
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论