提交 09bce224 authored 作者: Olivier Breuleux's avatar Olivier Breuleux

merge

...@@ -193,3 +193,9 @@ How to reuse (overwrite) a storage tensor ...@@ -193,3 +193,9 @@ How to reuse (overwrite) a storage tensor
``theano.compile.io.Out(gw1, borrow = True)`` for that value in ``theano.compile.io.Out(gw1, borrow = True)`` for that value in
``compile.function`` ``compile.function``
=========================================
ProfileMode
=========================================
*** write up how to use it ***
...@@ -5,43 +5,49 @@ ...@@ -5,43 +5,49 @@
Theano Theano
====== ======
Theano is a Python library aiming to allow definition, optimization Theano is a Python library that allows you to definite, optimize, and
and efficient evaluation of mathematical expressions involving efficiently evaluate mathematical expressions involving multi-dimensional
multi-dimensional arrays (though it may be extended to support many arrays. It can be extended to support other types. Theano melds some
other types). Theano melds some aspects of a computer algebra system aspects of a computer algebra system (CAS) with aspects of an optimizing
(CAS) with aspects of an optimizing compiler. This is particularly compiler. It can even transform some or all of the expression into C code
useful in fields such as machine learning where complicated algorithms and compile it into native machine instructions. This combination of CAS
must be run over large amounts of data. with optimizing compilation is particularly useful for computational
fields in which complicated mathematical expressions are evaluated
Theano supports a wide range of numerical types in multiple numerous times over large data sets.
dimensions, a rapidly growing number of well-tested operations as well
as utilities to compute the gradient of an expression with respect to Theano was written at the LISA_ lab to support the development of
another. Symbolic expressions may be compiled into functions, which
work merrily on the same data structures as numpy_, allowing for easy
interoperability.
Theano's compiler applies many optimizations of varying
complexity. These optimizations include, but are not limited to
constant folding, merging of similar subgraphs (to avoid calculating
the same values more than once), simple arithmetic simplification
(``x*y/x -> y``), inserting efficient BLAS_ operations and using
inplace operations wherever it is safe to do so. Theano also defines
several optimizations which improve the numerical stability of
computations and it provides a framework to add and test new
optimizers.
Theano was written at the LISA_ to support the development of
efficient machine learning algorithms while minimizing human efficient machine learning algorithms while minimizing human
time. Theano was named after the `Greek mathematician`_ who may have time. We use it especially in gradient-based learning techniques.
been Pythagoras' wife.
Theano supports a range of numerical types in multiple dimensions and
a number of well-tested operations. It also allows you to compute the
gradient of an expression with respect to another. Symbolic expressions
may be compiled into functions, which work on the same data structures
as numpy_, allowing for easy interoperability.
Theano's compiler applies many optimizations of varying complexity
to these symbolic expressions. These optimizations include, but are
not limited to:
* constant folding
* merging of similar subgraphs, to avoid calculating the same values more than once
* simple arithmetic simplification (``x*y/x -> y``)
* inserting efficient BLAS_ operations
* using inplace operations wherever it is safe to do so.
Theano defines several optimizations which improve the numerical
stability of computations. It also provides a framework to add and test
new optimizers.
Theano was named after the `Greek mathematician`_, who may have
been Pythagoras' wife.
Theano is released under a BSD license (:ref:`link <license>`) Theano is released under a BSD license (:ref:`link <license>`)
Sneak peek Sneak peek
========== ==========
Here's a very simple example of how to use Theano. It doesn't show Here is a simple example of how to use Theano. It doesn't show
off many of Theano's features, but it illustrates concretely what off many of Theano's features, but it illustrates concretely what
Theano is. Theano is.
...@@ -67,8 +73,7 @@ Theano is. ...@@ -67,8 +73,7 @@ Theano is.
Theano is not a programming language in the normal sense because you Theano is not a programming language in the normal sense because you
write a program in Python that builds expressions for Theano. Still write a program in Python that builds expressions for Theano. Still
it is like a programming language in the sense that to use theano, you it is like a programming language in the sense that you have to
have to
- declare variables (``a,b``) and give their types - declare variables (``a,b``) and give their types
...@@ -77,8 +82,8 @@ have to ...@@ -77,8 +82,8 @@ have to
- compile expression graphs to functions in order to use them for computation. - compile expression graphs to functions in order to use them for computation.
It is good to think of ``theano.function`` as the interface to a It is good to think of ``theano.function`` as the interface to a
compiler which builds a callable object from a purely symbolic graph; compiler which builds a callable object from a purely symbolic graph.
one of theano's most important features is that ``theano.function`` One of theano's most important features is that ``theano.function``
can optimize a graph and even compile some or all of it into native can optimize a graph and even compile some or all of it into native
machine instructions. machine instructions.
...@@ -95,18 +100,18 @@ package, so what does Theano do that Python and numpy do not? ...@@ -95,18 +100,18 @@ package, so what does Theano do that Python and numpy do not?
parts your expression graph into native machine code, which runs parts your expression graph into native machine code, which runs
much faster than python. much faster than python.
- *symbolic differentiation*: Theano can convert a symbolic graph - *symbolic differentiation*: Theano can automatic build symbolic graphs
build symbolic graphs for computing gradients. for computing gradients.
- *stability optimizations*: Theano can recognize numerically unstable - *stability optimizations*: Theano can recognize numerically unstable
expressions and compute them with more stable algorithms. expressions and compute them with more stable algorithms.
There also exists symbolic packages in Python, namely sympy_. Theano There exist another symbolic package in Python, namely sympy_. Theano
is different from them in the sense that while it allows symbolic is different from sympy in the sense that while Theano allows symbolic
manipulation it puts more emphasis on the evaluation of these manipulation it puts more emphasis on the evaluation of these expressions
expressions and being able to repeatedly evaluate them on many and being able to repeatedly evaluate them on many different inputs. Theano
different sets of inputs. It is also better suited to handling very is also better suited to handling very large tensors which have no
large tensors which have no assumed structures. assumed structures.
If numpy_ is to be compared to MATLAB_ and sympy_ to Mathematica_, If numpy_ is to be compared to MATLAB_ and sympy_ to Mathematica_,
Theano is a sort of hybrid of the two which tries to make the best of Theano is a sort of hybrid of the two which tries to make the best of
...@@ -145,10 +150,9 @@ issues that concern the end users. ...@@ -145,10 +150,9 @@ issues that concern the end users.
Questions, comments, praise, criticism as well as bug reports should Questions, comments, praise, criticism as well as bug reports should
be submitted to these mailing lists. be submitted to these mailing lists.
We welcome all kinds of contributions. Our `task list`_ is full of We welcome all kinds of contributions. If you have any questions
interesting ideas awaiting a champion. If you have any questions regarding how to extend Theano, please feel free to ask on the theano-dev_
regarding how to extend Theano, please feel free to ask on the mailing list.
theano-dev_ mailing list.
......
...@@ -12,15 +12,31 @@ Requirements ...@@ -12,15 +12,31 @@ Requirements
In order to use Theano, the following libraries and software will need In order to use Theano, the following libraries and software will need
to be installed: to be installed:
Linux or OS-X operating system
- linux or OS-X operating system We develop mainly on 64-bit Linux machines. 32-bit architectures are
- python >=2.5 not well-tested.
- numpy >=1.2 (earlier versions have memory leaks)
- SciPy (specifically numpy, sparse, weave). We recommend scipy >=0.7 if you are using sparse matrices, because scipy.sparse is buggy in 0.6. (scipy.csc_matrix dot has a bug with singleton dimensions. There may be more bugs.) python >= 2.5
- g++, python-dev (optional but highly recommended, to compile generated C code)
- sphinx >=0.5.1, pygments (optional, to build documentation) (also latex and dvipng if you want math to show up as images...) `numpy <http://numpy.scipy.org/>`_ >= 1.2
- mercurial (optional, to download the source) Earlier versions have memory leaks.
- nose (nosetests) (optional, for testing) `SciPy <http://scipy.org>`_
Specifically numpy, sparse, and weave. We recommend scipy
>=0.7 if you are using sparse matrices, because scipy.sparse
is buggy in 0.6. (scipy.csc_matrix dot has a bug with singleton
dimensions. There may be more bugs.)
The following libraries and software are optional:
g++, python-dev
Highly recommended, to compile generated C code.
`nose <http://somethingaboutorange.com/mrl/projects/nose/>`_
Recommended, to run Theano's test-suite.
`sphinx <http://sphinx.pocoo.org/>`_ >=0.5.1, `pygments <http://pygments.org/>`_
Used to build documentation. latex and dvipng
are also necessary for math to show up as images.
`mercurial <http://www.selenic.com/mercurial/>`_
To download bleeding-edge
------------ ------------
......
...@@ -37,19 +37,15 @@ objects). ...@@ -37,19 +37,15 @@ objects).
>>> x = T.dscalar('x') >>> x = T.dscalar('x')
>>> y = T.dscalar('y') >>> y = T.dscalar('y')
In Theano, all symbols must be typed. In particular, ``T.dscalar`` is In Theano, all symbols must be typed. In particular, ``T.dscalar``
the type we assign to "0-dimensional arrays of doubles". It is a is the type we assign to "0-dimensional arrays (`scalar`) of doubles
Theano :term:`Type`. Therefore, you can guess that by calling (`d`)". It is a Theano :term:`Type`.
``T.dscalar`` with a string argument, you create a :term:`Result`
representing a floating-point scalar quantity with the given name (if ``dscalar`` is not a class. Therefore, neither ``x`` nor ``y``
you provide no argument, the symbol will be unnamed, which can cause are actually instances of ``dscalar``. They are instances of
difficulties in debugging). :api:`TensorResult <theano.tensor.basic.TensorResult>`. ``x`` and ``y``
are, however, assigned the theano Type ``dscalar`` in their ``type``
Note that ``dscalar`` is not a class and that therefore neither ``x`` field, as you can see here:
nor ``y`` are actually instances of ``dscalar``. They are instances of
:api:`TensorResult <theano.tensor.basic.TensorResult>`. It is however
assigned the theano Type ``dscalar`` in its ``type`` field, as you can
see here:
>>> type(x) >>> type(x)
<class 'theano.tensor.basic.TensorResult'> <class 'theano.tensor.basic.TensorResult'>
...@@ -60,9 +56,14 @@ Tensor(float64, scalar) ...@@ -60,9 +56,14 @@ Tensor(float64, scalar)
>>> x.type == T.dscalar >>> x.type == T.dscalar
True True
Ditto for ``y``. You may learn more about the structures in Theano in You can learn more about the structures in Theano in
the :ref:`advtutorial` and in :ref:`graphstructures`. the :ref:`advtutorial` and in :ref:`graphstructures`.
By calling ``T.dscalar`` with a string argument, you create a
:term:`Result` representing a floating-point scalar quantity with the
given name. If you provide no argument, the symbol will be unnamed. Names
are not require, but they can aid debugging.
------------------------------------------- -------------------------------------------
**Step 2** **Step 2**
...@@ -83,14 +84,14 @@ x + y ...@@ -83,14 +84,14 @@ x + y
**Step 3** **Step 3**
The last step is to create a function taking ``x`` and ``y`` as inputs The last step is to create a function taking ``x`` and ``y`` as inputs
and giving out ``z`` as output: and giving ``z`` as output:
>>> f = function([x, y], z) >>> f = function([x, y], z)
The first argument to ``function`` is a list of :term:`Results The first argument to ``function`` is a list of :term:`Results <Result>`
<Result>` that will be provided as inputs to the function. The second that will be provided as inputs to the function. The second argument
argument is a single Result that we want to see as output *or* a list is a single Result *or* a list of Results. For either case, the second
of output results. argument is what we want to see as output when we apply the function.
``f`` may then be used like a normal Python function. ``f`` may then be used like a normal Python function.
......
...@@ -17,7 +17,7 @@ installed: ...@@ -17,7 +17,7 @@ installed:
>>> from theano import * >>> from theano import *
Many of symbols you will need to use lie in the ``tensor`` subpackage Many of symbols you will need to use are in the ``tensor`` subpackage
of theano. Let's import that subpackage under a handy name. I like of theano. Let's import that subpackage under a handy name. I like
``T``. ``T``.
......
...@@ -195,8 +195,10 @@ def _optcheck_env(input_specs, output_specs, accept_inplace = False): ...@@ -195,8 +195,10 @@ def _optcheck_env(input_specs, output_specs, accept_inplace = False):
inputs, outputs = gof.graph.clone(orig_inputs, orig_outputs) inputs, outputs = gof.graph.clone(orig_inputs, orig_outputs)
equivalence_tracker = _ResultEquivalenceTracker() equivalence_tracker = _ResultEquivalenceTracker()
env = gof.env.Env(inputs, outputs, env = gof.env.Env(inputs, outputs,
features=[equivalence_tracker, #DestroyHandler is not needed because it is actually installed by an optimization
gof.DestroyHandler(do_imports_on_attach=False)]) # after canonicalization. This results in a big speed gain.
#features=[equivalence_tracker, gof.DestroyHandler(do_imports_on_attach=False)])
features=[equivalence_tracker])
if not accept_inplace: if not accept_inplace:
for node in env.nodes: for node in env.nodes:
......
"""Driver of graph construction, optimization, and linking. """Driver of graph construction, optimization, and linking.
""" """
__docformat__ = "restructuredtext en"
import copy_reg import copy_reg
import cPickle import cPickle
......
差异被折叠。
#!/usr/bin/env python #!/usr/bin/env python
import numpy as N import numpy as N
from theano import Op, Apply, tensor as T, Module, Member, Method, Mode, compile from theano import Op, Apply, tensor as T, Module, Method, Mode, compile
from theano.gof import OpSub, TopoOptimizer from theano.gof import OpSub, TopoOptimizer
from pylearn.algorithms.minimizer import make_minimizer # minimizer
from theano.printing import Print from theano.printing import Print
from theano.tests import unittest_tools from theano.tests import unittest_tools
#import sgd #until Olivier's module-import thing works better
#################### ####################
# Library-type stuff # Library-type stuff
...@@ -15,8 +13,6 @@ from theano.tests import unittest_tools ...@@ -15,8 +13,6 @@ from theano.tests import unittest_tools
from theano.compile import module from theano.compile import module
from theano import tensor as T from theano import tensor as T
from pylearn.algorithms.minimizer import minimizer_factory
class StochasticGradientDescent(module.FancyModule): class StochasticGradientDescent(module.FancyModule):
"""Fixed stepsize gradient descent""" """Fixed stepsize gradient descent"""
def __init__(self, args, cost, params, gradients=None, stepsize=None, WEIRD_STUFF=True): def __init__(self, args, cost, params, gradients=None, stepsize=None, WEIRD_STUFF=True):
...@@ -29,18 +25,18 @@ class StochasticGradientDescent(module.FancyModule): ...@@ -29,18 +25,18 @@ class StochasticGradientDescent(module.FancyModule):
self.stepsize_init = None self.stepsize_init = None
if stepsize is None: if stepsize is None:
self.stepsize = module.Member(T.dscalar()) self.stepsize = (T.dscalar())
elif isinstance(stepsize, T.TensorResult): elif isinstance(stepsize, T.TensorResult):
self.stepsize = stepsize self.stepsize = stepsize
else: else:
if self.WEIRD_STUFF: if self.WEIRD_STUFF:
#TODO: why is this necessary? why does the else clause not work? #TODO: why is this necessary? why does the else clause not work?
# self.stepsize = module.Member(T.dscalar(), init = stepsize) # self.stepsize = module.Member(T.dscalar(), init = stepsize)
self.stepsize = module.Member(T.dscalar()) self.stepsize = (T.dscalar())
self.stepsize_init = stepsize self.stepsize_init = stepsize
else: else:
# self.stepsize = module.Member(T.value(stepsize)) # self.stepsize = module.Member(T.value(stepsize))
self.stepsize = module.Member(T.constant(stepsize))#work! self.stepsize = (T.constant(stepsize))#work!
if self.stepsize.ndim != 0: if self.stepsize.ndim != 0:
raise ValueError('stepsize must be a scalar', stepsize) raise ValueError('stepsize must be a scalar', stepsize)
...@@ -63,7 +59,6 @@ class StochasticGradientDescent(module.FancyModule): ...@@ -63,7 +59,6 @@ class StochasticGradientDescent(module.FancyModule):
pass pass
@minimizer_factory('sgd')
def sgd_minimizer(stepsize=None, **args): def sgd_minimizer(stepsize=None, **args):
def m(i,c,p,g=None): def m(i,c,p,g=None):
return StochasticGradientDescent(i, c, p, stepsize=stepsize, **args) return StochasticGradientDescent(i, c, p, stepsize=stepsize, **args)
...@@ -101,6 +96,9 @@ class TanhRnn(Op): ...@@ -101,6 +96,9 @@ class TanhRnn(Op):
return Apply(self, [x, z0, A], [z]) return Apply(self, [x, z0, A], [z])
def perform(self, node, (x,z0,A), out): def perform(self, node, (x,z0,A), out):
assert x is not None
assert z0 is not None
assert A is not None
T,M = x.shape T,M = x.shape
z = N.zeros((T+1, M)) z = N.zeros((T+1, M))
z[0] = z0 z[0] = z0
...@@ -161,10 +159,10 @@ class ExampleRNN(Module): ...@@ -161,10 +159,10 @@ class ExampleRNN(Module):
self.n_vis = n_vis self.n_vis = n_vis
#recurrent weight matrix in latent space #recurrent weight matrix in latent space
self.z0 = Member(T.dvector()) self.z0 = (T.dvector())
self.w = Member(T.dmatrix()) self.w = (T.dmatrix())
self.params = [self.w] self.params = [self.z0, self.w]
#input and target #input and target
x, y = T.dmatrix(), T.dmatrix() x, y = T.dmatrix(), T.dmatrix()
...@@ -176,6 +174,7 @@ class ExampleRNN(Module): ...@@ -176,6 +174,7 @@ class ExampleRNN(Module):
self.minimizer = minimizer([x, y], self.cost, self.params) self.minimizer = minimizer([x, y], self.cost, self.params)
def _instance_initialize(self, obj): def _instance_initialize(self, obj):
print 'INITIALIZE EXAMPLE RNN'
n_vis = self.n_vis n_vis = self.n_vis
rng = N.random.RandomState(unittest_tools.fetch_seed(2342)) rng = N.random.RandomState(unittest_tools.fetch_seed(2342))
...@@ -185,14 +184,14 @@ class ExampleRNN(Module): ...@@ -185,14 +184,14 @@ class ExampleRNN(Module):
obj.minimizer.initialize() obj.minimizer.initialize()
def test_example_rnn(): def test_example_rnn():
minimizer_fn = make_minimizer('sgd', stepsize = 0.001) minimizer_fn = sgd_minimizer(stepsize = 0.001)
n_vis = 5 n_vis = 5
n_out = 3 n_out = 3
n_hid = 4 n_hid = 4
rnn_module = ExampleRNN(n_vis, minimizer_fn) rnn_module = ExampleRNN(n_vis, minimizer_fn)
rnn = rnn_module.make(mode='FAST_RUN') rnn = rnn_module.make()
rng = N.random.RandomState(unittest_tools.fetch_seed(7722342)) rng = N.random.RandomState(unittest_tools.fetch_seed(7722342))
x = rng.randn(10,n_vis) x = rng.randn(10,n_vis)
...@@ -212,6 +211,7 @@ def test_example_rnn(): ...@@ -212,6 +211,7 @@ def test_example_rnn():
print i, rnn.minimizer.step_cost(x, y), rnn.minimizer.stepsize print i, rnn.minimizer.step_cost(x, y), rnn.minimizer.stepsize
else: else:
rnn.minimizer.step_cost(x, y) rnn.minimizer.step_cost(x, y)
assert rnn.minimizer.step_cost(x,y) < -20 #it starts around -.28
def test_WEIRD_STUFF(): def test_WEIRD_STUFF():
n_vis = 3 n_vis = 3
...@@ -224,8 +224,8 @@ def test_WEIRD_STUFF(): ...@@ -224,8 +224,8 @@ def test_WEIRD_STUFF():
LAG = 4 LAG = 4
y[LAG:] = x[:-LAG, 0:n_vis] y[LAG:] = x[:-LAG, 0:n_vis]
minimizer_fn1 = make_minimizer('sgd', stepsize = 0.001, WEIRD_STUFF = False) minimizer_fn1 = sgd_minimizer(stepsize = 0.001, WEIRD_STUFF = False)
minimizer_fn2 = make_minimizer('sgd', stepsize = 0.001, WEIRD_STUFF = True) minimizer_fn2 = sgd_minimizer(stepsize = 0.001, WEIRD_STUFF = True)
rnn_module1 = ExampleRNN(n_vis, minimizer_fn1) rnn_module1 = ExampleRNN(n_vis, minimizer_fn1)
rnn_module2 = ExampleRNN(n_vis, minimizer_fn2) rnn_module2 = ExampleRNN(n_vis, minimizer_fn2)
rnn1 = rnn_module1.make(mode='FAST_RUN') rnn1 = rnn_module1.make(mode='FAST_RUN')
......
...@@ -84,6 +84,9 @@ class Apply(utils.object2): ...@@ -84,6 +84,9 @@ class Apply(utils.object2):
else: else:
raise TypeError("The 'outputs' argument to Apply must contain Result instances with no owner, not %s" % output) raise TypeError("The 'outputs' argument to Apply must contain Result instances with no owner, not %s" % output)
self._creation_idx = _creation_idx[0]
_creation_idx[0] += 1
def default_output(self): def default_output(self):
"""Returns the default output for this node. """Returns the default output for this node.
...@@ -123,9 +126,6 @@ class Apply(utils.object2): ...@@ -123,9 +126,6 @@ class Apply(utils.object2):
return self return self
def __hash__(self): def __hash__(self):
if not hasattr(self, '_creation_idx'):
self._creation_idx = _creation_idx[0]
_creation_idx[0] += 1
return self._creation_idx return self._creation_idx
......
...@@ -473,15 +473,6 @@ class GemmLocalOptimizer(LocalOptimizer): ...@@ -473,15 +473,6 @@ class GemmLocalOptimizer(LocalOptimizer):
return [T.add(*new_add_inputs)] return [T.add(*new_add_inputs)]
return False return False
@staticmethod
def failure_callback(exc, nav, repl_pairs):
"""WRITEME"""
if not isinstance(exc, InconsistencyError):
traceback.print_exc()
else:
#print 'GEMM caused cycle, it happens.'
pass
@staticmethod @staticmethod
def _as_scalar(res): def _as_scalar(res):
"""Return None or a TensorResult whose type is in T.float_scalar_types""" """Return None or a TensorResult whose type is in T.float_scalar_types"""
...@@ -579,11 +570,11 @@ class GemmLocalOptimizer(LocalOptimizer): ...@@ -579,11 +570,11 @@ class GemmLocalOptimizer(LocalOptimizer):
# TODO: This could be an equilibriumOptmizer, but I don't know how to combine an OpKeyOptimizer and # TODO: This could be an equilibriumOptmizer, but I don't know how to combine an OpKeyOptimizer and
# an EquilibriumOptimizer. # an EquilibriumOptimizer.
compile.optdb.register('inplace_gemm_0', OpKeyOptimizer(GemmLocalOptimizer(), compile.optdb.register('inplace_gemm_0', OpKeyOptimizer(GemmLocalOptimizer(),
failure_callback=GemmLocalOptimizer.failure_callback), 70.00, 'fast_run', 'inplace', 'gemm') failure_callback=OpKeyOptimizer.warn_inplace), 70.00, 'fast_run', 'inplace', 'gemm')
compile.optdb.register('inplace_gemm_1', OpKeyOptimizer(GemmLocalOptimizer(), compile.optdb.register('inplace_gemm_1', OpKeyOptimizer(GemmLocalOptimizer(),
failure_callback=GemmLocalOptimizer.failure_callback), 70.01, 'fast_run', 'inplace', 'gemm') failure_callback=OpKeyOptimizer.warn_inplace), 70.01, 'fast_run', 'inplace', 'gemm')
compile.optdb.register('inplace_gemm_2', OpKeyOptimizer(GemmLocalOptimizer(), compile.optdb.register('inplace_gemm_2', OpKeyOptimizer(GemmLocalOptimizer(),
failure_callback=GemmLocalOptimizer.failure_callback), 70.02, 'fast_run', 'inplace', 'gemm') failure_callback=OpKeyOptimizer.warn_inplace), 70.02, 'fast_run', 'inplace', 'gemm')
class Dot22(GemmRelated): class Dot22(GemmRelated):
"""Compute a matrix-matrix product. """Compute a matrix-matrix product.
......
...@@ -1305,14 +1305,26 @@ class test_matinv(unittest.TestCase): ...@@ -1305,14 +1305,26 @@ class test_matinv(unittest.TestCase):
ssd, gw = fn(x,w) ssd, gw = fn(x,w)
#print ssd, x*w, x, w #print ssd, x*w, x, w
if i == 0: if i == 0:
str0 = str(ssd) ssd0 = ssd
w -= 0.4 * gw w -= 0.4 * gw
return str0, str(ssd) return ssd0, ssd
def test_reciprocal(self): def test_reciprocal(self):
"""Matrix reciprocal by gradient descent""" """Matrix reciprocal by gradient descent"""
self.assertEqual(('6.10141615619', '0.00703816291711'), self.mat_reciprocal(3)) ssd0,ssd = self.mat_reciprocal(3)
numpy.random.seed(unittest_tools.fetch_seed(1))
# hand-coded numpy implementation for verification
x = numpy.random.rand(3,3)+0.1
w = numpy.random.rand(3,3)
myssd0 = numpy.sum((x*w - numpy.ones((3,3)))**2.0)
for i in xrange(300):
gw = 2*(x*w - numpy.ones((3,3)))*x # derivative of dMSE/dw
myssd = numpy.sum((x*w - numpy.ones((3,3)))**2)
w -= 0.4 * gw
self.failUnlessAlmostEqual(ssd0, myssd0)
self.failUnlessAlmostEqual(ssd, myssd)
class t_dot(unittest.TestCase): class t_dot(unittest.TestCase):
def setUp(self): def setUp(self):
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论