提交 4c6c0034 authored 作者: Olivier Breuleux's avatar Olivier Breuleux

merge

...@@ -3,6 +3,7 @@ syntax: glob ...@@ -3,6 +3,7 @@ syntax: glob
*~ *~
\#*\# \#*\#
doc/oplist.txt doc/oplist.txt
doc/typelist.txt
compiled/*.cpp compiled/*.cpp
cutils_ext.cpp cutils_ext.cpp
html html
......
THEANO
Documentation et al is in Trac:
http://lgcm.iro.umontreal.ca:8000/theano/wiki/WikiStart
The lisa twiki is deprecated for documenting Theano.
Requirements:
scipy [version?]
numpy [version?]
Python >=2.5 (for function all)
==============
README: theano
==============
.. contents::
Project Description
===================
Theano is a python library for manipulating and evaluating expressions, especially matrix-valued ones.
What does Theano do that Python and numpy do not?
- *execution speed optimizations*: Theano can use `g++` to compile parts your expression graph into native machine code, which runs much faster than python.
- *symbolic differentiation*: Theano can convert a symbolic graph build symbolic graphs for computing gradients.
- *stability optimizations*: Theano can recognize numerically unstable expressions and compute them with more stable algorithms.
Here's a very simple example of how to use Theano. It doesn't show off many of Theano's features, but it illustrates concretely what Theano is.
.. code-block:: python
import theano
from theano import tensor
a = tensor.fscalar() # declare a symbolic floating-point scalar.
b = tensor.fscalar() # declare a symbolic floating-point scalar.
c = a + b # create a simple expression
f = theano.function([a,b], [c]) # convert the expression into a callable object
# that takes (a,b) values as input and computes a value for c
assert 4.0 == f(1.5, 2.5) # bind 1.5 to 'a', 2.5 to 'b', and evaluate 'c'
Theano is not a programming language in the normal sense because you write a program in Python that builds expressions for Theano. Still it is like a programming language in the sense that to use theano, you have to
- declare variables ({{{a,b}}}) and give their types
- build expressions for how to put those variables together
- compile expression graphs to functions in order to use them for computation.
It is good to think of `theano.function` as the interface to a compiler which builds a callable object from a purely symbolic graph.
License
-------
Theano is licensed under a BSD-like license. See the LICENSE file in the project root folder.
Installation
============
(See also the :wiki:`InstallationNotes` on the wiki.)
Software Requirements
---------------------
- linux or OS-X operating system
- python 2.5
- SciPy (specifically numpy, sparse, weave). Numpy version >= 1.1 fixes memory leak.
- docutils, pygments (optional, to build documentation)
- mercurial (optional, to download the source)
- g++, python-dev (optional, to compile generated C code)
- `psyco <http://psyco.sourceforge.net/>`__ can make your python code much faster, if you are on a 32-bit x86 architecture. If you use compiled C code, this can be less important.
Downloading Theano
------------------
There are two ways to get the source: mercurial (required for library developers) and unix tar.
There are no stable releases yet.
*To get the source via mercurial,* you must have `mercurial <http://www.selenic.com/mercurial/wiki/>`__ installed.
Get the source and run the auto-tests like this:
.. code-block:: bash
hg clone http://pylearn.org/hg/theano theano
cd theano
python autotest.py
To update your library to the latest on pylearn.org, change directory (`cd`) to this `theano` folder and type
.. code-block:: bash
hg pull -u
*To get the source via unix tar*, you can download the latest source directly as a gzip'd tar file:
`<http://pylearn.org/hg/theano/archive/tip.tar.gz>`__.
Two environment variables are used to control automatic code generation.
(It is possible to use theano in a way that avoids all automatic code generation, but the functions you make using {{{theano.function}}} will execute more slowly.)
- `THEANO_BLAS_LDFLAGS`:
a space-separated list of library names to link against for BLAS functions. Default: `-lblas`
- `THEANO_COMPILEDIR`:
a directory with read/write access permissions, where theano will store
autogenerated code and c modules. Default: `$HOME/.theano`. If this
directory does not exist, or does not have the correct permissions, then
theano will try to create it with the correct permissions. If that fails,
an exception will be raised and no C code will be compiled.
Setup on Linux
++++++++++++++
Setup on OS-X
+++++++++++++
- Install [http://www.macports.org/ MacPorts]
- `sudo port install gcc42 py25-zlib py25-numpy py25-scipy mercurial`.
Note that compiling gcc42 takes a significant time (hours) so it's probably
not the best solution if you're in a rush! In my (Doomie) experience, scipy
failed to compile the first time I tried the command, but the second time
it compiled just fine. Same thing with py25-zlib.
- Install some kind of BLAS library (TODO: how?)
- Set THEANO_BLAS_LDFLAGS to something which will link against said BLAS
library. (e.g., `THEANO_BLAS_LDFLAGS='-lcblas -latlas -lgfortran'`).
Setup on Windows
++++++++++++++++
No one has done this yet. WRITEME.
Tips for running at LISA
++++++++++++++++++++++++
Use the fast BLAS library that Fred installed, by setting
`THEANO_BLAS_LDFLAGS=-lgoto`.
Tips for running on a cluster
+++++++++++++++++++++++++++++
Use something like the following in your .bashrc:
.. code-block:: bash
#use the intel math-kernel library for BLAS routines
THEANO_BLAS_LDFLAGS=-lmkl
# use up to two threads in the MKL routines
OMP_NUM_THREADS=2
# IMPORTANT!
# Use the local-temporary directory as a cache.
# If several jobs start simultaneously and use a common
# cache, then the cache may be corrupted.
# Theano is not process-safe or thread-safe in this sense.
THEANO_COMPILEDIR=/ltmp/<username>_theano
Running the Test Suite
======================
Test your installation by running the autotests. Type at the shell:
.. code-block:: bash
cd theano
python2.5 autotest.py
All tests should pass.
Using Theano
============
Now that you've got theano installed and running, check out the `n00b tutorial <doc/n00b.html>`__ for how to use it.
Getting Help
============
If these installation instructions don't work, search the theano-users archive for similar cases. If you don't find a solution, write to theano-users and explain the situation.
.. header:: |THEANO| - README_ - Download_ - Documentation_ - Wiki_ - `Task List`_
.. _README: README.html
.. _Download: README.html#downloading-theano
.. _Documentation: doc/index.html
.. _Wiki: http://pylearn.org/theano
.. _task list: http://lgcm.iro.umontreal.ca/theano/query?status=accepted&status=assigned&status=new&status=reopened&group=milestone&max=200&col=id&col=summary&col=status&col=owner&col=type&col=priority&col=component&col=time&report=9&order=priority
.. |THEANO| image:: http://lgcm.iro.umontreal.ca/theano/chrome/site/theano_logo.png
:target: http://pylearn.org/auto_theano
:alt: THEANO
:align: top
:class: borderless
:width: 60
:height: 18
...@@ -533,64 +533,6 @@ DotTester = make_tester(name = 'DotTester', ...@@ -533,64 +533,6 @@ DotTester = make_tester(name = 'DotTester',
# rationale: it's tricky, and necessary everytime you want to verify # rationale: it's tricky, and necessary everytime you want to verify
# gradient numerically # gradient numerically
def verify_grad(testcase, op, pt, n_tests=1, rng=numpy.random, eps=0.0000001, tol=0.0001,
linker='c&py'):
"""testcase.failUnless(analytic gradient matches finite-diff gradient)"""
pt = [numpy.asarray(p) for p in pt]
for test_num in xrange(n_tests):
# tensor_pt = [as_tensor(p,name='input %i'%i) for i,p in enumerate(pt)]
tensor_pt = [constant(p).type('input %i'%i) for i,p in enumerate(pt)]
#o = op.make_node(*[tpt.copy() for tpt in tensor_pt])
o = safe_make_node(op, *[tpt.copy() for tpt in tensor_pt])
if hasattr(o, 'outputs'):
o_outputs = o.outputs
else:
o_outputs = o
if len(o_outputs) > 1:
raise NotImplementedError('cant (yet) autotest gradient of op with multiple outputs')
# we could make loop over outputs making random projections R for each,
# but this doesn't handle the case where not all the outputs are
# differentiable... so I leave this as TODO for now -JB.
o_fn = function(tensor_pt, o_outputs[0], mode=compile.Mode(optimizer = None, linker = linker))
o_fn_out = o_fn(*pt)
random_projection = rng.rand(*o_fn_out.shape)
t_r = as_tensor(random_projection)
#random projection of o onto t_r
cost = sum(t_r * o_outputs[0])
cost_fn = function(tensor_pt, cost, mode=compile.Mode(optimizer = None, linker = linker))
num_grad = gradient.numeric_grad(cost_fn, pt)
symbolic_grad = grad(cost, tensor_pt,as_tensor(1.0,name='g_cost'))
if 0:
print '-------'
print '----------'
for op in gof.graph.io_toposort(tensor_pt, symbolic_grad):
print op
grad_fn = function(tensor_pt, symbolic_grad, mode=compile.Mode(optimizer = None, linker = linker))
analytic_grad = grad_fn(*pt)
if not isinstance(analytic_grad, (list, tuple)):
analytic_grad = [analytic_grad]
# if num_grad.max_err(analytic_grad) > 1.0e-4:
# print "aaaaaaaaaa"
# print gof.Env(tensor_pt, [cost])
# print gof.Env(tensor_pt, symbolic_grad)
# print analytic_grad
# print num_grad.gf
# print num_grad.max_err(analytic_grad)
# print "bbbbbbbbbb"
if num_grad.max_err(analytic_grad) > 1.0e-4:
raise Exception(verify_grad.E_grad)
verify_grad.E_grad = 'gradient error exceeded tolerance'
#useful mostly for unit tests #useful mostly for unit tests
...@@ -945,29 +887,100 @@ class T_subtensor(unittest.TestCase): ...@@ -945,29 +887,100 @@ class T_subtensor(unittest.TestCase):
class T_Stack(unittest.TestCase): class T_Join_and_Split(unittest.TestCase):
def test_hstack(self): """
a = as_tensor(numpy.array([[1, 2, 3], [4, 5, 6]])) Split is tested by each verify_grad method.
b = as_tensor(numpy.array([[7], [8]])) """
s = horizontal_stack(a, b)
c = numpy.array([[1, 2, 3, 7], [4, 5, 6, 8]]) class Join1(Op):
self.failUnless((eval_outputs([s]) == c).all()) def make_node(self, *inputs):
def test_vstack(self): inputs = [as_tensor(t) for t in inputs]
a = as_tensor(numpy.array([[1, 2, 3], [4, 5, 6]])) outputs = [lscalar()] + [i.type() for i in inputs]
b = as_tensor(numpy.array([[7, 8, 9]])) return Apply(self, inputs, outputs)
s = vertical_stack(a, b) def perform(self, node, inputs, outputs):
c = numpy.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) outputs[0][0] = 1
self.failUnless((eval_outputs([s]) == c).all()) for i,o in zip(inputs, outputs[1:]):
o[0] = i.copy()
def grad(self, inputs, g_outputs):
return g_outputs[1:]
def setUp(self):
Join.debug = False
def test_join_scalar(self):
a = as_tensor(1)
b = as_tensor(2)
try:
s = join(0, a, b)
except:
return
self.fail()
def test_stack_scalar(self):
a = as_tensor(1)
b = as_tensor(2)
c = as_tensor(3)
s = stack(a, b, c)
def test_vstack_grad(self): want = numpy.array([1, 2, 3])
self.failUnless((eval_outputs([s]) == want).all())
def test_join_vector(self):
a = as_tensor(numpy.array([1, 2, 3]))
b = as_tensor(numpy.array([7, 8, 9]))
s = join(0, a, b)
want = numpy.array([1, 2, 3, 7, 8, 9])
self.failUnless((eval_outputs([s]) == want).all())
def test_stack_vector(self):
a = as_tensor(numpy.array([1, 2, 3]))
b = as_tensor(numpy.array([7, 8, 9]))
s = stack(a, b)
want = numpy.array([[1, 2, 3],[ 7, 8, 9]])
self.failUnless((eval_outputs([s]) == want).all())
def test_join_matrix0(self):
a = as_tensor(numpy.array([[1, 2, 3], [4, 5, 6]])) a = as_tensor(numpy.array([[1, 2, 3], [4, 5, 6]]))
b = as_tensor(numpy.array([[7, 8, 9]])) b = as_tensor(numpy.array([[7, 8, 9]]))
s = vertical_stack(a, b) s = join(0, a, b)
ga,gb = grad(sum(vertical_stack(a,b)), [a,b])
want = numpy.array([[1, 2, 3],[4,5,6],[7, 8, 9]])
self.failUnless((eval_outputs([s]) == want).all())
def test_join_matrix1(self):
av=numpy.array([[1, 2, 3], [4, 5, 6]], dtype='float32')
bv= numpy.array([[7], [8]],dtype='float32')
a = as_tensor(av)
b = as_tensor(bv)
s = join(1, a, b)
want = numpy.array([[1, 2, 3, 7], [4, 5, 6, 8]], dtype='float32')
self.failUnless((eval_outputs([s]) == want).all())
verify_grad(self, lambda a, b: join(1,a,b), [av, bv], eps=1.0e-4, tol=1.0e-3)
def test_join_matrixV(self):
"""variable join axis"""
v = numpy.array([[1., 2., 3.], [4., 5., 6.]])
a = as_tensor(v.copy())
b = as_tensor(v.copy())
ax = lscalar()
s = join(ax, a, b)
f = function([ax], [s])
want = numpy.array([[1, 2, 3], [4, 5, 6] ,[1, 2, 3], [4, 5, 6]])
got = f(0)
self.failUnless((got == want).all(), (got, want))
want = numpy.array([[ 1, 2, 3, 1, 2, 3], [4, 5, 6, 4, 5, 6]])
got = f(1)
self.failUnless((got == want).all(), (got, want))
gval = eval_outputs([ga, gb]) verify_grad(self, lambda a, b: join(0,a,b), [v, 2*v])
self.failUnless(numpy.all(gval[0] == 1.0)) verify_grad(self, lambda a, b: join(1,a,b), [v, 2*v])
self.failUnless(numpy.all(gval[1] == 1.0))
class _test_comparison(unittest.TestCase): class _test_comparison(unittest.TestCase):
...@@ -1761,10 +1774,10 @@ class T_op_cache(unittest.TestCase): ...@@ -1761,10 +1774,10 @@ class T_op_cache(unittest.TestCase):
self.failUnless(numpy.all(fn_py(a) == fn_c_or_py(a))) self.failUnless(numpy.all(fn_py(a) == fn_c_or_py(a)))
if __name__ == '__main__': if __name__ == '__main__':
if 1: if 0:
unittest.main() unittest.main()
else: else:
testcase = t_dot testcase = AbsInplaceTester
suite = unittest.TestLoader() suite = unittest.TestLoader()
suite = suite.loadTestsFromTestCase(testcase) suite = suite.loadTestsFromTestCase(testcase)
......
"""Convenient driver of graph construction, optimization, and linking.""" """Convenient driver of graph construction, optimization, and linking."""
import copy_reg
import cPickle
from functools import partial
import numpy import numpy
import gof import gof
import sys import sys
from copy import copy from copy import copy
import tensor_opt
def check_equal(x, y): def check_equal(x, y):
""" """
...@@ -57,6 +60,12 @@ predefined_linkers = { ...@@ -57,6 +60,12 @@ predefined_linkers = {
default_linker = 'c|py' default_linker = 'c|py'
def register_linker(name, linker):
"""Add a `Linker` which can be referred to by `name` in `Mode`."""
if name in predefined_linkers:
raise ValueError('Linker name already taken: %s' % name)
predefined_linkers[name] = linker
# If a string is passed as the optimizer argument in the constructor # If a string is passed as the optimizer argument in the constructor
# for Mode, it will be used as the key to retrieve the real optimizer # for Mode, it will be used as the key to retrieve the real optimizer
...@@ -64,13 +73,15 @@ default_linker = 'c|py' ...@@ -64,13 +73,15 @@ default_linker = 'c|py'
predefined_optimizers = { predefined_optimizers = {
None : lambda env: None, None : lambda env: None,
'merge' : gof.MergeOptimizer(), 'merge' : gof.MergeOptimizer(),
'math' : gof.MergeOptMerge(
gof.PureThenInplaceOptimizer(tensor_opt.math_optimizer,
tensor_opt.inplace_optimizer))
} }
default_optimizer = 'merge' default_optimizer = 'merge'
def register_optimizer(name, opt):
"""Add a `Optimizer` which can be referred to by `name` in `Mode`."""
if name in predefined_optimizers:
raise ValueError('Optimizer name already taken: %s' % name)
predefined_optimizers[name] = opt
class Mode(object): class Mode(object):
""" """
...@@ -110,15 +121,14 @@ class Mode(object): ...@@ -110,15 +121,14 @@ class Mode(object):
# If a string is passed as the mode argument in function or # If a string is passed as the mode argument in function or
# FunctionMaker, the Mode will be taken from this dictionary using the # FunctionMaker, the Mode will be taken from this dictionary using the
# string as the key # string as the key
predefined_modes = { predefined_modes = {'FAST_COMPILE': Mode('py', 'merge')}
'SANITY_CHECK' : Mode('c&py', 'math'), default_mode = 'FAST_COMPILE'
'FAST_COMPILE' : Mode('py', 'merge'),
'FAST_RUN' : Mode('c|py', 'math'),
'EXPENSIVE_OPTIMIZATIONS' : Mode('c|py', 'math'),
}
default_mode = 'FAST_RUN'
def register_mode(name, mode):
"""Add a `Mode` which can be referred to by `name` in `function`."""
if name in predefined_modes:
raise ValueError('Mode name already taken: %s' % name)
predefined_modes[name] = mode
...@@ -508,9 +518,6 @@ class FunctionMaker(object): ...@@ -508,9 +518,6 @@ class FunctionMaker(object):
return fn return fn
import copy_reg
import cPickle
def _pickle_FunctionMaker(fm): def _pickle_FunctionMaker(fm):
return (_constructor_FunctionMaker, (fm.inputs, fm.outputs, fm.mode, fm.accept_inplace)) return (_constructor_FunctionMaker, (fm.inputs, fm.outputs, fm.mode, fm.accept_inplace))
...@@ -527,8 +534,6 @@ copy_reg.pickle(slice, _pickle_slice) ...@@ -527,8 +534,6 @@ copy_reg.pickle(slice, _pickle_slice)
from functools import partial
DUPLICATE = ['DUPLICATE'] # unique id object used as a placeholder for duplicate entries DUPLICATE = ['DUPLICATE'] # unique id object used as a placeholder for duplicate entries
class Function(object): class Function(object):
......
=====================
Developer Start Guide
=====================
- Learn about the basics of using mercurial.
- Learn some `non-basic python`_ to understand what's going on in some of the
tricker files (like tensor.py).
- BasicNumpy_ essential things to know about numpy.
- Learn to write reStructuredText_ for epydoc_.
- ExternalTools - packages that play well with Numpy
- EssentialUnitTest - essential usage of python.unittest
Accounts
========
To obtain developer access: send an email to an admin with an username and
temporary password. Pending approval, this will give you access to both the
repository and Trac. You should then change your password in the
`<http://pylearn.org/theano/prefs preferences>` tab - do *NOT* use a good
password! We are using plain text http which is not secure.
Theano code
===========
The code that makes up Theano is in a single repository available in
`<http://pylearn.org/hg/theano>`__.
As a developer, you should clone this repository like this:
- `hg clone 'http://username:password@pylearn.org/hg/theano' theano`
Setting up your environment
===========================
Some notes on the environment variable $PYTHONPATH.
If theano lives in $DEV/theano, you should have $DEV in your $PYTHONPATH. You should '''not''' have $DEV/theano in your $PYTHONPATH.
Olivier Breuleux explains:
$PYTHONPATH should contain a ":"-separated list of paths, each of which contains one or several Python packages, in the order in which you would like Python to search for them. If a package has sub-packages of interest to you, do _not_ add them in the path: it is not portable, might shadow other packages or short-circuit important things in its __init__.
I advise to never import theano's files from outside theano itself (and I think that is good advice for Python packages in general). Use "from theano import tensor" instead of "import tensor". ... $PYTHONPATH ... should only contain paths to complete packages, so you don't get surprises if I add files that enter in conflict with other packages.
When you install a package, only the package name can be imported directly. If you want a sub-package, you must import it from the main package. That's how it will work in 99.9% of installs because it is the default. Therefore, if you stray from this practice, your code will not be portable. Also, some ways to circumvent circular dependencies might make it so you have to import files in a certain order, which is best handled by the package's own __init__.py.
.. _non-basic python: http://lgcm.iro.umontreal.ca/theano/wiki/NonbasicPython
.. _reStructuredText: http://docutils.sourceforge.net/rst.html
.. _epydoc: http://epydoc.sourceforge.net/
.. _basicnumpy: http://lgcm.iro.umontreal.ca/theano/wiki/BasicNumpy
.. header:: |THEANO| - README_ - Download_ - Documentation_ - Wiki_ - `Task List`_
.. |THEANO| image:: http://lgcm.iro.umontreal.ca/theano/chrome/site/theano_logo.png
:target: http://pylearn.org/auto_theano
:alt: THEANO
:align: top
:class: borderless
:width: 60
:height: 18
.. _README: ../README.html
.. _Download: ../README.html#downloading-theano
.. _Documentation: index.html
.. _Wiki: http://pylearn.org/theano
.. _TRAC: http://trac.edgewall.org/
.. _task list: http://lgcm.iro.umontreal.ca/theano/query?status=accepted&status=assigned&status=new&status=reopened&group=milestone&max=200&col=id&col=summary&col=status&col=owner&col=type&col=priority&col=component&col=time&report=9&order=priority
...@@ -29,12 +29,21 @@ except: ...@@ -29,12 +29,21 @@ except:
# real ``epydoc`` package. So remove ``sys.path[0]``, which contains the # real ``epydoc`` package. So remove ``sys.path[0]``, which contains the
# directory of the script. # directory of the script.
import sys, os.path import sys, os.path
script_path = os.path.abspath(sys.path[0])
sys.path = [p for p in sys.path if os.path.abspath(p) != script_path] # I leave this in place actually, so that I can import pygments_code_block_directive
#script_path = os.path.abspath(sys.path[0])
#sys.path = [p for p in sys.path if os.path.abspath(p) != script_path]
import epydoc.docwriter.xlink as xlink import epydoc.docwriter.xlink as xlink
from docutils.core import publish_cmdline, default_description from docutils.core import publish_cmdline, default_description
try:
# .. code-block:: python should look nice with this
import pygments_code_block_directive
except Exception, e:
print >> sys.stderr, "Failed to import pygments", e
description = ('Generates (X)HTML documents with API documentation links. ' description = ('Generates (X)HTML documents with API documentation links. '
+ default_description) + default_description)
publish_cmdline(reader=xlink.ApiLinkReader(), writer_name='html', publish_cmdline(reader=xlink.ApiLinkReader(), writer_name='html',
......
#!/bin/bash
APIRST2HTML=apirst2html.py
EPYDOC_ARGS='--external-api=api --external-api-file=api:../html/api/api-objects.txt --external-api-root=api:epydoc/'
mkdir html 2> /dev/null
for RST in graph ; do
$APIRST2HTML $EPYDOC_ARGS $RST.txt html/$RST.html
done
td.linenos { background-color: #f0f0f0; padding-right: 10px; }
span.lineno { background-color: #f0f0f0; padding: 0 5px 0 5px; }
pre { line-height: 125%; }
body { background: #ffffff; }
body .c { color: #808080 } /* Comment */
body .err { color: #F00000; background-color: #F0A0A0 } /* Error */
body .k { color: #008000; font-weight: bold } /* Keyword */
body .o { color: #303030 } /* Operator */
body .cm { color: #808080 } /* Comment.Multiline */
body .cp { color: #507090 } /* Comment.Preproc */
body .c1 { color: #808080 } /* Comment.Single */
body .cs { color: #cc0000; font-weight: bold } /* Comment.Special */
body .gd { color: #A00000 } /* Generic.Deleted */
body .ge { font-style: italic } /* Generic.Emph */
body .gr { color: #FF0000 } /* Generic.Error */
body .gh { color: #000080; font-weight: bold } /* Generic.Heading */
body .gi { color: #00A000 } /* Generic.Inserted */
body .go { color: #808080 } /* Generic.Output */
body .gp { color: #c65d09; font-weight: bold } /* Generic.Prompt */
body .gs { font-weight: bold } /* Generic.Strong */
body .gu { color: #800080; font-weight: bold } /* Generic.Subheading */
body .gt { color: #0040D0 } /* Generic.Traceback */
body .kc { color: #008000; font-weight: bold } /* Keyword.Constant */
body .kd { color: #008000; font-weight: bold } /* Keyword.Declaration */
body .kp { color: #003080; font-weight: bold } /* Keyword.Pseudo */
body .kr { color: #008000; font-weight: bold } /* Keyword.Reserved */
body .kt { color: #303090; font-weight: bold } /* Keyword.Type */
body .m { color: #6000E0; font-weight: bold } /* Literal.Number */
body .s { background-color: #fff0f0 } /* Literal.String */
body .na { color: #0000C0 } /* Name.Attribute */
body .nb { color: #007020 } /* Name.Builtin */
body .nc { color: #B00060; font-weight: bold } /* Name.Class */
body .no { color: #003060; font-weight: bold } /* Name.Constant */
body .nd { color: #505050; font-weight: bold } /* Name.Decorator */
body .ni { color: #800000; font-weight: bold } /* Name.Entity */
body .ne { color: #F00000; font-weight: bold } /* Name.Exception */
body .nf { color: #0060B0; font-weight: bold } /* Name.Function */
body .nl { color: #907000; font-weight: bold } /* Name.Label */
body .nn { color: #0e84b5; font-weight: bold } /* Name.Namespace */
body .nt { color: #007000 } /* Name.Tag */
body .nv { color: #906030 } /* Name.Variable */
body .ow { color: #000000; font-weight: bold } /* Operator.Word */
body .w { color: #bbbbbb } /* Text.Whitespace */
body .mf { color: #6000E0; font-weight: bold } /* Literal.Number.Float */
body .mh { color: #005080; font-weight: bold } /* Literal.Number.Hex */
body .mi { color: #0000D0; font-weight: bold } /* Literal.Number.Integer */
body .mo { color: #4000E0; font-weight: bold } /* Literal.Number.Oct */
body .sb { background-color: #fff0f0 } /* Literal.String.Backtick */
body .sc { color: #0040D0 } /* Literal.String.Char */
body .sd { color: #D04020 } /* Literal.String.Doc */
body .s2 { background-color: #fff0f0 } /* Literal.String.Double */
body .se { color: #606060; font-weight: bold; background-color: #fff0f0 } /* Literal.String.Escape */
body .sh { background-color: #fff0f0 } /* Literal.String.Heredoc */
body .si { background-color: #e0e0e0 } /* Literal.String.Interpol */
body .sx { color: #D02000; background-color: #fff0f0 } /* Literal.String.Other */
body .sr { color: #000000; background-color: #fff0ff } /* Literal.String.Regex */
body .s1 { background-color: #fff0f0 } /* Literal.String.Single */
body .ss { color: #A06000 } /* Literal.String.Symbol */
body .bp { color: #007020 } /* Name.Builtin.Pseudo */
body .vc { color: #306090 } /* Name.Variable.Class */
body .vg { color: #d07000; font-weight: bold } /* Name.Variable.Global */
body .vi { color: #3030B0 } /* Name.Variable.Instance */
body .il { color: #0000D0; font-weight: bold } /* Literal.Number.Integer.Long */
...@@ -12,18 +12,12 @@ Subtitle ...@@ -12,18 +12,12 @@ Subtitle
Here is some stuff. Here is some stuff.
.. code-block:: .. code-block:: python
def fib(n):
if n == 0: def fib(n):
return 1 if n == 0:
if n == 1: return 1
return 1 if n == 1:
return fib(n-1) + fib(n-1) return 1
return fib(n-1) + fib(n-1)
.. python::
def fib(n):
if n == 0:
return 1
if n == 1:
return 1
return fib(n-1) + fib(n-1)
.. header:: |THEANO| - README_ - Download_ - Documentation_ - Wiki_ - `Task List`_
.. |THEANO| image:: http://lgcm.iro.umontreal.ca/theano/chrome/site/theano_logo.png
:target: http://pylearn.org/auto_theano
:alt: THEANO
:align: top
:class: borderless
:width: 60
:height: 18
.. _README: ../README.html
.. _Download: ../README.html#downloading-theano
.. _Documentation: index.html
.. _Wiki: http://pylearn.org/theano
.. _task list: http://lgcm.iro.umontreal.ca/theano/query?status=accepted&status=assigned&status=new&status=reopened&group=milestone&max=200&col=id&col=summary&col=status&col=owner&col=type&col=priority&col=component&col=time&report=9&order=priority
=====================================
Theano Project Documentation Overview
=====================================
Documentation is divided broadly into two kinds: user documentation and
developer documentation.
- `Using Theano` covers how to *use* what is already in the Theano library to
build graphs and evaluate them.
- `Extending Theano` introduces how Theano works and explains how to add new
data and expression types, as well as optimizations to accompany them.
- `Hacking Theano` introduces you to what's under the hood: the compilation
process, the Env, C code generation.
Using Theano
============
- First of all, read the `n00b guide`_. It is a cut-and-paste, tutorial-style intro to what Theano can do.
- Familiarize yourself with the `glossary of terminology`_.
- Join `theano-users`_.
- Learn to use the typelist_, and the oplist_. These are the building blocks
of theano expression graphs.
- Browse through some of the `Howto`_ recipes on the wiki.
.. _Howto:
.. _theano-users: http://groups.google.com/group/theano-users?pli=1
.. _theano-dev: http://groups.google.com/group/theano-dev?pli=1
.. _n00b guide: n00b.html
.. _glossary of terminology: glossary.html
.. _typelist: typelist.html
.. _oplist: oplist.html
Extending Theano
================
- Read about `How Theano Works <UserAdvanced.html>`__. This introduces the
major interface data structures: Op, Type, Result, Apply.
- How to make a new Op.
- How to make a new Optimization.
- How to make a new data Type.
Hacking Theano
==============
- `Get Started as a Developer <DevStartGuide.html>`__ by setting up mercurial, getting a few accounts,
setting up your environment, and getting some background in mercurial, python,
and numpy.
- Join `theano-dev`_ to participate in development discussion.
- Pick a task from the `task list`_, or suggest one on `theano-users`_.
Features/ideas are generally discussed on `theano-users`_. Technical
discussions of how to actually implement something should be on
`theano-dev`_.
- Browse `Theano's API <../api/>`__.
- Keep an eye on the `Mercurial Changelog <http://pylearn.org/hg/theano>`__.
- Send us your work as a patch to `theano-dev`_ or commit directly to the trunk.
.. _theano-dev: http://groups.google.com/group/theano-dev?pli=1
.. _task list: http://lgcm.iro.umontreal.ca/theano/query?status=accepted&status=assigned&status=new&status=reopened&group=milestone&max=200&col=id&col=summary&col=status&col=owner&col=type&col=priority&col=component&col=time&report=9&order=priority
.. _reStructuredText: http://docutils.sourceforge.net/rst.html
.. header:: |THEANO| - README_ - Download_ - Documentation_ - Wiki_ - `Task List`_
.. |THEANO| image:: http://lgcm.iro.umontreal.ca/theano/chrome/site/theano_logo.png
:target: http://pylearn.org/auto_theano
:alt: THEANO
:align: top
:class: borderless
:width: 60
:height: 18
.. _README: ../README.html
.. _Download: ../README.html#downloading-theano
.. _Documentation: index.html
.. _Wiki: http://pylearn.org/theano
.. _TRAC: http://trac.edgewall.org/
.. _task list: http://lgcm.iro.umontreal.ca/theano/query?status=accepted&status=assigned&status=new&status=reopened&group=milestone&max=200&col=id&col=summary&col=status&col=owner&col=type&col=priority&col=component&col=time&report=9&order=priority
=============
n00b Tutorial
=============
.. contents::
*This documentation is still in-progress. 20080919*
Introduction
============
Great. You know `What theano is`_, and you've even `installed it`_.
But how do you use it?
.. _`What theano is`: http://lgcm.iro.umontreal.ca/theano/wiki/WhatIsTheano
.. _`installed it`: http://lgcm.iro.umontreal.ca/theano/wiki/InstallationNotes
If you have never used Theano before, we recommend you read over this tutorial start-to-finish. This will give you a sense of what you can do with Theano, and how.
Afterwards, we encourage you to read the documentation in accompanying links, which will allow you to understand the underlying concepts behind Theano better.
Scalar example
==============
In the following example, we will build a function `f(x) = x + 1.5`. We will then evaluate that function
.. code-block:: python
import theano
import theano.tensor as tensor
# Declare a symbolic constant
c = tensor.constant(1.5)
# Declare a symbolic floating-point scalar
x = tensor.fscalar()
# The symbolic result y is computed by adding x to c
y = x + c
# f is a function we build to compute output y given input x.
# f(x) = y
# = x + c
# = x + 1.5
f = theano.function([x], [y])
# We now bind 2.5 to an internal copy of x and evaluate an internal y,
# which we return.
# We assert that 4.0 == f(2.5) = 2.5 + 1.5
assert 4.0 == f(2.5)
In the example above, `c`, `x`, and `y` are each a ''symbolic'' result_. They
are symbolic because they stand for variables and have a type_, but
do not necessarily store actual values. Not yet, at least. (To give them
values, we will have to `evaluate` them. More on this below.)
.. _result: glossary.html#result
.. _type: glossary.html#type
Since we are using the addition operator (`x + c`) here on symbolic results, the
output `y` is also symbolic. The `+` corresponds to an ''operation'' in theano
terminology, or ''op'' for short.
We use these results and ops to construct a `symbolic graph`_. The graph is
symbolic because we declare what it computes, but do not actually perform any
computation. Some type-checking is done on while we build our graphs, so if you
try to do something really crazy you'll see an exception right away.
.. _symbolic graph: glossary.html#symbolicgraph
To actually use our graph for computation, we have to compile (or build) it into
a function `f`. The compiled function is actually capable of performing
computation. So after we have built f, we use it to compute the value of y from
a `value input` x. Some argument checking is only possible at run-time, so if
you ask for impossible things (i.e. logarithm of a negative number, sum of
matrices with different shapes) then you will get exceptions from the compiled
function. These exceptions can be tricky to understand, but we feel your pain
and we are working hard to make these problems errors easier to fix.
*TODO: Is concrete the opposite of symbolic? Do we actually have a term for this?*
*TODO: Go over TerminologyGlossary and make sure we touch on / link to most basic concepts in the above.*
*It would be worth thinking through the order in which these terms should be introduced.
Can we inline the text?'''*
*Note: Theano has two types of [DefineScalar scalar].*
Matrix example
==============
In the following example, we will build a function to evaluate the dot product `f(x) = dot(x, w)`.
*TODO: Are there ways we can nicely format the matrix math?*
.. code-block:: python
import theano
import theano.tensor as tensor
# Define the symbolic results
x_sym = tensor.matrix()
w_sym = tensor.matrix()
y_sym = tensor.dot(x_sym, w_sym)
f = theano.function([x_sym, w_sym], [y_sym])
from numpy import asarray
# Now, choose concrete x and w values.
# x = [[1 2 3]
# [4 5 6]]
x = asarray([[1, 2, 3], [4, 5, 6]])
# w = [[ 1 2]
# [-1 -2]
# [ 3 3]]
w = asarray([[1, 2], [-1, -2], [3, 3]])
# f(x, w) = [[ 8. 7.]
# [ 17. 16.]]
# .all() checks the equality over all matrix entries.
assert (f(x, w) == asarray([[8, 7], [17, 16]])).all()
*TODO: Explain the matrix and other interesting things going on here.*
*TODO: Explain that we have a lot of numpy functionality reimplemented. Link to
numpy docs and say familiarity won't hurt. Also link to list of available ops.*
Broadcasting example
====================
Broadcasting is a subtle and important concept in numpy, which I don't
completely understand. Regardless, here is an example of how broadcasting
works.
*WRITEME: Extend to above example to add a vector.*
Gradient example
================
We are going to write some gradient-based learning code.
You may now wish to review some
`matrix conventions <http://pylearn.org/pylearn/wiki/MatrixConventions>`__.
(Hint: Each row is a training instance, each column is a feature dimension.)
*WRITEME: A simple logistic regression example.*
State example
=============
In this example, we'll look at a complete logistic regression model, with
training by simple gradient descent.
.. code-block:: python
def build_logistic_regression_model(n_in, n_out, l2_coef=30.0)
# DECLARE SOME VARIABLES
import tensor as T
x = T.matrix() #our points, one point per row
y = T.matrix() #store our labels as place codes (label 3 of 5 is vector [00100])
w = T.matrix() #the linear transform to apply to our input points
b = T.vector() #a vector of biases, which make our transform affine instead of linear
stepsize = T.scalar('stepsize') # a stepsize for gradient descent
# REGRESSION MODEL AND COSTS TO MINIMIZE
prediction = T.softmax(T.dot(x, w) + b)
cross_entropy = T.sum(y * T.log(prediction) + (1-y) * T.log(1.0 - prediction), axis=1)
cost = T.sum(cross_entropy) + l2_coef * T.sum(T.sum(w*w))
# GET THE GRADIENTS NECESSARY TO FIT OUR PARAMETERS
grad_w, grad_b = T.grad(cost, [w, b])
#
# GET THE GRADIENTS NECESSARY TO FIT OUR PARAMETERS
update_fn = theano.function(
inputs = [x, y, stepsize,
In(w,
name='w',
value=numpy.zeros((n_in, n_out)),
update=w - stepsize * grad_w,
mutable=True,
strict=True)
In(b,
name='b',
value=numpy.zeros(n_out),
update=b - lr * grad_b,
mutable=True,
strict=True)
],
outputs = cost,
mode = 'EXPENSIVE_OPTIMIZATIONS')
apply_fn = theano.function(
inputs = [x, In(w, value=update_fn.storage[w]), In(b, value=update_fn.storage[b])],
outputs = [prediction])
return update_fn, apply_fn
#USUALLY THIS WOULD BE IN A DIFFERENT FUNCTION/CLASS
#FIT SOME DUMMY DATA: 100 points with 10 attributes and 3 potential labels
up_fn, app_fn = build_logistic_regression_model(n_in=10, n_out=3, l2_coef=30.0)
x_data = numpy.random.randn(100, 10)
y_data = numpy.random.randn(100, 3)
y_data = numpy.asarray(y_data == numpy.max(y_data, axis=1), dtype='int64')
print "Model Training ..."
for iteration in xrange(1000):
print " iter", iteration, "cost", update_fn(x_data, y_data, stepsize=0.0001)
print "Model Predictions"
print apply_fn(x_data)
Summary
=======
*TODO: Rewrite above examples to use doctest strings?*
*TODO: Go through above and link all terms, either to wiki documentation or to
epydoc documentation.*
*TODO: I would be useful to actually have example files like this in the source
code. The question is how to automatically extract the source files and inline
them into this documentation.*
.. header:: |THEANO| - README_ - Download_ - Documentation_ - Wiki_ - `Task List`_
.. |THEANO| image:: http://lgcm.iro.umontreal.ca/theano/chrome/site/theano_logo.png
:target: http://pylearn.org/auto_theano
:alt: THEANO
:align: top
:class: borderless
:width: 60
:height: 18
.. _README: ../README.html
.. _Download: ../README.html#downloading-theano
.. _Documentation: index.html
.. _Wiki: http://pylearn.org/theano
.. _task list: http://lgcm.iro.umontreal.ca/theano/query?status=accepted&status=assigned&status=new&status=reopened&group=milestone&max=200&col=id&col=summary&col=status&col=owner&col=type&col=priority&col=component&col=time&report=9&order=priority
#!/usr/bin/python
# :Author: a Pygments author|contributor; Felix Wiemann; Guenter Milde
# :Date: $Date: 2007-06-13 12:20:42 +0200 (Wed, 13 Jun 2007) $
# :Copyright: This module has been placed in the public domain.
#
# This is a merge of `Using Pygments in ReST documents`_ from the pygments_
# documentation, and a `proof of concept`_ by Felix Wiemann.
#
# ========== ===========================================================
# 2007-06-01 Removed redundancy from class values.
# 2007-06-04 Merge of successive tokens of same type
# (code taken from pygments.formatters.others).
# 2007-06-05 Separate docutils formatter script
# Use pygments' CSS class names (like the html formatter)
# allowing the use of pygments-produced style sheets.
# 2007-06-07 Merge in the formatting of the parsed tokens
# (misnamed as docutils_formatter) as class DocutilsInterface
# 2007-06-08 Failsave implementation (fallback to a standard literal block
# if pygments not found)
# ========== ===========================================================
#
# ::
"""Define and register a code-block directive using pygments
"""
# Requirements
# ------------
# ::
from docutils import nodes
from docutils.parsers.rst import directives
try:
import pygments
from pygments import highlight
from pygments.lexers import get_lexer_by_name
from pygments.formatters.html import _get_ttype_class
from pygments.styles import get_style_by_name
from pygments.lexers import PythonLexer
from pygments.formatters import HtmlFormatter
# Customisation
# -------------
#
# Do not insert inline nodes for the following tokens.
# (You could add e.g. Token.Punctuation like ``['', 'p']``.) ::
unstyled_tokens = ['']
# DocutilsInterface
# -----------------
#
# This interface class combines code from
# pygments.formatters.html and pygments.formatters.others.
#
# It does not require anything of docutils and could also become a part of
# pygments::
class DocutilsInterface(object):
"""Parse `code` string and yield "classified" tokens.
Arguments
code -- string of source code to parse
language -- formal language the code is written in.
Merge subsequent tokens of the same token-type.
Yields the tokens as ``(ttype_class, value)`` tuples,
where ttype_class is taken from pygments.token.STANDARD_TYPES and
corresponds to the class argument used in pygments html output.
"""
def __init__(self, code, language):
self.code = code
self.language = language
def lex(self):
# Get lexer for language (use text as fallback)
try:
lexer = get_lexer_by_name(self.language)
except ValueError:
# info: "no pygments lexer for %s, using 'text'"%self.language
lexer = get_lexer_by_name('text')
return pygments.lex(self.code, lexer)
def join(self, tokens):
"""join subsequent tokens of same token-type
"""
tokens = iter(tokens)
(lasttype, lastval) = tokens.next()
for ttype, value in tokens:
if ttype is lasttype:
lastval += value
else:
yield(lasttype, lastval)
(lasttype, lastval) = (ttype, value)
yield(lasttype, lastval)
def __iter__(self):
"""parse code string and yield "clasified" tokens
"""
try:
tokens = self.lex()
except IOError:
print "INFO: Pygments lexer not found, using fallback"
# TODO: write message to INFO
yield ('', self.code)
return
for ttype, value in self.join(tokens):
yield (_get_ttype_class(ttype), value)
# code_block_directive
# --------------------
# ::
def code_block_directive(name, arguments, options, content, lineno,
content_offset, block_text, state, state_machine):
"""parse and classify content of a code_block
"""
language = arguments[0]
# create a literal block element and set class argument
if 0:
code_block = nodes.literal_block(classes=["code-block", language])
code_block += nodes.raw('<b>hello</b> one', 'hello two')
else:
code_block = nodes.literal_block(classes=["code-block", language])
# parse content with pygments and add to code_block element
for cls, value in DocutilsInterface(u'\n'.join(content), language):
if cls in unstyled_tokens:
# insert as Text to decrease the verbosity of the output.
code_block += nodes.Text(value, value)
else:
code_block += nodes.inline(value, value, classes=[cls])
if 0:
v = highlight(u'\n'.join(content), PythonLexer(),
HtmlFormatter(style='colorful', full=True, cssfile='blah.css'))
print help(nodes.Inline)
return [code_block]
# Register Directive
# ------------------
# ::
code_block_directive.arguments = (1, 0, 1)
code_block_directive.content = 1
directives.register_directive('code-block', code_block_directive)
# .. _doctutils: http://docutils.sf.net/
# .. _pygments: http://pygments.org/
# .. _Using Pygments in ReST documents: http://pygments.org/docs/rstdirective/
# .. _proof of concept:
# http://article.gmane.org/gmane.text.docutils.user/3689
#
# Test output
# -----------
#
# If called from the command line, call the docutils publisher to render the
# input::
except ImportError:
print >> sys.stderr, "Failed to import pygments"
pass
if __name__ == '__main__':
from docutils.core import publish_cmdline, default_description
description = "code-block directive test output" + default_description
try:
import locale
locale.setlocale(locale.LC_ALL, '')
except:
pass
# Uncomment the desired output format:
publish_cmdline(writer_name='pseudoxml', description=description)
# publish_cmdline(writer_name='xml', description=description)
# publish_cmdline(writer_name='html', description=description)
# publish_cmdline(writer_name='latex', description=description)
# publish_cmdline(writer_name='newlatex2e', description=description)
/*
* :Author: Your Name
* :Contact: Your Email Address
* :Copyright: This stylesheet has been placed in the public domain.
*
* Stylesheet for use with Docutils. [Optionally place a more
* detailed description here.]
* */
@import url(html4css1.css); /* for basic rst stuff */
@import url(colorful.css); /* for source highlighting */
/* Your customizations go here. For example: */
/*
h1, h2, h3, h4, h5, h6, p.topic-title {
font-family: sans-serif }
*/
Title
=====
Some text.
Subtitle
--------
More stuff_.
.. _stuff:: http://www.google.com
...@@ -3,100 +3,164 @@ __docformat__ = "restructuredtext en" ...@@ -3,100 +3,164 @@ __docformat__ = "restructuredtext en"
import sys import sys
import gof import gof
def isOpClass(thing): def print_title(title_string, under_char, over_char=''):
return hasattr(thing, 'perform') and not isinstance(thing, gof.Op) l = len(title_string)
if over_char:
print over_char * l
def isOpConstructor(thing, module):
return hasattr(thing, 'perform') and isinstance(thing, gof.Op)\
or thing in getattr(module, '_constructor_list', [])
def print_title(title_string, under_char):
print title_string print title_string
print under_char * len(title_string)
print ""
def chomp(s): if under_char:
"""interpret and left-align a docstring""" print under_char * l
if 'subtensor' in s: print ""
debug = 0
else:
debug = 0
r = [] def print_hline():
leadspace = True print '-' * 80
for c in s:
if leadspace and c in ' \n\t': class Entry:
continue """Structure for generating the oplist file"""
symbol = None
name = None
module = None
docstring = None
tags = []
def __init__(self, symbol, name, current_module):
self.symbol = symbol
self.name = name
self.module = symbol.__module__ #current_module.__name__ # symbol.__module__
self.docstring = symbol.__doc__
self.tags = ['module:%s' % current_module.__name__] + getattr(symbol, '__oplist_tags', [])
def mini_desc(self, maxlen=50):
"""Return a short description of the op"""
def chomp(s):
"""interpret and left-align a docstring"""
if 'subtensor' in s:
debug = 0
else:
debug = 0
r = []
leadspace = True
for c in s:
if leadspace and c in ' \n\t':
continue
else:
leadspace = False
if c == '\n':
if debug:
print >> sys.stderr, 'breaking'
break
if c in '\t*`':
c = ' ';
r.append(c)
if debug:
print >> sys.stderr, r
return "".join(r)
minmax = 5
assert maxlen >= minmax
if not self.docstring:
return "" #+ '(no doc)'
elif len(self.docstring) < maxlen:
return chomp(self.docstring)
else: else:
leadspace = False return "%s ..."% chomp(self.docstring[:maxlen-minmax])
if c == '\n': apilink = property(lambda self: ":api:`%s.%s`"% (self.module, self.name))
if debug: """Return the ReST link into the epydoc of this symbol"""
print >> sys.stderr, 'breaking'
break class EntryOp(Entry):
if c == '\t': def __init__(self, symbol, *args):
c = ' '; has_perform = hasattr(symbol, 'perform')
r.append(c) if symbol is gof.Op:
raise TypeError('not an Op subclass')
if debug: if not issubclass(symbol, gof.Op):
print >> sys.stderr, r raise TypeError('not an Op subclass')
Entry.__init__(self, symbol, *args)
class EntryConstructor(Entry):
def __init__(self, symbol, name, module):
is_op = isinstance(symbol, gof.Op)
is_ctor = symbol in getattr(module, '__oplist_constructor_list', [])
if not (is_op or is_ctor):
raise TypeError('not a constructor', symbol)
Entry.__init__(self, symbol, name, module)
def search_entries(module_list):
ops = []
constructors = []
for module in module_list:
symbol_name_list = [s for s in dir(module) if not s[0] == '_']
return "".join(r) for symbol_name in symbol_name_list:
symbol = getattr(module, symbol_name)
try:
ops.append(EntryOp(symbol, symbol_name, module))
except TypeError:
try:
constructors.append(EntryConstructor(symbol, symbol_name, module))
except TypeError:
pass
return ops, constructors
def print_entries(ops, constructors):
tags = {}
for o in ops + constructors:
for t in o.tags:
tags.setdefault(t, []).append(o)
for t in tags:
print_title(t, '=')
tagged_ops = [op for op in tags[t] if isinstance(op, EntryOp)]
if len(tagged_ops):
print_title('Op Classes', '-')
for op in tagged_ops:
print "- %s" % op.apilink
print " %s" % op.mini_desc()
print ""
tagged_ops = [op for op in tags[t] if isinstance(op, EntryConstructor)]
if len(tagged_ops):
print_title('Op Constructors', '-')
for op in tagged_ops:
print "- %s" % op.apilink
print " %s" % op.mini_desc()
print ""
def generate(): if __name__ == "__main__":
"""Generate the op list""" """Generate the op list"""
import scalar, sparse, tensor import scalar, sparse, tensor
print_title("Theano Op List", "~") print_title("Op List", "~", "~")
print """
This page lists the `Op Classes` and `constructors` that are provided by the Theano library.
`Op Classes` drive from :api:`Op`, whereas `constructors` are typically `Op Class` instances, but may be true Python functions.
In the future, this list may distinguish `constructors` that are Op instances from true Python functions.
"""
print_hline()
print "" print ""
print ".. contents:: " print ".. contents:: "
print "" print ""
for module in [scalar, sparse, tensor]: ops, constructors = search_entries([scalar, sparse, tensor])
print_title('module: `%s`' % module.__name__, '=')
print_title('Op Classes', '-')
symbol_name_list = [s for s in dir(module) if not s[0] == '_'] print_entries(ops, constructors)
for symbol_name in symbol_name_list:
symbol = getattr(module, symbol_name)
if isOpClass(symbol):
print ""
print "- :api:`%s.%s`" % (symbol.__module__, symbol_name)
docstring = getattr(symbol, '__doc__', "")
if not docstring:
print " ", '(no doc)'
elif len(docstring) < 50:
print " ", chomp(docstring)
else:
print " ", chomp(docstring[:40]), "..."
# a little trailing whitespace
print ""
print_title('Op Constructors', '-')
for symbol_name in symbol_name_list:
symbol = getattr(module, symbol_name)
if isOpConstructor(symbol, module): print ""
print ""
print "- :api:`%s.%s`" % (symbol.__module__, symbol_name)
docstring = getattr(symbol, '__doc__', "")
if not docstring:
print " ", 'No documentation'
elif len(docstring) < 50:
print " ", chomp(docstring)
else:
print " ", chomp(docstring[:40]), "..."
# a little trailing whitespace
print ""
if __name__ == "__main__": for line in open("doc/header.txt"):
generate() print line[:-1]
from gen_oplist import print_title, print_hline
if __name__ == '__main__':
print_title("Type List", "~", "~")
print "*THIS PAGE IS A PLACEHOLDER: WRITEME*"
print ""
print_hline()
print ""
print ".. contents::"
print ""
print_title("Type Classes", '=')
print "- scalar.Scalar\n"
print "- tensor.Tensor\n"
print "- sparse.Sparse\n"
print_title("Type Instances", '=')
print "- scalar.int8\n"
print "- tensor.lvector\n"
print "- sparse.??\n"
print ""
for line in open("doc/header.txt"):
print line[:-1]
...@@ -110,62 +110,4 @@ def grad_sources_inputs(sources, graph_inputs): ...@@ -110,62 +110,4 @@ def grad_sources_inputs(sources, graph_inputs):
gmap[r] = g_r gmap[r] = g_r
return gmap return gmap
class numeric_grad:
def __init__(self, f, pt, eps=1.0e-7):
"""Return the gradient of f at pt.
This function computes the gradient by a one-sided finite differences of a
fixed step size (eps).
It is assumed that f(...) will return a scalar.
It is assumed that all f's inputs are numpy.ndarray objects.
"""
gf = [numpy.ndarray(x.shape) for x in pt]
f_pt = f(*pt)
if isinstance(f, (list, tuple)):
f_pt = [numpy.copy(x) for x in f_pt]
else:
f_pt = numpy.copy(f_pt)
for idx in xrange(len(gf)):
if len(pt[idx].shape) == 0:
orig = pt[idx]
pt[idx] = numpy.asarray(pt[idx] + eps)
f_eps = f(*pt)
gf[idx] = numpy.asarray((f_eps - f_pt)/eps)
pt[idx] = orig
elif len(pt[idx].shape) == 1:
for i in xrange(pt[idx].shape[0]):
orig = pt[idx][i]
pt[idx][i] = pt[idx][i] + eps
f_eps = f(*pt)
gf[idx][i] = numpy.asarray((f_eps - f_pt)/eps)
pt[idx][i] = orig
elif len(pt[idx].shape) == 2:
for i in xrange(pt[idx].shape[0]):
for j in xrange(pt[idx].shape[1]):
orig = pt[idx][i,j]
pt[idx][i,j] = pt[idx][i,j] + eps
f_eps = f(*pt)
gf[idx][i,j] = numpy.asarray((f_eps - f_pt)/eps)
pt[idx][i,j] = orig
else:
raise NotImplementedError()
self.gf = gf
@staticmethod
def abs_rel_err(a,b,eps=1.0e-10):
"""Return a small number when a and b are close, relative to how big they are"""
return abs(a-b) / (abs(a)+abs(b)+eps)
def max_err(self, g_pt):
"""Return the biggest relative error between g_pt and self.gf"""
assert len(g_pt) == len(self.gf)
errs = []
for a, b in zip(g_pt, self.gf):
errs.append(numpy.max(numeric_grad.abs_rel_err(a,b)))
return max(errs)
======
Theano
======
---------------------------------------------------------------
An optimizing compiler for matrix valued expressions in Python
---------------------------------------------------------------
Theano is an optimizing compiler in Python, built to evaluate complicated
expressions (especially matrix-valued ones) as quickly as possible.
It was written at LISA_ to explore techniques for machine learning.
Our project uses the name to honour the ancient Greek mathematician.
--------------------------------------------------------------------------------
.. _not in the normal sense: :wiki:`WhatIsTheano`
Overview
========
**To get up & running quickly** see README_.
All **documentation** can be reached from the `Theano Project Documentation Overview`_.
As developers of an open source project, we rely on **feedback** for
determining what features to implement, and what documentation needs to be
improved. The best forum for feedback is the theano-users_ mailing list.
All **discussion** about theano also takes place on the theano-users_ mailing list.
If you find a **bug**, please file a `bug report`_ or send email to
the theano-users_ mailing list. **Patch** submissions should be
sent to theano-dev_.
We welcome all kinds of **contributions**. Our `task list`_ is
full of interesting ideas awaiting a champion. If you have any
questions regarding how to extend Theano, please feel free to ask on
the Theano-dev_ mailing list.
Theano is in active development and should be considered **experimental**.
APIs are subject to change at any time.
Download
========
We recommend that you use the `latest snapshot`_,
Better yet, use `mercurial`_ to keep your installation fresh.
The snapshots usually contain *more features* and *fewer bugs* than the
"official" releases |---| they're not only for developers!
.. class:: credits
Docs by docutils_ and epydoc_.
Project by Mercurial_ and TRAC_.
Powered by Python_ and SciPy_.
Coded at the LISA_ lab.
.. class:: hidden
Google should index the mailing lists:
`theano-users <http://groups.google.com/group/theano-users?pli=1>`__,
and
`theano-dev <http://groups.google.com/group/theano-dev?pli=1>`__.
.. |---| unicode:: U+02014 .. em dash
:trim:
.. _latest snapshot: http://pylearn.org/hg/theano/archive/tip.tar.gz
.. _bug report: http://lgcm.iro.umontreal.ca/theano/newticket
.. _theano-users: http://groups.google.com/group/theano-users?pli=1
.. _theano-dev: http://groups.google.com/group/theano-dev?pli=1
.. _reStructuredText: rst.html
.. _task list: http://lgcm.iro.umontreal.ca/theano/query?status=accepted&status=assigned&status=new&status=reopened&group=milestone&max=200&col=id&col=summary&col=status&col=owner&col=type&col=priority&col=component&col=time&report=9&order=priority
.. _README: README.html
.. _Quick-Start: README.html#quick-start
.. _Theano Project Documentation Overview: doc/index.html
.. _Mercurial: http://www.selenic.com/mercurial/wiki/
.. _docutils: http://docutils.sourceforge.net
.. _epydoc: http://epydoc.sourceforge.net/
.. _scipy: http://scipy.org/
.. _Python: http://www.python.org/
.. _TRAC: http://trac.edgewall.org/
.. _LISA: http://www.iro.umontreal.ca/rubrique.php3?id_rubrique=27
.. |TRAC| image:: http://www.edgewall.org/gfx/trac_logo.png
:target: http://www.edgewall.org/
:alt: Trac Logo
:align: middle
:class: borderless
:width: 193
:height: 32
.. |Python| image:: python.png
:alt: Python Logo
:align: middle
:class: borderless
:width: 193
:height: 32
.. |LISA| image:: http://www.iro.umontreal.ca/images/neurone_chip2.jpg
:target: http://www.iro.umontreal.ca/rubrique.php3?id_rubrique=27
:width: 193
:height: 32
:alt: LISA Logo
:align: middle
:class: borderless
.. header:: |THEANO| - README_ - Download_ - Documentation_ - Wiki_ - `Task List`_
.. _Download: README.html#downloading-theano
.. _Documentation: doc/index.html
.. _Wiki: http://pylearn.org/theano
.. |THEANO| image:: http://lgcm.iro.umontreal.ca/theano/chrome/site/theano_logo.png
:target: http://pylearn.org/auto_theano
:alt: THEANO
:align: top
:class: borderless
:width: 60
:height: 18
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
End:
#!/bin/bash #!/bin/bash
APIRST2HTML=doc/apirst2html.py
EPYDOC_ARGS='--external-api=api --external-api-file=api:html/api/api-objects.txt --external-api-root=api:../api/'
mkdir -p html/api && mkdir -p html/doc mkdir -p html/api && mkdir -p html/doc
# this builds some stuff or something... basically makes the rest work properly # this builds some stuff or something... basically makes the rest work properly
# for a reason I don't understand. -JB 20080924 # for a reason I don't understand. -JB 20080924
python __init__.py python __init__.py
#runs if you called $./local.build_html.sh epydoc
if [ " $1" != " rst" ]; then if [ " $1" != " rst" ]; then
./epydoc --config local.epydoc ./epydoc --config local.epydoc
fi fi
#runs if you called $./local.build_html.sh rst
if [ " $1" != " epydoc" ]; then if [ " $1" != " epydoc" ]; then
python gen_oplist.py > doc/oplist.txt APIRST2HTML=doc/apirst2html.py
for RST in graph oplist ; do EPYDOC_ARGS='--external-api=api --external-api-file=api:html/api/api-objects.txt --external-api-root=api:../api/ --link-stylesheet'
$APIRST2HTML $EPYDOC_ARGS doc/$RST.txt html/doc/$RST.html # install the stylesheets
done HTML4CSS1='/usr/lib/python2.5/site-packages/docutils/writers/html4css1/html4css1.css'
cp $HTML4CSS1 html/html4css1.css
cp doc/colorful.css html/colorful.css
cp doc/style.css html/style.css
#generate the index & readme files
echo "$APIRST2HTML $EPYDOC_ARGS index.txt html/index.html..."
$APIRST2HTML -stg $EPYDOC_ARGS --stylesheet=style.css index.txt html/index.html
echo "$APIRST2HTML $EPYDOC_ARGS README.txt html/README.html..."
$APIRST2HTML -stg $EPYDOC_ARGS --stylesheet=style.css README.txt html/README.html
#generate the oplist in ReST format
echo "gen oplist..."
python gen_oplist.py > doc/oplist.txt
python gen_typelist.py > doc/typelist.txt
#generate html files for all the ReST documents in doc/
echo "gen doc/*.txt..."
for RST in doc/*.txt; do
BASENAME=$(basename $RST .txt)
echo "gen doc/$BASENAME.txt..."
$APIRST2HTML -stg $EPYDOC_ARGS --stylesheet=../style.css doc/$BASENAME.txt html/doc/$BASENAME.html
done
fi fi
from .. import tensor as T from theano import tensor as T
from .. import scalar as S from theano import scalar as S
from .. import gof from theano import gof
from copy import copy from copy import copy
import sys import sys
......
...@@ -2,6 +2,7 @@ ...@@ -2,6 +2,7 @@
__docformat__ = "restructuredtext en" __docformat__ = "restructuredtext en"
import __builtin__
import sys # for sys.maxint import sys # for sys.maxint
import inspect import inspect
import functools import functools
...@@ -20,17 +21,26 @@ import elemwise ...@@ -20,17 +21,26 @@ import elemwise
import scalar as scal import scalar as scal
from gof.python25 import partial from gof.python25 import partial
import compile
### set up the external interface ### set up the external interface
from elemwise import Elemwise, DimShuffle, CAReduce, Sum from elemwise import Elemwise, DimShuffle, CAReduce, Sum
_constructor_list = [] __oplist_constructor_list = []
"""List of functions to be listed as op constructors in the oplist (`gen_oplist`, doc/oplist.txt).""" """List of functions to be listed as op constructors in the oplist (`gen_oplist`, doc/oplist.txt)."""
def constructor(f): def constructor(f):
"""Make `f` appear as a constructor in the oplist (`gen_oplist`, doc/oplist.txt).""" """Add `f` to :doc:`oplist`.
_constructor_list.append(f)
Make `f` appear as a constructor in the oplist (`gen_oplist`, doc/oplist.txt).
"""
__oplist_constructor_list.append(f)
return f return f
def __oplist_tag(thing, tag):
tags = getattr(thing, '__oplist_tags', [])
tags.append(tag)
thing.__oplist_tags = tags
def as_tensor(x, name = None): def as_tensor(x, name = None):
...@@ -91,7 +101,7 @@ def constant(x): ...@@ -91,7 +101,7 @@ def constant(x):
except: except:
raise TypeError("Could not convert %s to Tensor" % x, type(x)) raise TypeError("Could not convert %s to Tensor" % x, type(x))
def value(x): def value(x, name=None):
"""Return a symbolic `Value` with default value `x` """Return a symbolic `Value` with default value `x`
:Exceptions: :Exceptions:
...@@ -102,8 +112,12 @@ def value(x): ...@@ -102,8 +112,12 @@ def value(x):
else: else:
x_ = numpy.asarray(x) x_ = numpy.asarray(x)
try: try:
return TensorValue(Tensor(dtype = x_.dtype, if name is None:
return TensorValue(Tensor(dtype = x_.dtype,
broadcastable = [d == 1 for d in x_.shape]), x_) broadcastable = [d == 1 for d in x_.shape]), x_)
else:
return TensorValue(Tensor(dtype = x_.dtype,
broadcastable = [d == 1 for d in x_.shape]), x_, name=name)
except: except:
raise TypeError("Could not convert %s to Tensor" % x, type(x)) raise TypeError("Could not convert %s to Tensor" % x, type(x))
...@@ -203,7 +217,7 @@ class Tensor(Type): ...@@ -203,7 +217,7 @@ class Tensor(Type):
- `name`: str - `name`: str
A pretty name to identify this `Result` when printing and debugging A pretty name to identify this `Result` when printing and debugging
""" """
return TensorResult(self, name = name) return TensorResult(self, name = name)
def __str__(self): def __str__(self):
...@@ -481,6 +495,18 @@ class _tensor_py_operators: ...@@ -481,6 +495,18 @@ class _tensor_py_operators:
raise TypeError('Tensor does not support iteration. ' raise TypeError('Tensor does not support iteration. '
'Maybe you are using builtin.sum instead of theano.tensor.sum? (Maybe .max?)') 'Maybe you are using builtin.sum instead of theano.tensor.sum? (Maybe .max?)')
# CONVENIENT ACCESS TO TYPE PROPERTIES
ndim = property(lambda self: self.type.ndim)
"""The rank of this tensor."""
broadcastable = property(lambda self: self.type.broadcastable)
"""The broadcastable signature of this tensor.
See :doc:`broadcasting` for details.
"""
dtype = property(lambda self: self.type.dtype)
""" The dtype of this tensor. """
class TensorResult(Result, _tensor_py_operators): class TensorResult(Result, _tensor_py_operators):
...@@ -523,11 +549,12 @@ def _elemwise(scalar_op, name, doc_prefix=''): ...@@ -523,11 +549,12 @@ def _elemwise(scalar_op, name, doc_prefix=''):
return straight, inplace return straight, inplace
def _redefine(real_symbol_value): def _redefine(real_symbol_value, module='tensor'):
"""Replace the value associated with a function symbol. """Replace the value associated with a function symbol.
This is useful to trick epydoc into doing what we want. It's a hack. This is useful to trick epydoc into doing what we want. It's a hack.
""" """
real_symbol_value.__module__ = 'tensor'
def decorator(f): def decorator(f):
return real_symbol_value return real_symbol_value
return decorator return decorator
...@@ -557,6 +584,7 @@ def _scal_elemwise(symbol): ...@@ -557,6 +584,7 @@ def _scal_elemwise(symbol):
#for the meaning of this see the ./epydoc script #for the meaning of this see the ./epydoc script
# it makes epydoc display rval as if it were a function, not an object # it makes epydoc display rval as if it were a function, not an object
rval.__epydoc_asRoutine = symbol rval.__epydoc_asRoutine = symbol
rval.__module__ = 'tensor'
return rval return rval
...@@ -606,6 +634,8 @@ def cast(t, dtype): ...@@ -606,6 +634,8 @@ def cast(t, dtype):
#to be removed as we get the epydoc routine-documenting thing going -JB 20080924 #to be removed as we get the epydoc routine-documenting thing going -JB 20080924
def _conversion(real_value): def _conversion(real_value):
__oplist_tag(real_value, 'casting')
real_value.__module__='tensor'
return real_value return real_value
convert_to_int8 = _conversion(elemwise.Elemwise(scal.Identity(scal.specific_out(scal.int8)))) convert_to_int8 = _conversion(elemwise.Elemwise(scal.Identity(scal.specific_out(scal.int8))))
...@@ -1314,32 +1344,101 @@ class SetSubtensor(Subtensor): ...@@ -1314,32 +1344,101 @@ class SetSubtensor(Subtensor):
x.__setitem__(cdata, y) x.__setitem__(cdata, y)
out[0] = x out[0] = x
class Split(Op):
"""Partition a `TensorResult` along some axis.
.. python::
x = vector()
splits = lvector()
# you have to declare right away how many split_points there will be.
ra, rb, rc = split(x, axis=0, points=splits, n_splits=3)
f = compile([x, splits], [ra, rb, rc])
a, b, c = f([0,1,2,3,4,5,6], [3, 2, 1])
#a == [0,1,2]
#b == [3, 4]
#c == [5]
"""
len_splits = None
"""A Split instance will have this many outputs, and require that the splits argument to
`perform` have exactly this many elements.
"""
def __init__(self, len_splits):
self.len_splits = int(len_splits)
def make_node(self, x, axis, splits):
"""WRITEME"""
x = as_tensor(x)
axis = as_tensor(axis)
splits = as_tensor(splits)
if splits.type != lvector:
raise TypeError('splits must have type tensor.lvector', splits.type)
if axis.type != lscalar:
raise TypeError('axis must have type lscalar', axis.type)
inputs = [x, axis, splits]
outputs = [x.type() for i in xrange(self.len_splits)]
return Apply(self, inputs, outputs)
class MakeVector(Op):
"""WRITEME"""
def __init__(self, stype):
self.stype = stype
def make_node(self, *inputs):
inputs = map(as_tensor, inputs)
assert all(a.type == self.stype for a in inputs)
return Apply(self, inputs, [Tensor(broadcastable = (False,),
dtype = self.stype.dtype)()])
def perform(self, node, inputs, (out,)):
out[0] = numpy.asarray(inputs)
def grad(self, inputs, (gout,)):
return [None]*len(inputs)
make_lvector = MakeVector(lscalar)
"""WRITEME"""
class Concatenate(Op): def perform(self, node, (x, axis, splits), outputs):
"""WRITEME"""
try:
len_along_axis = x.shape[axis]
except :
raise ValueError('Split.perform() with axis=(%s) is invalid for x.shape==(%s)'
%(axis, x.shape))
if len(splits) != self.len_splits:
raise ValueError('In Split.perform(), len(splits) != len_splits.',
(len(splits), self.len_splits))
# Checking is done, let's roll the splitting algorithm!
# Basically we step along the given axis of x, extracting subtensors of size splits[i]
# as we go along.
general_key = [slice(None, None, None) for s in x.shape]
lower_idx = 0
for i in xrange(self.len_splits):
upper_idx = lower_idx + splits[i]
general_key[axis] = slice(lower_idx, upper_idx, None)
outputs[i][0] = x.__getitem__(general_key).copy()
lower_idx = upper_idx
def grad(self, (x, axis, splits), g_outputs):
"""Join the gradients along the axis that was used to split x."""
return [join(axis, *g_outputs), None, None]
class Join(Op):
""" """
Concatenate two L{Tensor}s along the given axis. Concatenate two `TensorResult`s along some axis.
These L{Tensor}s must have the same shape along all dimensions other than
this axis. These tensors must have the same shape along all dimensions other than this axis.
Of course, TensorResult instances don't have a shape, so this error can't be caught until
runtime. See `perform()`.
.. python::
x, y, z = tensor.matrix(), tensor.matrix(), tensor.matrix()
u = tensor.vector()
r = join(0, x, y, z)
c = join(1, x, y, z)
join(2, x, y, z) # WRONG: the axis has to be an index into the shape
join(0, x, y) # WRONG: tensors have to have the same rank to be joined
""" """
def make_node(self, *axis_and_tensors): def make_node(self, *axis_and_tensors):
"""
WRITEME
"""
axis, tensors = axis_and_tensors[0], axis_and_tensors[1:] axis, tensors = axis_and_tensors[0], axis_and_tensors[1:]
as_tensor_args= [as_tensor(x) for x in tensors] as_tensor_args= [as_tensor(x) for x in tensors]
dtypes = [x.type.dtype for x in as_tensor_args] dtypes = [x.type.dtype for x in as_tensor_args]
...@@ -1367,20 +1466,39 @@ class Concatenate(Op): ...@@ -1367,20 +1466,39 @@ class Concatenate(Op):
bcastable[:] = as_tensor_args[0].type.broadcastable bcastable[:] = as_tensor_args[0].type.broadcastable
bcastable[axis] = False bcastable[axis] = False
inputs = [scal.as_scalar(axis)] + as_tensor_args inputs = [as_tensor(axis)] + as_tensor_args
if inputs[0].type != lscalar:
raise TypeError('Axis could not be cast to lscalar', axis)
outputs = [tensor(dtype = dtypes[0], outputs = [tensor(dtype = dtypes[0],
broadcastable = bcastable)] broadcastable = bcastable)]
return Apply(self, inputs, outputs) return Apply(self, inputs, outputs)
def perform(self, node, axis_and_tensors, (out, )): def perform(self, node, axis_and_tensors, (out, )):
"""
WRITEME
"""
axis, tensors = axis_and_tensors[0], axis_and_tensors[1:] axis, tensors = axis_and_tensors[0], axis_and_tensors[1:]
out[0] = numpy.concatenate(tensors, axis = axis) out[0] = numpy.concatenate(tensors, axis = axis)
def grad(self, axis_and_tensors, (gz,)): def grad(self, axis_and_tensors, (gz,)):
""" The gradient wrt a join op is a `Split`, used to partition the gradient along the
`axis` which was used for joining.
"""
axis, tensors = axis_and_tensors[0], axis_and_tensors[1:]
if 'float' in tensors[0].dtype or 'complex' in tensors[0].dtype:
# assume that this isn't differentiable
split = Split(len(tensors))
return [None] + split(gz, axis, stack(*[shape(x)[axis] for x in tensors]))
else:
# assume that this isn't differentiable
return [None] * (1 + len(tensors))
def _native_grad(self, axis_and_tensors, (gz,)):
"""WRITEME"""
axis, tensors = axis_and_tensors[0], axis_and_tensors[1:] axis, tensors = axis_and_tensors[0], axis_and_tensors[1:]
n_dims = len(shape(tensors[0]))
sizes_along_axis = [shape(x)[axis] for x in tensors] sizes_along_axis = [shape(x)[axis] for x in tensors]
n_dims = len(shape(tensors[0]))
idx = [0] idx = [0]
for s in sizes_along_axis: for s in sizes_along_axis:
idx.append(idx[-1] + s) idx.append(idx[-1] + s)
...@@ -1390,91 +1508,172 @@ class Concatenate(Op): ...@@ -1390,91 +1508,172 @@ class Concatenate(Op):
[slice(None)] * (n_dims - axis - 1)] \ [slice(None)] * (n_dims - axis - 1)] \
for k in range(len(sizes_along_axis))] for k in range(len(sizes_along_axis))]
def get_vector_length(v): def vec_length(self, node):
if isinstance(v, gof.Constant) and v.type.ndim == 1: assert isinstance(node.owner.op, Join)
return len(v.data) if node.ndim != 1:
elif v.owner and isinstance(v.owner.op, MakeVector): raise TypeError('argument must be symbolic vector')
return len(v.owner.inputs) inputs = node.owner.inputs
elif v.owner and v.owner.op == shape: axis, tensors = inputs[0], inputs[1]
return v.owner.inputs[0].type.ndim # if v is a vector, axis must be 0
else: # the question is whether all the inputs are broadcastable.
return None if all(i.broadcastable[0] for i in tensors):
return len(tensors)
def concatenate(tensors, axis=0):
@_redefine_asRoutine(Join())
def join(axis, *tensors):
""" """
Convenience function to concatenate `Tensor`s along the given axis. Convenience function to concatenate `Tensor`s along the given axis.
The `axis` parameter may either be an integer or an object that can be
converted to a scalar using `as_scalar`(`axis`). In the former case, :Parameters:
the axis is fixed at construction, while in the latter it may vary over - `tensors` : list of tensors (or list-like)
time depending on the value of the `axis` variable. A list of tensors to be concatenated along the given axis.
- `axis` : int (symbolic or literal)
On which dimension should the tensors be joined? The `axis` must be a valid index into
the shape of the tensors to be concatenated.
The `axis` parameter may either be an integer or an object that can be converted to a
scalar using `as_scalar`(`axis`). In the former case, the axis is fixed at construction,
while in the latter it may vary over time depending on the value of the `axis` variable.
The shapes of the tensors to be concatenated must be all identical, except in the dimension
(`axis`) on which they are to be joined.
"""
@constructor
def leftpad_shape(tensor, n_ones):
"""Reshape `tensor` by left-padding the shape with `n_ones` 1s"""
pattern = ['x']*n_ones + [i for i in range(tensor.type.ndim)]
return DimShuffle(tensor.broadcastable, pattern)(tensor)
@constructor
def stack(*tensors):
"""Insert the arguments as slices into a tensor of 1 rank greater.
EXAMPLE
"""
return join(0, *[leftpad_shape(t, 1) for t in tensors])
@constructor
def concatenate(tensor_list, axis=0):
"""Alias for `join`(axis, *tensor_list).
This function is similar to `join`, but uses the signature of numpy's concatenate function.
This function
:Exceptions:
- `TypeError` : the tensor_list must be a tuple or list
""" """
# Check someone did not make the common mistake to do something like: # Check someone did not make the common mistake to do something like:
# c = concatenate(x, y) # c = concatenate(x, y)
# instead of # instead of
# c = concatenate((x, y)) # c = concatenate((x, y))
if not isinstance(tensors, (tuple, list)): if not isinstance(tensor_list, (tuple, list)):
raise TypeError("The 'tensors' argument must be either a tuple " raise TypeError("The 'tensors' argument must be either a tuple "
"or a list, make sure you did not forget () or [] around " "or a list, make sure you did not forget () or [] around "
"arguments of concatenate.", tensors) "arguments of concatenate.", tensors)
# Ensure we only create one instance of 'Concatenate', to simplify the return join(axis, *tensor_list)
# merging job.
if not hasattr(concatenate, 'obj'):
concatenate.obj = Concatenate()
return concatenate.obj(axis, *tensors)
class VerticalStack(Op): def get_vector_length(v):
""" """Return the run-time length of a symbolic vector.
Vertically stack two L{Tensor}s.
Stack two L{Tensor}s along the first axis (row wise). These :Parameters:
L{Tensor}s must have the same shape along all dimensions but the - `v` : A rank-1 Tensor result.
first.
:Exceptions:
- `TypeError` : `v` hasn't the proper type.
- `ValueError` : No special case applies, the length is not known.
In general this is not possible, but for a number of special cases the length can be
determined at compile / graph-construction time. This function implements these special
cases.
@attention: Because we use vstack as the implementation, if the
inputs have 1-dimension, the output will have 2-dimensions.
""" """
def make_node(self, x, y): if v.ndim != 1:
x = as_tensor(x) raise TypeError('argument must be symbolic vector')
y = as_tensor(y) if isinstance(v, gof.Constant) and v.type.ndim == 1:
assert x.type.dtype == y.type.dtype return len(v.data)
if x.type.broadcastable[1:] != y.type.broadcastable[1:]: if v.owner and isinstance(v.owner.op, join):
raise NotImplementedError try:
inputs = [x, y] return join.vec_length(v)
bcastable = (False, ) + x.type.broadcastable[1:] except:
outputs = [tensor(dtype = x.type.dtype, pass
broadcastable = bcastable)] if v.owner and v.owner.op == shape:
return Apply(self, inputs, outputs) return v.owner.inputs[0].type.ndim
def perform(self, node, (x, y), (out, )): raise ValueError("length not known")
assert x.ndim == y.ndim
# Make sure every dimension (save the first) is the same if 0: #vertical and horizontal stacking are deprecated. Better to use stack() and join().
for i in range(x.ndim): assert i == 0 or x.shape[i] == y.shape[i] class VerticalStack(Op):
out[0] = numpy.vstack([x, y]) """
def grad(self, (x, y), (gz,)): Vertically stack two L{Tensor}s.
Stack two L{Tensor}s along the first axis (row wise). These
L{Tensor}s must have the same shape along all dimensions but the
first.
@attention: Because we use vstack as the implementation, if the
inputs have 1-dimension, the output will have 2-dimensions.
""" """
@todo: Make VSplit (or this grad implementation) its own L{Op}, def make_node(self, x, y):
that way we can do more sanity-checking:: x = as_tensor(x)
y = as_tensor(y)
assert x.type.dtype == y.type.dtype
if x.type.broadcastable[1:] != y.type.broadcastable[1:]:
raise NotImplementedError
inputs = [x, y]
bcastable = (False, ) + x.type.broadcastable[1:]
outputs = [tensor(dtype = x.type.dtype,
broadcastable = bcastable)]
return Apply(self, inputs, outputs)
def perform(self, node, (x, y), (out, )):
assert x.ndim == y.ndim assert x.ndim == y.ndim
# Make sure every dimension (save the first) is the same # Make sure every dimension (save the first) is the same
for i in range(x.data.ndim): assert i == 0 or x.data.shape[i] == y.shape[i] for i in range(x.ndim): assert i == 0 or x.shape[i] == y.shape[i]
etc... out[0] = numpy.vstack([x, y])
def grad(self, (x, y), (gz,)):
"""
@todo: Make VSplit (or this grad implementation) its own L{Op},
that way we can do more sanity-checking::
assert x.ndim == y.ndim
# Make sure every dimension (save the first) is the same
for i in range(x.data.ndim): assert i == 0 or x.data.shape[i] == y.shape[i]
etc...
"""
xs = shape(x)
ys = shape(y)
return gz[:xs[0]], gz[xs[0]:]
vertical_stack = VerticalStack()
def horizontal_stack(x, y):
""" """
xs = shape(x) Horizontally stack two L{Tensor}s.
ys = shape(y) Stack two L{Tensor}s along the second axis (column wise). These
return gz[:xs[0]], gz[xs[0]:] L{Tensor}s must have the same shape along all dimensions but the
vertical_stack = VerticalStack() second.
def horizontal_stack(x, y): @note: Unlike VerticalStack, we assume that the L{Tensor}s have
""" two dimensions.
Horizontally stack two L{Tensor}s. """
Stack two L{Tensor}s along the second axis (column wise). These assert x.type.ndim == 2
L{Tensor}s must have the same shape along all dimensions but the assert y.type.ndim == 2
second. return transpose(vertical_stack(x.T, y.T))
class MakeVector(Op):
"""WRITEME"""
def __init__(self, stype):
self.stype = stype
def make_node(self, *inputs):
inputs = map(as_tensor, inputs)
assert all(a.type == self.stype for a in inputs)
return Apply(self, inputs, [Tensor(broadcastable = (False,),
dtype = self.stype.dtype)()])
def perform(self, node, inputs, (out,)):
out[0] = numpy.asarray(inputs)
def grad(self, inputs, (gout,)):
return [None]*len(inputs)
make_lvector = MakeVector(lscalar)
"""WRITEME"""
@note: Unlike VerticalStack, we assume that the L{Tensor}s have else:
two dimensions. pass
"""
assert x.type.ndim == 2
assert y.type.ndim == 2
return transpose(vertical_stack(x.T, y.T))
######################### #########################
...@@ -1845,3 +2044,146 @@ def grad(cost, wrt, g_cost=None): ...@@ -1845,3 +2044,146 @@ def grad(cost, wrt, g_cost=None):
else: else:
return gmap.get(wrt, zero(wrt)) return gmap.get(wrt, zero(wrt))
class numeric_grad:
"""WRITEME"""
def __init__(self, f, pt, eps=1.0e-7):
"""Return the gradient of f at pt.
This function computes the gradient by a one-sided finite differences of a
fixed step size (eps).
It is assumed that f(...) will return a scalar.
It is assumed that all f's inputs are numpy.ndarray objects.
"""
def prod(inputs):
rval = 1
for i in inputs:
rval *= i
return rval
packed_pt = False
if not isinstance(pt, (list, tuple)):
pt = [pt]
packed_pt = True
apt = [numpy.array(p) for p in pt]
shapes = [p.shape for p in apt]
dtypes = [str(p.dtype) for p in apt]
if not dtypes == [dtypes[0]] * len(apt):
raise TypeError('All function arguments must have same dtype')
total_size = __builtin__.sum(prod(sh) for sh in shapes)
#create un-initialized memory
x = numpy.ndarray((total_size,), dtype=dtypes[0])
gx = numpy.ndarray((total_size,), dtype=dtypes[0])
#set up aliases so that apt[i] is backed by memory in x
# and self.gf is backed by memory in gx
cur_pos = 0
self.gf = []
for i,p in enumerate(apt):
p_size = prod(p.shape)
# set up alias
apt[i] = x[cur_pos:cur_pos+p_size].reshape(p.shape)
self.gf.append(gx[cur_pos:cur_pos+p_size].reshape(p.shape))
# initialize with p's value
apt[i][:] = p
cur_pos += p_size
f_x = f(*[p.copy() for p in apt])
# now iterate over the elements of x, and call f on apt.
x_copy = x.copy()
for i in xrange(total_size):
x[:] = x_copy
x[i] += eps
f_eps = f(*apt)
gx[i] = numpy.asarray((f_eps - f_x)/eps)
if packed_pt:
self.gf = self.gf[0]
@staticmethod
def abs_rel_err(a,b,eps=1.0e-10):
"""Return a small number when a and b are close, relative to how big they are"""
return abs(a-b) / (abs(a)+abs(b)+eps)
def max_err(self, g_pt):
"""Return the biggest relative error between g_pt and self.gf"""
assert len(g_pt) == len(self.gf)
errs = []
for a, b in zip(g_pt, self.gf):
errs.append(numpy.max(numeric_grad.abs_rel_err(a,b)))
return numpy.max(errs)
def verify_grad(testcase, op, pt, n_tests=1, rng=numpy.random, eps=1.0e-7, tol=0.0001,
linker='c&py'):
""" WRITEME
testcase.failUnless(analytic gradient matches finite-diff gradient)
"""
pt = [numpy.array(p) for p in pt]
#print "PT", pt
def function(inputs, output):
return compile.function(inputs, output,
mode=compile.Mode(optimizer = None, linker = linker),
accept_inplace=True)
for test_num in xrange(n_tests):
tensor_pt = [value(p.copy(), name='input %i'%i) for i,p in enumerate(pt)]
#op can be either a function or an actual Op instance
#print "OP", op
#print "TENSOR PT", tensor_pt
o_output = op(*tensor_pt)
if isinstance(o_output,list) > 1:
raise NotImplementedError('cant (yet) autotest gradient of op with multiple outputs')
# we could make loop over outputs making random projections R for each,
# but this doesn't handle the case where not all the outputs are
# differentiable... so I leave this as TODO for now -JB.
o_fn = function(tensor_pt, o_output)
#print "PT B", pt
o_fn_out = o_fn(*[p.copy() for p in pt])
#print "PT C", pt
random_projection = rng.rand(*o_fn_out.shape)
t_r = as_tensor(random_projection)
#random projection of o onto t_r
cost = sum(t_r * o_output)
cost_fn = function(tensor_pt, cost)
num_grad = numeric_grad(cost_fn, [p.copy() for p in pt], eps)
symbolic_grad = grad(cost, tensor_pt,as_tensor(1.0,name='g_cost'))
if 0:
print '----------'
for op in gof.graph.io_toposort(tensor_pt, symbolic_grad):
print op
grad_fn = function(tensor_pt, symbolic_grad)
#print "PT D", pt
analytic_grad = grad_fn(*pt)
#print "PT Z", pt
if not isinstance(analytic_grad, (list, tuple)):
analytic_grad = [analytic_grad]
max_err = num_grad.max_err(analytic_grad)
if max_err > tol:
#print 'analytic grad', analytic_grad
#print 'numeric grad', num_grad.gf
raise Exception(verify_grad.E_grad, (max_err, tol))
verify_grad.E_grad = 'gradient error exceeded tolerance'
"""This error is raised when a gradient is calculated, but incorrect."""
...@@ -12,6 +12,7 @@ import numpy as N ...@@ -12,6 +12,7 @@ import numpy as N
import operator import operator
import itertools import itertools
import sys import sys
import compile #to register the optimizer built by this file
# Utilities # Utilities
...@@ -40,9 +41,10 @@ gemm_pattern_1 = gof.PatternSub((T._sub_inplace, ...@@ -40,9 +41,10 @@ gemm_pattern_1 = gof.PatternSub((T._sub_inplace,
# gemm: (d,a,b,c,s) -> d = d*s + a*dot(b,c) # gemm: (d,a,b,c,s) -> d = d*s + a*dot(b,c)
# Transforms dot(a, b) into gemm(zeros(2)(hstack(shape(a)[:1], shape(b)[1:])), 1.0, a, b, 1.0) # Transforms dot(a, b) into gemm(zeros(2)(hstack(shape(a)[:1], shape(b)[1:])), 1.0, a, b, 1.0)
# The construction of the 'gemm' node may fail if, for example, a and b are not both matrices.
dot_to_gemm = gof.PatternSub((T.dot, 'a', 'b'), dot_to_gemm = gof.PatternSub((T.dot, 'a', 'b'),
(T.gemm, (T.Zeros(2), (T.gemm, (T.Zeros(2),
(T.vertical_stack, (T.stack,
(T.Subtensor([slice(0, 1)]), (T.shape, 'a')), (T.Subtensor([slice(0, 1)]), (T.shape, 'a')),
(T.Subtensor([slice(1, 2)]), (T.shape, 'b')))), (T.Subtensor([slice(1, 2)]), (T.shape, 'b')))),
T.constant(1.0), 'a', 'b', T.constant(1.0)), T.constant(1.0), 'a', 'b', T.constant(1.0)),
...@@ -239,7 +241,15 @@ def local_subtensor_make_vector(node): ...@@ -239,7 +241,15 @@ def local_subtensor_make_vector(node):
If the index or slice is constant. If the index or slice is constant.
""" """
if not opt.check_chain(node, T.Subtensor, T.MakeVector): if not opt.check_chain(node, T.Subtensor, T.Join):
return False
joined_r = node.inputs[0]
try:
#check that join is being used to join scalars
veclen = T.join.vec_length(joined_r)
except:
return False return False
idxlist = node.op.idx_list idxlist = node.op.idx_list
...@@ -652,6 +662,16 @@ def _math_optimizer(): ...@@ -652,6 +662,16 @@ def _math_optimizer():
math_optimizer = _math_optimizer() math_optimizer = _math_optimizer()
compile.register_optimizer('math',
gof.MergeOptMerge(
gof.PureThenInplaceOptimizer(
math_optimizer,
inplace_optimizer)))
compile.register_mode('SANITY_CHECK', compile.Mode('c&py', 'math'))
compile.register_mode('FAST_RUN', compile.Mode('c|py', 'math'))
compile.register_mode('EXPENSIVE_OPTIMIZATIONS', compile.Mode('c|py', 'math'))
# @gof.local_optimizer # @gof.local_optimizer
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论