提交 1857d61d authored 作者: James Bergstra's avatar James Bergstra

Sandbox doc updates

上级 8a2e4d1f
......@@ -114,6 +114,42 @@ For more details, including the interface for providing a C implementation of
perform(), refer to the documentation for :ref:`op`.
Checklist
---------
Use this list to make sure that you defined everything you need for your Op:
* Are there parameters that are not inputs but parametrize the behavior of your Op? (see parametrization section below)
* Yes?
* Define ``__init__`` with those parameters. They will be instance variables.
* Override ``__eq__``, ``__ne__`` and ``__hash__`` (optional)
* Consider making pre-made instances for common parameters. This will simplify usage.
* No? (usual case for simple Ops)
* Consider making a singleton of your Op (this can be as simple as ``my_op = MyOp()``). It will simplify usage. [*What is the benefit of using the singleton? How does it simplify usage? We __shouldn't__ use singletons when there __are__ parameters?*]
* All instances should compare equal (which is trivial if there is only one of them). [*How do we make sure this is true? Because this checklist should be a list of instructions. Do you describe later on?*]
* Always define *make_node* (see make_node section below).
* Always define *perform* (see perform section below).
* Do you need performance only C can offer?
* Define *c_code* and *c_code_cleanup* (see HowtoMakeCeeOps)
* Remember to use the 'c' or 'c|py' linker on graphs using your Op! [*This is described where?*]
* Is your Op differentiable?
* Define *grad* (see grad section below) [*If not, and you don't define *grad*, what will happen if you try to differentiate it?*]
* Does your Op modify any of its inputs?
* *IMPORTANT:* read the destroyers and viewers section.
* Does any output from the Op share any sort of state with an input?
* *IMPORTANT:* read the destroyers and viewers section.
* Does your Op have more than one output?
* Consider setting the default_output attribute to the index of that output. (It will make your Op usable in ``PatternOptimizers``, and make user code look like the Op has only that output.)
[*Consider changing the order of the checklist above and the sections below such that the stuff you ALWAYS have to do, which is the most basic stuff anyhow, goes towards the top.*]
Defining mul
============
......@@ -348,10 +384,3 @@ your disposal to create these objects as efficiently as possible.
**Exercise**: Make a generic DoubleOp, where the number of
arguments can also be given as a parameter.
**Next:** `Implementing double in C`_
.. _Implementing double in C: ctype.html
TOTALLY CUTE HOMEPAGE! MAKE THIS A NICE HOMEPAGE :)
Documentation is broadly divided into two kinds: user documentation and
developer documentation.
- `Using Theano` covers how to *use* what is already in the Theano library to
build graphs and evaluate them.
- `Concepts` introduces how Theano works.
- `Extending Theano` explains how to add new
data and expression types, as well as optimizations to accompany them.
- `Module`
- `Hacking Theano` introduces you to what's under the hood: the compilation
process, the Env, C code generation.
Using Theano
============
- First of all, read the `tutorial`_. It is a cut-and-paste, tutorial-style intro to what Theano can do.
- Familiarize yourself with the :ref:`glossary`.
- Join `theano-users`_.
- Learn to use the typelist_, and the oplist_. These are the building blocks
of theano expression graphs.
- Browse through some of the `Howto`_ recipes on the wiki.
.. _Howto:
.. _theano-users: http://groups.google.com/group/theano-users?pli=1
.. _theano-dev: http://groups.google.com/group/theano-dev?pli=1
.. _tutorial: tutorial.html
.. _typelist: typelist.html
.. _oplist: oplist.html
Extending Theano
================
- Read about `How Theano Works <UserAdvanced.html>`__. This introduces the
major interface data structures: Op, Type, Result, Apply.
- Read about `Extending theano <extending.html>`__.
- How to make a new Op.
- How to make a new Optimization.
- How to make a new data Type.
Hacking Theano
==============
- `Get Started as a Developer <DevStartGuide.html>`__ by setting up mercurial, getting a few accounts,
setting up your environment, and getting some background in mercurial, python,
and numpy.
- Join `theano-dev`_ to participate in development discussion.
- Pick a task from the `task list`_, or suggest one on `theano-users`_.
Features/ideas are generally discussed on `theano-users`_. Technical
discussions of how to actually implement something should be on
`theano-dev`_.
- Browse `Theano's API <../api/>`__.
- Keep an eye on the `Mercurial Changelog <http://pylearn.org/hg/theano>`__.
- Send us your work as a patch to `theano-dev`_ or commit directly to the trunk.
.. _theano-dev: http://groups.google.com/group/theano-dev?pli=1
.. _task list: http://lgcm.iro.umontreal.ca/theano/query?status=accepted&status=assigned&status=new&status=reopened&group=milestone&max=200&col=id&col=summary&col=status&col=owner&col=type&col=priority&col=component&col=time&report=9&order=priority
.. _reStructuredText: http://docutils.sourceforge.net/rst.html
......@@ -5,6 +5,10 @@
Installing Theano
=================
.. note::
If you are a member of LISA Labo, have a look at :ref:`lisa_labo` for
lab-specific installation instructions.
------------
Requirements
------------
......
......@@ -14,3 +14,4 @@ Structure
dev_start_guide
hg_primer
metadocumentation
lisa_labo
==============
README: theano
==============
.. contents::
Project Description
===================
<MOVED TO theano.txt>
<MOVED TO theano.txt>
License
-------
Theano is licensed under a BSD-like license. See the LICENSE file in the project root folder.
Installation
============
Software Requirements
---------------------
- linux or OS-X operating system
- python 2.5
- SciPy (specifically numpy, sparse, weave). We recommend scipy >=0.7 if you are using sparse matrices, because scipy.sparse is buggy in 0.6. (scipy.csc_matrix dot has a bug with singleton dimensions. There may be more bugs.) Numpy version >= 1.1 fixes memory leak. Numpy version >=1.2 fixes more memory leak.
- docutils, pygments (optional, to build documentation)
- mercurial (optional, to download the source)
- g++, python-dev (optional, to compile generated C code)
- nose(nosetests), for testing
- Optional: `psyco <http://psyco.sourceforge.net/>`__ can make your python code much faster, if you are on a 32-bit x86 architecture. If you use compiled C code, this can be less important.
Downloading Theano
------------------
There are two ways to get the source: mercurial (required for library developers) and unix tar.
There are no stable releases yet.
*To get the source via mercurial,* you must have `mercurial <http://www.selenic.com/mercurial/wiki/>`__ installed.
Get the source and run the tests like this:
.. code-block:: bash
hg clone http://pylearn.org/hg/theano Theano
ln -s Theano/theano <someplace on your PYTHONPATH>/theano
cd Theano
nosetests #execute all the tests
All tests should pass. From time to time, there is some test that are
know to fail, but normally we know them and are trying to fix them. But
to be sure contact us about them.
The environment variables PYTHONPATH should be modified to allow python
import. In bash do this:
.. code-block:: bash
export PYTHONPATH=<path to theano>:$PYTHONPATH
In csh:
.. code-block:: csh
setevn PYTHONPATH <path to theano>:$PYTHONPATH
To update your library to the latest on pylearn.org, change directory (`cd`) to this `Theano` folder and type
.. code-block:: bash
hg pull -u
*To get the source via unix tar*, you can download the latest source directly as a gzip'd tar file:
`<http://pylearn.org/hg/theano/archive/tip.tar.gz>`__.
Configuring the environment
---------------------------
Two environment variables are used to control automatic code generation.
(It is possible to use theano in a way that avoids all automatic code generation, but the functions you make using ``theano.function`` will execute more slowly.)
- `THEANO_BLAS_LDFLAGS`:
a space-separated list of library names to link against for BLAS functions. Default: `-lblas`
- `THEANO_COMPILEDIR`:
a directory with read/write access permissions, where theano will store
autogenerated code and c modules. Default: `$HOME/.theano`. If this
directory does not exist, or does not have the correct permissions, then
theano will try to create it with the correct permissions. If that fails,
an exception will be raised and no C code will be compiled.
Setup on Linux
++++++++++++++
Setup on OS-X
+++++++++++++
- Install [http://www.macports.org/ MacPorts]
- `sudo port install gcc42 py25-zlib py25-numpy py25-scipy mercurial`.
Note that compiling gcc42 takes a significant time (hours) so it's probably
not the best solution if you're in a rush! In my (Doomie) experience, scipy
failed to compile the first time I tried the command, but the second time
it compiled fine. Same thing with py25-zlib.
- Install some kind of BLAS library (TODO: how?)
- Set THEANO_BLAS_LDFLAGS to something which will link against said BLAS
library. (e.g., `THEANO_BLAS_LDFLAGS='-lcblas -latlas -lgfortran'`).
Setup on Windows
++++++++++++++++
No one has done this yet. WRITEME.
Tips for running at LISA
++++++++++++++++++++++++
Use the fast BLAS library that Fred installed, by setting
`THEANO_BLAS_LDFLAGS=-lgoto`.
Tips for running on a cluster
+++++++++++++++++++++++++++++
OUTDATED(was for mammouth1, should be updated for mammouth2)
Use something like the following in your .bashrc:
.. code-block:: bash
#use the intel math-kernel library for BLAS routines
THEANO_BLAS_LDFLAGS=-lmkl
# use up to two threads in the MKL routines
OMP_NUM_THREADS=2
# IMPORTANT!
# Use the local-temporary directory as a cache.
# If several jobs start simultaneously and use a common
# cache, then the cache may be corrupted.
# Theano is not process-safe or thread-safe in this sense.
THEANO_COMPILEDIR=/ltmp/<username>_theano
You may also need to run the following from your shell:
.. code-block:: bash
module add python # for the current shell session
module initadd python # for this and future sessions
Lastly, if ``./filename.py`` doesn't work, try ``python filename.py``.
Running the Test Suite
======================
Test your installation by running the tests. Type at the shell:
.. code-block:: bash
cd theano
nosetests
All tests should pass.
python-nose must be installed. On red-hat or fedora core: ``sudo yum install python-nose.noarch``
Using Theano
============
Now that you've got theano installed and running, check out the `tutorial <doc/tutorial.html>`__ for how to use it.
Getting Help
============
If these installation instructions don't work, search the theano-users archive for similar cases. If you don't find a solution, write to theano-users and explain the situation.
.. _README: README.html
.. _Download: README.html#downloading-theano
.. _Documentation: doc/index.html
.. _Wiki: http://pylearn.org/theano
.. _task list: http://lgcm.iro.umontreal.ca/theano/query?status=accepted&status=assigned&status=new&status=reopened&group=milestone&max=200&col=id&col=summary&col=status&col=owner&col=type&col=priority&col=component&col=time&report=9&order=priority
.. _gradient:
===========================
Computation of the Gradient
===========================
WRITEME
Describe what is happening in general when you compute the gradient
Give examples with varying shapes
......@@ -4,72 +4,6 @@
How to Make Ops
#################
[*Links within the page would be nice. Also, links to the epydocumentation for the major classes, e.g. Op and Result, would be useful at least once in the document.*]
:ref:`Graph`
What is an Op?
==============
An Op *instance* represents a particular function that can be applied to
inputs to produce a Result. Note that (unlike in the previous version of
Theano) an Op instance does not represent the application of a function,
only the function itself. This means that the same Op instance can be
used several times in the same computation graph, as part of different
nodes. [*I don't understand, how would that work? What are the semantics
of it?*]
An Op can provide the following special functionality which will be
detailed further down the page:
* Given a list of input Results, *make_node* produces an Apply instance representing the application of a function on those inputs.
* Given an Apply instance, a list of input values and a list of output storage, *perform* fills the storage with the results of the computation on the inputs.
* Given an Apply instance and names for the node, inputs and outputs, *c_code* and *c_code_cleanup* produce C code to compute the function.
* Given input Results and gradient Results, *grad* returns the symbolic expressions for computing gradients for each input.
To make an Op, extend the Op class and override the functions you need. The checklist section below should be of great help to make sure your Op's interface is complete.
Using an Op subclass
--------------------
This is not meant to give an exhaustive overview of how to use an Op (see IntroToOps for that).
.. code-block:: python
op = MyOp(<parameters>)
node = MyOp(<parameters>).make_node(<inputs>) # returns an Apply instance (contains pointers to op, inputs and outputs)
result = MyOp(<parameters>)(<inputs>) # returns as many Result instances as the op has outputs (each contains pointer to node) (this is what the end user manipulates)
value = op.perform(node, <values>, <storage>) # computes the function on actual values - see perform section
Checklist
---------
Use this list to make sure that you defined everything you need for your Op:
* Are there parameters that are not inputs but parametrize the behavior of your Op? (see parametrization section below)
* Yes?
* Define ``__init__`` with those parameters. They will be instance variables.
* Override ``__eq__``, ``__ne__`` and ``__hash__`` (optional)
* Consider making pre-made instances for common parameters. This will simplify usage.
* No? (usual case for simple Ops)
* Consider making a singleton of your Op (this can be as simple as ``my_op = MyOp()``). It will simplify usage. [*What is the benefit of using the singleton? How does it simplify usage? We __shouldn't__ use singletons when there __are__ parameters?*]
* All instances should compare equal (which is trivial if there is only one of them). [*How do we make sure this is true? Because this checklist should be a list of instructions. Do you describe later on?*]
* Always define *make_node* (see make_node section below).
* Always define *perform* (see perform section below).
* Do you need performance only C can offer?
* Define *c_code* and *c_code_cleanup* (see HowtoMakeCeeOps)
* Remember to use the 'c' or 'c|py' linker on graphs using your Op! [*This is described where?*]
* Is your Op differentiable?
* Define *grad* (see grad section below) [*If not, and you don't define *grad*, what will happen if you try to differentiate it?*]
* Does your Op modify any of its inputs?
* *IMPORTANT:* read the destroyers and viewers section.
* Does any output from the Op share any sort of state with an input?
* *IMPORTANT:* read the destroyers and viewers section.
* Does your Op have more than one output?
* Consider setting the default_output attribute to the index of that output. (It will make your Op usable in ``PatternOptimizers``, and make user code look like the Op has only that output.)
[*Consider changing the order of the checklist above and the sections below such that the stuff you ALWAYS have to do, which is the most basic stuff anyhow, goes towards the top.*]
Parametrization
===============
......@@ -103,13 +37,6 @@ In order for certain optimizations to apply (such as the merging of duplicate ca
Recall: the contract for ``__hash__`` is that ``a == b`` implies ``hash(a) == hash(b)``.
Mutability
----------
In general, Theano's internal routines assume that the parameters of an op are immutable.
If in doubt, don't change them (especially once you are using them in a graph).
[*Does this mean that the output has to be deterministic? i.e. if I generate random unmers in the Op, I have to reset the RNG to the initial state afterwards?*]
make_node
=========
......
.. _howtotest:
===========
How to test
===========
How to test an Op
=================
blah blah WRITEME
How to test an Optimizer
========================
yadda WRITEME yadda
......@@ -7,82 +7,3 @@ Sandbox, this documentation may or may not be out-of-date
*
Documentation is broadly divided into two kinds: user documentation and
developer documentation.
- `Using Theano` covers how to *use* what is already in the Theano library to
build graphs and evaluate them.
- `Concepts` introduces how Theano works.
- `Extending Theano` explains how to add new
data and expression types, as well as optimizations to accompany them.
- `Module`
- `Hacking Theano` introduces you to what's under the hood: the compilation
process, the Env, C code generation.
Using Theano
============
- First of all, read the `tutorial`_. It is a cut-and-paste, tutorial-style intro to what Theano can do.
- Familiarize yourself with the :ref:`glossary`.
- Join `theano-users`_.
- Learn to use the typelist_, and the oplist_. These are the building blocks
of theano expression graphs.
- Browse through some of the `Howto`_ recipes on the wiki.
.. _Howto:
.. _theano-users: http://groups.google.com/group/theano-users?pli=1
.. _theano-dev: http://groups.google.com/group/theano-dev?pli=1
.. _tutorial: tutorial.html
.. _typelist: typelist.html
.. _oplist: oplist.html
Extending Theano
================
- Read about `How Theano Works <UserAdvanced.html>`__. This introduces the
major interface data structures: Op, Type, Result, Apply.
- Read about `Extending theano <extending.html>`__.
- How to make a new Op.
- How to make a new Optimization.
- How to make a new data Type.
Hacking Theano
==============
- `Get Started as a Developer <DevStartGuide.html>`__ by setting up mercurial, getting a few accounts,
setting up your environment, and getting some background in mercurial, python,
and numpy.
- Join `theano-dev`_ to participate in development discussion.
- Pick a task from the `task list`_, or suggest one on `theano-users`_.
Features/ideas are generally discussed on `theano-users`_. Technical
discussions of how to actually implement something should be on
`theano-dev`_.
- Browse `Theano's API <../api/>`__.
- Keep an eye on the `Mercurial Changelog <http://pylearn.org/hg/theano>`__.
- Send us your work as a patch to `theano-dev`_ or commit directly to the trunk.
.. _theano-dev: http://groups.google.com/group/theano-dev?pli=1
.. _task list: http://lgcm.iro.umontreal.ca/theano/query?status=accepted&status=assigned&status=new&status=reopened&group=milestone&max=200&col=id&col=summary&col=status&col=owner&col=type&col=priority&col=component&col=time&report=9&order=priority
.. _reStructuredText: http://docutils.sourceforge.net/rst.html
.. _intro_to_ops:
===================
Introduction to Ops
===================
This page introduces :term:`Apply` and :term:`Op`. To start, let's consider the following program:
.. code-block:: python
import theano
from theano import tensor
a = tensor.constant(1.5)
b = tensor.fscalar()
c = a + b # Apply the Add Op to results a and b.
d = c + c # Apply the Add Op to the result c in two ways
f = theano.function([b], [d]) # Convert Op applications to callable objects.
assert 8.0 == f(2.5) # Bind 2.5 to 'b' and evaluate 'd' by running
# Add.perform() twice.
The python variables ``a,b,c,d`` all refer to classes of type
:term:`Result` (introduced in :ref:`intro_to_types`), whereas
:term:`Apply` and :term:`Op` classes serve to connect them together.
:term:`Apply` instances permit :api:`function <theano.compile.function>`
to figure out how to compute outputs from inputs (in this case, ``d``
from ``b``). Comparing with python's normal types, an :term:`Apply`
instance is theano's version of a function call (or expression instance)
whereas :term:`Op` is theano's version of a function.
There are three fields which are fundamental to an ''':term:`Apply`''' instance:
* ``inputs``: a list of :term:`Result` instances that represent the arguments of the function.
* ``outputs``: a list of :term:`Result` instances that represent the return values of the function.
* ``op``: an Op instance that determines which function is being applied here.
Now that we've seen :term:`Result` and :term:`Apply` we can begin to
understand what :api:`function <theano.compile.function>` does. When a
:term:`Result` is the output of an :term:`Apply`, it stores a reference
to this ``owner``). Similarly, each :term:`Apply` stores a list of its
inputs. In this way, :term:`Result` and :term:`Apply` instances together
form a bi-partite directed acyclic graph: :term:`Results <Result>`
point to :term:`Applies <Apply>` via the ``.owner`` attribute and
:term:`Applies <Apply>` to :term:`Results <Result>` via the ``.inputs``
attribute. When we call :api:`function <theano.compile.function>` one
of the first things that happens is a search through this graph from the
:term:`Results <Result>` given as the function's outputs; this search
establishes how to compute the outputs from inputs, and finds all the
constants and values which contribute to the outputs.
:term:`Op` instances, like :term:`Type` instances, tell :api:`function
<theano.compile.function>` what to do with the nodes it finds in this
graph search. An :term:`Op` instance has a ``perform`` method which
implements the computation that transforms the data associated with
``Apply.inputs`` to the data associated with ``Apply.outputs``.
What's Next?
============
* Read more about theano's :ref:`Graph`.
* Learn :ref:`HowtoMakeOps`.
.. _intro_to_types:
=====================
Introduction to Types
=====================
This page introduces ``theano.Result`` and ``theano.Type``.
class ``Result``
------------------
Consider the following program:
.. code-block:: python
import theano
from theano import tensor
a = tensor.constant(1.5) # declare a symbolic constant
b = tensor.fscalar() # declare a symbolic floating-point scalar
c = a + b # create a simple expression
f = theano.function([b], [c]) # convert the expression into a callable function
assert 4.0 == f(2.5) # bind 2.5 to 'b' and evaluate 'c'
The python variables ``a,b,c`` all refer to classes of type ``theano.Result``.
A ``Result`` is theano's version of a variable. There are three important kinds of ``Results``:
* ones that are the result of an expression (such as c) are the normal ``Result``
* constants, which are of subclass ``Constant``
* closures, which are of subclass ``Value``.
In our example, ``a`` refers to a ``Constant`` and ``b`` is a normal
``Result``. Although ``b`` is not the result of an expression in our
graph, it is necessary that ``b`` be the result of an expression outside
the graph; that's why ``b`` must be listed as one of the inputs of our
compiled function ``f``. We could have named ``a`` as an input to our
function too (even though it is declared as a constant) but as the example
shows, we don't have to because it already has a value associated with it.
The other kind of ``Result`` is the ``Value`` which implements
closures. It comes into play in the following variation on the program
above.
.. code-block:: python
import theano
from theano import tensor
a = tensor.value(1.5) # declare a symbolic value
b = tensor.fscalar() # declare a symbolic floating-point scalar
c = a # create a second name for a
c += b # c refers to the result of incrementing a by b
f = theano.function([b], [c]) # convert the expression into a callable function
assert 4.0 == f(2.5) # bind 2.5 to 'b' and evaluate 'c' (increments f's copy of a)
assert 6.5 == f(2.5) # bind 2.5 to 'b' and evaluate 'c' (increments f's copy of a)
g = theano.function([b], [c]) # make another function like f
assert 4.0 == g(2.5) # g got a fresh version of the closure, not the one modified by f
A ``Value`` is a ``Result`` that is not computed by any expression,
but need not be an input to our function because it already has a value.
In this example, ``a`` is a ``Value`` instance. [''Too many negations
in the previous sentence for me to figure out what it means.''] One of
the expressions that use it in a given function can modify it and the
modified value will persist between evaluations of that function. If two
expressions try to modify the same ``Value`` then ``theano.function``
will raise an exception. Incidentally, ``theano.function`` might choose
to work in-place on internal results at its discretion... once you tell
it which input and output results you care about, then it basically
has free reign over all the others. [''Shouldn't this sentence be a
new paragraph?'']
class ``Type``
----------------
[http://lgcm.iro.umontreal.ca:8000/theano/chrome/common/epydoc/theano.gof.type.Type-class.html autodoc of theano.Type]
A ``Type`` instance hides behind each ``Result`` and indicates what
sort of value we can associate with that ``Result``. Many ``Result``
instances can use the same ``Type`` instance. In our example above
``theano.fscalar`` is a ``Type`` instance, and calling it generated
a ``Result`` of that type. The ``Type`` of a ``Result`` is a
contract to expression implementations; [''previous phrase is really
convoluted. Just use standard terminology from programming language
specification. It's like a type declaration, right?''] it's a promise
that at computation time, the actual value (not symbolic anymore)
will have a certain interface... to really go into detail is beyond the
scope of this user intro, but for example if a ``Result`` has a type
``tensor.fvector`` then we'll compute a 1-dimensional numpy.ndarray of
dtype('float64') for it[[''How to get float32?'']]. ``Type`` instances
are also responsible for exposing actual data to C code, and packaging it
back up for python when ``theano.function`` is asked to generate C code.
To learn more about that, read the introduction to CodeGeneration.
What's Next?
--------------
The companion to Result and Type is :ref:`intro_to_ops`, which develops a similar story for the expression objects themselves.
.. _logistic_regression_example:
State example
=============
In this example, we'll look at a complete logistic regression model, with
training by gradient descent.
BUT, YOU GOTTA RUN THIS CODE AND MAKE SURE IT STILL WORKS NICELY, HEY?
.. code-block:: python
def build_logistic_regression_model(n_in, n_out, l2_coef=30.0)
# DECLARE SOME VARIABLES
import tensor as T
x = T.matrix() #our points, one point per row
y = T.matrix() #store our labels as place codes (label 3 of 5 is vector [00100])
w = T.matrix() #the linear transform to apply to our input points
b = T.vector() #a vector of biases, which make our transform affine instead of linear
stepsize = T.scalar('stepsize') # a stepsize for gradient descent
# REGRESSION MODEL AND COSTS TO MINIMIZE
prediction = T.softmax(T.dot(x, w) + b)
cross_entropy = T.sum(y * T.log(prediction), axis=1)
cost = T.sum(cross_entropy) + l2_coef * T.sum(T.sum(w*w))
# GET THE GRADIENTS NECESSARY TO FIT OUR PARAMETERS
grad_w, grad_b = T.grad(cost, [w, b])
#
# GET THE GRADIENTS NECESSARY TO FIT OUR PARAMETERS
update_fn = theano.function(
inputs = [x, y, stepsize,
In(w,
name='w',
value=numpy.zeros((n_in, n_out)),
update=w - stepsize * grad_w,
mutable=True,
strict=True)
In(b,
name='b',
value=numpy.zeros(n_out),
update=b - lr * grad_b,
mutable=True,
strict=True)
],
outputs = cost,
mode = 'EXPENSIVE_OPTIMIZATIONS')
apply_fn = theano.function(
inputs = [x, In(w, value=update_fn.storage[w]), In(b, value=update_fn.storage[b])],
outputs = [prediction])
return update_fn, apply_fn
#USUALLY THIS WOULD BE IN A DIFFERENT FUNCTION/CLASS
#FIT SOME DUMMY DATA: 100 points with 10 attributes and 3 potential labels
up_fn, app_fn = build_logistic_regression_model(n_in=10, n_out=3, l2_coef=30.0)
x_data = numpy.random.randn(100, 10)
y_data = numpy.random.randn(100, 3)
y_data = numpy.asarray(y_data == numpy.max(y_data, axis=1), dtype='int64')
print "Model Training ..."
for iteration in xrange(1000):
print " iter", iteration, "cost", update_fn(x_data, y_data, stepsize=0.0001)
print "Model Predictions"
print apply_fn(x_data)
.. _tutorial:
=============
Tutorial
=============
.. contents::
*This documentation is still in-progress. 20080919*
Introduction
============
Great. You know `What theano is`_, and you've even `installed it`_.
But how do you use it?
.. _`What theano is`: http://lgcm.iro.umontreal.ca/theano/wiki/WhatIsTheano
.. _`installed it`: http://lgcm.iro.umontreal.ca/theano/wiki/InstallationNotes
If you have never used Theano before, we recommend you read over this tutorial start-to-finish. This will give you a sense of what you can do with Theano, and how.
Afterwards, we encourage you to read the documentation in accompanying links, which will allow you to understand the underlying concepts behind Theano better.
Scalar example
==============
In the following example, we will build a function `f(x) = x + 1.5`. We will then evaluate that function
.. code-block:: python
import theano
import theano.tensor as tensor
# Declare a symbolic constant
c = tensor.constant(1.5)
# Declare a symbolic floating-point scalar
x = tensor.fscalar()
# The symbolic result y is computed by adding x to c
y = x + c
# f is a function we build to compute output y given input x.
# f(x) = y
# = x + c
# = x + 1.5
f = theano.function([x], [y])
# We now bind 2.5 to an internal copy of x and evaluate an internal y,
# which we return (actually f(2.5) returns a list because theano
# functions in general can return many things).
# We assert that 4.0 == f(2.5)[0] = 2.5 + 1.5
assert 4.0 == f(2.5)[0]
In the example above, `c`, `x`, and `y` are each a ''symbolic''
result_. They are symbolic because they stand for variables and have
a type_, but do not actually store instantiated variables. Not yet,
at least. (To give them values, we will have to `evaluate` them.
More on this below.)
.. _result: glossary.html#result
.. _type: glossary.html#type
Since we are using the addition operator (`x + c`) here on symbolic results, the
output `y` is also symbolic. The `+` corresponds to an ''operation'' in theano
terminology, or ''op'' for short.
We use these results and ops to construct a `symbolic graph`_. The graph
is symbolic because we declare what it computes, but do not actually
perform any computation. Some type-checking is done on while we build
our graphs, so if you try to do something really crazy you'll see an
exception right away.
.. _symbolic graph: glossary.html#symbolicgraph
To actually use our graph for computation, we have to compile_ (or build_) it into
a function `f`. The compiled function is actually capable of performing
computation. So after we have built f, we use it to compute the value of y from
a `value input` x. Some argument checking is only possible at run-time, so if
you ask for impossible things (i.e. logarithm of a negative number, sum of
matrices with different shapes) then you will get exceptions from the compiled
function. These exceptions can be tricky to understand, but we feel your pain
and we are working hard to make these problems errors easier to fix.
*TODO: Is concrete the opposite of symbolic? Do we actually have a term for this?*
*TODO: Go over TerminologyGlossary and make sure we touch on / link to most basic concepts in the above.*
*It would be worth thinking through the order in which these terms should be introduced.
Can we inline the text?'''*
*Note: Theano has two types of scalar_.*
Matrix example
==============
In the following example, we will build a function to evaluate the dot product `f(x) = dot(x, w)`.
*TODO: Are there ways we can nicely format the matrix math?*
.. code-block:: python
import theano
import theano.tensor as tensor
# Define the symbolic results
x_sym = tensor.matrix()
w_sym = tensor.matrix()
y_sym = tensor.dot(x_sym, w_sym)
f = theano.function([x_sym, w_sym], [y_sym])
from numpy import asarray
# Now, choose concrete x and w values.
# x = [[1 2 3]
# [4 5 6]]
x = asarray([[1, 2, 3], [4, 5, 6]])
# w = [[ 1 2]
# [-1 -2]
# [ 3 3]]
w = asarray([[1, 2], [-1, -2], [3, 3]])
# f(x, w) = [[ 8. 7.]
# [ 17. 16.]]
# .all() checks the equality over all matrix entries.
assert (f(x, w) == asarray([[8, 7], [17, 16]])).all()
*TODO: Explain the matrix and other interesting things going on here.*
*TODO: Explain that we have a lot of numpy functionality reimplemented. Link to
numpy docs and say familiarity won't hurt. Also link to list of available ops.*
Broadcasting example
====================
Broadcasting is a subtle and important concept in numpy, which I don't
completely understand. Regardless, here is an example of how broadcasting
works.
*WRITEME: Extend to above example to add a vector.*
Gradient example
================
We are going to write some gradient-based learning code.
You may now wish to review some
`matrix conventions <http://pylearn.org/pylearn/wiki/MatrixConventions>`__.
(Hint: Each row is a training instance, each column is a feature dimension.)
*WRITEME: A simple logistic regression example.*
State example
=============
In this example, we'll look at a complete logistic regression model, with
training by gradient descent.
.. code-block:: python
def build_logistic_regression_model(n_in, n_out, l2_coef=30.0)
# DECLARE SOME VARIABLES
import tensor as T
x = T.matrix() #our points, one point per row
y = T.matrix() #store our labels as place codes (label 3 of 5 is vector [00100])
w = T.matrix() #the linear transform to apply to our input points
b = T.vector() #a vector of biases, which make our transform affine instead of linear
stepsize = T.scalar('stepsize') # a stepsize for gradient descent
# REGRESSION MODEL AND COSTS TO MINIMIZE
prediction = T.softmax(T.dot(x, w) + b)
cross_entropy = T.sum(y * T.log(prediction), axis=1)
cost = T.sum(cross_entropy) + l2_coef * T.sum(T.sum(w*w))
# GET THE GRADIENTS NECESSARY TO FIT OUR PARAMETERS
grad_w, grad_b = T.grad(cost, [w, b])
#
# GET THE GRADIENTS NECESSARY TO FIT OUR PARAMETERS
update_fn = theano.function(
inputs = [x, y, stepsize,
In(w,
name='w',
value=numpy.zeros((n_in, n_out)),
update=w - stepsize * grad_w,
mutable=True,
strict=True)
In(b,
name='b',
value=numpy.zeros(n_out),
update=b - lr * grad_b,
mutable=True,
strict=True)
],
outputs = cost,
mode = 'EXPENSIVE_OPTIMIZATIONS')
apply_fn = theano.function(
inputs = [x, In(w, value=update_fn.storage[w]), In(b, value=update_fn.storage[b])],
outputs = [prediction])
return update_fn, apply_fn
#USUALLY THIS WOULD BE IN A DIFFERENT FUNCTION/CLASS
#FIT SOME DUMMY DATA: 100 points with 10 attributes and 3 potential labels
up_fn, app_fn = build_logistic_regression_model(n_in=10, n_out=3, l2_coef=30.0)
x_data = numpy.random.randn(100, 10)
y_data = numpy.random.randn(100, 3)
y_data = numpy.asarray(y_data == numpy.max(y_data, axis=1), dtype='int64')
print "Model Training ..."
for iteration in xrange(1000):
print " iter", iteration, "cost", update_fn(x_data, y_data, stepsize=0.0001)
print "Model Predictions"
print apply_fn(x_data)
Summary
=======
*TODO: Rewrite above examples to use doctest strings?*
*TODO: Go through above and link all terms, either to wiki documentation or to
epydoc documentation.*
*TODO: I would be useful to actually have example files like this in the source
code. The question is how to automatically extract the source files and inline
them into this documentation.*
.. _README: ../README.html
.. _Download: ../README.html#downloading-theano
.. _Documentation: index.html
.. _Wiki: http://pylearn.org/theano
.. _task list: http://lgcm.iro.umontreal.ca/theano/query?status=accepted&status=assigned&status=new&status=reopened&group=milestone&max=200&col=id&col=summary&col=status&col=owner&col=type&col=priority&col=component&col=time&report=9&order=priority
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论