提交 d5e4d890 authored 作者: James Bergstra's avatar James Bergstra

logistic regression, comments

上级 c79e0c92
...@@ -4,19 +4,52 @@ README: theano ...@@ -4,19 +4,52 @@ README: theano
.. contents:: .. contents::
Quick-Start
===========
WRITEME.
Project Description Project Description
=================== ===================
Theano is a python library for manipulating and evaluating expressions, especially matrix-valued ones.
What does Theano do that Python and numpy do not?
- *execution speed optimizations*: Theano can use `g++` to compile parts your expression graph into native machine code, which runs much faster than python.
- *symbolic differentiation*: Theano can convert a symbolic graph build symbolic graphs for computing gradients.
- *stability optimizations*: Theano can recognize numerically unstable expressions and compute them with more stable algorithms.
Here's a very simple example of how to use Theano. It doesn't show off many of Theano's features, but it illustrates concretely what Theano is.
.. code-block:: python
import theano
from theano import tensor
a = tensor.fscalar() # declare a symbolic floating-point scalar.
b = tensor.fscalar() # declare a symbolic floating-point scalar.
c = a + b # create a simple expression
f = theano.function([a,b], [c]) # convert the expression into a callable object
# that takes (a,b) values as input and computes a value for c
assert 4.0 == f(1.5, 2.5) # bind 1.5 to 'a', 2.5 to 'b', and evaluate 'c'
Theano is not a programming language in the normal sense because you write a program in Python that builds expressions for Theano. Still it is like a programming language in the sense that to use theano, you have to
- declare variables ({{{a,b}}}) and give their types
- build expressions for how to put those variables together
- compile expression graphs to functions in order to use them for computation.
It is good to think of `theano.function` as the interface to a compiler which builds a callable object from a purely symbolic graph.
License License
------- -------
Theano is licensed under a BSD-like license. See the LICENSE file in the project root folder.
Installation Installation
...@@ -28,10 +61,57 @@ Installation ...@@ -28,10 +61,57 @@ Installation
Software Requirements Software Requirements
--------------------- ---------------------
- linux or OS-X operating system
- python 2.5
- SciPy (specifically numpy, sparse, weave). Numpy version >= 1.1 fixes memory leak.
- docutils, pygments (optional, to build documentation)
- mercurial (optional, to download the source)
- g++, python-dev (optional, to compile generated C code)
- `psyco <http://psyco.sourceforge.net/>`__ can make your python code much faster, if you are on a 32-bit x86 architecture. If you use compiled C code, this can be less important.
Downloading Theano Downloading Theano
------------------ ------------------
There are two ways to get the source: mercurial (required for library developers) and unix tar.
There are no stable releases yet.
*To get the source via mercurial,* you must have `mercurial <http://www.selenic.com/mercurial/wiki/>`__ installed.
Get the source and run the auto-tests like this:
.. code-block::
hg clone http://pylearn.org/hg/theano theano
cd theano
python autotest.py
To update your library to the latest on pylearn.org, change directory (`cd`) to this `theano` folder and type
.. code-block::
hg pull -u
*To get the source via unix tar*, you can download the latest source directly as a gzip'd tar file:
`<http://pylearn.org/hg/theano/archive/tip.tar.gz>`__.
Two environment variables are used to control automatic code generation.
(It is possible to use theano in a way that avoids all automatic code generation, but the functions you make using {{{theano.function}}} will execute more slowly.)
- `THEANO_BLAS_LDFLAGS`:
a space-separated list of library names to link against for BLAS functions. Default: `-lblas`
- `THEANO_COMPILEDIR`:
a directory with read/write access permissions, where theano will store
autogenerated code and c modules. Default: `$HOME/.theano`. If this
directory does not exist, or does not have the correct permissions, then
theano will try to create it with the correct permissions. If that fails,
an exception will be raised and no C code will be compiled.
Setup on Linux Setup on Linux
++++++++++++++ ++++++++++++++
...@@ -40,6 +120,21 @@ Setup on Linux ...@@ -40,6 +120,21 @@ Setup on Linux
Setup on OS-X Setup on OS-X
+++++++++++++ +++++++++++++
- Install [http://www.macports.org/ MacPorts]
- `sudo port install gcc42 py25-zlib py25-numpy py25-scipy mercurial`.
Note that compiling gcc42 takes a significant time (hours) so it's probably
not the best solution if you're in a rush! In my (Doomie) experience, scipy
failed to compile the first time I tried the command, but the second time
it compiled just fine. Same thing with py25-zlib.
- Install some kind of BLAS library (TODO: how?)
- Set THEANO_BLAS_LDFLAGS to something which will link against said BLAS
library. (e.g., `THEANO_BLAS_LDFLAGS='-lcblas -latlas -lgfortran'`).
Setup on Windows Setup on Windows
++++++++++++++++ ++++++++++++++++
...@@ -47,26 +142,55 @@ Setup on Windows ...@@ -47,26 +142,55 @@ Setup on Windows
No one has done this yet. WRITEME. No one has done this yet. WRITEME.
Tips for running at LISA
++++++++++++++++++++++++
Use the fast BLAS library that Fred installed, by setting
`THEANO_BLAS_LDFLAGS=-lgoto`.
Tips for running on a cluster Tips for running on a cluster
+++++++++++++++++++++++++++++ +++++++++++++++++++++++++++++
WRITEME. Use something like the following in your .bashrc:
.. code-block::
#use the intel math-kernel library for BLAS routines
THEANO_BLAS_LDFLAGS=-lmkl
# use up to two threads in the MKL routines
OMP_NUM_THREADS=2
# IMPORTANT!
# Use the local-temporary directory as a cache.
# If several jobs start simultaneously and use a common
# cache, then the cache may be corrupted.
# Theano is not process-safe or thread-safe in this sense.
THEANO_COMPILEDIR=/ltmp/<username>_theano
Running the Test Suite Running the Test Suite
---------------------- ======================
Test your installation by running the autotests. Type at the shell:
.. code-block::
WRITEME cd theano
python2.5 autotest.py
All tests should pass.
Using Theano Using Theano
------------ ============
WRITEME Now that you've got theano installed and running, check out the `n00b tutorial <n00b.html>`__ for how to use it.
Getting Help Getting Help
------------ ============
WRITEME If these installation instructions don't work, search the theano-users archive for similar cases. If you don't find a solution, write to theano-users and explain the situation.
...@@ -2,25 +2,37 @@ ...@@ -2,25 +2,37 @@
Theano Project Documentation Overview Theano Project Documentation Overview
===================================== =====================================
Documentation is divided broadly into two kinds: user documentation and
developer documentation.
`Using Theano` covers how to *use* what is already in the Theano library to
build graphs and evaluate them.
`Hacking Theano` introduces you to what's under the hood. If you want to extend Theano
to handle new data and expression types, this documentation is for you.
Using Theano Using Theano
============ ============
- First of all, read the `n00b guide`_. - First of all, read the `n00b guide`_. It is a cut-and-paste, tutorial-style intro to what Theano can do.
- Familiarize yourself with the `glossary of terminology`_.
- Join `theano-users`_. - Join `theano-users`_.
- Familiarize yourself with our `glossary of terminology`_. - Learn to use the typelist_, and the oplist_. These are the building blocks
of theano expression graphs.
- Consult some of the `Howto`_ recipes on the wiki. - Browse through some of the `Howto`_ recipes on the wiki.
.. _Howto: .. _Howto:
.. _theano-users: http://groups.google.com/group/theano-users?pli=1 .. _theano-users: http://groups.google.com/group/theano-users?pli=1
.. _theano-dev: http://groups.google.com/group/theano-dev?pli=1 .. _theano-dev: http://groups.google.com/group/theano-dev?pli=1
.. _n00b guide: n00b.html .. _n00b guide: n00b.html
.. _glossary of terminology: glossary.html .. _glossary of terminology: glossary.html
.. _typelist: typelist.html
.. _oplist: oplist.html
Extending Theano Hacking Theano
================ ==============
- `Get Started as a Developer <DevStartGuide.html>`__ by setting up mercurial, getting a few accounts, - `Get Started as a Developer <DevStartGuide.html>`__ by setting up mercurial, getting a few accounts,
setting up your environment, and getting some background in mercurial, python, setting up your environment, and getting some background in mercurial, python,
...@@ -28,14 +40,17 @@ Extending Theano ...@@ -28,14 +40,17 @@ Extending Theano
- Join `theano-dev`_ to participate in development discussion. - Join `theano-dev`_ to participate in development discussion.
- Keep an eye on the `Mercurial Changelog <http://pylearn.org/hg/theano>`__. - Pick a task from the `task list`_, or suggest one on `theano-users`_.
Features/ideas are generally discussed on `theano-users`_. Technical
- Pick a task from the `task list`_. discussions of how to actually implement something should be on
`theano-dev`_.
- Read about `How Theano Works <UserAdvanced.html>`__. - Read about `How Theano Works <UserAdvanced.html>`__.
- Browse `Theano's API <../api/>`__. - Browse `Theano's API <../api/>`__.
- Keep an eye on the `Mercurial Changelog <http://pylearn.org/hg/theano>`__.
- Send us your work as a patch to `theano-dev`_ or commit directly to the trunk. - Send us your work as a patch to `theano-dev`_ or commit directly to the trunk.
.. _theano-dev: http://groups.google.com/group/theano-dev?pli=1 .. _theano-dev: http://groups.google.com/group/theano-dev?pli=1
......
...@@ -53,8 +53,8 @@ are symbolic because they stand for variables and have a type_, but ...@@ -53,8 +53,8 @@ are symbolic because they stand for variables and have a type_, but
do not necessarily store actual values. Not yet, at least. (To give them do not necessarily store actual values. Not yet, at least. (To give them
values, we will have to `evaluate` them. More on this below.) values, we will have to `evaluate` them. More on this below.)
.. _result: WRITEME.html .. _result: glossary.html#result
.. _type: WRITEME.html .. _type: glossary.html#type
Since we are using the addition operator (`x + c`) here on symbolic results, the Since we are using the addition operator (`x + c`) here on symbolic results, the
output `y` is also symbolic. The `+` corresponds to an ''operation'' in theano output `y` is also symbolic. The `+` corresponds to an ''operation'' in theano
...@@ -65,10 +65,10 @@ symbolic because we declare what it computes, but do not actually perform any ...@@ -65,10 +65,10 @@ symbolic because we declare what it computes, but do not actually perform any
computation. Some type-checking is done on while we build our graphs, so if you computation. Some type-checking is done on while we build our graphs, so if you
try to do something really crazy you'll see an exception right away. try to do something really crazy you'll see an exception right away.
.. _symbolic graph: WRITEME.html .. _symbolic graph: glossary.html#symbolicgraph
To actually use our graph for computation, we have to compile (or build) it into To actually use our graph for computation, we have to compile (or build) it into
a function `f`. The compiled function is actually capable of performing a function `f`. The compiled function is actually capable of performing
computation. So after we have built f, we use it to compute the value of y from computation. So after we have built f, we use it to compute the value of y from
a `value input` x. Some argument checking is only possible at run-time, so if a `value input` x. Some argument checking is only possible at run-time, so if
you ask for impossible things (i.e. logarithm of a negative number, sum of you ask for impossible things (i.e. logarithm of a negative number, sum of
...@@ -126,7 +126,8 @@ In the following example, we will build a function to evaluate the dot product ` ...@@ -126,7 +126,8 @@ In the following example, we will build a function to evaluate the dot product `
*TODO: Explain the matrix and other interesting things going on here.* *TODO: Explain the matrix and other interesting things going on here.*
*TODO: Explain that we have a lot of numpy functionality reimplemented. Link to numpy docs and say familiarity won't hurt. Also link to list of available ops.* *TODO: Explain that we have a lot of numpy functionality reimplemented. Link to
numpy docs and say familiarity won't hurt. Also link to list of available ops.*
Broadcasting example Broadcasting example
==================== ====================
...@@ -150,7 +151,76 @@ You may now wish to review some ...@@ -150,7 +151,76 @@ You may now wish to review some
State example State example
============= =============
*WRITEME: A simple logistic regression example, with implicit weights.* In this example, we'll look at a complete logistic regression model, with
training by simple gradient descent.
.. code-block:: python
def build_logistic_regression_model(n_in, n_out, l2_coef=30.0)
# DECLARE SOME VARIABLES
import tensor as T
x = T.matrix() #our points, one point per row
y = T.matrix() #store our labels as place codes (label 3 of 5 is vector [00100])
w = T.matrix() #the linear transform to apply to our input points
b = T.vector() #a vector of biases, which make our transform affine instead of linear
stepsize = T.scalar('stepsize') # a stepsize for gradient descent
# DECLARE SOME VARIABLES
prediction = T.softmax(T.dot(x, w) + b)
cost = T.sum(T.kl_multinomial(targ=y, pred=prediction)) + l2_coef * T.sum(T.sum(w*w))
# GET THE GRADIENTS NECESSARY TO FIT OUR PARAMETERS
grad_w, grad_b = T.grad(cost, [w, b])
#
# GET THE GRADIENTS NECESSARY TO FIT OUR PARAMETERS
update_fn = theano.function(
inputs = [x, y, stepsize,
In(w,
name='w',
value=numpy.zeros((n_in, n_out)),
update=w - stepsize * grad_w,
mutable=True,
strict=True)
In(b,
name='b',
value=numpy.zeros(n_out),
update=b - lr * grad_b,
mutable=True,
strict=True)
],
outputs = cost,
mode = 'EXPENSIVE_OPTIMIZATIONS')
apply_fn = theano.function(
inputs = [x, In(w, value=update_fn.storage[w]), In(b, value=update_fn.storage[b])],
outputs = [prediction])
return update_fn, apply_fn
#USUALLY THIS WOULD BE IN A DIFFERENT FUNCTION/CLASS
#FIT SOME DUMMY DATA: 100 points with 10 attributes and 3 potential labels
up_fn, app_fn = build_logistic_regression_model(n_in=10, n_out=3, l2_coef=30.0)
x_data = numpy.random.randn(100, 10)
y_data = numpy.random.randn(100, 3)
y_data = numpy.asarray(y_data == numpy.max(y_data, axis=1), dtype='int64')
print "Model Training ..."
for iteration in xrange(1000):
print " iter", iteration, "cost", update_fn(x_data, y_data, stepsize=0.0001)
print "Model Predictions"
print apply_fn(x_data)
Summary Summary
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论