merge

d9c7ae08 · Pascal Lamblin · 191a96e7 · d6f86538 · 191a96e7 · 191a96e7
--- a/doc/advanced/ccodegen.txt
+++ b/doc/advanced/ccodegen.txt
-.. _ccodegen:
-=================
-C Code Generation
-=================
-WRITEME
--- a/doc/advanced/features.txt
+++ b/doc/advanced/features.txt
-.. _envfeaturelist:
-====================
-List of Env Features
-====================
-See :api:`gof.env.Env`.
-WRITEME
-.. _nodefinder:
-NodeFinder
-==========
-See :api:`gof.toolbox.NodeFinder`.
-WRITEME
--- a/doc/advanced/module.txt
+++ b/doc/advanced/module.txt
-.. _moduleinterface:
-================
-Module Interface
-================
-A Theano Module is like Theano's version of a file.
-When you instantiate a ``Module()``, you are creating a blank file.
-Into this file you can put both symbolic and non-symbolic objects.
-Non-symbolic objects are like constants (technically literals) in the file.
-Symbolic objects are like variables and functions.
-The functions in a Module are called Methods.
-The variables in a Module (and submodules) are global.
-Module Methods have access to all these global variables.
-To use a Module, you need to compile it.
-This is done by calling `Module.make()`.
-The result of compiling a Module is a ModuleInstance, this is the compiled
-version of your Theano file.
-In the ModuleInstance, your symbolic variables have become containers (containing None),
-and your Methods have become callable functions.
-You should initialize the symbolic variables by calling
-``ModuleInstance.initialize()`` (although make() will call it for you, 
-on the top-level ModuleInstance.)
-You can compile a Module several times, to create multiple ModuleInstances.
-Each of these will have its own copy of all program literals.
-Module Graph
------------
-Components can be grouped into a directed graph.
-When we call `make`, this graph is replicated with ComponentInstances instead of
-Components.  Wheras Components are represent symbolic things (ie. Variables), ComponentInstances represent non-symbolic ones (ie. sparse matrices, ndarrays, callable functions).
-.. index::
-   single: Component
-   single: component; Component
-.. _component:
---------
-Component
---------
-All of the elements of what is called the "module system" or "modules" are
-components.
-A component subclass is represents a symbolic theano thing, and implements the
-``build`` function. 
-The ``build`` function is responsible for converting the symbolic thing into a
-non-symbolic thing.
-Compiling with make
-------------------
-Conversion from a Component graph to a ComponentInstance graph is performed by `Component.make`.
-This method traverses the Component graph in multiple passes. 
-In the first pass (the allocate pass), it creates storage for all Variables that are contained in the graph (see
-`Component.allocate`).  These are the module variables.
-In the second pass (the build pass), it creates functions that (in general) operate on these module variables.
-This pass also serves to construct all ComponentInstance-derived instances as well, such as
-`ModuleInstance`s.  The objects that are returned from this second pass are the return value of
-`Component.make`.
-In the third pass (the initialize pass), is optional and not necessarily recursive through the
-graph.
-The purpose of the third pass is to call the initialize method of the ComponentInstances built
-during the second pass.
-During this pass the ComponentInstance graph is complete. It is a good time to fill storage
-allocated in phase 1 with sensible values.
-.. index::
-   single: External
-   single: component; External
-.. _external:
--------
-External
--------
-WRITEME
-.. index::
-   single: Member
-   single: component; Member
-.. _member:
------
-Member
------
-WRITEME
-.. index::
-   single: Method
-   single: component; Method
-.. _method:
------
-Method
------
-WRITEME
-.. index::
-   single: Module
-   single: component; Module
-.. _module:
------
-Module
------
-A Module instance can contain objects as attributes.
-This makes it something like a class in the way that Method is
-analogous to a function.
-A Module is meant to contain Components.
-Attributes which are not Components themselves must at least be transform-able
-into Components by :api:`compile.module.wrap`.  If a Module contains something
-that is not convertible into a Component, then it is not possible to compile
-that Module with ``make``.
-Old Text
--------
-In the Module system, the analog of the file is the `Module`, the analog of the function is the
-`Method`, and the analog of the variable is the `Member`.  Module, Member, and Method all work
-at the symbolic level.  Once a graph of Modules, Members, and Methods is ready for use, it must
-be compiled with a call to `make` which will return an isomorphic structure in which Modules
-have become `ModuleInstances`, Members have become `Container`s, and Methods have become
-`Function`s.
-This structure contains numbers and functions, and is ready for computation.
--- a/doc/advanced_tutorial/index.txt
+++ b/doc/advanced_tutorial/index.txt
-.. _advtutorial:
-=================
-Advanced Tutorial
-=================
-Before tackling this tutorial, it is highly recommended to read the
-:ref:`basictutorial`.
-The advanced tutorial is meant to give the reader a greater
-understanding of the building blocks of Theano. Through this tutorial
-we are going to define one :ref:`type`, ``double``, and basic
-arithmetic :ref:`operations <op>` on that Type. We will first define
-them using a Python implementation and then we will add a C
-implementation.
-This tutorial should be of most use to users who want to extend Theano
-with custom types and operations related to these types.
-It is a good idea to read this tutorial as well since it provides
-grounding for fundamental Theano concepts.
-.. toctree::
-   theano_vs_c
-   graphstructures
-   type
-   op
-   inplace
-   ctype
-   cop
-   optimization
-   tips
--- a/doc/basic_tutorial/index.txt
+++ b/doc/basic_tutorial/index.txt
-.. _basictutorial:
-==============
-Basic Tutorial
-==============
-Before doing anything in this tutorial, make sure that Theano is
-installed on your system (see :ref:`install`).
-Done? Alright!
-Let's start an interactive session and import the package you just
-installed:
->>> from theano import *
-Many of symbols you will need to use are in the ``tensor`` subpackage
-of theano. Let's import that subpackage under a handy name. I like
-``T``.
->>> import theano.tensor as T
-Now we're ready for the tour:
-.. toctree::
-   adding
-   examples
-   tools
--- a/doc/conf.py
+++ b/doc/conf.py
@@ -39,7 +39,7 @@ templates_path = ['.templates']
 source_suffix = '.txt'
 # The master toctree document.
-master_doc = 'contents'
+master_doc = 'index'
 # General substitutions.
 project = 'Theano'
@@ -64,7 +64,7 @@ today_fmt = '%B %d, %Y'
 # List of directories, relative to source directories, that shouldn't be searched
 # for source files.
-exclude_dirs = ['images', 'scripts', 'trac']
+exclude_dirs = ['images', 'scripts', 'sandbox']
 # The reST default role (used for this markup: `text`) to use for all documents.
 #default_role = None
@@ -90,7 +90,8 @@ pygments_style = 'sphinx'
 # The style sheet to use for HTML and HTML Help pages. A file of that name
 # must exist either in Sphinx' static/ path, or in one of the custom paths
 # given in html_static_path.
-html_style = 'default.css'
+#html_style = 'default.css'
+html_theme = 'sphinxdoc'
 # The name for this set of Sphinx documents.  If None, it defaults to
 # "<project> v<release> documentation".
@@ -101,7 +102,8 @@ html_style = 'default.css'
 # The name of an image file (within the static path) to place at the top of
 # the sidebar.
-html_logo = 'images/theano_logo-200x67.png'
+#html_logo = 'images/theano_logo-200x67.png'
+html_logo = 'images/theano_logo_allblue_200x46.png'
 # The name of an image file (within the static path) to use as favicon of the
 # docs.  This file should be a Windows icon file (.ico) being 16x16 or 32x32

--- a/doc/contents.txt
+++ b/doc/contents.txt
-.. _contents:
-========
-Contents
-========
-.. toctree::
-   :maxdepth: 2
-   introduction
-   LICENSE
-   install
-   basic_tutorial/index
-   advanced_tutorial/index
-   topics/index
-   indexes/index
-   glossary
-   links
-   internal/index
-   NEWS
-..   examples/index
-..   advanced/index
--- a/doc/advanced_tutorial/apply.png
+++ b/doc/advanced_tutorial/apply.png
--- a/doc/advanced_tutorial/apply.svg
+++ b/doc/advanced_tutorial/apply.svg
--- a/doc/advanced_tutorial/apply2.svg
+++ b/doc/advanced_tutorial/apply2.svg
--- a/doc/advanced_tutorial/cop.txt
+++ b/doc/advanced_tutorial/cop.txt
--- a/doc/advanced_tutorial/ctype.txt
+++ b/doc/advanced_tutorial/ctype.txt
--- a/doc/topics/debug_faq.txt
+++ b/doc/topics/debug_faq.txt
@@ -9,7 +9,7 @@ There are many kinds of bugs that might come up in a computer program.
 This page is structured as an FAQ.  It should provide recipes to tackle common
 problems, and introduce some of the tools that we use to find problems in our
 Theano code, and even (it happens) in Theano's internals, such as
-:ref:`debugmode`.
+:ref:`using_debugmode`.
@@ -49,7 +49,7 @@ I wrote a new Op/Type, and weird stuff is happening...
 First, check the :ref:`op_contract` and the :ref:`type_contract` 
 and make sure you're following the rules.
-Then try running your program in :ref:`debugmode`.  DebugMode might catch
+Then try running your program in :ref:`using_debugmode`.  DebugMode might catch
 something that you're not seeing.
@@ -65,8 +65,8 @@ I wrote a new optimization, and it changed my results even though I'm pretty sur
 ------------------------------------------------------------------------------------------------
 First, check the :ref:`op_contract` and make sure you're following the rules.
-Then try running your program in :ref:`debugmode`.  DebugMode might catch
+Then try running your program in :ref:`using_debugmode`.  DebugMode might 
-something that you're not seeing.
+catch something that you're not seeing.
 The function I compiled is too slow, what's up?
@@ -77,7 +77,7 @@ First, make sure you're running in FAST_RUN mode, by passing
 operations have excruciatingly slow Python implementations and that
 can negatively effect the performance of FAST_COMPILE.
-Second, try the theano :ref:`profilemode`.  This will tell you which
+Second, try the theano :ref:`using_profilemode`.  This will tell you which
 Apply nodes, and which Ops are eating up your CPU cycles.
@@ -110,9 +110,7 @@ put logic inside of the print_eval function  that would, for example, only
 print something out if a certain kind of Op was used, at a certain program
 position, or if a particular value shows up in one of the inputs or outputs.
-This can be a really powerful debugging tool.  Read about more things you can
+.. TODO: documentation for link.WrapLinkerMany
-do with :api:`link.WrapLinkerMany`.
-Note well the call to ``fn`` inside the call to ``print_eval``; without it,
+This can be a really powerful debugging tool.  Note the call to ``fn`` inside the call to ``print_eval``; without it, the graph wouldn't get computed at all!
-the graph wouldn't get computed at all!
--- a/doc/advanced_tutorial/graphstructures.txt
+++ b/doc/advanced_tutorial/graphstructures.txt
--- a/doc/extending/index.txt
+++ b/doc/extending/index.txt
+.. _extending:
+================
+Extending Theano
+================
+This documentation is for users who want to extend Theano with new Types, new
+Operations (Ops), and new graph optimizations.
+Along the way, it also introduces many aspects of how Theano works, so it is
+also good for you if you are interested in getting more under the hood with
+Theano itself.
+Before tackling this tutorial, it is highly recommended to read the
+:ref:`basictutorial`.
+The first few pages will walk you through the definition of a new :ref:`type`,
+``double``, and a basic arithmetic :ref:`operations <op>` on that Type. We
+will start by defining them using a Python implementation and then we will add
+a C implementation.
+.. toctree::
+    pipeline
+    theano_vs_c
+    graphstructures
+    type
+    op
+    inplace
+    ctype
+    cop
+    optimization
+    tips
+    unittest
--- a/doc/advanced_tutorial/inplace.txt
+++ b/doc/advanced_tutorial/inplace.txt
--- a/doc/advanced_tutorial/op.txt
+++ b/doc/advanced_tutorial/op.txt
--- a/doc/advanced_tutorial/optimization.txt
+++ b/doc/advanced_tutorial/optimization.txt
--- a/doc/topics/pipeline.txt
+++ b/doc/topics/pipeline.txt
--- a/doc/advanced_tutorial/theano_vs_c.txt
+++ b/doc/advanced_tutorial/theano_vs_c.txt
--- a/doc/advanced_tutorial/tips.txt
+++ b/doc/advanced_tutorial/tips.txt
--- a/doc/advanced_tutorial/type.txt
+++ b/doc/advanced_tutorial/type.txt
--- a/doc/topics/unittest.txt
+++ b/doc/topics/unittest.txt
--- a/doc/images/theano_logo.svg
+++ b/doc/images/theano_logo.svg
-<?xml version="1.0" encoding="UTF-8" standalone="no"?>
-<!-- Created with Inkscape (http://www.inkscape.org/) -->
-<svg
-   xmlns:dc="http://purl.org/dc/elements/1.1/"
-   xmlns:cc="http://web.resource.org/cc/"
-   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
-   xmlns:svg="http://www.w3.org/2000/svg"
-   xmlns="http://www.w3.org/2000/svg"
-   xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd"
-   xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"
-   width="345.86591"
-   height="115.13724"
-   id="svg2"
-   sodipodi:version="0.32"
-   inkscape:version="0.45.1"
-   sodipodi:docbase="/home/olivier/hg/theano"
-   sodipodi:docname="theano_logo.svg"
-   inkscape:output_extension="org.inkscape.output.svg.inkscape"
-   version="1.0"
-   inkscape:export-filename="/home/olivier/hg/theano/theano_logo_big.png"
-   inkscape:export-xdpi="273.58655"
-   inkscape:export-ydpi="273.58655">
-  <defs
-     id="defs4" />
-  <sodipodi:namedview
-     id="base"
-     pagecolor="#ffffff"
-     bordercolor="#666666"
-     borderopacity="1.0"
-     gridtolerance="10000"
-     guidetolerance="10"
-     objecttolerance="10"
-     inkscape:pageopacity="0.0"
-     inkscape:pageshadow="2"
-     inkscape:zoom="1.979899"
-     inkscape:cx="248.50886"
-     inkscape:cy="97.530852"
-     inkscape:document-units="px"
-     inkscape:current-layer="layer1"
-     inkscape:window-width="1680"
-     inkscape:window-height="1030"
-     inkscape:window-x="0"
-     inkscape:window-y="0"
-     showguides="true"
-     inkscape:guide-bbox="true" />
-  <metadata
-     id="metadata7">
-    <rdf:RDF>
-      <cc:Work
-         rdf:about="">
-        <dc:format>image/svg+xml</dc:format>
-        <dc:type
-           rdf:resource="http://purl.org/dc/dcmitype/StillImage" />
-      </cc:Work>
-    </rdf:RDF>
-  </metadata>
-  <g
-     inkscape:label="Layer 1"
-     inkscape:groupmode="layer"
-     id="layer1"
-     transform="translate(-219.06115,-88.23416)">
-    <path
-       id="path5572"
-       d="M 245.99986,202.38198 C 235.76172,199.76305 230.3317,195.18454 224.56469,184.30815 C 220.37775,176.41173 219.14676,170.92373 219.06742,159.80009 C 219.02952,154.48681 219.14363,153.33451 219.96737,150.71192 L 220.91072,147.70853 L 222.03485,150.91475 C 223.32792,154.60284 224.5932,157.2101 225.42491,157.90035 C 225.91931,158.31066 226.04839,157.45384 226.31509,151.99127 C 226.48664,148.47733 226.74177,144.59829 226.88203,143.3712 C 227.13637,141.14611 227.14306,141.13711 229.37079,140.02194 C 233.6165,137.89661 241.51289,137.62549 255.7355,139.11671 C 262.25557,139.80033 276.27711,139.65881 278.302,138.88894 C 280.15154,138.18575 280.55926,136.52884 280.07117,131.69921 C 279.49474,125.99537 279.0548,124.08561 277.22091,119.32634 C 272.4649,106.98367 264.75123,100.69911 254.31572,100.6648 C 244.91721,100.6339 237.20308,106.18784 232.64521,116.26692 C 228.63554,125.13371 226.84755,134.63837 225.79128,152.70119 L 225.49476,157.77183 L 224.6018,156.08339 C 220.32764,148.00176 218.55416,134.3005 220.39244,123.56361 C 221.81624,115.24763 224.72248,108.02444 229.43922,101.07873 C 233.51167,95.08179 239.33503,91.22689 247.37024,89.20891 C 252.54529,87.90924 256.08615,87.90924 261.2612,89.20891 C 269.29641,91.22689 275.11977,95.08179 279.19222,101.07873 C 283.85913,107.95107 286.81123,115.24029 288.1872,123.28884 C 289.11587,128.72102 289.26704,136.96138 288.48572,139.5625 C 287.80095,141.84221 282.75423,149.25874 282.58446,148.23482 C 282.51467,147.81394 282.66002,147.09129 282.90745,146.62895 C 283.60255,145.33016 282.97412,144.79606 281.91813,145.78812 C 281.09814,146.55845 280.95497,146.57992 280.4772,146.00425 C 279.46931,144.78981 279.09827,146.0508 280.02317,147.54731 C 281.09294,149.27824 281.11194,149.86163 280.09855,149.86163 C 279.6655,149.86163 279.2114,150.02307 279.08945,150.2204 C 278.12451,151.78171 263.15706,152.14918 251.27333,150.90331 C 242.48708,149.98217 235.49959,150.17874 233.86598,151.393 C 232.52086,152.39282 230.73981,155.92513 230.13832,158.78596 C 229.56685,161.50406 229.89814,169.75383 230.71167,173.06316 C 231.53272,176.40313 234.44347,181.26714 237.48117,184.37536 C 245.97324,193.06457 259.99042,193.16426 268.52866,184.59618 C 272.82158,180.28826 276.28725,173.36771 275.26986,171.13477 C 275.01206,170.56897 274.80113,169.46845 274.80113,168.68918 C 274.80113,167.27252 276.03299,164.34881 276.84003,163.85004 C 277.97809,163.14668 279.2633,160.34344 279.2633,158.56453 C 279.2633,156.50464 279.81574,155.1351 280.64665,155.1351 C 281.94053,155.1351 281.78744,149.84815 280.42796,147.58266 C 279.38328,145.84176 279.47773,145.48404 280.68309,146.61641 C 281.46075,147.34699 281.69721,147.42235 281.69721,146.93962 C 281.69721,146.59338 282.00521,146.05957 282.38164,145.75336 C 282.9932,145.2559 283.02559,145.28301 282.68588,146.00793 C 282.47678,146.45415 282.35906,148.62448 282.4243,150.8309 C 282.5319,154.47038 282.63024,154.91126 283.48431,155.58307 C 284.25335,156.18799 284.4647,156.82757 284.6386,159.07597 C 284.78839,161.01273 285.24037,162.64716 286.16384,164.59151 C 287.23183,166.84012 287.43789,167.69463 287.27043,169.18035 C 287.15459,170.2081 286.70684,171.3939 286.24597,171.89349 C 285.2295,172.99536 281.11174,180.12521 280.69642,181.50246 C 279.94371,183.99856 277.41503,189.23736 275.76462,191.71994 C 273.21329,195.55768 270.45935,197.86457 265.70147,200.14953 C 258.59319,203.56326 253.06615,204.18955 245.99986,202.38198 z "
-       style="fill:#000000;fill-opacity:1" />
-    <text
-       xml:space="preserve"
-       style="font-size:15.53327274px;font-style:normal;font-weight:normal;fill:#7799ee;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;font-family:Bitstream Vera Sans"
-       x="285.01266"
-       y="186.09427"
-       id="text5574"
-       transform="scale(1.0402212,0.961334)"><tspan
-         sodipodi:role="line"
-         id="tspan5576"
-         x="285.01266"
-         y="186.09427"
-         style="font-size:93.19962311px;font-weight:normal;fill:#7799ee;fill-opacity:1;font-family:MgOpen Modata"
-         dx="0 -4.2857141 -6.4285722 -5 -5.7142901 -6.0714293"
-         dy="0 0 -1.3672954 0.35714287 1.0101526 -1.0101526">Theano</tspan></text>
-  </g>
-</svg>
--- a/doc/images/theano_logo_allblue.svg
+++ b/doc/images/theano_logo_allblue.svg
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!-- Created with Inkscape (http://www.inkscape.org/) -->
+<svg
+   xmlns:dc="http://purl.org/dc/elements/1.1/"
+   xmlns:cc="http://creativecommons.org/ns#"
+   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
+   xmlns:svg="http://www.w3.org/2000/svg"
+   xmlns="http://www.w3.org/2000/svg"
+   xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd"
+   xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"
+   width="345.86591"
+   height="115.13724"
+   id="svg2"
+   sodipodi:version="0.32"
+   inkscape:version="0.46"
+   sodipodi:docbase="/home/olivier/hg/theano"
+   sodipodi:docname="theano_logo_allblue.svg"
+   inkscape:output_extension="org.inkscape.output.svg.inkscape"
+   version="1.0"
+   inkscape:export-filename="theano_logo_allblue_200x46.png"
+   inkscape:export-xdpi="53.889999"
+   inkscape:export-ydpi="53.889999">
+  <defs
+     id="defs4">
+    <inkscape:perspective
+       sodipodi:type="inkscape:persp3d"
+       inkscape:vp_x="0 : 57.568619 : 1"
+       inkscape:vp_y="0 : 1000 : 0"
+       inkscape:vp_z="345.86591 : 57.568619 : 1"
+       inkscape:persp3d-origin="172.93295 : 38.379079 : 1"
+       id="perspective10" />
+  </defs>
+  <sodipodi:namedview
+     id="base"
+     pagecolor="#ffffff"
+     bordercolor="#666666"
+     borderopacity="1.0"
+     gridtolerance="10000"
+     guidetolerance="10"
+     objecttolerance="10"
+     inkscape:pageopacity="0.0"
+     inkscape:pageshadow="2"
+     inkscape:zoom="1.4"
+     inkscape:cx="187.28443"
+     inkscape:cy="-18.63669"
+     inkscape:document-units="px"
+     inkscape:current-layer="layer1"
+     inkscape:window-width="1280"
+     inkscape:window-height="999"
+     inkscape:window-x="0"
+     inkscape:window-y="6"
+     showguides="true"
+     inkscape:guide-bbox="true"
+     showgrid="false" />
+  <metadata
+     id="metadata7">
+    <rdf:RDF>
+      <cc:Work
+         rdf:about="">
+        <dc:format>image/svg+xml</dc:format>
+        <dc:type
+           rdf:resource="http://purl.org/dc/dcmitype/StillImage" />
+      </cc:Work>
+    </rdf:RDF>
+  </metadata>
+  <g
+     inkscape:label="Layer 1"
+     inkscape:groupmode="layer"
+     id="layer1"
+     transform="translate(-219.06115,-88.23416)">
+    <path
+       id="path5572"
+       d="M 361.71053,188.16928 C 354.76773,186.82023 350.70665,184.96687 346.79586,179.36429 C 343.95656,175.29673 343.50062,171.96473 343.44682,166.23478 C 343.42111,163.49783 343.49849,162.90428 344.0571,161.55334 L 344.69681,160.00626 L 345.45913,161.65784 C 346.336,163.55761 347.19403,164.90066 347.75802,165.25619 C 348.09328,165.46756 348.18083,165.02619 348.36169,162.21236 C 348.47801,160.40227 348.65102,158.40413 348.74614,157.77203 C 348.91863,156.62587 348.92314,156.62125 350.43384,156.04681 C 353.31299,154.95199 358.66778,154.81233 368.31255,155.58049 C 372.73402,155.93264 382.24935,156.1272 383.25884,155.25172 C 384.30569,154.34386 384.43282,153.93027 384.10183,151.44246 C 383.71094,148.50433 384.12609,147.83778 382.88247,145.3862 C 379.65727,139.02834 374.30749,136.3197 367.23085,136.30203 C 360.85744,136.28611 355.74516,138.6184 352.65434,143.81025 C 349.93524,148.37768 348.72275,153.27364 348.00646,162.57805 L 347.80539,165.18998 L 347.19983,164.32026 C 344.3014,160.1573 342.50418,152.88816 343.75077,147.35745 C 344.71627,143.07377 346.68707,139.35299 349.88566,135.77517 C 352.64732,132.68608 357.19089,130.91181 362.63982,129.87232 C 366.14917,129.20284 368.55034,129.20284 372.05972,129.87232 C 377.50862,130.91181 381.45763,132.89753 384.21929,135.98663 C 387.38405,139.52666 389.38596,143.28142 390.31905,147.42734 C 390.9488,150.22555 391.05131,154.47027 390.52149,155.81012 C 390.05713,156.98445 386.63479,160.80479 386.51965,160.27735 C 386.47234,160.06056 386.57091,159.6883 386.73869,159.45016 C 387.21007,158.78114 386.7839,158.506 386.0678,159.01703 C 385.51175,159.41383 385.41465,159.42491 385.09066,159.12836 C 384.40717,158.50279 384.15557,159.15234 384.78277,159.92321 C 385.50823,160.81483 385.52111,161.11534 384.8339,161.11534 C 384.54022,161.11534 384.23228,161.1985 384.14958,161.30014 C 383.49524,162.1044 373.34534,162.29368 365.28662,161.65193 C 359.32839,161.17745 354.58998,161.2787 353.48216,161.90416 C 352.57,162.41919 351.36222,164.23874 350.95433,165.71238 C 350.56679,167.11253 350.79145,171.3621 351.34313,173.06677 C 351.89991,174.78722 353.87377,177.29276 355.93375,178.89384 C 361.69246,183.36976 371.19795,183.42111 376.98801,179.00757 C 379.89915,176.78852 382.24934,173.22363 381.55941,172.07343 C 381.38459,171.78199 381.24154,171.2151 381.24154,170.81367 C 381.24154,170.08392 382.07691,168.57788 382.62419,168.32095 C 383.39593,167.95867 384.26748,166.51466 384.26748,165.59831 C 384.26748,164.53724 384.64213,163.83178 385.20558,163.83178 C 386.08301,163.83178 385.97918,161.1084 385.05727,159.94142 C 384.34884,159.04465 384.41288,158.86039 385.23029,159.44369 C 385.75764,159.82002 385.91798,159.85884 385.91798,159.61018 C 385.91798,159.43183 386.12685,159.15685 386.38212,158.99912 C 386.79684,158.74289 386.81881,158.75684 386.58845,159.13026 C 386.44664,159.3601 386.36682,160.4781 386.41105,161.61465 C 386.48402,163.48937 386.5507,163.71649 387.12987,164.06252 C 387.65138,164.37413 387.79471,164.70362 387.91265,165.86178 C 388.01423,166.85943 388.3207,167.70135 388.94694,168.70291 C 389.67118,169.86121 389.81092,170.30137 389.69736,171.06666 C 389.61881,171.59609 389.31517,172.20689 389.00266,172.46425 C 388.31336,173.03183 385.52096,176.70451 385.23933,177.41397 C 384.72888,178.69974 383.01411,181.39832 381.89492,182.67712 C 380.16478,184.65398 378.29724,185.84229 375.07078,187.01932 C 370.25045,188.77779 366.50239,189.10039 361.71053,188.16928 z"
+       style="fill:#11557c;fill-opacity:1;stroke:#ffffff;stroke-width:0.73518676000000005;stroke-opacity:1"
+       sodipodi:nodetypes="csscccsssssssssscccsssssssssssssssssssssssssssssssssssssssc" />
+    <text
+       xml:space="preserve"
+       style="font-size:96px;font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;text-align:start;line-height:100%;writing-mode:lr-tb;text-anchor:start;fill:#11557c;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;font-family:DejaVu Sans Mono;-inkscape-font-specification:DejaVu Sans Mono"
+       x="221.28195"
+       y="185.3815"
+       id="text2382"
+       sodipodi:linespacing="100%"><tspan
+         sodipodi:role="line"
+         id="tspan2386"
+         x="221.28195"
+         y="185.3815">th ano</tspan></text>
+  </g>
+</svg>
--- a/doc/images/theano_logo_allblue_200x46.png
+++ b/doc/images/theano_logo_allblue_200x46.png
--- a/doc/images/theano_logo_allblue_200x54.png
+++ b/doc/images/theano_logo_allblue_200x54.png
--- a/doc/images/theano_logo_allblue_350x95.png
+++ b/doc/images/theano_logo_allblue_350x95.png
--- a/doc/index.txt
+++ b/doc/index.txt
-===========
+Welcome
-Quick Start
+=======
-===========
 Theano is a Python library that allows you to define, optimize, and
-efficiently evaluate mathematical expressions involving multi-dimensional
+evaluate mathematical expressions involving multi-dimensional
-arrays.
+arrays efficiently. Theano features:
-The latest release is version `0.1
+* **tight integration with numpy**
-<http://pylearn.org/theano/downloads/Theano-0.1.tar.gz>`_.
+* **near-transparent use of a GPU** to accelerate for intense calculations [JAN 2010].
-You can download the latest `PDF documentation
+* **symbolic differentiation**
-<http://pylearn.org/theano/theano.pdf>`_, rather than reading it online.
+* **speed and stability optimizations**: write ``log(1+exp(x))`` and get the right answer.
-You can go to the :ref:`Table of Contents <contents>`.
+* **dynamic C code generation** for faster expression evaluation
-News
+Download
----
+========
-* 2009-04-01: Theano 0.1 released. See the :ref:`release notes <NEWS>`.
+In April 2009 we made declared the creation of a `0.1 release <http://pylearn.org/theano/downloads/Theano-0.1.tar.gz>`_.
+Development has continued non-stop since then.
-Choose your own adventure...
+The current version is available via::
----------------------------
+    hg clone http://hg.assembla.com/theano Theano
-* You have no idea what Theano is and you read the :ref:`introduction
-  <introduction>`.
+The theano subfolder should be on your ``$PYTHONPATH``.  For more information about
+installation and configuration, see :ref:`installing Theano <install>`.
-* Convinced by the Theano staff's rhetoric that it is worthy of your
-  time (you can hear them into the distance congratulating themselves
+Documentation
-  on the success of their dishonest ploy - a lone puppy's
+=============
-  heart-wrenching cry echoes behind you but is cut short by an
-  outworldly screech and what might be the dripping sound of a bloody
+You can download the latest `PDF documentation <http://pylearn.org/theano/theano.pdf>`_, rather than reading it online.
-  blade - you don't care, though, because Theano is wicked cool), you
-  :ref:`install Theano <install>` and :ref:`learn the basics
+* If you have no idea what Theano is read the :ref:`introduction <introduction>`.
-  <basictutorial>`.
+* :ref:`learn the basics <tutorial>`
+* :ref:`library reference <libdoc>`
-* Blinded by your newfound knowledge, you want to play God and
+* :ref:`extending Theano <extending>` with new Types, and Ops
-  :ref:`learn how to extend Theano <advtutorial>`. The Gods are
+* :ref:`internal docs <internal>`
-  pleased that they get to devour your soul for this heresy, but you
-  ascertain that it is worth it.
+.. toctree::
+   :maxdepth: 1
-* Decidedly, your thirst for Theano-related information is
+   :hidden:
-  unquenchable! So there you are, reading about Theano's compilation
-  :ref:`pipeline <pipeline>`. Little do you know that article could
+   NEWS
-  not have been made without the tears of a thousand innocent kittens.
+   introduction
+   install
-* You descend into madness as you read detailed (or as-of-yet not
+   tutorial/index
-  written, but our plans for world domination as well as the hourly
+   library/index
-  sacrifices that our God requires leave us little time for writing)
+   extending/index
-  articles about :ref:`op`, :ref:`type`, :ref:`function`,
+   indexes/index
-  :ref:`module`, :ref:`compilation`, :ref:`optimization`, :ref:`env`
+   glossary
-  and as you consult the list of :ref:`envfeaturelist`. Sometimes you
+   links
-  check the :ref:`glossary` just in case you missed an important
+   internal/index
-  concept.
+   examples/index
+   proposals/index
-* Requiring help in your dark endeavors, you register and post to the
+   LICENSE
-  `theano-users`_ mailing list and curious to know who is behind this
-  witchery and how their wicked plans come into fruition, you read the
-  `theano-dev`_ mailing list.
+Community
+=========
-* Mistaking your feverish sweat for genuine excitation, you
-  investigate `Theano's Trac <trac/>`__ for tickets_ that your feeble
+* Register and post to `theano-users`_ if you want to talk to all Theano users.
-  mind might comprehend and your consumed brain is yearning to solve.
+* Register and post to `theano-dev`_ if you want to talk to the developers.
-* Bored a Friday evening, you browse `Theano's API <api/>`__. May
-  the Great Ones help you with that one.
+* We try to stay organized with `Theano's Trac <trac/>`__ 
+* Come visit us in Montreal!  Most of the developers are students in the LISA_ group at the `University of Montreal`_.
-.. note::
-   May you be assured that this page was made solely under the
-   unsobering influence of boredom and that all of the unspeakable
-   things mentioned therein are merely artifacts of the author's
-   maddened delusions.
 .. _theano-dev: http://groups.google.com/group/theano-dev
 .. _theano-users: http://groups.google.com/group/theano-users
 .. _tickets: http://pylearn.org/theano/trac/query?status=accepted&status=assigned&status=new&status=reopened&group=milestone&max=200&col=id&col=summary&col=status&col=owner&col=type&col=priority&col=component&col=time&report=9&order=priority
+.. _LISA: http://www.iro.umontreal.ca/~lisa
+.. _University of Montreal: http://www.umontreal.ca
--- a/doc/indexes/index.txt
+++ b/doc/indexes/index.txt
-=======
-Indexes
-=======
-The following indexes are generated automatically and cover most
-:ref:`Ops <op>`, :ref:`Types <type>` and :ref:`Optimizers <optimization>`
-in Theano.
-.. toctree::
-   :maxdepth: 1
-   oplist
-   typelist
--- a/doc/install.txt
+++ b/doc/install.txt
@@ -16,7 +16,7 @@ Requirements
 In order to use Theano, the following libraries and software will need
 to be installed:
-    Linux or OS-X operating system
+    Linux, OS-X or Windows operating system
        We develop mainly on 64-bit Linux machines. 32-bit architectures are
        not well-tested.
@@ -66,6 +66,7 @@ want. Unpack the release, and type:
    python setup.py test
    python setup.py install
+.. _install_bleeding_edge:
 Bleeding Edge
 --------------
@@ -73,7 +74,6 @@ Bleeding Edge
 Feeling lucky and want to run bleeding-edge code?
 Then check out the :ref:`dev_start_guide` guide.
-I bet you also run with scissors.
 Configuring the environment
 ---------------------------
@@ -109,7 +109,7 @@ automatic code generation, but that way is much, much slower.
 - ``THEANO_DEFAULT_MODE``:
    String value specifying the default mode to use when compiling Theano
    graphs. This can be one of the strings defined in
-    :api:`theano.compile.mode`.
+    :ref:`using_modes`.
    Possible values so far are:
    - ``'FAST_COMPILE'``
@@ -145,10 +145,102 @@ This advice has not been tested recently, so please inform us of your results.
 Windows
 -------
-As of now, the Windows platform is not supported. In fact, it has
+Running Theano under Windows is currently achieved by using the `MinGW
-never even been tested, so feel free to explore this uncharted
+<http://www.mingw.org>`__ port of the GCC compiler.
-territory and inform us of your progress!
+It could probably also run with `Cygwin <http://www.cygwin.com/>`__,
+but this has not been tested yet.
+- From `the MinGW files <http://sourceforge.net/projects/mingw/files/>`__,
+  download the latest version of the ``Automated MinGW Installer`` and install
+  it (keeping default options).
+- From `the MinGW files <http://sourceforge.net/projects/mingw/files/>`__,
+  download the latest ``MSYS Base System`` executable file and run it
+  (note that the latest version of MSYS Base System may not contain an
+  executable file, in which case it is easier to just use an
+  older version, e.g. MSYS-1.0.11.exe).
+  This will install MSYS (you can keep the default install options).
+  It will also run a post-install script where it will ask you about the
+  location of MinGW (typically ``c:/MinGW``).
+- From `the MinGW files <http://sourceforge.net/projects/mingw/files/>`__,
+  download the current version of ``GCC Version 4`` (full package with
+  binaries, e.g.
+  gcc-full-4.4.0-mingw32-bin-2.tar.lzma). Unpack it (you may use
+  `7-Zip <http://www.7-zip.org>`__ to unpack files with the
+  .lzma extension), copying the content into the root directory
+  of your MinGW installation (if you obtain a .tar file, make
+  sure you expand it as well, either with `7-Zip <http://www.7-zip.org>`__
+  or through the ``tar`` command on the MSYS command line).
+- If you are familiar with vi, you may find useful to download and install
+  ``MSYS vim`` (this is done in a similar way to GCC 4).
+  This is strictly optional and mostly helpful to edit configuration files
+  from within MSYS.
+- Run MSYS (Start/Programs/MinGW/MSYS/MSYS) and check the installation
+  by verifying that the proper version of GCC is found:
+    .. code-block:: bash
+        gcc --version
+  You may also decide to change the location of your home directory by
+  adding a line at the beginning of msys.bat, that would look like
+  ``set HOME=C:\My\Home\For\MinGW`` (you can also set a global ``HOME``
+  environment variable within Windows, but this could affect more programs).
+- If you do not have them already, install the latest versions of
+  `Python 2.x <http://www.python.org/download/windows>`__ and
+  corresponding `Numpy <http://sourceforge.net/projects/numpy/files/>`__
+  then `SciPy <http://sourceforge.net/projects/scipy/files/>`__
+  packages (simply use the executable installers).
+- Ensure that the Python installation directory and its ``Scripts`` sub-directory
+  are in your system path. This may be done by
+  modifying the global ``PATH`` Windows environment variables, or by creating a ``.profile`` file in
+  your MinGW home, containing the line
+  ``export PATH=$PATH:/c/Python26:/c/Python26/Scripts``
+  (for Python 2.6).
+- Install a ``BLAS`` library. Note that although the following instructions
+  will give you a generic way to build your own library, there may exist
+  better (more optimized) versions of BLAS available for your system, but
+  these have not been tested for Windows at this time.
+  To build BLAS, download the latest version of `LAPACK <http://www.netlib.org/lapack/>`__
+  (typically lapack.tgz), then issue the following commands in MSYS
+  (for LAPACK 3.2.1):
+    .. code-block:: bash
+        tar zxvf lapack.tgz
+        cd lapack-3.2.1
+        gfortran -shared -O3 -o libblas.dll BLAS/SRC/*.f
+        mv libblas.dll /mingw/lib
+- Install `Mercurial <http://mercurial.selenic.com/downloads/>`__
+  (you can use the regular Windows release, you do not need TortoiseHg).
+- In order to run Theano's test-suite, you will need `nose
+  <http://somethingaboutorange.com/mrl/projects/nose>`__.
+  After unpacking its source code, you can build and install it from within
+  the code directory by:
+    .. code-block:: bash
+        python setup.py install
+- Install Theano using the above :ref:`install_bleeding_edge` installation instructions
+  (using ``easy_install`` will require additional packages and has not been
+  tested yet, while the latest official Theano release is also untested at this
+  time).
+  In particular, do not forget to make the Theano package accessible from
+  Python, e.g. by adding to your ``.profile`` a line like
+  ``export PYTHONPATH=PYTHONPATH:$HOME/Theano``.
+- Please note that at this time, some tests (launched using ``nosetests``) are
+  still failing under Windows.
+  We are working on fixing them.
 Generating the documentation
 ----------------------------

--- a/doc/internal/dev_start_guide.txt
+++ b/doc/internal/dev_start_guide.txt
@@ -6,10 +6,10 @@ Developer Start Guide
 =====================
- Learn some `non-basic python`_ to understand what's going on in some of the
+- Learn some non-basic python to understand what's going on in some of the
  tricker files (like tensor.py).
- BasicNumpy_ essential things to know about numpy.
+- Roughly go through the numpy documentation.
 - Learn to write reStructuredText_ for epydoc_.
@@ -41,13 +41,13 @@ As a developer, you should clone this repository like this:
 .. code-block:: bash
-    hg clone 'http://username:password@pylearn.org/hg/Theano'
+    hg clone 'http://username:password@hg.assembla.com/theano Theano'
 You can also clone the code anonymously:
 .. code-block:: bash
-    hg clone http://pylearn.org/hg/Theano
+    hg clone http://hg.assembla.com/theano Theano
 Setting up your environment
 ===========================
@@ -134,6 +134,7 @@ so there might be some latency in the process.
 For more detail :ref:`see <metadocumentation_nightly_build>`.
+.. TODO: fix this links
 .. _non-basic python: http://pylearn.org/theano/wiki/NonbasicPython
 .. _reStructuredText: http://docutils.sourceforge.net/rst.html

--- a/doc/introduction.txt
+++ b/doc/introduction.txt
 .. _introduction:
-============
+==================
-Introduction
+Theano at a Glance
-============
+==================
-Theano is a Python library that allows you to define, optimize, and
+Theano is a Python library that allows you to define, optimize, and evaluate
-efficiently evaluate mathematical expressions involving
+mathematical expressions involving multi-dimensional arrays. Using Theano it is
-multi-dimensional arrays. Using Theano, for problems involving large
+possible to attain speeds rivaling hand-crafted C implementations for problems
-amounts of data, it is possible to attain speeds that are only a few
+involving large amounts of data.  It can also surpass C on a CPU by many orders
-percentage points slower than hand-crafted C implementations.
+of magnitude by taking advantage of recent GPUs.
 Theano melds some aspects of a computer algebra system (CAS) with
 aspects of an optimizing compiler. It can even transform some or all
@@ -125,18 +125,16 @@ Getting started
 :ref:`install`
  Instructions to download and install Theano on your system.
-:ref:`basictutorial`
+:ref:`tutorial`
-  Getting started with Theano's basic features. Go there if you are
+  Getting started with Theano's basic features. Go here if you are
  new!
-:ref:`advtutorial`
+:ref:`libdoc`
-  This tutorial is for more advanced users who want to define their
+  Details of what Theano provides. It is recommended to go through
-  own operations and optimizations. It is recommended to go through
+  the :ref:`tutorial` first though.
-  the :ref:`basictutorial` first.
-For a complete map of the documentation you may check the
-:ref:`contents`. Also, a PDF version of the online documentation may
+A PDF version of the online documentation may be found `here <theano.pdf>`_.
-be found `here <theano.pdf>`_.
 Contact us

--- a/doc/library/compile/debugmode.txt
+++ b/doc/library/compile/debugmode.txt
+.. _debugmode:
+=================
+:mod:`debugmode`
+=================
+.. module:: debugmode
+   :platform: Unix, Windows
+   :synopsis: defines DebugMode
+.. moduleauthor:: LISA
+Guide
+=====
+The DebugMode evaluation mode includes a number of self-checks and assertions
+that can help to diagnose several kinds of programmer errors that can lead to
+incorrect output.
+It is much slower to evaluate a function or method with DebugMode than
+it would be in ``'FAST_RUN'`` or even ``'FAST_COMPILE'``.  We recommended you use
+DebugMode during development, but not when you launch 1000 processes on
+a cluster.
+DebugMode can be used as follows:
+.. code-block:: python
+    x = tensor.dvector('x')
+    f = theano.function([x], 10*x, mode='DEBUG_MODE')
+    f(5) 
+    f(0) 
+    f(7) 
+It can also be used by setting an environment variable ``THEANO_DEFAULT_MODE=DEBUG_MODE``.
+It can also be used by passing a DebugMode instance as the mode, as in 
+>>> f = theano.function([x], 10*x, mode=DebugMode(check_c_code=False))
+If any problem is detected, DebugMode will raise an exception according to
+what went wrong, either at call time (``f(5)``) or compile time (
+``f = theano.function(x, 10*x, mode='DEBUG_MODE')``). These exceptions
+should *not* be ignored; talk to your local Theano guru or email the
+users list if you cannot make the exception go away.
+Some kinds of errors can only be detected for certain input value combinations.
+In the example above, there is no way to guarantee that a future call to say,
+``f(-1)`` won't cause a problem.  DebugMode is not a silver bullet.
+If you instantiate DebugMode using the constructor ``compile.DebugMode``
+rather than the keyword ``DEBUG_MODE`` you can configure its behaviour via
+constructor arguments. 
+Reference
+==========
+.. class:: DebugMode(Mode)
+    Evaluation Mode that detects internal theano errors.
+    This mode catches several kinds of internal error:
+    - inconsistent c_code and perform implementations (see `BadCLinkerOutput`)
+    - a variable replacing another when their runtime values don't match.  This is a symptom of
+      an incorrect optimization step, or faulty Op implementation (raises `BadOptimization`)
+    - stochastic optimization ordering (raises `StochasticOrder`)
+    - incomplete `destroy_map` specification (raises `BadDestroyMap`)
+    - an op that returns an illegal value not matching the output Variable Type (raises
+      InvalidValueError)
+    Each of these exceptions inherits from the more generic `DebugModeError`.
+    If there are no internal errors, this mode behaves like FAST_RUN or FAST_COMPILE, but takes
+    a little longer and uses more memory.  
+    If there are internal errors, this mode will raise an `DebugModeError` exception.
+    .. attribute:: stability_patience = config.THEANO_DEBUGMODE_PATIENCE
+        When checking for the stability of optimization, recompile the graph this many times.
+        Default 10.
+    .. attribute:: check_c_code = config.THEANO_DEBUGMODE_CHECK_C
+        Should we evaluate (and check) the `c_code` implementations?
+        ``True`` -> yes, ``False`` -> no.
+        Default yes.
+    .. attribute:: check_py_code = config.THEANO_DEBUGMODE_CHECK_PY
+    Should we evaluate (and check) the `perform` implementations?
+        ``True`` -> yes, ``False`` -> no.
+        Default yes.
+    .. attribute:: check_isfinite = config.THEANO_DEBUGMODE_CHECK_FINITE
+        Should we check for (and complain about) ``NaN``/``Inf`` ndarray elements?
+        ``True`` -> yes, ``False`` -> no.
+        Default yes.
+    .. attribute:: require_matching_strides = config.THEANO_DEBUGMODE_CHECK_STRIDES
+        Check for (and complain about) Ops whose python and C
+        outputs are ndarrays with different strides. (This can catch bugs, but
+        is generally overly strict.) 
+        0 -> no check, 1 -> warn, 2 -> err.
+        Default warn.
+    .. method:: __init__(self, optimizer='fast_run', stability_patience=None, check_c_code=None, check_py_code=None, check_isfinite=None, require_matching_strides=None, linker=None)
+        Initialize member variables.
+        If any of these arguments (except optimizer) is not None, it overrides the class default.
+        The linker arguments is not used. It is set their to allow Mode.requiring() and some other fct to work with DebugMode too.
+The keyword version of DebugMode (which you get by using ``mode='DEBUG_MODE``)
+is quite strict, and can raise several different Exception types.
+There following are DebugMode exceptions you might encounter:
+.. class:: DebugModeError(Exception)
+    This is a generic error.  All the other exceptions inherit from this one.
+    This error is typically not raised directly.
+    However, you can use ``except DebugModeError: ...`` to catch any of the more
+    specific types of Exception.
+.. class:: BadCLinkerOutput(DebugModeError)
+    This exception means that python (``perform``) and c (``c_code``) for an Op
+    didn't compute the same thing like they were supposed to.
+    The problem might be a bug in either ``perform`` or ``c_code`` (or both).
+.. class:: BadOptimization(DebugModeError)
+    This exception indicates that an Optimization replaced one variable (say V1)
+    with another one (say V2)  but at runtime, the values for V1 and V2 were
+    different.  This is something that optimizations are not supposed to do.
+    It can be tricky to identify the one-true-cause of an optimization error, but
+    this exception provides a lot of guidance.  Most of the time, the
+    exception object will indicate which optimization was at fault.
+    The exception object also contains information such as a snapshot of the
+    before/after graph where the optimization introduced the error.
+.. class:: BadDestroyMap(DebugModeError)
+    This happens when an Op's ``perform()`` or ``c_code()`` modifies an input that it wasn't
+    supposed to.  If either the ``perform`` or ``c_code`` implementation of an Op
+    might modify any input, it has to advertise that fact via the ``destroy_map``
+    attribute.
+    For detailed documentation on the ``destroy_map`` attribute, see :ref:`inplace`.
+.. class:: BadViewMap(DebugModeError)
+    This happens when an Op's perform() or c_code() creates an alias or alias-like
+    dependency between an input and an output... and it didn't warn the
+    optimization system via the ``view_map`` attribute.
+    For detailed documentation on the ``view_map`` attribute, see :ref:`views`.
+.. class:: StochasticOrder(DebugModeError)
+    This happens when an optimization does not perform the same graph operations
+    in the same order when run several times in a row.  This can happen if any
+    steps are ordered by ``id(object)`` somehow, such as via the default object
+    hash function.  A Stochastic optimization invalidates the pattern of work
+    whereby we debug in DEBUG_MODE and then run the full-size jobs in FAST_RUN.
+.. class:: InvalidValueError(DebugModeError)
+    This happens when some Op's ``perform`` or ``c_code`` implementation computes
+    an output that is invalid with respect to the type of the corresponding output
+    variable.  Like if it returned a complex-valued ndarray for a ``dscalar``
+    Type.
+    This can also be triggered when floating-point values such as NaN and Inf are
+    introduced into the computations.  It indicates which Op created the first
+    NaN.  These floating-point values can be allowed by passing the
+    ``check_isfinite=False`` argument to DebugMode. 
--- a/doc/library/compile/function.txt
+++ b/doc/library/compile/function.txt
+.. _usingfunction:
+===========================================
+:mod:`function` - defines theano.function
+===========================================
+.. module:: function
+   :platform: Unix, Windows
+   :synopsis: defines theano.function and related classes
+.. moduleauthor:: LISA
+Guide
+=====
+This module provides :func:`function`, commonly accessed as `theano.function`,
+the interface for compiling graphs into callable objects.
+You've already seen example usage in the basic tutorial... something like this:
+>>> x = theano.tensor.dscalar()
+>>> f = theano.function([x], 2*x)
+>>> print f(4)                    # prints 8.0
+The idea here is that we've compiled the symbolic graph (``2*x``) into a function that can be called on a number and will do some computations.
+The behaviour of function can be controlled in several ways, such as
+:class:`Param`, ``mode``, ``updates``, and ``givens``.  These are covered
+in the :ref:`tutorial examples <basictutexamples>` and :ref:`tutorial on modes <using_modes>`.
+Reference
+=========
+.. class:: Out
+    A class for attaching information to function outputs
+    .. attribute:: variable
+        A variable in an expression graph to use as a compiled-function
+        output
+    .. attribute:: borrow
+        ``True`` indicates that a reference to internal storage may be returned, and that the caller is aware that subsequent function evaluations might overwrite this memory.
+    .. method:: __init__(variable, borrow=False)
+        Initialize attributes from arguments. 
+.. class:: Param
+    A class for attaching information to function inputs.
+    .. attribute:: variable
+        A variable in an expression graph to use as a compiled-function parameter
+    .. attribute:: default
+        The default value to use at call-time (can also be a Container where
+        the function will find a value at call-time.)
+    .. attribute:: name 
+        A string to identify an argument for this parameter in keyword arguments.
+    .. attribute:: mutable
+        ``True`` means the compiled-function is allowed to modify this
+        argument. ``False`` means it is not allowed.
+    .. attribute:: strict
+      If ``False``, a function argument may be copied or cast to match the type
+      required by the parameter `variable`.  If ``True``, a function argument
+      must exactly match the type required by `variable`.
+    .. method:: __init__(self, variable, default=None, name=None, mutable=False, strict=False)
+        Initialize object attributes.
+.. function:: function(inputs, outputs, mode=None, updates=[], givens=[], accept_inplace=False, name=None)
+    Return a callable object that will calculate `outputs` from `inputs`.
+    :type params: list of either Variable or Param instances, but not shared
+        variables.
+    :param params: the returned :class:`Function` instance will have
+      parameters for these variables.
+    :type outputs: list of Variables or Out instances
+    :param outputs: expressions to compute.
+    :type mode: None, string or :class:`Mode` instance.
+    :param mode: compilation mode
+    :type updates: iterable over pairs (shared_variable, new_expression).
+       List, tuple or dict.
+    :param updates: expressions for new SharedVariable values
+    :type givens: iterable over pairs (Var1, Var2) of Variables. 
+       List, tuple or dict.  The Var1
+       and Var2 in each pair must have the same Type.
+    :param givens: specific substitutions to make in the 
+      computation graph (Var2 replaces Var1).  
+    :param name: an optional name for this function. 
+      The profile mode will print the time spent in this function.
+    :rtype: Function instance
+    :returns: a callable object that will compute the outputs (given the inputs)
+      and update the implicit function arguments according to the `updates`.
+    Inputs can be given as variables or Param instances. 
+    :class:`Param` instances also have a variable, but they attach some extra
+    information about how call-time arguments corresponding to that variable
+    should be used.  Similarly, :class:`Out` instances can attach information
+    about how output variables should be returned.
+    The default is typically 'FAST_RUN' but this can be changed in
+    :doc:`theano.config <../config>` or via :envvar:`THEANO_DEFAULT_MODE`.  The mode
+    argument controls the sort of optimizations that will be applied to the
+    graph, and the way the optimized graph will be evaluated.
+    After each function evaluation, the `updates` mechanism can replace the
+    value of any SharedVariable [implicit] inputs with new values computed
+    from the expressions in the `updates` list.  An exception will be raised
+    if you give two update expressions for the same SharedVariable input (that
+    doesn't make sense).
+    Regarding givens: Be careful to make sure that these substitutions are
+    independent--behaviour when Var1 of one pair appears in the graph leading
+    to Var2 in another expression is undefined.  Replacements specified with
+    givens are different from optimizations in that Var2 is not expected to be
+    equivalent to Var1.
--- a/doc/library/compile/index.txt
+++ b/doc/library/compile/index.txt
+.. _libdoc_compile:
+==============================================================
+:mod:`compile` -- transforming expression graphs to functions
+==============================================================
+.. module:: compile
+   :platform: Unix, Windows
+   :synopsis: transforming expression graphs to functions
+.. moduleauthor:: LISA
+.. toctree::
+    :maxdepth: 1
+    function
+    io
+    mode
+    module
+    debugmode
+    profilemode
--- a/doc/topics/function.txt
+++ b/doc/topics/function.txt
-.. _usingfunction:
+.. note::
-=====================
+    ***TODO*** Freshen up this old documentation
-Using theano.function
-=====================
-This page is about :api:`theano.function
+.. _function_inputs:
-<theano.compile.function_module.function>`, the interface for compiling
-graphs into callable objects.
-The signature for this function is:
-.. code-block:: python
-    def function(inputs, outputs, mode=None):
-        ...
-You've already seen example usage in the basic tutorial... something like this:
->>> x = theano.tensor.dscalar()
+===========================================
->>> f = theano.function([x], 2*x)
+:mod:`io` - defines theano.function [TODO]
->>> print f(4)                    # prints 8.0
+===========================================
-The idea here is that we've compiled the symbolic graph (``2*x``) into a function that can be called on a number and will do some computations.
+.. module:: io
+   :platform: Unix, Windows
+   :synopsis: defines In and Out
+.. moduleauthor:: LISA
-.. _function_inputs:
 Inputs
 ======
@@ -34,7 +23,7 @@ The ``inputs`` argument to ``theano.function`` is a list, containing the ``Varia
 ``In`` instances let us attach properties to ``Variables`` to tell function more about how to use them.
-.. class:: In
+.. class:: In(object)
   .. method:: __init__(variable, name=None, value=None, update=None, mutable=False, strict=False, autoname=True, implicit=None)
@@ -364,28 +353,3 @@ If a list of ``Variable`` or ``Out`` instances is given as argument, then the co
    fn3 = theano.function([x], outputs=x+x)
    print fn3(numpy.asarray([[1,0],[0,1]]))
-.. _function_mode:
-Mode
-====
-The ``mode`` parameter to ``theano.function`` controls how the
-inputs-to-outputs graph is transformed into a callable object.
-Theano defines the following modes by name:
- ``FAST_COMPILE``: Apply just a few optimizations, but use C op implementations where possible.
- ``FAST_RUN``: Apply all optimizations, and use C op implementations where possible.
- ``DEBUG_MODE``: Verify the correctness of all optimizations, and compare C and python 
-    implementations. This mode can take much longer than the other modes, 
-    but can identify many kinds of problems.
-The default mode is typically 'FAST_RUN', but it can be controlled via the
-environment variable 'THEANO_DEFAULT_MODE', which can in turn be overridden by
-setting ``theano.compile.mode.default_mode`` directly, which can in turn be
-overridden by passing the keyword argument to ``theano.function``.
-For a finer level of control over which optimizations are applied, and whether
-C or python implementations are used, read :api:`compile.mode.Mode`.
--- a/doc/library/compile/mode.txt
+++ b/doc/library/compile/mode.txt
+======================================
+:mod:`mode` -- controlling compilation
+======================================
+.. module:: mode
+   :platform: Unix, Windows
+   :synopsis: controlling compilation
+.. moduleauthor:: LISA
+Guide
+=====
+The ``mode`` parameter to :func:`theano.function`` controls how the
+inputs-to-outputs graph is transformed into a callable object.
+Theano defines the following modes by name:
+- ``FAST_COMPILE``: Apply just a few optimizations, but use C op implementations where possible.
+- ``FAST_RUN``: Apply all optimizations, and use C op implementations where possible.
+- ``DEBUG_MODE``: Verify the correctness of all optimizations, and compare C and python 
+    implementations. This mode can take much longer than the other modes, 
+    but can identify many kinds of problems.
+The default mode is typically 'FAST_RUN', but it can be controlled via the
+environment variable 'THEANO_DEFAULT_MODE', which can in turn be overridden by
+setting ``theano.compile.mode.default_mode`` directly, which can in turn be
+overridden by passing the keyword argument to ``theano.function``.
+For a finer level of control over which optimizations are applied, and whether
+C or python implementations are used, read :api:`compile.mode.Mode`.
+Reference
+=========
+.. attribute:: FAST_COMPILE
+.. attribute:: FAST_RUN
+.. attribute:: DEBUG_MODE
+.. attribute:: PROFILE_MODE
+.. class:: Mode(object)
+    Compilation is controlled by two attributes: the `optimizer` controls how
+    an expression graph will be transformed; the `linker` controls how the
+    optimized expression graph will be evaluated.
+    .. attribute:: optimizer
+        An :class:`optimizer` instance.
+    .. attribute:: linker
+        A :class:`linker` instance.
+    .. method:: including(*tags)
+        Return a new Mode instance like this one, but with an
+        optimizer modified by including the given tags.
+    .. method:: excluding(*tags)
+        Return a new Mode instance like this one, but with an
+        optimizer modified by excluding the given tags.
+    .. method:: requiring(*tags)
+        Return a new Mode instance like this one, but with an
+        optimizer modified by requiring the given tags.
--- a/doc/topics/module.txt
+++ b/doc/topics/module.txt
-============
+=======================================
-Using Module
+:mod:`module` -- a theano object system
-============
+=======================================
+.. note::
+    Module addresses similar needs to `shared`.  New code is encouraged to
+    use `shared` variables.
 Now that we're familiar with the basics, we introduce Theano's more
 advanced interface, Module. This interface allows you to define Theano

--- a/doc/topics/profilemode.txt
+++ b/doc/topics/profilemode.txt
 .. _profilemode:
-=========================================
+================================================
-ProfileMode
+:mod:`profilemode` -- profiling Theano functions
-=========================================
+================================================
+.. module:: profilemode
+   :platform: Unix, Windows
+   :synopsis: profiling Theano functions with ProfileMode
+.. moduleauthor:: LISA
+Guide
+=====
 To profile a Theano graph, a special mode called ProfileMode, must be passed as
 an argument when compiling your graph. Using ProfileMode is a three-step
@@ -118,6 +127,34 @@ generates the following output:
    """
+.. note::
+    ***TODO***
+    The following text was recovered from a recent version of the source
+    file... hopefully things haven't gotten too out-of-sync!
+    The first show an Apply-wise summary, the second show an Op-wise summary, the third show an type-Op-wise summary.
+    The Apply-wise summary print the timing information for the worst
+    offending Apply nodes. This corresponds to individual Op applications
+    within your graph which take the longest to execute (so if you use dot
+    twice, you will see two entries there). 
+    The Op-wise summary print the execution time of all Apply nodes
+    executing the same Op are grouped together and the total execution
+    time per Op is shown (so if you use dot twice, you will see only one
+    entry there corresponding to the sum of the time spent in each of
+    them). If two Op have different hash value, they will be separate.
+    The type-Op-wise summary group the result by type of op. So event if
+    two Op have different hash value, they will be merged.
+    Their is an hack with the Op-wise summary. Go see it if you want to know more.
 The summary has two components to it. In the first section called the
 Apply-wise summary, timing information is provided for the worst
 offending Apply nodes. This corresponds to individual Op applications
@@ -131,7 +168,43 @@ there corresponding to the sum of the time spent in each of them).
 Note that the ProfileMode also shows which Ops were running a c
 implementation.
-Developers wishing to optimize the performance of their graph, should
+Developers wishing to optimize the performance of their graph should
-focus on the worst offending Ops. If no C implementation exists for
+focus on the worst offending Ops and Apply nodes -- either by optimizing an
-this op, consider writing a C implementation yourself or use the
+implementation, providing a missing C implementation, or by writing a graph
-mailing list, to suggest that a C implementation be provided.
+optimization that eliminates the offending Op altogether.
+You should strongly consider emailing one of our lists about your issue before
+spending too much time on this.
+Reference
+=========
+.. class:: ProfileMode(Mode)
+    .. method:: print_summary(n_apply_to_print=None, n_ops_to_print=None)
+        Print three summaries to stdout that show where cpu time is spent during theano function executions (for all functions using this object instance).
+        :param n_apply_to_print: the number of apply nodes to print. 
+           The default 15, but can be configured via ``ProfileMode.n_ops_to_print`` in :envvar:`THEANO_FLAGS`.
+        :param n_ops_to_print: the number of ops to print.
+           Default 20, or but can be configured via ``ProfileMode.n_apply_to_print`` in :envvar:`THEANO_FLAGS`.
+        :returns: None
+    .. method:: print_diff_summary(self, other, n_apply_to_print=None, n_ops_to_print=None):
+        """ As print_summary, but print the difference on two different profile mode.
+        TODO: Also we don't print the Apply-wise summary as it don't work for now.
+        TODO: make comparaison with gpu code.
+        :param other: the other instance of ProfileMode that we want to be compared to.
+        :param n_apply_to_print: the number of apply nodes to print. 
+           The default 15, but can be configured via ``ProfileMode.n_ops_to_print`` in :envvar:`THEANO_FLAGS`.
+        :param n_ops_to_print: the number of ops to print.
+           Default 20, or but can be configured via ``ProfileMode.n_apply_to_print`` in :envvar:`THEANO_FLAGS`.
+        :returns: None
--- a/doc/library/config.txt
+++ b/doc/library/config.txt
+.. _libdoc_config:
+=======================================
+:mod:`config` -- library configuration 
+=======================================
+.. module:: config
+   :platform: Unix, Windows
+   :synopsis: library configuration
+.. moduleauthor:: LISA
+.. envvar:: THEANO_FLAGS
+    ***TODO***
+.. attribute:: floatx
+***TODO*** what attributes are in here? 
--- a/doc/topics/floatX.txt
+++ b/doc/topics/floatX.txt
-.. _floatX:
-========================
+.. _libdoc_floatX:
-Special data type floatX
-========================
-Intro
+=======================================================================
+:mod:`floatX` -- easy switching between float32 and float64
+=======================================================================
+.. module:: floatx
+   :platform: Unix, Windows
+   :synopsis: easy switching between float32 and float64
+.. moduleauthor:: LISA
+Guide
 =====
 Their is a special data type called floatX. It is not a real datatype. It is never present in the theano graph, but their exist constructor and function that will change the floatX to float32 or float64(default) in your graph. You can change the value of floatX when you start the execution of python by setting the environement variables THEANO_GPU=floatX=float{32,64}(case sensitive). You can have the value of floatX with::
@@ -35,3 +42,42 @@ HINT: linear algorythm are less affected by the different precision then non-lin
      numpy.asarray(x,dtype=config.floatX) warn copy only if needed.
 WARNING: theano.floatx.set_floatX() exist for our test. Don't use it for something else. If you do, it will make code hard to read and it is a sign that their is something better for you then floatX.
+Reference
+==========
+.. function:: xscalar(name=None)
+    Alias for either :func:`dscalar` or :func:`fscalar`
+.. function:: xvector(name=None)
+    Alias for either :func:`d...` or :func:`f---`
+.. function:: xmatrix(name=None)
+    Alias for either :func:`d...` or :func:`f---`
+.. function:: xrow(name=None)
+    Alias for either :func:`d...` or :func:`f---`
+.. function:: xcol(name=None)
+    Alias for either :func:`d...` or :func:`f---`
+.. function:: xtensor3(name=None)
+    Alias for either :func:`d...` or :func:`f---`
+.. function:: xtensor4(name=None)
+    Alias for either :func:`d...` or :func:`f---`
+.. function:: set_floatX(dtype=config.floatX)
+    Reset the :func:`xscalar`, ... :func:`xtensor4` aliases to return types
+    with given dtype.
--- a/doc/library/gof/index.txt
+++ b/doc/library/gof/index.txt
+.. _libdoc_gof:
+================================================
+:mod:`gof` -- theano internals [doc TODO]
+================================================
--- a/doc/library/gradient.txt
+++ b/doc/library/gradient.txt
+.. _libdoc_gradient:
+===========================================
+:mod:`gradient` -- symbolic differentiation
+===========================================
+.. module:: gradient
+   :platform: Unix, Windows
+   :synopsis: low-level automatic differentiation
+.. moduleauthor:: LISA
+Symbolic gradient is usually computed from :func:`tensor.grad`, which offers a
+more convenient syntax for the common case of wanting the gradient in some
+expressions with respect to a scalar cost.  The :func:`grad_sources_inputs`
+function does the underlying work, and is more flexible, but is also more
+awkward to use when :func:`tensor.grad` can do the job.
+.. function:: grad_sources_inputs(sources, graph_inputs, warn_type=True)
+    A gradient source is a pair (``r``, ``g_r``), in which ``r`` is a `Variable`, and ``g_r`` is a
+    `Variable` that is a gradient wrt ``r``.
+    This function traverses the graph backward from the ``r`` sources,
+    calling ``op.grad(...)`` for all ops with some non-None gradient on an output.
+    The ``op.grad(...)`` functions are called like this:
+    .. code-block:: python
+        op.grad(op.inputs[:], [total_gradient(v) for v in op.outputs])
+    This call to ``op.grad`` should return a list or tuple: one symbolic gradient per input.
+    If ``op`` has a single input, then ``op.grad``  should return a list or tuple of length 1.
+    For each input wrt to which ``op`` is not differentiable, it should return ``None`` instead
+    of a `Variable` instance.
+    If a source ``r`` receives a gradient from another source ``r2``, then the effective
+    gradient on ``r`` is the sum of both gradients.
+    :type sources: list of pairs of Variable: (v, gradient-on-v) to 
+                   initialize the total_gradient dictionary
+    :param sources: gradients to back-propagate using chain rule
+    :param warn_type: True will trigger warnings via the logging module when
+       the gradient on an expression has a different type than the original
+       expression
+    :type warn_type: bool
+    :type graph_inputs: list of Variable
+    :param graph_inputs: variables considered to be constant 
+                         (do not backpropagate through them)
+    :rtype: dictionary whose keys and values are of type `Variable`
+    :returns: mapping from each Variable encountered in the backward traversal to its [total] gradient.
--- a/doc/library/index.txt
+++ b/doc/library/index.txt
+.. _libdoc:
+=====================
+Library Documentation
+=====================
+This documentation covers Theano module-wise.
+.. toctree::
+   :maxdepth: 1
+   tensor/index
+   gradient
+   config
+   floatX
+   printing
+   compile/index
+   sparse/index
+   scalar/index
+   gof/index
+There are also some top-level imports that you might find more convenient:
+.. module:: theano
+   :platform: Unix, Windows
+   :synopsis: Theano top-level import
+.. moduleauthor:: LISA
+.. function:: function(...)
+    Alias for :func:`function.function`
+.. class:: Param
+    Alias for :class:`function.Param`
+.. function:: dot(x, y)
+    Works like :func:`tensor.dot` for both sparse and dense matrix products
--- a/doc/library/printing.txt
+++ b/doc/library/printing.txt
+.. _libdoc_printing:
+================================================================================
+:mod:`printing` -- graph printing and symbolic print statement [doc TODO]
+================================================================================
--- a/doc/library/scalar/index.txt
+++ b/doc/library/scalar/index.txt
+.. _libdoc_scalar:
+==============================================================
+:mod:`scalar` -- symbolic scalar types, ops [doc TODO]
+==============================================================
--- a/doc/library/sparse/index.txt
+++ b/doc/library/sparse/index.txt
+.. _libdoc_sparse:
+===========================================================
+:mod:`sparse` -- symbolic sparse matrices [doc TODO]
+===========================================================
--- a/doc/library/tensor/basic.txt
+++ b/doc/library/tensor/basic.txt
+.. currentmodule:: tensor
+.. _libdoc_tensor_type:
+TensorType
+==========
+.. class:: TensorType
+.. _libdoc_tensor_variable
+TensorVariable
+==============
+.. _libdoc_tensor_constant
+TensorConstant
+==============
+.. _libdoc_tensor_creation:
+Creation
+========
+Theano provides a list of predefined tensor types that can be used
+to create a tensor variables. The names of the predefined types follow
+a simple recipe : 
+``<dtype><dimensionality>``
+Where ``<dtype>`` is one of (note that this is not a complete list of
+possible ``<dtypes>``, it just covers thouse used by the predefined
+types):
+==== ========== =============== =================
+code type       domain          bits
+==== ========== =============== =================
+b    byte       signed integer  8
+w    word       signed integer  16
+i    integer    signed integer  32
+l    long       signed integer  64
+f    float      floating point  32
+d    double     floating point  64
+c    complex64  complex         64 (two float32)
+z    complex128 complex         128 (two float64)
+==== ========== =============== =================
+``<dimensionality>`` is one of:
+======== ============= ================================================================= 
+ code     shape         :ref:`broadcastable <libdoc_tensor_broadcastable>` pattern   
+======== ============= ================================================================= 
+ scalar   []            [True,  True,  True,  True ]
+ vector   [n]           [True,  True,  True,  False] (vectors are used like row vectors)
+ row      [1, n]        [True,  True,  True,  False] 
+ col      [m, 1]        [True,  True,  False, True ]                            
+ matrix   [m, n]        [True,  True,  False, False]
+ tensor3  [m, n, k]     [True,  False, False, False]
+ tensor4  [m, n, k, l]  [False, False, False, False]
+======== ============= ================================================================= 
+So, if you want the type of a row of 32-bit floats, it is available
+as :ref:`theano.tensor.frow <libdoc_tensor_type>`.
+If you want a matrix of unsigned 32-bit integers it is available as
+:ref:`theano.tensor.imatrix <libdoc_tensor_type>`.
+Each of the types described above can be constructed by two methods:
+a singular version (e.g., :ref:`dmatrix <libdoc_tensor_creation>`)
+and a plural version (:ref:`dmatrices <libdoc_tensor_creation>`).
+When called, the singular version takes a single
+argument which is the name of the *Variable* we want to make and it
+makes a single Variable of that type. The plural version can either take
+an integer or several strings. If an integer is provided, the method
+will return that many Variables and if strings are provided, it will
+create one Variable for each string, using the string as the Variable's
+name. For example:
+.. code-block:: python
+   from theano.tensor import *
+   x = dmatrix() # creates one Variable with no name
+   x = dmatrix('x') # creates one Variable with name 'x'
+   xyz = dmatrix('xyz') # creates one Variable with name 'xyz'
+   x, y, z = dmatrices(3) # creates three Variables with no names
+   x, y, z = dmatrices('x', 'y', 'z') # creates three Variables named 'x', 'y' and 'z'
+Custom tensor types
+-------------------
+If you wish to use a type of tensor which is not already available 
+(for example, a 5D tensor) you can build an appropriate type using
+:ref:`theano.tensor.TensorType <libdoc_tensor_type>`.
+The first argument you pass is the `dtype` and the second is the
+`broadcastable pattern`.
+Where `dtype` is one of (a complete list of supported dtypes):
+================= =================== =================
+dtype             domain              bits
+================= =================== =================
+``'int8'``        signed integer      8
+``'int16'``       signed integer      16
+``'int32'``       signed integer      32
+``'int64'``       signed integer      64
+``'uint8'``       unsigned integer    8
+``'uint16'``      unsigned integer    16
+``'uint32'``      unsigned integer    32
+``'uint64'``      unsigned integer    64
+``'float32'``     floating point      32
+``'float64'``     floating point      64
+``'complex64'``   complex             64 (two float32)
+``'complex128'``  complex             128 (two float64)
+================= =================== =================
+The broadcastable pattern indicates both the number of dimensions and
+whether a particular dimension must have length 1.
+Here is a table mapping the :ref:`broadcastable <libdoc_tensor_broadcastable>` pattern to what kind of tensor it encodes:
+===================== =================================
+pattern               interpretation
+===================== =================================
+[]                    scalar
+[True]                1D scalar (vector of length 1)
+[True, True]          2D scalar (1x1 matrix)
+[False]               vector
+[False, False]        matrix
+[False] * n           nD tensor
+[True, False]         row (1xN matrix)
+[False, True]         column (Mx1 matrix)
+[False, True, False]  A Mx1xP tensor (a)
+[True, False, False]  A 1xNxP tensor (b)
+[False, False, False] A MxNxP tensor (pattern of a + b)
+===================== =================================
+For dimensions in which broadcasting is False, the length of this
+dimension can be 1 or more.  For dimensions in which broadcasting is True,
+the length of this dimension must be 1.
+When two tensors have a different number of dimensions, the broadcastable
+pattern is *expanded to the left*, by padding with ``True``. For example,
+a vector's pattern, ``[False]``, could be expanded to ``[True, False]``, and
+would behave like a row (1xN matrix). In the same way, a matrix (``[False,
+False]``) would behave like a 1xNxP tensor (``[True, False, False]``).
+If we wanted to create a type representing a 3D array of unsigned
+bytes, we would do:
+.. code-block:: python
+   # 3D tensor of signed bytes
+   mytype = theano.tensor.TensorType('uint8', [False]*3)
+   # complex types (based on complex64)
+   my_cscalar = theano.tensor.TensorType('complex64', [])
+   my_cmatrix = theano.tensor.TensorType('complex64', [False, False])
+Shared Variable
+---------------
+Yet another way of creating a special type of Theano variable is by using
+:func:`shared` as in the example below:
+.. code-block:: python
+  x = shared(value, name)
+Shared takes two parameters, `value` and `name` and creates a Theano 
+variable with the name `name` and initial value `value`. The type of this
+variable is obtained from the type of the value `value`, so if value is a
+numpy float matrix the shared variable will be of type `fmatrix`. 
+Note that a shared variable is not like other Theano variables. For more
+details of how to use shared variables look :ref:`here <functionstateexample>` (or for more details
+:ref:`here <sharedvars>`). TODO : make the last link to a detailed
+description of shared variables.
+Autocasting
+-----------
+ Theano does autocasting of numpy ndarray or python floats/ints into 
+ Theano constants.
+TODO: What does (or compatible) mean?  Talk about casting rules, refer .
+TODO: link to floatX (?) 
+.. function:: as_tensor_variable(x, ...)
+Shaping and Shuffling
+=====================
+.. function:: shape(x)
+    :param x:  symbolic Tensor (or compatible)
+    Returns the symbolic shape vector of `x`
+.. function:: reshape(x)
+.. function:: dimshuffle(x)
+Reductions 
+==========
+.. function:: max(x)
+    :param x:  symbolic Tensor (or compatible)
+    Returns TODO
+.. function:: min(x)
+    :param x:  symbolic Tensor (or compatible)
+    Returns TODO
+.. function:: sum(x)
+    :param x:  symbolic Tensor (or compatible)
+    Returns TODO
+Indexing
+========
+Basic indexing.
+Advanced indexing.
+.. _libdoc_tensor_elementwise:
+Elementwise
+===========
+Casting
+-------
+Logic Functions
+---------------
+.. function:: lt(a, b)
+    Returns a variable representing the result of logical less than (a<b).
+      :Parameter:  *a* - symbolic Tensor (or compatible)
+      :Parameter:  *b* - symbolic Tensor (or compatible)
+      :Return type: symbolic Tensor
+      :Returns: a symbolic tensor representing the application of logical 
+      elementwise less than.
+.. code-block:: python 
+    import theano.tensor as T
+    x,y = T.dmatrices('x','y')
+    z = T.lt(x,y)
+.. function:: gt(a, b)
+    Returns a variable representing the result of logical greater than (a>b).
+      :Parameter:  *a* - symbolic Tensor (or compatible)
+      :Parameter:  *b* - symbolic Tensor (or compatible)
+      :Return type: symbolic Tensor
+      :Returns: a symbolic tensor representing the application of logical 
+      elementwise greater than.
+.. code-block:: python 
+    import theano.tensor as T
+    x,y = T.dmatrices('x','y')
+    z = T.gt(x,y)
+.. function:: le(a, b)
+    Returns a variable representing the result of logical less than or
+    equal (a<=b).
+      :Parameter:  *a* - symbolic Tensor (or compatible)
+      :Parameter:  *b* - symbolic Tensor (or compatible)
+      :Return type: symbolic Tensor
+      :Returns: a symbolic tensor representing the application of logical 
+      elementwise less than or equal.
+.. code-block:: python 
+    import theano.tensor as T
+    x,y = T.dmatrices('x','y')
+    z = T.le(x,y)
+.. function:: ge(a, b)
+    Returns a variable representing the result of logical greater or equal
+     than (a>=b).
+      :Parameter:  *a* - symbolic Tensor (or compatible)
+      :Parameter:  *b* - symbolic Tensor (or compatible)
+      :Return type: symbolic Tensor
+      :Returns: a symbolic tensor representing the application of logical 
+      elementwise greater than or equal.
+.. code-block:: python 
+    import theano.tensor as T
+    x,y = T.dmatrices('x','y')
+    z = T.ge(x,y)
+.. function:: eq(a, b)
+    Returns a variable representing the result of logical equality (a==b).
+      :Parameter:  *a* - symbolic Tensor (or compatible)
+      :Parameter:  *b* - symbolic Tensor (or compatible)
+      :Return type: symbolic Tensor
+      :Returns: a symbolic tensor representing the application of logical 
+      elementwise equality.
+.. code-block:: python 
+    import theano.tensor as T
+    x,y = T.dmatrices('x','y')
+    z = T.eq(x,y)
+.. function:: neq(a, b)
+    Returns a variable representing the result of logical inequality
+    (a!=b).
+      :Parameter:  *a* - symbolic Tensor (or compatible)
+      :Parameter:  *b* - symbolic Tensor (or compatible)
+      :Return type: symbolic Tensor
+      :Returns: a symbolic tensor representing the application of logical 
+      elementwise inequality.
+.. code-block:: python 
+    import theano.tensor as T
+    x,y = T.dmatrices('x','y')
+    z = T.neq(x,y)
+Mathematical
+------------
+.. _libdoc_tensor_broadcastable:
+Broadcasting in Theano vs. Numpy
+--------------------------------
+Broadcasting is a mechanism which allows tensors with
+different numbers of dimensions to be added or multiplied
+together by (virtually) replicating the smaller tensor along
+the dimensions that it is lacking.
+In a nutshell, broadcasting is the mechanism by which a scalar
+may be added to a matrix, a vector to a matrix or a scalar to
+a vector.
+.. figure:: bcast.png
+Broadcasting a row matrix. T and F respectively stand for
+True and False and indicate along which dimensions we allow
+broadcasting.
+If the second argument were a vector, its shape would be
+``(2,)`` and its broadcastable pattern ``(F,)``. They would
+be automatically expanded to the **left** to match the
+dimensions of the matrix (adding ``1`` to the shape and ``T``
+to the pattern), resulting in ``(1, 2)`` and ``(T, F)``.
+It would then behave just like the example above.
+Unlike numpy which does broadcasting dynamically, Theano needs
+to know, for any operation which supports broadcasting, which
+dimensions will need to be broadcasted. When applicable, this
+information is given in the :ref:`type` of a *Variable*.
+See also:
+* :ref:`How broadcasting is used in Theano's tensor types <tensortypes>`
+* `SciPy documentation about numpy's broadcasting <http://www.scipy.org/EricsBroadcastingDoc>`_
+* `OnLamp article about numpy's broadcasting <http://www.onlamp.com/pub/a/python/2000/09/27/numerically.html>`_
+Linear Algebra
+==============
+.. function:: dot(X, Y)
+    :param X: left term
+    :param Y: right term
+    :type X: symbolic matrix or vector
+    :type Y: symbolic matrix or vector
+    :rtype: symbolic matrix or vector
+    :return: the inner product of `X` and `Y`.
+.. function:: outer(X, Y)
+    :param X: left term
+    :param Y: right term
+    :type X: symbolic vector
+    :type Y: symbolic vector
+    :rtype: symbolic matrix 
+    :return: vector-vector outer product
+.. function:: tensordot(X, Y, axes=2)
+    This is a symbolic standing for ``numpy.tensordot``.
+    :param X: left term
+    :param Y: right term
+    :param axes: sum out these axes from X and Y.
+    :type X: symbolic tensor
+    :type Y: symbolic tensor
+    :rtype: symbolic tensor 
+    :type axes: see numpy.tensordot
+    :return: tensor product
+Fourier Transforms
+==================
+[James has some code for this, but hasn't gotten it into the source tree yet.]
+Gradient / Differentiation
+==========================
+.. function:: grad(cost, wrt, g_cost=None, consider_constant=[], warn_type=False)
+    Return symbolic gradients for one or more variables with respect to some
+    cost.
+    :type cost: 0-d tensor variable
+    :type wrt: tensor variable or list of tensor variables
+    :type g_cost: same as `cost`
+    :type consider_constant: list of variables
+    :type warn_type: bool
+    :param cost: a scalar with respect to which we are differentiating
+    :param wrt: term[s] for which we want gradients
+    :param g_cost: the gradient on the cost
+    :param consider_constant: variables whose gradients will be held at 0.
+    :param warn_type: True will trigger warnings via the logging module when
+       the gradient on an expression has a different type than the original
+       expression
+    :rtype: variable or list of variables (matching `wrt`)
+    :returns: gradients with respect to cost for each of the `wrt` terms 
--- a/doc/library/tensor/bcast.png
+++ b/doc/library/tensor/bcast.png
--- a/doc/library/tensor/bcast.svg
+++ b/doc/library/tensor/bcast.svg
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!-- Created with Inkscape (http://www.inkscape.org/) -->
+<svg
+   xmlns:dc="http://purl.org/dc/elements/1.1/"
+   xmlns:cc="http://web.resource.org/cc/"
+   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
+   xmlns:svg="http://www.w3.org/2000/svg"
+   xmlns="http://www.w3.org/2000/svg"
+   xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd"
+   xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"
+   width="144.18471"
+   height="188.09711"
+   id="svg2"
+   sodipodi:version="0.32"
+   inkscape:version="0.45.1"
+   sodipodi:docbase="/u/breuleuo/hg/theano/doc"
+   sodipodi:docname="bcast.svg"
+   inkscape:output_extension="org.inkscape.output.svg.inkscape"
+   version="1.0"
+   inkscape:export-filename="/u/breuleuo/hg/theano/doc/bcast.png"
+   inkscape:export-xdpi="249.67973"
+   inkscape:export-ydpi="249.67973">
+  <defs
+     id="defs4">
+    <marker
+       inkscape:stockid="Arrow2Lend"
+       orient="auto"
+       refY="0"
+       refX="0"
+       id="Arrow2Lend"
+       style="overflow:visible">
+      <path
+         id="path3247"
+         style="font-size:12px;fill-rule:evenodd;stroke-width:0.625;stroke-linejoin:round"
+         d="M 8.7185878,4.0337352 L -2.2072895,0.016013256 L 8.7185884,-4.0017078 C 6.97309,-1.6296469 6.9831476,1.6157441 8.7185878,4.0337352 z "
+         transform="matrix(-1.1,0,0,-1.1,-1.1,0)" />
+    </marker>
+    <marker
+       inkscape:stockid="Arrow1Lend"
+       orient="auto"
+       refY="0"
+       refX="0"
+       id="Arrow1Lend"
+       style="overflow:visible">
+      <path
+         id="path3229"
+         d="M 0,0 L 5,-5 L -12.5,0 L 5,5 L 0,0 z "
+         style="fill-rule:evenodd;stroke:#000000;stroke-width:1pt;marker-start:none"
+         transform="matrix(-0.8,0,0,-0.8,-10,0)" />
+    </marker>
+  </defs>
+  <sodipodi:namedview
+     id="base"
+     pagecolor="#ffffff"
+     bordercolor="#666666"
+     borderopacity="1.0"
+     gridtolerance="10000"
+     guidetolerance="10"
+     objecttolerance="10"
+     inkscape:pageopacity="0.0"
+     inkscape:pageshadow="2"
+     inkscape:zoom="2.8"
+     inkscape:cx="55.423257"
+     inkscape:cy="90.829331"
+     inkscape:document-units="px"
+     inkscape:current-layer="layer1"
+     inkscape:window-width="1272"
+     inkscape:window-height="937"
+     inkscape:window-x="0"
+     inkscape:window-y="0" />
+  <metadata
+     id="metadata7">
+    <rdf:RDF>
+      <cc:Work
+         rdf:about="">
+        <dc:format>image/svg+xml</dc:format>
+        <dc:type
+           rdf:resource="http://purl.org/dc/dcmitype/StillImage" />
+      </cc:Work>
+    </rdf:RDF>
+  </metadata>
+  <g
+     inkscape:label="Layer 1"
+     inkscape:groupmode="layer"
+     id="layer1"
+     transform="translate(-106.70114,-419.13306)">
+    <text
+       xml:space="preserve"
+       style="font-size:12px;font-style:normal;font-weight:normal;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;font-family:Aharoni CLM"
+       x="116.79369"
+       y="428.03931"
+       id="text2160"><tspan
+         sodipodi:role="line"
+         id="tspan2162"
+         x="116.79369"
+         y="428.03931"
+         style="font-family:Monospace">1 2</tspan><tspan
+         sodipodi:role="line"
+         x="116.79369"
+         y="443.03931"
+         id="tspan2164"
+         style="font-family:Monospace">3 4</tspan><tspan
+         sodipodi:role="line"
+         x="116.79369"
+         y="458.03931"
+         id="tspan2166"
+         style="font-family:Monospace">5 6</tspan></text>
+    <text
+       xml:space="preserve"
+       style="font-size:12px;font-style:normal;font-weight:normal;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;font-family:Aharoni CLM"
+       x="180.75143"
+       y="506.09698"
+       id="text2184"><tspan
+         sodipodi:role="line"
+         id="tspan2186"
+         x="180.75143"
+         y="506.09698"
+         style="font-family:Monospace">1 2</tspan><tspan
+         sodipodi:role="line"
+         x="180.75143"
+         y="521.09698"
+         id="tspan2188"
+         style="fill:#0000ff;font-family:Monospace">1 2</tspan><tspan
+         sodipodi:role="line"
+         x="180.75143"
+         y="536.09698"
+         id="tspan2190"
+         style="fill:#0000ff;font-family:Monospace">1 2</tspan></text>
+    <text
+       xml:space="preserve"
+       style="font-size:12px;font-style:normal;font-weight:normal;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;font-family:Aharoni CLM"
+       x="150.42657"
+       y="577.06024"
+       id="text2192"><tspan
+         sodipodi:role="line"
+         id="tspan2194"
+         x="150.42657"
+         y="577.06024"
+         style="font-family:Monospace">2 4</tspan><tspan
+         sodipodi:role="line"
+         x="150.42657"
+         y="592.06024"
+         id="tspan2196"
+         style="font-family:Monospace">4 6</tspan><tspan
+         sodipodi:role="line"
+         x="150.42657"
+         y="607.06024"
+         id="tspan2198"
+         style="font-family:Monospace">6 8</tspan></text>
+    <text
+       xml:space="preserve"
+       style="font-size:12px;font-style:normal;font-weight:normal;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;font-family:Aharoni CLM"
+       x="180.81337"
+       y="428.06268"
+       id="text2200"><tspan
+         sodipodi:role="line"
+         x="180.81337"
+         y="428.06268"
+         id="tspan2206"
+         style="font-family:Monospace">1 2</tspan><tspan
+         sodipodi:role="line"
+         x="180.81337"
+         y="443.06268"
+         style="font-family:Monospace"
+         id="tspan2208" /><tspan
+         sodipodi:role="line"
+         x="180.81337"
+         y="458.06268"
+         style="font-family:Monospace"
+         id="tspan2210" /></text>
+    <text
+       xml:space="preserve"
+       style="font-size:12px;font-style:normal;font-weight:normal;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;font-family:Aharoni CLM"
+       x="156.64333"
+       y="442.89511"
+       id="text2216"><tspan
+         sodipodi:role="line"
+         x="156.64333"
+         y="442.89511"
+         id="tspan2218"
+         style="font-family:Monospace">+</tspan><tspan
+         sodipodi:role="line"
+         x="156.64333"
+         y="457.89511"
+         style="font-family:Monospace"
+         id="tspan2220" /><tspan
+         sodipodi:role="line"
+         x="156.64333"
+         y="472.89511"
+         style="font-family:Monospace"
+         id="tspan2222" /></text>
+    <text
+       xml:space="preserve"
+       style="font-size:6px;font-style:normal;font-weight:normal;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;font-family:Aharoni CLM"
+       x="106.13571"
+       y="465.37097"
+       id="text2224"><tspan
+         sodipodi:role="line"
+         x="106.13571"
+         y="465.37097"
+         id="tspan2226"
+         style="font-size:6px;font-family:Monospace">shape: (3, 2)</tspan><tspan
+         sodipodi:role="line"
+         x="106.13571"
+         y="472.87097"
+         style="font-size:6px;font-family:Monospace"
+         id="tspan2240">bcast: (F, F)</tspan><tspan
+         sodipodi:role="line"
+         x="106.13571"
+         y="480.37097"
+         style="font-size:6px;font-family:Monospace"
+         id="tspan2228" /><tspan
+         sodipodi:role="line"
+         x="106.13571"
+         y="487.87097"
+         style="font-size:6px;font-family:Monospace"
+         id="tspan2230" /></text>
+    <text
+       xml:space="preserve"
+       style="font-size:6px;font-style:normal;font-weight:normal;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;font-family:Aharoni CLM"
+       x="168.05223"
+       y="465.34521"
+       id="text2232"><tspan
+         sodipodi:role="line"
+         x="168.05223"
+         y="465.34521"
+         id="tspan2234"
+         style="font-size:6px;font-family:Monospace">shape: (1, 2)</tspan><tspan
+         sodipodi:role="line"
+         x="168.05223"
+         y="472.84521"
+         style="font-size:6px;font-family:Monospace"
+         id="tspan2242">bcast: (<tspan
+   style="fill:#0000ff"
+   id="tspan2244">T</tspan>, F)</tspan><tspan
+         sodipodi:role="line"
+         x="168.05223"
+         y="480.34521"
+         style="font-size:6px;font-family:Monospace"
+         id="tspan2236" /><tspan
+         sodipodi:role="line"
+         x="168.05223"
+         y="487.84521"
+         style="font-size:6px;font-family:Monospace"
+         id="tspan2238" /></text>
+    <path
+       style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.5;stroke-linecap:butt;stroke-linejoin:miter;marker-end:url(#Arrow2Lend);stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
+       d="M 161.11933,479.10061 L 161.37187,491.98006"
+       id="path2248" />
+    <text
+       id="text3469"
+       y="506.03931"
+       x="116.79369"
+       style="font-size:12px;font-style:normal;font-weight:normal;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;font-family:Aharoni CLM"
+       xml:space="preserve"><tspan
+         style="font-family:Monospace"
+         y="506.03931"
+         x="116.79369"
+         id="tspan3471"
+         sodipodi:role="line">1 2</tspan><tspan
+         style="font-family:Monospace"
+         id="tspan3473"
+         y="521.03931"
+         x="116.79369"
+         sodipodi:role="line">3 4</tspan><tspan
+         style="font-family:Monospace"
+         id="tspan3475"
+         y="536.03931"
+         x="116.79369"
+         sodipodi:role="line">5 6</tspan></text>
+    <text
+       id="text3485"
+       y="520.89514"
+       x="156.64333"
+       style="font-size:12px;font-style:normal;font-weight:normal;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;font-family:Aharoni CLM"
+       xml:space="preserve"><tspan
+         style="font-family:Monospace"
+         id="tspan3487"
+         y="520.89514"
+         x="156.64333"
+         sodipodi:role="line">+</tspan><tspan
+         id="tspan3489"
+         style="font-family:Monospace"
+         y="535.89514"
+         x="156.64333"
+         sodipodi:role="line" /><tspan
+         id="tspan3491"
+         style="font-family:Monospace"
+         y="550.89514"
+         x="156.64333"
+         sodipodi:role="line" /></text>
+    <text
+       id="text3493"
+       y="543.37097"
+       x="106.13571"
+       style="font-size:6px;font-style:normal;font-weight:normal;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;font-family:Aharoni CLM"
+       xml:space="preserve"><tspan
+         style="font-size:6px;font-family:Monospace"
+         id="tspan3495"
+         y="543.37097"
+         x="106.13571"
+         sodipodi:role="line">shape: (3, 2)</tspan><tspan
+         id="tspan3497"
+         style="font-size:6px;font-family:Monospace"
+         y="550.87097"
+         x="106.13571"
+         sodipodi:role="line">bcast: (F, F)</tspan><tspan
+         id="tspan3499"
+         style="font-size:6px;font-family:Monospace"
+         y="558.37097"
+         x="106.13571"
+         sodipodi:role="line" /><tspan
+         id="tspan3501"
+         style="font-size:6px;font-family:Monospace"
+         y="565.87097"
+         x="106.13571"
+         sodipodi:role="line" /></text>
+    <text
+       id="text3503"
+       y="543.34521"
+       x="168.05223"
+       style="font-size:6px;font-style:normal;font-weight:normal;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;font-family:Aharoni CLM"
+       xml:space="preserve"><tspan
+         style="font-size:6px;font-family:Monospace"
+         id="tspan3505"
+         y="543.34521"
+         x="168.05223"
+         sodipodi:role="line">shape: (<tspan
+   style="fill:#0000ff"
+   id="tspan3515">3</tspan>, 2)</tspan><tspan
+         id="tspan3507"
+         style="font-size:6px;font-family:Monospace"
+         y="550.84521"
+         x="168.05223"
+         sodipodi:role="line">bcast: (<tspan
+   id="tspan3509"
+   style="fill:#0000ff">T</tspan>, F)</tspan><tspan
+         id="tspan3511"
+         style="font-size:6px;font-family:Monospace"
+         y="558.34521"
+         x="168.05223"
+         sodipodi:role="line" /><tspan
+         id="tspan3513"
+         style="font-size:6px;font-family:Monospace"
+         y="565.84521"
+         x="168.05223"
+         sodipodi:role="line" /></text>
+    <path
+       style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.81574231;stroke-linecap:butt;stroke-linejoin:miter;marker-end:url(#Arrow2Lend);stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
+       d="M 209.4424,497.10811 L 209.6746,534.39419"
+       id="path3517" />
+    <text
+       id="text3519"
+       y="517.36304"
+       x="211.73936"
+       style="font-size:6px;font-style:normal;font-weight:normal;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;font-family:Aharoni CLM"
+       xml:space="preserve"><tspan
+         id="tspan3523"
+         style="font-size:6px;font-family:Monospace"
+         y="517.36304"
+         x="211.73936"
+         sodipodi:role="line">broadcasted</tspan><tspan
+         id="tspan3525"
+         style="font-size:6px;font-family:Monospace"
+         y="524.86304"
+         x="211.73936"
+         sodipodi:role="line" /><tspan
+         id="tspan3527"
+         style="font-size:6px;font-family:Monospace"
+         y="532.36304"
+         x="211.73936"
+         sodipodi:role="line" /></text>
+    <path
+       id="path3533"
+       d="M 161.11933,553.10061 L 161.37187,565.98006"
+       style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.5;stroke-linecap:butt;stroke-linejoin:miter;marker-end:url(#Arrow2Lend);stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1" />
+  </g>
+</svg>
--- a/doc/library/tensor/index.txt
+++ b/doc/library/tensor/index.txt
+.. _libdoc_tensor:
+==================================================
+:mod:`tensor`  -- types and ops for symbolic numpy
+==================================================
+.. module:: tensor
+   :platform: Unix, Windows
+   :synopsis: symbolic types and operations for n-dimensional arrays.
+.. moduleauthor:: LISA
+Theano's strength is in expressing symbolic calculations involving tensors.  
+There are many types of symbolic expressions for tensors.  For everyone's
+sanity, they are grouped into the following sections:
+.. toctree::
+    :maxdepth: 1
+    basic
+    shared_randomstreams
+    nnet
+    signal
--- a/doc/library/tensor/nnet.txt
+++ b/doc/library/tensor/nnet.txt
+.. _libdoc_tensor_nnet:
+======================================================
+:mod:`nnet` -- Ops for neural networks
+======================================================
+.. module:: nnet
+   :platform: Unix, Windows
+   :synopsis: Ops for neural networks
+.. moduleauthor:: LISA
+.. function:: sigmoid(x)
+   Returns the standard sigmoid nonlinearity applied to x
+    :Parameters: *x* - symbolic Tensor (or compatible)
+    :Return type: same as x
+    :Returns: element-wise sigmoid: :math:`sigmoid(x) = \frac{1}{1 + \exp(-x)}`. 
+Example:
+.. code-block:: python
+    x,y,b = T.dvectors('x','y','b')
+    W = T.dmatrix('W')
+    y = T.nnet.sigmoid(T.dot(W,x) + b)
+.. note:: The underlying code will return an exact 0 or 1 if an element of x is too small or too big.
+.. function:: softplus(x)
+   Returns the softplus nonlinearity applied to x
+    :Parameter: *x* - symbolic Tensor (or compatible)
+    :Return type: same as x
+    :Returns: elementwise softplus: :math:`softplus(x) = \log_e{\left(1 + \exp(x)\right)}`. 
+.. note:: The underlying code will return an exact 0 if an element of x is too small.
+.. code-block:: python
+    x,y,b = T.dvectors('x','y','b')
+    W = T.dmatrix('W')
+    y = T.nnet.softplus(T.dot(W,x) + b)
+.. function:: softmax(x)
+   Returns the softmax function of x:
+    :Parameter: *x* symbolic **2D** Tensor (or compatible). 
+    :Return type: same as x
+    :Returns: a symbolic 2D tensor whose ijth element is  :math:`softmax_{ij}(x) = \frac{\exp{x_{ij}}}{\sum_k\exp(x_{ik})}`. 
+The softmax function will, when applied to a matrix, compute the softmax values row-wise. 
+.. code-block:: python
+    x,y,b = T.dvectors('x','y','b')
+    W = T.dmatrix('W')
+    y = T.nnet.softmax(T.dot(W,x) + b)
+.. function:: binary_crossentropy(output,target)
+   Computes the binary cross-entropy between a target and an output:
+    :Parameters:
+       * *target* - symbolic Tensor (or compatible)
+       * *output* - symbolic Tensor (or compatible)
+    :Return type: same as target
+    :Returns: a symbolic tensor, where the following is applied elementwise :math:`crossentropy(t,o) = -(t\cdot log(o) + (1 - t) \cdot log(1 - o))`. 
+The following block implements a simple auto-associator with a sigmoid
+nonlinearity and a reconstruction error which corresponds to the binary
+cross-entropy (note that this assumes that x will contain values between 0 and
+1):
+.. code-block:: python
+    x,y,b = T.dvectors('x','y','b')
+    W = T.dmatrix('W') 
+    h = T.nnet.sigmoid(T.dot(W,x) + b)
+    x_recons = T.nnet.sigmoid(T.dot(V,h) + c)
+    recon_cost = T.nnet.binary_crossentropy(x_recons,x).mean()
+.. function:: categorical_crossentropy(coding_dist,true_dist)
+    Return the cross-entropy between an approximating distribution and a true distribution. 
+    The cross entropy between two probability distributions measures the average number of bits
+    needed to identify an event from a set of possibilities, if a coding scheme is used based
+    on a given probability distribution q, rather than the "true" distribution p. Mathematically, this 
+    function computes :math:`H(p,q) = - \sum_x p(x) \log(q(x))`, where
+    p=coding_dist and q=true_dist
+    :Parameters:
+       * *coding_dist* - symbolic 2D Tensor (or compatible). Each row
+         represents a distribution.
+       * *true_dist* - symbolic 2D Tensor **OR** symbolic vector of ints.  In
+         the case of an integer vector argument, each element represents the
+         position of the '1' in a 1-of-N encoding (aka "one-hot" encoding)
+    :Return type: tensor of rank one-less-than `coding_dist`
+.. note:: An application of the scenario where *true_dist* has a 1-of-N representation
+    is in classification with softmax outputs. If `coding_dist` is the output of
+    the softmax and `true_dist` is a vector of correct labels, then the function
+    will compute ``y_i = - \log(coding_dist[i, one_of_n[i]])``, which corresponds
+    to computing the neg-log-probability of the correct class (which is typically
+    the training criterion in classification settings).
+.. code-block:: python
+    y = T.nnet.softmax(T.dot(W,x) + b)
+    cost = T.nnet.categorical_crossentropy(y,o)
+    # o is either the above-mentioned 1-of-N vector or 2D tensor
--- a/doc/topics/randomstreams.txt
+++ b/doc/topics/randomstreams.txt
-.. _randomstreams:
+.. _libdoc_tensor_shared_randomstreams:
-====================
+======================================================
-Using RandomStreams
+:mod:`shared_randomstreams` -- Friendly random numbers
-====================
+======================================================
+.. module:: shared_randomstreams
+   :platform: Unix, Windows
+   :synopsis: symbolic random variables
+.. moduleauthor:: LISA
+Guide
+=====
 Since Theano uses a functional design, producing pseudo-random numbers in a
-graph is not quite as straightforward as it is in numpy.  But close.  If you are using theano's
+graph is not quite as straightforward as it is in numpy. If you are using Theano's
-shared variables, then a RandomStreams object is probably what you want.  If you are
+shared variables, then a `RandomStreams` object is probably what you want.  (If you are
-using the function interface directly, or you are using Module then this tutorial will be useful but not exactly what you want.  
+using Module then this tutorial will be useful but not exactly what you want.  
-Have a look at the :api:`RandomFunction` Op.
+Have a look at the :api:`RandomFunction` Op.)
 The way to think about putting randomness into theano's computations is to
 put random variables in your graph.  Theano will allocate a numpy RandomState
@@ -33,7 +41,7 @@ Here's a brief example.  The setup code is:
 Here, 'rv_u' represents a random stream of 2x2 matrices of draws from a uniform
 distribution.  Likewise,  'rv_n' represenents a random stream of 2x2 matrices of
 draws from a normal distribution.  The distributions that are implemented are
-defined in :ref:`tensor.shared_randomstreams.RandomStreams`.
+defined in :class:`RandomStreams`.
 Now let's use these things.  If we call f(), we get random uniform numbers.
 Since we are updating the internal state of the random number generator (via
@@ -56,27 +64,6 @@ random variable appears three times in the output expression.
 >>> nearly_zeros = function([], rv_u + rv_u - 2 * rv_u, updates=[rv_u.update])
-Internal Attributes 
--------------------
-The random variables returned by methods like ``RandomStreams.uniform`` have
-some useful internal attributes.  
-The one we used above is the ``update`` attribute.  ``rv_u.update`` is a pair
-whose first element is a shared variable whose value is a numpy RandomState,
-and whose second element is an [symbolic] expression for the next value of that
-RandomState after drawing samples.
-The first element of the ``update`` attribute is also accessible as
-``rv_u.rng``.  A random variable can be re-seeded by seeding or assigning to
-``rv_u.rng.value``.
-The ``RandomStreams`` object itself has an ``updates()`` method that returns a
-list of all the (state, new_state) update pairs from the random variables it
-has returned.  This can be a convenient shortcut to enumerating all
-the random variables in a large graph in the ``update`` paramter of function.
 Seedings Streams
 ----------------
@@ -87,14 +74,13 @@ You can seed just one random variable by seeding or assigning to the
 >>> rv_u.rng.value.seed(89234)  # seeds the generator for rv_u
-You can also seed *all* of the random variables allocated by a ``RandomStreams``
+You can also seed *all* of the random variables allocated by a :class:`RandomStreams`
 object by that object's ``seed`` method.  This seed will be used to seed a
 temporary random number generator, that will in turn generate seeds for each
 of the random variables.
 >>> srng.seed(902340)  # seeds rv_u and rv_n with different seeds each
 Sharing Streams between Functions
 ---------------------------------
@@ -111,9 +97,179 @@ For example:
 >>> v2 = f()             # v2 != v1
-.. _Tools: tools.html
+Reference
+=========
+.. class:: RandomStreams(object)
+    This is a symbolic stand-in for ``numpy.random.RandomState``. It has
+    methods such as `uniform` and `normal` that return symbolic random variables.
+    .. method:: updates()
+        :returns: a list of all the (state, new_state) update pairs from the
+        random variables it has returned.  This can be a convenient shortcut
+        to enumerating all the random variables in a large graph in the
+        ``update`` paramter of function.
+    .. method:: seed(meta_seed)
+        `meta_seed` will be used to seed a temporary random number generator,
+        that will in turn generate seeds for each of the random variables that
+        has been created by this object.
+        :returns: None
+    .. method:: binomial(self, size, n=1, p=0.5)
+        Symbolic stand-in for numpy.random.RandomState.binomial
+        :returns: :class:`RandomVariable` of float64 that will have `shape==size` at run-time.
+    .. method:: uniform(self, size, low=0.0, high=1.0)
+        Symbolic stand-in for numpy.random.RandomState.uniform
+        :returns: :class:`RandomVariable` of float64 that will have `shape==size` at run-time.
+    .. method:: normal(self, size, loc=0.0, std=1.0)
+        Symbolic stand-in for numpy.random.RandomState.normal
+        :returns: :class:`RandomVariable` of float64 that will have `shape==size` at run-time.
+    .. method:: random_integers(self, size, low=0, high=1)
+        Symbolic stand-in for numpy.random.RandomState.random_integers
+        :returns: :class:`RandomVariable` of int64 that will have `shape==size` at run-time.
+.. class:: RandomVariable(object)
+    .. attribute:: rng
+        The shared variable whose ``.value`` is the numpy RandomState
+        generator feeding this random variable.
+    .. attribute:: update
+        A pair
+        whose first element is a shared variable whose value is a numpy RandomState,
+        and whose second element is an [symbolic] expression for the next value of that
+        RandomState after drawing samples.
+        Including this pair in the``updates`` list to function will cause the
+        function to update the random number generator feeding this variable.
+.. _libdoc_tensor_raw_random:
+=============================================
+:mod:`raw_random` -- Low-level random numbers
+=============================================
+.. module:: raw_random
+   :platform: Unix, Windows
+   :synopsis: symbolic random variables
+.. moduleauthor:: LISA
+Raw random provides the random-number drawing functionality, that underlies
+the :class:`RandomStreams` interface.
+Reference
+=========
+.. class:: RandomStateType(gof.Type)
+    A `Type` for variables that will take ``numpy.random.RandomState`` values.
+.. class:: RandomFunction(gof.Op)
+    Op that draws random numbers from a numpy.RandomState object.  This Op is
+    parametrized to draw numbers from many distributions.
+.. function:: random_function(fn, dtype, *rfargs, **rfkwargs)
+    Returns a wrapper around RandomFunction which automatically infers the number
+    of dimensions of the output from the given shape. If the shape cannot be inferred,
+    the user can give an integer as first argument, which will be interpreted as the
+    number of dimensions.
+    If the distribution is not scalar (e.g., a multinomial), the output will have
+    more dimensions than what the shape argument suggests. The "ndim_added" keyword
+    arguments allows to specify how many dimensions to add (for a multinomial, 1).
+    The number of dimensions for the following shape arguments can be inferred:
+    * shape(x)
+    * make_lvector(x, y, z, ...)
+    * ndarrays,  constants
+.. function:: uniform(random_state, size, low=0.0, high=1.0)
+    Sample from a uniform distribution between low and high.
+    If the size argument is ambiguous on the number of
+    dimensions, the first argument may be a plain integer
+    to supplement the missing information.
+    :returns: :class:`RandomVariable`, NewRandomState
+.. function:: binomial(random_state, size, n=1, p=0.5)
+    Sample n times with probability of success prob for each trial,
+    return the number of successes.
+    If the size argument is ambiguous on the number of
+    dimensions, the first argument may be a plain integer
+    to supplement the missing information.
+    :returns: :class:`RandomVariable`, NewRandomState
+.. function:: normal(random_state, size, avg=0.0, std=1.0)
+    Sample from a normal distribution centered on avg with
+    the specified standard deviation (std)
+    If the size argument is ambiguous on the number of
+    dimensions, the first argument may be a plain integer
+    to supplement the missing information.
+    :returns: :class:`RandomVariable`, NewRandomState
+.. function:: random_integers(random_state, size, low=0, high=1)
+    Sample a random integer between low and high, both inclusive.
+    If the size argument is ambiguous on the number of
+    dimensions, the first argument may be a plain integer
+    to supplement the missing information.
+    :returns: :class:`RandomVariable`, NewRandomState
+.. function:: permutation(random_state, size, n=1)
+    Returns permutations of the integers between 0 and n-1, as many times
+    as required by size. For instance, if size=(p,q), p*q permutations
+    will be generated, and the output shape will be (p,q,n), because each
+    permutation is of size n.
+    If the size argument is ambiguous on the number of dimensions, the first
+    argument may be a plain integer i, which should correspond to len(size).
+    Note that the output will then be of dimension i+1.
+    :returns: :class:`RandomVariable`, NewRandomState
+.. function:: multinomial(random_state, size, p_vals=[0.5, 0.5])
+    Sample from a multinomial distribution defined by probabilities pvals,
+    as many times as required by size. For instance, if size=(p,q), p*q
+    samples will be drawn, and the output shape will be (p,q,len(pvals)).
+    If the size argument is ambiguous on the number of dimensions, the first
+    argument may be a plain integer i, which should correspond to len(size).
+    Note that the output will then be of dimension i+1.
+    :returns: :class:`RandomVariable`, NewRandomState
--- a/doc/library/tensor/signal.txt
+++ b/doc/library/tensor/signal.txt
+.. _libdoc_tensor_signal:
+======================================================
+:mod:`signal` -- Signal processing
+======================================================
+.. module:: signal
+   :platform: Unix, Windows
+   :synopsis: ops for signal processing
+.. moduleauthor:: LISA
+TODO: Give examples for how to use these things! They are pretty complicated.
+.. function:: conv2D(*todo)
+.. function:: downsample2D(*todo)
+.. function:: fft(*todo)
--- a/doc/proposals/index.txt
+++ b/doc/proposals/index.txt
+.. _proposals:
+==================================
+Proposals for new/revised features
+==================================
+.. toctree::
+    :maxdepth: 1
+    pfunc
+    noupdates
--- a/doc/proposals/noupdates.txt
+++ b/doc/proposals/noupdates.txt
+=================
+Automatic updates
+=================
+.. note:
+   Proposed 2010 01 13
+The Module version of RandomStreams could arrange for the automatic update of
+certain inputs (such as the random number generators) at the time of make(), so
+that certain *obvious* patterns would work:
+>>> rs = RandomStreams()
+>>> u = rs.uniform(...)
+>>> f = theano.function([], u)
+>>> assert not numpy.all(f() == f())
+Unfortunately, with shared variables this does not work!  Function needs to be
+told which shared variables to update.  The current workaround is to do this:
+>>> theano.function([], u, updates=rs.updates())
+or this:
+>>> theano.function([], u, updates=[u.update])
+But it is all too easy to forget to do either of these workarounds, and
+accidentally run a program whose random numbers are the same in every call.
+Proposal
+========
+Add an optional `default_update` attribute to Shared variables. This will be
+consulted by function.  If no update expression is given for this variable in
+the updates list, then this default will be inserted.  Note well: a value of None for the
+default_update means to update with a value of None!  To have no default update,
+make sure that the default_update attribute is not defined.
+Add an optional argument to function: `no_default_updates`.  This argument defaults to
+False, which results in the current semantics.
+A True value here would mean "ignore all default_update expressions", and this
+would be useful for disabling implicit behaviour.
+A list of shared variables here would mean to ignore the
+default_update_expressions in these specific variables.
+Alternatives
+============
+Consider a singleton 'NOUPDATE' object that can be used as a pseudo-expression
+in the update list.  This doesn't introduce a new keyword argument, which makes
+it slightly more awkward to document in theano.function.  Really though, I have
+no strong feelings between this and the no_updates paramter.
--- a/doc/proposals/pfunc.txt
+++ b/doc/proposals/pfunc.txt
-======================================
+=============================================
-Proposal for pfunc  Function Interface
+Proposal for pfunc  Function Interface [DONE]
-======================================
+=============================================
 .. note::

--- a/doc/trac/broadcasting.txt
+++ b/doc/trac/broadcasting.txt
--- a/doc/trac/ccodegen.txt
+++ b/doc/trac/ccodegen.txt
@@ -68,53 +68,51 @@ Failure
 Besides cleanup code, all code has access to the %(fail)s template. For three code blocks, the generated C code will pretty much look like this:
-{{{
+.. code-block::
-int failure = 0;
+    int failure = 0;
-{
-  <code1>
-  {
-    <code2>
    {
-      <code3>
+      <code1>
-    label3:
+      {
-      <cleanup3>
+        <code2>
+        {
+          <code3>
+        label3:
+          <cleanup3>
+        }
+      label2:
+        <cleanup2>
+      }
+    label1:
+      <cleanup1>
    }
-  label2:
+    return failure;
-    <cleanup2>
-  }
-label1:
-  <cleanup1>
-}
-return failure;
-}}}
 And %(fail)s in the nth code block will take the value "{failure = n; goto label<n>;}". This means only the blocks executed up to the failure point are cleaned up and the return value indicates which block failed, which is handy for debugging.
 When compiling an Op, we want to sync the outputs so we can get the results from Python. In case of failure, we will not necessarily want to sync. Because of that, typical code will look like this:
-{{{
+.. code-block::
-int failure = 0;
+    int failure = 0;
-<declare input>
+    <declare input>
-<declare output>
+    <declare output>
-{
-  <extract input>
-  {
-    <extract output>
    {
-      <perform>
+      <extract input>
-    label3:
+      {
-      <clean up perform>
+        <extract output>
+        {
+          <perform>
+        label3:
+          <clean up perform>
+        }
+      label2:
+        if (!failure)
+          <sync output>
+        <clean up output>
+      }
+    label1:
+      <clean up input>
    }
-  label2:
+    return failure;
-    if (!failure)
-      <sync output>
-    <clean up output>
-  }
-label1:
-  <clean up input>
-}
-return failure;
-}}}
 Furthermore, is not necessary to extract the output because we mean to overwrite it anyway. In that case, <extract output> will be a no-op, but of course we may still need to clean up or sync what <perform> will put in the declared outputs.
@@ -124,20 +122,19 @@ Example ResultBase
 The following ResultBase represents a double (we only care about the C part).
-{{{
+.. code-block::
-class Double(ResultBase):
+    class Double(ResultBase):
-  <snip>
+      <snip>
-  def c_declare(self):
+      def c_declare(self):
-    return "double %(name)s;"
+        return "double %(name)s;"
-  def c_init(self):
+      def c_init(self):
-    return "%(name)s = 0.0;"
+        return "%(name)s = 0.0;"
-  def c_extract(self):
+      def c_extract(self):
-    return "%(name)s = PyFloat_AsDouble(py_%(name)s);"
+        return "%(name)s = PyFloat_AsDouble(py_%(name)s);"
-  def c_cleanup(self):
+      def c_cleanup(self):
-    return "" # nothing to do
+        return "" # nothing to do
-  def c_sync(self):
+      def c_sync(self):
-    return "Py_XDECREF(py_%(name)s); py_%(name)s = PyFloat_FromDouble(%(name)s);"
+        return "Py_XDECREF(py_%(name)s); py_%(name)s = PyFloat_FromDouble(%(name)s);"
-}}}
 Example Op
@@ -145,71 +142,69 @@ Example Op
 The following ResultBase represents addition of two nonnegative doubles (we only care about the C part).
-{{{
+.. code-block::
-class Add(Op):
+    class Add(Op):
-  <snip>
+      <snip>
-  def c_var_names(self):
+      def c_var_names(self):
-    return "[['x', 'y'], ['z']]"
+        return "[['x', 'y'], ['z']]"
-  def c_validate_update(self):
+      def c_validate_update(self):
-    return "if (%(x)s < 0 || %(y)s < 0) %(fail)s" # fail if x or y is negative
+        return "if (%(x)s < 0 || %(y)s < 0) %(fail)s" # fail if x or y is negative
-  def c_validate_update_cleanup(self):
+      def c_validate_update_cleanup(self):
-    return "" # nothing to do
+        return "" # nothing to do
-  def c_code(self):
+      def c_code(self):
-    return "%(z)s = %(x)s + %(y)s;"
+        return "%(z)s = %(x)s + %(y)s;"
-  def c_code_cleanup(self):
+      def c_code_cleanup(self):
-    return "" # nothing to do
+        return "" # nothing to do
-}}}
 Generating a C function
 =======================
 For the example Op, the generated C function will typically look like this:
-{{{
+.. code-block::
-void add(PyObject* storage_x, PyObject* storage_y, PyObject* storage_z) {
+    void add(PyObject* storage_x, PyObject* storage_y, PyObject* storage_z) {
-  PyObject* py_x = PyList_GET_ITEM(storage_x, 0); Py_XINCREF(py_x); // automatic
+      PyObject* py_x = PyList_GET_ITEM(storage_x, 0); Py_XINCREF(py_x); // automatic
-  PyObject* py_y = PyList_GET_ITEM(storage_y, 0); Py_XINCREF(py_y); // automatic
+      PyObject* py_y = PyList_GET_ITEM(storage_y, 0); Py_XINCREF(py_y); // automatic
-  PyObject* py_z = Py_None; // we don't care what's currently in storage_z
+      PyObject* py_z = Py_None; // we don't care what's currently in storage_z
-  failure = 0
+      failure = 0
-  double x; // x.c_declare
+      double x; // x.c_declare
-  double y; // y.c_declare
+      double y; // y.c_declare
-  double z; // z.c_declare
+      double z; // z.c_declare
-  {
-    x = PyFloat_AsDouble(py_x); // x.c_extract
-    {
-      y = PyFloat_AsDouble(py_y); // y.c_extract
      {
-        # we don't need to use z.c_extract
+        x = PyFloat_AsDouble(py_x); // x.c_extract
        {
-          if (x < 0 || y < 0) { // add.validate_update
+          y = PyFloat_AsDouble(py_y); // y.c_extract
-            // This is automatically inserted in place of %(fail)s
-            failure = 4;
-            goto label_add_validate_update_cleanup;
-          }
          {
-            z = x + y; // add.c_code
+            # we don't need to use z.c_extract
-          label_add_code_cleanup:
+            {
+              if (x < 0 || y < 0) { // add.validate_update
+                // This is automatically inserted in place of %(fail)s
+                failure = 4;
+                goto label_add_validate_update_cleanup;
+              }
+              {
+                z = x + y; // add.c_code
+              label_add_code_cleanup:
+              }
+            label_add_validate_update_cleanup:
+            }
+          label_z_sync_or_cleanup:
+            if (!failure) {
+              Py_XDECREF(py_z); // z.c_sync
+              py_z = PyFloat_FromDouble(z); // z.c_sync, the result is now available from Python!
+              PyList_SET_ITEM(storage_z, 0, py_z); // always done after _.c_sync
+            }
+            Py_XDECREF(py_z); // always done after _.c_cleanup
          }
-        label_add_validate_update_cleanup:
+        label_y_cleanup:
-        }
+          Py_XDECREF(py_y); // always done after _.c_cleanup
-      label_z_sync_or_cleanup:
-        if (!failure) {
-          Py_XDECREF(py_z); // z.c_sync
-          py_z = PyFloat_FromDouble(z); // z.c_sync, the result is now available from Python!
-          PyList_SET_ITEM(storage_z, 0, py_z); // always done after _.c_sync
        }
-        Py_XDECREF(py_z); // always done after _.c_cleanup
+      label_x_cleanup:
+        Py_XDECREF(py_x); // always done after _.c_cleanup
      }
-    label_y_cleanup:
+      return failure;
-      Py_XDECREF(py_y); // always done after _.c_cleanup
    }
-  label_x_cleanup:
-    Py_XDECREF(py_x); // always done after _.c_cleanup
-  }
-  return failure;
-}
-}}}
 Generating a C struct
 =====================
@@ -218,30 +213,29 @@ To accelerate processing a tad, a struct can be generated instead of a function.
 Here is a sketch of the struct equivalent of the previous function:
-{{{
+.. code-block::
-struct add {
+    struct add {
-  PyObject* storage_x;
+      PyObject* storage_x;
-  PyObject* storage_y;
+      PyObject* storage_y;
-  PyObject* storage_z;
+      PyObject* storage_z;
-  double z; // z.c_declare
+      double z; // z.c_declare
-  void init(PyObject* storage_x, PyObject* storage_y, PyObject* storage_z) {
+      void init(PyObject* storage_x, PyObject* storage_y, PyObject* storage_z) {
-    <set the struct members of the same names>
+        <set the struct members of the same names>
-    <init the struct members corresponding to z>
+        <init the struct members corresponding to z>
-  }
+      }
-  void cleanup(void) {
+      void cleanup(void) {
-    <cleanup z>
+        <cleanup z>
-  }
+      }
-  void run(void) {
+      void run(void) {
-    <same code as before minus z's cleanup>
+        <same code as before minus z's cleanup>
-  }
+      }
-  add() { this->init(); }
+      add() { this->init(); }
-  ~add() { this->cleanup(); }
+      ~add() { this->cleanup(); }
-};
+    };
-}}}
 Advantages of using a struct:
  * Can be run several times even if we provide the storage only once.

--- a/doc/advanced/compilation.txt
+++ b/doc/advanced/compilation.txt
--- a/doc/advanced/debugging_with_stepmode.txt
+++ b/doc/advanced/debugging_with_stepmode.txt
--- a/doc/trac/elemwise_compiler.txt
+++ b/doc/trac/elemwise_compiler.txt
@@ -33,27 +33,26 @@ Question: does it make sense to apply the order to the loop, or is this broadcas
 Here is the loop for {{{order == c}}}. Check for errors!
-{{{
+.. code-block::
-<initialize iterators>
+    <initialize iterators>
-i1 = -1
+    i1 = -1
-while (++i1 < dim1) {
+    while (++i1 < dim1) {
-  i2 = -1
+      i2 = -1
-  rank_N-1_accumulator = init
+      rank_N-1_accumulator = init
-  while (++i2 < dim2) {
+      while (++i2 < dim2) {
-    ...
+        ...
-    iN = -1
+        iN = -1
-    while (++iN < dimN) {
+        while (++iN < dimN) {
-      <accumulate rank N input>
+          <accumulate rank N input>
-      <SET rank N output using broadcasted inputs>
+          <SET rank N output using broadcasted inputs>
-      <NEXT rank N iterator>
+          <NEXT rank N iterator>
+        }
+        ...
+      }
+      <SET rank 1 output using accumulated inputs>
+      <NEXT rank 1 iterator>
    }
-    ...
-  }
-  <SET rank 1 output using accumulated inputs>
-  <NEXT rank 1 iterator>
-}
-}}}
 When {{{order == f}}}, the iterators ''ideally'' (but not necessarily) iterate in FORTRAN order, i.e. the while loops are on {{{dimN..dim1}}} instead of {{{dim1..dimN}}}.

--- a/doc/advanced/env.txt
+++ b/doc/advanced/env.txt
--- a/doc/advanced/function.txt
+++ b/doc/advanced/function.txt
--- a/doc/advanced/index.txt
+++ b/doc/advanced/index.txt
--- a/doc/trac/interactive_debugger.txt
+++ b/doc/trac/interactive_debugger.txt
--- a/doc/sandbox/module.txt
+++ b/doc/sandbox/module.txt
-.. _module:
-######
-Module
-######
-What is a Theano Module
-=======================
-Theano 'Module' is a structure which implements what could be called a
-"theano class". A ``Module`` can contain ``Members``, which act like
-instance variables ("state"). It can also contain an arbitrary number
-of ``Methods``, which are functions that share the same ``Members`` in
-addition to their own inputs. Last but not least, ``Modules`` can be
-nested (explanations and examples follow). ``Module`` is meant to:
- #. ease the sharing of variables between several Theano functions,
- #. streamline automatic naming, and
- #. allow a hierarchy of "modules" whose states can interact.
+.. _moduleinterface:
-import
+================
-======
+Module Interface
+================
-All examples suppose that you have done those import:
-.. code-block:: python
-    #!/usr/bin/env python
+A Theano Module is like Theano's version of a file.
-    import theano
+When you instantiate a ``Module()``, you are creating a blank file.
-    import numpy as N
+Into this file you can put both symbolic and non-symbolic objects.
-    from theano import tensor as T
+Non-symbolic objects are like constants (technically literals) in the file.
-    from theano.tensor import nnet as NN
+Symbolic objects are like variables and functions.
-    from theano.compile import module as M
+The functions in a Module are called Methods.
+The variables in a Module (and submodules) are global.
+Module Methods have access to all these global variables.
-Module
+To use a Module, you need to compile it.
-======
+This is done by calling `Module.make()`.
+The result of compiling a Module is a ModuleInstance, this is the compiled
+version of your Theano file.
+In the ModuleInstance, your symbolic variables have become containers (containing None),
+and your Methods have become callable functions.
+You should initialize the symbolic variables by calling
+``ModuleInstance.initialize()`` (although make() will call it for you, 
+on the top-level ModuleInstance.)
-A ``Module`` can contain ``Members``, ``Methods`` and inner ``Modules``. Each type has a special meaning.
+You can compile a Module several times, to create multiple ModuleInstances.
+Each of these will have its own copy of all program literals.
-.. code-block:: python
-    module = M.Module()
-``Member``
+Module Graph
 ------------
-Usage:
+Components can be grouped into a directed graph.
+When we call `make`, this graph is replicated with ComponentInstances instead of
+Components.  Wheras Components are represent symbolic things (ie. Variables), ComponentInstances represent non-symbolic ones (ie. sparse matrices, ndarrays, callable functions).
-.. code-block:: python
-    #module.state = variable
+.. index::
-    module.state = T.scalar()
+   single: Component
+   single: component; Component
-A ``Member`` represents a state variable (i.e., whose value remains after a ``Method`` is called). It will be named automatically after that field and it will be an implicit input of all ``Methods`` of the ``Module``. Its storage (i.e. where the value is stored) will be shared by all ``Methods`` of the ``Module``.
+.. _component:
+---------
+Component
+---------
-A ``Variable`` which is the variable of a previous computation (by opposition to being ``updated``) is not a ``Member``. Internally this is called an External. You should not need to care about this.
+All of the elements of what is called the "module system" or "modules" are
+components.
-For sharing state between modules, see ``Inner Module`` section.
+A component subclass is represents a symbolic theano thing, and implements the
+``build`` function. 
+The ``build`` function is responsible for converting the symbolic thing into a
+non-symbolic thing.
-``Method``
------------
-Usage:
+Compiling with make
+-------------------
-.. code-block:: python
+Conversion from a Component graph to a ComponentInstance graph is performed by `Component.make`.
+This method traverses the Component graph in multiple passes. 
-    module.method = M.Method(inputs, outputs, **updates)
+In the first pass (the allocate pass), it creates storage for all Variables that are contained in the graph (see
+`Component.allocate`).  These are the module variables.
-Each key in the updates dictionary must be the name of an existing ``Member`` of the ``Module`` and the value associated to that key is the update expression for the state. When called on a ``ModuleInstance`` produced by the ``Module``, the method will calculate the outputs from the inputs and will update all the states as specified by the update expressions. See the basic example below.
+In the second pass (the build pass), it creates functions that (in general) operate on these module variables.
+This pass also serves to construct all ComponentInstance-derived instances as well, such as
+`ModuleInstance`s.  The objects that are returned from this second pass are the return value of
+`Component.make`.
-Inner Module
+In the third pass (the initialize pass), is optional and not necessarily recursive through the
------------
+graph.
+The purpose of the third pass is to call the initialize method of the ComponentInstances built
+during the second pass.
+During this pass the ComponentInstance graph is complete. It is a good time to fill storage
+allocated in phase 1 with sensible values.
-To share a ``Member`` between modules, the modules must be linked through the inner module mechanism. 
+.. index::
+   single: External
+   single: component; External
-Usage:
+.. _external:
+--------
+External
+--------
-.. code-block:: python
+WRITEME
-    module2.submodule = module
-``ModuleInstance``
-====================
-A ``Module`` can produce a ``ModuleInstance`` with its ``make`` method. Think of this as a class and an object in C++/Java. If an attribute was a ``Member``, it will become a read/write access to actual data for the state. If it was a ``M.Method``, a function will be compiled with the proper signature and semantics.
+.. index::
+   single: Member
+   single: component; Member
+.. _member:
+------
+Member
+------
-Module Interface
+WRITEME
-================
-.. code-block:: python
-    def make(self, mode = {'FAST_COMPILE', 'FAST_RUN', ... }, **init)
+.. index::
+   single: Method
+   single: component; Method
-'''make''' compiles all ``Methods`` and allocates storage for all ``Members`` into a ``ModuleInstance`` object, which is returned. The ``init`` dictionary can be used to provide initial values for the members.
+.. _method:
+------
+Method
+------
-.. code-block:: python
+WRITEME
-    def resolve(self, symbol, filter = None)
-Resolves a symbol in this module. The symbol can be a string or a ``Variable``. If the string contains dots (eg ``"x.y"``), the module will resolve the symbol hierarchically in its inner modules. The filter argument is None or a class and it can be used to restrict the search to ``Member`` or ``Method`` instances for example.
-.. code-block:: python
-    def _instance_initialize(self, inst, **init)
-The inst argument is a ``ModuleInstance``. For each key, value pair in init: ``setattr(inst, key, value)`` is called. This can be easily overriden by ``Module`` subclasses to initialize an instance in different ways. If you don't know what to put their, don't put it and it will execute a default version. If you want to call the parent version call: ``M.default_initialize(inst,**init)``
-Basic example
-=============
-The problem here is to create two functions, ``inc`` and ``dec`` and a shared state ``c`` such that ``inc(n)`` increases ``c`` by ``n`` and ``dec(n)`` decreases ``c`` by ``n``. We also want a third function, ``plus10``, which return 10 + the current state without changing the current state. Using the function interface, the feature can be implemented as follows:
-.. code-block:: python
-    n, c = T.scalars('nc')
-    inc = theano.function([n, ((c, c + n), 0)], [])
-    dec = theano.function([n, ((c, c - n), inc.container[c])], []) # we need to pass inc's container in order to share
-    plus10 = theano.function([(c, inc.container[c])], c + 10)
-    assert inc[c] == 0
-    inc(2)
-    assert inc[c] == 2 and dec[c] == inc[c]
-    dec(3)
-    assert inc[c] == -1 and dec[c] == inc[c]
-    assert plus10() == 9
-Now, using ``Module``:
-.. code-block:: python
-    m = M.Module()
-    n = T.scalar('n')
-    m.c = T.scalar() # state variables
-    m.inc = M.Method(n, [], updates = {m.c: m.c + n}) # m.c <= m.c + n
-    m.dec = M.Method(n, [], updates = {m.c: m.c - n}) # k.c <= k.c - n
-    #m.dec = M.Method(n, [], updates = {c: m.c - n})#global c don't exist
-    #m.plus10 does not update the state
-    m.plus10 = M.Method([], m.c + 10) # m.c is always accessible since it is a member of this mlass
-    inst = m.make(c = 0) # here, we make an "instance" of the module with c initialized to 0
-    assert inst.c == 0
-    inst.inc(2)
-    assert inst.c == 2
-    inst.dec(3)
-    assert inst.c == -1
-    assert inst.plus10() == 9
-Benefits of ``Module`` over ``function`` in this example:
- * There is no need to manipulate the containers directly
- * The fact inc and dec share a state is more obvious syntactically.
- * ``Method`` does not require the states to be anywhere in the input list.
- * The interface of the instance produced by ``m.make()`` is simple and coherent, extremely similar to that of a normal python object. It is directly usable by any user.
-Nesting example
-===============
-The problem now is to create two pairs of ``inc dec`` functions and a function ``sum`` that adds the shared states of the first and second pair.
-Using function:
-.. code-block:: python
-    def make_incdec_function():
-           n, c = T.scalars('nc')
-           inc = theano.function([n, ((c, c + n), 0)], [])
-           dec = theano.function([n, ((c, c - n), inc.container[c])], [])#inc and dec share the same state.
-           return inc,dec
-    inc1, dec1 = make_incdec_function()
-    inc2, dec2 = make_incdec_function()
-    a, b = T.scalars('ab')
-    sum = theano.function([(a, inc1.container['c']), (b, inc2.container['c'])], a + b)
-    inc1(2)
-    dec1(4)
-    inc2(6)
-    assert inc1['c'] == -2 and inc2['c'] == 6
-    assert sum() == 4 # -2 + 6
-Using Module:
-.. code-block:: python
-    def make_incdec_module():
-        m = M.Module()
-        n = T.scalar('n')
-        m.c = T.scalar() # state variables
-        m.inc = M.Method(n, [], updates = {m.c: m.c + n}) # m.c <= m.c + n
-        m.dec = M.Method(n, [], updates = {m.c: m.c - n}) # m.c <= m.c - n
-        return m
-    m = M.Module()
-    m.incdec1 = make_incdec_module()
-    m.incdec2 = make_incdec_module()
-    m.sum = M.Method([], m.incdec1.c + m.incdec2.c)
-    inst = m.make(incdec1 = dict(c=0), incdec2 = dict(c=0))
-    inst.incdec1.inc(2)
-    inst.incdec1.dec(4)
-    inst.incdec2.inc(6)
-    assert inst.incdec1.c == -2 and inst.incdec2.c == 6
-    assert inst.sum() == 4 # -2 + 6
-Here, we make a new ``Module`` and we give it two inner ``Modules`` like
-the one defined in the basic example. Each inner module has methods inc
-and dec as well as a state c and their state is directly accessible from
-the outer module, which means that it can define methods using them. The
-instance(inst) we make from the ``Module``(m) reflects the hierarchy
-that we created. Unlike the method using function, there is no need to
-manipulate any containers directly.
-Advanced example
-================
-Complex models can be implemented by subclassing ``Module`` (though that is not mandatory). Here is a complete, extensible (and working) regression model implemented using this system:
-.. literalinclude:: ../code/regression.py
+.. index::
+   single: Module
+   single: component; Module
-Here is how we use the model:
+.. _module:
+------
-.. code-block:: python
+Module
+------
-    data_x = N.random.randn(4, 10)
-    data_y = [ [int(x)] for x in N.random.randn(4) > 0]
-    model = SoftmaxXERegression(regularize = False).make(input_size = 10,
-                       target_size = 1,
-                       stepsize = 0.1)
-    for i in xrange(1000):
-       xe = model.update(data_x, data_y)
-       if i % 100 == 0:
-           print i, xe
-           pass
-       #for inputs, targets in my_training_set():
-           #print "cost:", model.update(inputs, targets)
-    print "final weights:", model.w
-    print "final biases:", model.b
-Extending ``Methods``
-=======================
-``Methods`` can be extended to update more parameters. For example, if we wanted to add a variable holding the sum of all costs encountered so far to ``SoftmaxXERegression``, we could proceed like this:
+A Module instance can contain objects as attributes.
+This makes it something like a class in the way that Method is
+analogous to a function.
-.. code-block:: python
+A Module is meant to contain Components.
+Attributes which are not Components themselves must at least be transform-able
+into Components by :api:`compile.module.wrap`.  If a Module contains something
+that is not convertible into a Component, then it is not possible to compile
+that Module with ``make``.
-    model_module = SoftmaxXERegression(regularize = False)
-    model_module.sum = T.scalar() # we add a module member to hold the sum
-    model_module.update.updates.update(sum = model_module.sum + model_module.cost) # now update will also update sum!
-    model = model_module.make(input_size = 4,
-                             target_size = 2,
-                             stepsize = 0.1,
-                             sum = 0) # we mustn't forget to initialize the sum
-    test = model.update([[0,0,1,0]], [[0,1]]) + model.update([[0,1,0,0]], [[1,0]])
-    assert model.sum == test
-The inputs and outputs list of a ``Method`` can be doctored as well, but it is trickier, arguably less useful and not fully supported at the moment.
+Old Text
+--------
+In the Module system, the analog of the file is the `Module`, the analog of the function is the
+`Method`, and the analog of the variable is the `Member`.  Module, Member, and Method all work
+at the symbolic level.  Once a graph of Modules, Members, and Methods is ready for use, it must
+be compiled with a call to `make` which will return an isomorphic structure in which Modules
+have become `ModuleInstances`, Members have become `Container`s, and Methods have become
+`Function`s.
+This structure contains numbers and functions, and is ready for computation.
--- a/doc/sandbox/module2.txt
+++ b/doc/sandbox/module2.txt
+.. _module:
+######
+Module
+######
+What is a Theano Module
+=======================
+Theano 'Module' is a structure which implements what could be called a
+"theano class". A ``Module`` can contain ``Members``, which act like
+instance variables ("state"). It can also contain an arbitrary number
+of ``Methods``, which are functions that share the same ``Members`` in
+addition to their own inputs. Last but not least, ``Modules`` can be
+nested (explanations and examples follow). ``Module`` is meant to:
+ #. ease the sharing of variables between several Theano functions,
+ #. streamline automatic naming, and
+ #. allow a hierarchy of "modules" whose states can interact.
+import
+======
+All examples suppose that you have done those import:
+.. code-block:: python
+    #!/usr/bin/env python
+    import theano
+    import numpy as N
+    from theano import tensor as T
+    from theano.tensor import nnet as NN
+    from theano.compile import module as M
+Module
+======
+A ``Module`` can contain ``Members``, ``Methods`` and inner ``Modules``. Each type has a special meaning.
+.. code-block:: python
+    module = M.Module()
+``Member``
+------------
+Usage:
+.. code-block:: python
+    #module.state = variable
+    module.state = T.scalar()
+A ``Member`` represents a state variable (i.e., whose value remains after a ``Method`` is called). It will be named automatically after that field and it will be an implicit input of all ``Methods`` of the ``Module``. Its storage (i.e. where the value is stored) will be shared by all ``Methods`` of the ``Module``.
+A ``Variable`` which is the variable of a previous computation (by opposition to being ``updated``) is not a ``Member``. Internally this is called an External. You should not need to care about this.
+For sharing state between modules, see ``Inner Module`` section.
+``Method``
+------------
+Usage:
+.. code-block:: python
+    module.method = M.Method(inputs, outputs, **updates)
+Each key in the updates dictionary must be the name of an existing ``Member`` of the ``Module`` and the value associated to that key is the update expression for the state. When called on a ``ModuleInstance`` produced by the ``Module``, the method will calculate the outputs from the inputs and will update all the states as specified by the update expressions. See the basic example below.
+Inner Module
+------------
+To share a ``Member`` between modules, the modules must be linked through the inner module mechanism. 
+Usage:
+.. code-block:: python
+    module2.submodule = module
+``ModuleInstance``
+====================
+A ``Module`` can produce a ``ModuleInstance`` with its ``make`` method. Think of this as a class and an object in C++/Java. If an attribute was a ``Member``, it will become a read/write access to actual data for the state. If it was a ``M.Method``, a function will be compiled with the proper signature and semantics.
+Module Interface
+================
+.. code-block:: python
+    def make(self, mode = {'FAST_COMPILE', 'FAST_RUN', ... }, **init)
+'''make''' compiles all ``Methods`` and allocates storage for all ``Members`` into a ``ModuleInstance`` object, which is returned. The ``init`` dictionary can be used to provide initial values for the members.
+.. code-block:: python
+    def resolve(self, symbol, filter = None)
+Resolves a symbol in this module. The symbol can be a string or a ``Variable``. If the string contains dots (eg ``"x.y"``), the module will resolve the symbol hierarchically in its inner modules. The filter argument is None or a class and it can be used to restrict the search to ``Member`` or ``Method`` instances for example.
+.. code-block:: python
+    def _instance_initialize(self, inst, **init)
+The inst argument is a ``ModuleInstance``. For each key, value pair in init: ``setattr(inst, key, value)`` is called. This can be easily overriden by ``Module`` subclasses to initialize an instance in different ways. If you don't know what to put their, don't put it and it will execute a default version. If you want to call the parent version call: ``M.default_initialize(inst,**init)``
+Basic example
+=============
+The problem here is to create two functions, ``inc`` and ``dec`` and a shared state ``c`` such that ``inc(n)`` increases ``c`` by ``n`` and ``dec(n)`` decreases ``c`` by ``n``. We also want a third function, ``plus10``, which return 10 + the current state without changing the current state. Using the function interface, the feature can be implemented as follows:
+.. code-block:: python
+    n, c = T.scalars('nc')
+    inc = theano.function([n, ((c, c + n), 0)], [])
+    dec = theano.function([n, ((c, c - n), inc.container[c])], []) # we need to pass inc's container in order to share
+    plus10 = theano.function([(c, inc.container[c])], c + 10)
+    assert inc[c] == 0
+    inc(2)
+    assert inc[c] == 2 and dec[c] == inc[c]
+    dec(3)
+    assert inc[c] == -1 and dec[c] == inc[c]
+    assert plus10() == 9
+Now, using ``Module``:
+.. code-block:: python
+    m = M.Module()
+    n = T.scalar('n')
+    m.c = T.scalar() # state variables
+    m.inc = M.Method(n, [], updates = {m.c: m.c + n}) # m.c <= m.c + n
+    m.dec = M.Method(n, [], updates = {m.c: m.c - n}) # k.c <= k.c - n
+    #m.dec = M.Method(n, [], updates = {c: m.c - n})#global c don't exist
+    #m.plus10 does not update the state
+    m.plus10 = M.Method([], m.c + 10) # m.c is always accessible since it is a member of this mlass
+    inst = m.make(c = 0) # here, we make an "instance" of the module with c initialized to 0
+    assert inst.c == 0
+    inst.inc(2)
+    assert inst.c == 2
+    inst.dec(3)
+    assert inst.c == -1
+    assert inst.plus10() == 9
+Benefits of ``Module`` over ``function`` in this example:
+ * There is no need to manipulate the containers directly
+ * The fact inc and dec share a state is more obvious syntactically.
+ * ``Method`` does not require the states to be anywhere in the input list.
+ * The interface of the instance produced by ``m.make()`` is simple and coherent, extremely similar to that of a normal python object. It is directly usable by any user.
+Nesting example
+===============
+The problem now is to create two pairs of ``inc dec`` functions and a function ``sum`` that adds the shared states of the first and second pair.
+Using function:
+.. code-block:: python
+    def make_incdec_function():
+           n, c = T.scalars('nc')
+           inc = theano.function([n, ((c, c + n), 0)], [])
+           dec = theano.function([n, ((c, c - n), inc.container[c])], [])#inc and dec share the same state.
+           return inc,dec
+    inc1, dec1 = make_incdec_function()
+    inc2, dec2 = make_incdec_function()
+    a, b = T.scalars('ab')
+    sum = theano.function([(a, inc1.container['c']), (b, inc2.container['c'])], a + b)
+    inc1(2)
+    dec1(4)
+    inc2(6)
+    assert inc1['c'] == -2 and inc2['c'] == 6
+    assert sum() == 4 # -2 + 6
+Using Module:
+.. code-block:: python
+    def make_incdec_module():
+        m = M.Module()
+        n = T.scalar('n')
+        m.c = T.scalar() # state variables
+        m.inc = M.Method(n, [], updates = {m.c: m.c + n}) # m.c <= m.c + n
+        m.dec = M.Method(n, [], updates = {m.c: m.c - n}) # m.c <= m.c - n
+        return m
+    m = M.Module()
+    m.incdec1 = make_incdec_module()
+    m.incdec2 = make_incdec_module()
+    m.sum = M.Method([], m.incdec1.c + m.incdec2.c)
+    inst = m.make(incdec1 = dict(c=0), incdec2 = dict(c=0))
+    inst.incdec1.inc(2)
+    inst.incdec1.dec(4)
+    inst.incdec2.inc(6)
+    assert inst.incdec1.c == -2 and inst.incdec2.c == 6
+    assert inst.sum() == 4 # -2 + 6
+Here, we make a new ``Module`` and we give it two inner ``Modules`` like
+the one defined in the basic example. Each inner module has methods inc
+and dec as well as a state c and their state is directly accessible from
+the outer module, which means that it can define methods using them. The
+instance(inst) we make from the ``Module``(m) reflects the hierarchy
+that we created. Unlike the method using function, there is no need to
+manipulate any containers directly.
+Advanced example
+================
+Complex models can be implemented by subclassing ``Module`` (though that is not mandatory). Here is a complete, extensible (and working) regression model implemented using this system:
+.. literalinclude:: ../code/regression.py
+Here is how we use the model:
+.. code-block:: python
+    data_x = N.random.randn(4, 10)
+    data_y = [ [int(x)] for x in N.random.randn(4) > 0]
+    model = SoftmaxXERegression(regularize = False).make(input_size = 10,
+                       target_size = 1,
+                       stepsize = 0.1)
+    for i in xrange(1000):
+       xe = model.update(data_x, data_y)
+       if i % 100 == 0:
+           print i, xe
+           pass
+       #for inputs, targets in my_training_set():
+           #print "cost:", model.update(inputs, targets)
+    print "final weights:", model.w
+    print "final biases:", model.b
+Extending ``Methods``
+=======================
+``Methods`` can be extended to update more parameters. For example, if we wanted to add a variable holding the sum of all costs encountered so far to ``SoftmaxXERegression``, we could proceed like this:
+.. code-block:: python
+    model_module = SoftmaxXERegression(regularize = False)
+    model_module.sum = T.scalar() # we add a module member to hold the sum
+    model_module.update.updates.update(sum = model_module.sum + model_module.cost) # now update will also update sum!
+    model = model_module.make(input_size = 4,
+                             target_size = 2,
+                             stepsize = 0.1,
+                             sum = 0) # we mustn't forget to initialize the sum
+    test = model.update([[0,0,1,0]], [[0,1]]) + model.update([[0,1,0,0]], [[1,0]])
+    assert model.sum == test
+The inputs and outputs list of a ``Method`` can be doctored as well, but it is trickier, arguably less useful and not fully supported at the moment.
--- a/doc/sandbox/numpy.txt
+++ b/doc/sandbox/numpy.txt
-.. _numpy::
-===============
-NumPy refresher
-===============
-Give summary of type(x) vs x.type vs x.dtype
--- a/doc/trac/randomnumbers.txt
+++ b/doc/trac/randomnumbers.txt
--- a/doc/trac/rethinkccodegen.txt
+++ b/doc/trac/rethinkccodegen.txt
--- a/doc/sandbox/thinking_in_theano.txt
+++ b/doc/sandbox/thinking_in_theano.txt
-.. thinking_in_theano:
-==================
-Thinking in Theano
-==================
-Theano offers quite a bit of flexibility.
-How should you write your algorithm to make the most of what Theano can do?
-A few tips
----------
- Remember that your code builds a graph that theano compiles, and you cannot
-  literally put loops into that graph.
- Remember that Variables are symbolic of computations, not
-  storage.  It does not make sense to *reassign* to a Variable.
-Limitations
-----------
- Conditional control flow is possible but not efficient.  In essence, both
-  sides of an if (see ``switch``) will be evaluated.
- Loops are not supported, but soon will be.
-  (A ``scan`` op is in the works, but not included yet.)
- Recursion is not supported within a graph.
-A few examples
--------------
-**DO WE WANT SOMETHING HERE?**
-These are intended to illustrate good ways of formulating an algorithm for
-Theano.
-For complete, usable implementations of these algorithms see WRITEME.
- Short-time Fourier Transform
- Contrastive Divergence for Restricted Boltzmann Machine
- Kalman Filter
- Logistic Regression
- Training a neural network with sigmoidal hidden layer by backpropagation
- Learning an Echo-State Network
--- a/doc/scripts/docgen.py
+++ b/doc/scripts/docgen.py
@@ -75,15 +75,17 @@ if __name__ == '__main__':
    options['--all'] = not (bool(options['--epydoc']) ^ bool(options['--rst']))
-    import gen_oplist
+    if 0:
-    print 'Generating oplist...'
+        import gen_oplist
-    gen_oplist.print_file(open('%s/doc/indexes/oplist.txt' % throot, 'w'))
+        print 'Generating oplist...'
-    print 'oplist done!'
+        gen_oplist.print_file(open('%s/doc/indexes/oplist.txt' % throot, 'w'))
+        print 'oplist done!'
-    import gen_typelist
-    print 'Generating typelist...'
+    if 0:
-    gen_typelist.print_file(open('%s/doc/indexes/typelist.txt' % throot, 'w'))
+        import gen_typelist
-    print 'typelist done!'
+        print 'Generating typelist...'
+        gen_typelist.print_file(open('%s/doc/indexes/typelist.txt' % throot, 'w'))
+        print 'typelist done!'
    def mkdir(path):
        try:

--- a/doc/topics/debugmode.txt
+++ b/doc/topics/debugmode.txt
-.. _debugmode:
-===============
-Using DebugMode
-===============
-The DebugMode evaluation mode (available via ``mode='DEBUG_MODE'``,
-:api:`DebugMode`) includes a number of self-checks and assertions that
-can help to diagnose several kinds of programmer errors that can lead
-to incorrect output.
-It is much slower to evaluate a function or method in DEBUG_MODE than
-it would be in FAST_RUN or even FAST_COMPILE. We recommended you use
-DebugMode during development, but not when you launch 1000 processes on
-a cluster.
-DebugMode is used as follows:
-.. code-block:: python
-    x = theano.dvector('x')
-    f = theano.function(x, 10*x, mode='DEBUG_MODE')
-    f(5) 
-    f(0) 
-    f(7) 
-If any problem is detected, DebugMode will raise an exception according to
-what went wrong, either at call time (e.g. ``f(5)``) or compile time (e.g
-``f = theano.function(x, 10*x, mode='DEBUG_MODE')``). These exceptions
-should *not* be ignored; talk to your local Theano guru or email the
-users list if you cannot make the exception go away.
-Some kinds of errors can only be detected for certain input value combinations.
-In the example above, there is no way to guarantee that a future call to say,
-``f(-1)`` won't cause a problem.  DebugMode is not a silver bullet.
-If you instantiate DebugMode using the constructor ``compile.DebugMode``
-rather than the keyword ``DEBUG_MODE`` you can configure its behaviour via
-constructor arguments.  See :api:`DebugMode` for details.
-The keyword version of DebugMode (which you get by using ``mode='DEBUG_MODE``)
-is quite strict, and can raise several different Exception types.
-There following are DebugMode exceptions you might encounter:
-DebugModeError
--------------
-This is a generic error.  All the other exceptions inherit from this one.
-This error is typically not raised directly.
-However, you can use ``except DebugModeError: ...`` to catch any of the more
-specific types of Exception.
-For detailed documentation see :api:`DebugModeError`.
-BadCLinkerOutput
----------------
-This exception means that python (``perform``) and c (``c_code``) for an Op
-didn't compute the same thing like they were supposed to.
-The problem might be a bug in either ``perform`` or ``c_code`` (or both).
-For detailed documentation see :api:`BadCLinkerOutput`.
-BadOptimization
---------------
-This exception indicates that an Optimization replaced one variable (say V1)
-with another one (say V2)  but at runtime, the values for V1 and V2 were
-different.  This is something that optimizations are not supposed to do.
-It can be tricky to identify the one-true-cause of an optimization error, but
-this exception provides a lot of guidance.  Most of the time, the
-exception object will indicate which optimization was at fault.
-The exception object also contains information such as a snapshot of the
-before/after graph where the optimization introduced the error.
-For detailed documentation see :api:`BadOptimization`.
-BadDestroyMap
-------------
-This happens when an Op's ``perform()`` or ``c_code()`` modifies an input that it wasn't
-supposed to.  If either the ``perform`` or ``c_code`` implementation of an Op
-might modify any input, it has to advertise that fact via the ``destroy_map``
-attribute.
-For detailed documentation on the Exception, see :api:`BadDestroyMap`.
-For detailed documentation on the ``destroy_map`` attribute, see :ref:`inplace`.
-BadViewMap
----------
-This happens when an Op's perform() or c_code() creates an alias or alias-like
-dependency between an input and an output... and it didn't warn the
-optimization system via the ``view_map`` attribute.
-For detailed documentation on the Exception, see :api:`BadViewMap`.
-For detailed documentation on the ``view_map`` attribute, see :ref:`views`.
-StochasticOrder
---------------
-This happens when an optimization does not perform the same graph operations
-in the same order when run several times in a row.  This can happen if any
-steps are ordered by ``id(object)`` somehow, such as via the default object
-hash function.  A Stochastic optimization invalidates the pattern of work
-whereby we debug in DEBUG_MODE and then run the full-size jobs in FAST_RUN.
-For detailed documentation see :api:`StochasticOrder`.
-InvalidValueError
-----------------
-This happens when some Op's ``perform`` or ``c_code`` implementation computes
-an output that is invalid with respect to the type of the corresponding output
-variable.  Like if it returned a complex-valued ndarray for a ``dscalar``
-Type.
-This can also be triggered when floating-point values such as NaN and Inf are
-introduced into the computations.  It indicates which Op created the first
-NaN.  These floating-point values can be allowed by passing the
-``check_isfinite=False`` argument to DebugMode. 
-For detailed documentation see :api:`InvalidValueError`.
--- a/doc/topics/index.txt
+++ b/doc/topics/index.txt
-.. _topics:
-======
-Topics
-======
-.. toctree::
-    :maxdepth: 2
-    function
-    floatX
-    module
-    pipeline
-    unittest
-    profilemode
-    debugmode
-    debug_faq
-    randomstreams
--- a/doc/basic_tutorial/adding.txt
+++ b/doc/basic_tutorial/adding.txt
@@ -8,8 +8,8 @@ Baby steps - Adding two numbers together
 Adding two scalars
 ==================
-So, to get us started and get a feel of what we're working with, let's
+So, to get us started with Theano and get a feel of what we're working with, 
-make a simple function: add two numbers together. Here is how you do
+let's make a simple function: add two numbers together. Here is how you do
 it:
 >>> x = T.dscalar('x')
@@ -26,17 +26,31 @@ array(28.4)
 Let's break this down into several steps. The first step is to define
-two symbols, or Variables, representing the quantities that you want
+two symbols (*Variables*) representing the quantities that you want
-to add. Note that from now on, we will use the term :term:`Variable`
+to add. Note that from now on, we will use the term 
-to mean "symbol" (in other words, ``x``, ``y``, ``z`` are all Variable
+*Variable* to mean "symbol" (in other words, 
-objects). The output of the function ``f`` is a ``numpy.ndarray``
+``x``, ``y``, ``z`` are all *Variable* objects). The output of the function 
-with zero dimensions.
+``f`` is a ``numpy.ndarray`` with zero dimensions.
 If you are following along and typing into an interpreter, you may have
 noticed that there was a slight delay in executing the ``function``
 instruction. Behind the scenes, ``f`` was being compiled into C code.
-    .. TODO: help
+.. note:
+  A *Variable* is the main data structure you work with when
+  using Theano. The symbolic inputs that you operate on are
+  *Variables* and what you get from applying various operations to
+  these inputs are also *Variables*. For example, when I type
+  >>> x = theano.tensor.ivector()
+  >>> y = -x
+  ``x`` and ``y`` are both Variables, i.e. instances of the
+  ``theano.gof.graph.Variable`` class. The
+  type of both ``x`` and ``y`` is ``theano.tensor.ivector``.
 -------------------------------------------
@@ -47,11 +61,11 @@ instruction. Behind the scenes, ``f`` was being compiled into C code.
 In Theano, all symbols must be typed. In particular, ``T.dscalar``
 is the type we assign to "0-dimensional arrays (`scalar`) of doubles
-(`d`)". It is a Theano :term:`Type`.
+(`d`)". It is a Theano :ref:`type`.
 ``dscalar`` is not a class. Therefore, neither ``x`` nor ``y``
 are actually instances of ``dscalar``. They are instances of
-:api:`TensorVariable <theano.tensor.basic.TensorVariable>`. ``x`` and ``y``
+:ref:`TensorVariable <libdoc_tensor_type>`. ``x`` and ``y``
 are, however, assigned the theano Type ``dscalar`` in their ``type``
 field, as you can see here:
@@ -64,14 +78,14 @@ TensorType(float64, scalar)
 >>> x.type == T.dscalar
 True
-You can learn more about the structures in Theano in
+You can learn more about the structures in Theano in :ref:`graphstructures`.
-the :ref:`advtutorial` and in :ref:`graphstructures`.
 By calling ``T.dscalar`` with a string argument, you create a
-:term:`Variable` representing a floating-point scalar quantity with the
+*Variable* representing a floating-point scalar quantity with the
 given name. If you provide no argument, the symbol will be unnamed. Names
 are not required, but they can help debugging.
 -------------------------------------------
 **Step 2**
@@ -80,8 +94,8 @@ The second step is to combine ``x`` and ``y`` into their sum ``z``:
 >>> z = x + y
-``z`` is yet another :term:`Variable` which represents the addition of
+``z`` is yet another *Variable* which represents the addition of
-``x`` and ``y``. You can use the :api:`pp <theano.printing.pp>`
+``x`` and ``y``. You can use the :ref:`pp <libdoc_printing>`
 function to pretty-print out the computation associated to ``z``.
 >>> print pp(z)
@@ -96,7 +110,7 @@ and giving ``z`` as output:
 >>> f = function([x, y], z)
-The first argument to ``function`` is a list of :term:`Variables <Variable>`
+The first argument to :ref:`function <libdoc_compile_function>` is a list of Variables
 that will be provided as inputs to the function. The second argument
 is a single Variable *or* a list of Variables. For either case, the second
 argument is what we want to see as output when we apply the function.
@@ -133,18 +147,19 @@ array([[ 11.,  22.],
 It is possible to add scalars to matrices, vectors to matrices,
 scalars to vectors, etc. The behavior of these operations is defined
-by :term:`broadcasting`.
+by :ref:`broadcasting <libdoc_tensor_broadcastable>`.
 The following types are available:
-* **byte**: bscalar, bvector, bmatrix
+* **byte**: bscalar, bvector, bmatrix, brow, bcol, btensor3, btensor4
-* **32-bit integers**: iscalar, ivector, imatrix
+* **32-bit integers**: iscalar, ivector, imatrix, irow, icol, itensor3, itensor4
-* **64-bit integers**: lscalar, lvector, lmatrix
+* **64-bit integers**: lscalar, lvector, lmatrix, lrow, lcol, ltensor3, ltensor4
-* **float**: fscalar, fvector, fmatrix
+* **float**: fscalar, fvector, fmatrix, frow, fcol, ftensor3, ftensor4
-* **double**: dscalar, dvector, dmatrix
+* **double**: dscalar, dvector, dmatrix, drow, dcol, dtensor3, dtensor4
+* **complex**: cscalar, cvector, cmatrix, crow, ccol, ctensor3, ctensor4
 The previous list is not exhaustive. A guide to all types compatible
-with numpy arrays may be found :ref:`here <predefinedtypes>`.
+with numpy arrays may be found :ref:`here <libdoc_tensor_creation>`.
 .. note::

--- a/doc/tutorial/debug_faq.txt
+++ b/doc/tutorial/debug_faq.txt
+.. _debug_faq:
+=========================================
+Debugging Theano: FAQ and Troubleshooting
+=========================================
+There are many kinds of bugs that might come up in a computer program.
+This page is structured as an FAQ.  It should provide recipes to tackle common
+problems, and introduce some of the tools that we use to find problems in our
+Theano code, and even (it happens) in Theano's internals, such as
+:ref:`using_debugmode`.
+How do I print an intermediate value in a Function/Method?
+----------------------------------------------------------
+Theano provides a 'Print' Op to do this.
+.. code-block:: python
+    x = theano.tensor.dvector('x')
+    x_printed = theano.printing.Print('this is a very important value')(x)
+    f = theano.function([x], x * 5)
+    f_with_print = theano.function([x], x_printed * 5)
+    #this runs the graph without any printing
+    assert numpy.all( f([1,2,3]) == [5, 10, 15])
+    #this runs the graph with the message, and value printed
+    assert numpy.all( f_with_print([1,2,3]) == [5, 10, 15])
+Since Theano runs your program in a topological order, you won't have precise
+control over the order in which multiple Print() Ops are evaluted.  For a more
+precise inspection of what's being computed where, when, and how, see the
+:ref:`faq_wraplinker`.
+The function I compiled is too slow, what's up?
+-----------------------------------------------
+First, make sure you're running in FAST_RUN mode, by passing
+``mode='FAST_RUN'`` to ``theano.function`` or ``theano.make``. Some
+operations have excruciatingly slow Python implementations and that
+can negatively effect the performance of FAST_COMPILE.
+Second, try the theano :ref:`using_profilemode`.  This will tell you which
+Apply nodes, and which Ops are eating up your CPU cycles.
+.. _faq_wraplinker:
+How do I step through a compiled function with the WrapLinker?
+--------------------------------------------------------------
+This is not exactly an FAQ, but the doc is here for now...
+It's pretty easy to roll-your-own evaluation mode.
+Check out this one:
+.. code-block:: python
+    class PrintEverythingMode(Mode):
+        def __init__(self):
+            def print_eval(i, node, fn):
+                print i, node, [input[0] for input in fn.inputs],
+                fn()
+                print [output[0] for output in fn.outputs]
+            wrap_linker = theano.gof.WrapLinkerMany([theano.gof.OpWiseCLinker()], [print_eval])
+            super(PrintEverythingMode, self).__init__(wrap_linker, optimizer='fast_run')
+When you use ``mode=PrintEverythingMode()`` as the mode for Function or Method,
+then you should see [potentially a lot of] output.  Every Apply node will be printed out,
+along with its position in the graph, the arguments to the ``perform`` or
+``c_code`` and the output it computed.  
+>>> x = T.dscalar('x')
+>>> f = function([x], [5*x], mode=PrintEverythingMode())
+>>> f(3)
+>>> # print: 0 Elemwise{mul,no_inplace}(5, x) [array(5, dtype=int8), array(3.0)] [array(15.0)]
+>>> # print: [array(15.0)]
+Admittedly, this may be a huge amount of
+output to read through if you are using big tensors... but you can choose to
+put logic inside of the print_eval function  that would, for example, only
+print something out if a certain kind of Op was used, at a certain program
+position, or if a particular value shows up in one of the inputs or outputs.
+Use your imagination :)
+.. TODO: documentation for link.WrapLinkerMany
+This can be a really powerful debugging tool.
+Note the call to ``fn`` inside the call to ``print_eval``; without it, the graph wouldn't get computed at all!
--- a/doc/basic_tutorial/dlogistic.png
+++ b/doc/basic_tutorial/dlogistic.png
--- a/doc/basic_tutorial/examples.txt
+++ b/doc/basic_tutorial/examples.txt
@@ -22,7 +22,7 @@ the logistic curve, which is given by:
    A plot of the logistic function, with x on the x-axis and s(x) on the
    y-axis.
-You want to compute the function :term:`elementwise` on matrices of
+You want to compute the function :ref:`elementwise <libdoc_tensor_elementwise>` on matrices of
 doubles, which means that you want to apply this function to each
 individual element of the matrix.
@@ -58,7 +58,7 @@ Computing more than one thing at the same time
 ==============================================
 Theano supports functions with multiple outputs. For example, we can
-compute the :term:`elementwise` difference, absolute difference, and
+compute the :ref:`elementwise <libdoc_tensor_elementwise>` difference, absolute difference, and
 squared difference between two matrices ``a`` and ``b`` at the same time:
 >>> a, b = T.dmatrices('a', 'b')
@@ -134,16 +134,17 @@ array([[ 0.25      ,  0.19661193],
 The resulting function computes the gradient of its first argument
 with respect to the second. In this way, Theano can be used for
-`automatic differentiation`_.
+`automatic differentiation <http://en.wikipedia.org/wiki/Automatic_differentiation>`_.
 .. note::
-   The variable of ``T.grad`` has the same dimensions as the
+   The second argument of ``T.grad`` can be a list, in which case the
-   second argument. This is exactly like the first derivative if the
+   output is also a list. The order in both list is important, element
-   first argument is a scalar or a tensor of size 1 but not if it is
+   *i* of the output list is the gradient of the first argument of
-   larger. For more information on the semantics when the first
+   ``T.grad`` with respect to the *i*-th element of the list given as second argument.
-   argument has a larger size and details about the implementation,
+   The first arguement of ``T.grad`` has to be a scalar (a tensor
-   see :api:`tensor.grad`.
+   of size 1). For more information on the semantics of the arguments of 
+   ``T.grad`` and details about the implementation, see :ref:`this <libdoc_gradient>`.
 Setting a default value for an argument
@@ -273,9 +274,10 @@ shared variable, but you do *not* want to use its value. In this case, you can u
 for the purpose of one particular function.
 >>> fn_of_state = state * 2 + inc
->>> non_shared_state = state.type()
+>>> foo = lscalar()    # the type (lscalar) must match the shared variable we
->>> skip_shared = function([inc, non_shared_state], fn_of_state,
+>>>                    # are replacing with the ``givens`` list
-        givens=[(state, non_shared_state)])
+>>> skip_shared = function([inc, foo], fn_of_state,
+        givens=[(state, foo)])
 >>> skip_shared(1, 3)  # we're using 3 for the state, not state.value
 array(7)
 >>> state.value        # old state still there, but we didn't use it
@@ -288,31 +290,5 @@ substitution to be co-dependent, the order of substitution is not defined, so
 the substitutions have to work in any order.
-Mode
-====
-The ``mode`` parameter to :api:`theano.function` controls how the
-inputs-to-outputs graph is transformed into a callable object.
-Theano defines the following modes by name:
- ``FAST_COMPILE``: Apply just a few optimizations, but use C op implementations where possible.
- ``FAST_RUN``: Apply all optimizations, and use C op implementations where possible.
- ``DEBUG_MODE``: Verify the correctness of all optimizations, and compare C and python 
-    implementations. This mode can take much longer than the other modes, 
-    but can identify many kinds of problems.
-The default mode is typically ``FAST_RUN``, but it can be controlled via
-the environment variable ``THEANO_DEFAULT_MODE``, which can in turn be
-overridden by setting :api:`theano.compile.mode.default_mode` directly,
-which can in turn be overridden by passing the keyword argument to
-:api:`theano.function`.
-For a finer level of control over which optimizations are applied, and
-whether C or python implementations are used, read
-:api:`compile.mode.Mode`.
-.. _automatic differentiation: http://en.wikipedia.org/wiki/Automatic_differentiation
--- a/doc/tutorial/index.txt
+++ b/doc/tutorial/index.txt
+.. _tutorial:
+========
+Tutorial
+========
+Let's start an interactive session and import Theano.
+>>> from theano import *
+Many of symbols you will need to use are in the ``tensor`` subpackage
+of Theano. Let's import that subpackage under a handy name. I like
+``T`` (and many tutorials use this convention).
+>>> import theano.tensor as T
+If that worked you're ready for the tutorial, otherwise check your
+installation (see :ref:`install`).
+.. toctree::
+    numpy
+    adding
+    examples
+    loading_and_saving
+    modes
+    remarks
+    debug_faq
--- a/doc/tutorial/loading_and_saving.txt
+++ b/doc/tutorial/loading_and_saving.txt
+.. tutorial_loadsave:
+==================
+Loading and Saving
+==================
+Many Theano objects can be serialized.  However, you will want to consider different mechanisms
+depending on the amount of time you anticipate between saving and reloading.  For short-term
+(such as temp files and network transfers) pickling is possible.  For longer-term (such as
+saving models from an experiment) you should not rely on pickled theano objects; we recommend
+loading and saving the underlying shared objects as you would in the course of any other python
+program.
+pickling -- Short-term serialization
+=====================================
+Pickling and unpickling of functions. Caveats... basically don't do this for long-term storage.
+***TODO***
+not-pickling -- Long-term serialization
+=======================================
+***TODO***
+Give a short example of how to add a __getstate__ and __setstate__ to a class.  Point out to
+use protocol=-1 for numpy ndarrays.
+Point to the python docs for further reading.
--- a/doc/basic_tutorial/logistic.gp
+++ b/doc/basic_tutorial/logistic.gp
--- a/doc/basic_tutorial/logistic.png
+++ b/doc/basic_tutorial/logistic.png
--- a/doc/tutorial/modes.txt
+++ b/doc/tutorial/modes.txt
+.. _using_modes:
+===============================
+Using different compiling modes
+===============================
+Mode
+====
+Everytime :ref:`theano.function <libdoc_compile_function>` is called
+the symbolic relationships between the input and output Theano *variables* 
+are optimized and compiled. The way this compilation occurs 
+is controlled by the value of the ``mode`` parameter.
+Theano defines the following modes by name:
+- ``'FAST_COMPILE'``: Apply just a few graph optimizations, but use C implementations where possible.
+- ``'FAST_RUN'``: Apply all optimizations, and use C implementations where possible.
+- ``'DEBUG_MODE'``: Verify the correctness of all optimizations, and compare C and python 
+    implementations. This mode can take much longer than the other modes, 
+    but can identify many kinds of problems.
+The default mode is typically ``FAST_RUN``, but it can be controlled via
+the environment variable ``THEANO_DEFAULT_MODE``, which can in turn be
+overridden by setting `theano.compile.mode.default_mode` directly,
+which can in turn be overridden by passing the keyword argument to
+:ref:`theano.function <libdoc_compile_function>`.
+================= =============================================================== ===============================================================================
+short name        Full constructor                                                What does it do?
+================= =============================================================== ===============================================================================
+(default)         ``compile.mode.Mode(linker='py', optimizer=None)``              Python implementations with zero graph modifications.
+FAST_COMPILE      ``compile.mode.Mode(linker='c|py', optimizer='fast_compile')``  C implementations where available, quick and cheap graph transformations
+FAST_RUN          ``compile.mode.Mode(linker='c|py', optimizer='fast_run')``      C implementations where available, all available graph transformations.
+DEBUG_MODE        ``compile.debugmode.DebugMode()``                               Both implementations where available, all available graph transformations.
+================= =============================================================== ===============================================================================
+.. _using_debugmode:
+Using DebugMode
+===============
+While normally you should use the ``FAST_RUN`` or ``FAST_COMPILE`` mode, 
+it is useful at first (especially when you are defining new kinds of
+expressions or new optimizations) to run your code using the DebugMode  
+(available via ``mode='DEBUG_MODE'``). The DebugMode is designed to 
+do several self-checks and assertations that can help to diagnose 
+possible programming errors that can lead to incorect output. Note that 
+``DEBUG_MODE`` is much slower then ``FAST_RUN`` or ``FAST_COMPILE`` so 
+use it only during development (not when you luch 1000 process on a 
+cluster!).
+DebugMode is used as follows:
+.. code-block:: python
+    x = T.dvector('x')
+    f = theano.function([x], 10*x, mode='DEBUG_MODE')
+    f(5) 
+    f(0) 
+    f(7) 
+If any problem is detected, DebugMode will raise an exception according to
+what went wrong, either at call time (``f(5)``) or compile time (
+``f = theano.function(x, 10*x, mode='DEBUG_MODE')``). These exceptions
+should *not* be ignored; talk to your local Theano guru or email the
+users list if you cannot make the exception go away.
+Some kinds of errors can only be detected for certain input value combinations.
+In the example above, there is no way to guarantee that a future call to say,
+``f(-1)`` won't cause a problem.  DebugMode is not a silver bullet.
+If you instantiate DebugMode using the constructor (see :class:`DebugMode`)
+rather than the keyword ``DEBUG_MODE`` you can configure its behaviour via
+constructor arguments.  See :ref:`DebugMode <debugMode>` for details.
+The keyword version of DebugMode (which you get by using ``mode='DEBUG_MODE``)
+is quite strict.
+.. _using_profilemode:
+ProfileMode
+===========
+Beside checking for errors, another important task is to profile your 
+code. For this Theano uses a special mode called ProfileMode which has 
+to be passed as an argument to :ref:`theano.function <libdoc_compile_function>`. Using the ProfileMode is a three-step process.
+Creating a ProfileMode Instance
+-------------------------------
+First create a ProfileMode instance. 
+>>> from theano import ProfileMode
+>>> profmode = theano.ProfileMode(optimizer='fast_run', linker=theano.gof.OpWiseCLinker())
+The ProfileMode constructor takes as input an optimizer and a
+linker. Which optimizer and linker to use will depend on the
+application. For example, a user wanting to profile the Python
+implementation only, should use the gof.PerformLinker (or "py" for
+short). On the other hand, a user wanting to profile his graph using C
+implementations wherever possible should use the ``gof.OpWiseCLinker``
+(or "c|py"). For testing the speed of your code we would recommend 
+using the 'fast_run' optimizer and ``gof.OpWiseCLinker`` linker.
+Compiling your Graph with ProfileMode
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+Once the ProfileMode instance is created, simply compile your graph as you
+would normally, by specifying the mode parameter.
+>>> # with functions
+>>> f = theano.function([input1,input2],[output1], mode=profmode)
+>>> # with modules
+>>> m = theano.Module()
+>>> minst = m.make(mode=profmode)
+Retrieving Timing Information
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+Once your graph is compiled, simply run the program or operation you wish to
+profile, then call ``profmode.print_summary()``. This will provide you with
+the desired timing information, indicating where your graph is spending most
+of its time.
+This is best shown through an example.
+Lets use the example of logistic
+regression.  (Code for this example is in the file
+``benchmark/regression/regression.py``.) 
+Compiling the module with ProfileMode and calling ``profmode.print_summary()``
+generates the following output:
+.. code-block:: python
+    """
+    ProfileMode.print_summary()
+    ---------------------------
+    local_time 0.0749197006226 (Time spent running thunks)
+    Apply-wise summary: <fraction of local_time spent at this position> (<Apply position>, <Apply Op name>)
+            0.069   15      _dot22
+            0.064   1       _dot22
+            0.053   0       InplaceDimShuffle{x,0}
+            0.049   2       InplaceDimShuffle{1,0}
+            0.049   10      mul
+            0.049   6       Elemwise{ScalarSigmoid{output_types_preference=<theano.scalar.basic.transfer_type object at 0x171e650>}}[(0, 0)]
+            0.049   3       InplaceDimShuffle{x}
+            0.049   4       InplaceDimShuffle{x,x}
+            0.048   14      Sum{0}
+            0.047   7       sub
+            0.046   17      mul
+            0.045   9       sqr
+            0.045   8       Elemwise{sub}
+            0.045   16      Sum
+            0.044   18      mul
+       ... (remaining 6 Apply instances account for 0.25 of the runtime)
+    Op-wise summary: <fraction of local_time spent on this kind of Op> <Op name>
+            0.139   * mul
+            0.134   * _dot22
+            0.092   * sub
+            0.085   * Elemwise{Sub{output_types_preference=<theano.scalar.basic.transfer_type object at 0x1779f10>}}[(0, 0)]
+            0.053   * InplaceDimShuffle{x,0}
+            0.049   * InplaceDimShuffle{1,0}
+            0.049   * Elemwise{ScalarSigmoid{output_types_preference=<theano.scalar.basic.transfer_type object at 0x171e650>}}[(0, 0)]
+            0.049   * InplaceDimShuffle{x}
+            0.049   * InplaceDimShuffle{x,x}
+            0.048   * Sum{0}
+            0.045   * sqr
+            0.045   * Sum
+            0.043   * Sum{1}
+            0.042   * Elemwise{Mul{output_types_preference=<theano.scalar.basic.transfer_type object at 0x17a0f50>}}[(0, 1)]
+            0.041   * Elemwise{Add{output_types_preference=<theano.scalar.basic.transfer_type object at 0x1736a50>}}[(0, 0)]
+            0.039   * Elemwise{Second{output_types_preference=<theano.scalar.basic.transfer_type object at 0x1736d90>}}[(0, 1)]
+       ... (remaining 0 Ops account for 0.00 of the runtime)
+    (*) Op is running a c implementation
+    """
+The summary has two components to it. In the first section called the
+Apply-wise summary, timing information is provided for the worst
+offending Apply nodes. This corresponds to individual Op applications
+within your graph which take the longest to execute (so if you use
+``dot`` twice, you will see two entries there). In the second portion,
+the Op-wise summary, the execution time of all Apply nodes executing
+the same Op are grouped together and the total execution time per Op
+is shown (so if you use ``dot`` twice, you will see only one entry
+there corresponding to the sum of the time spent in each of them).
+Note that the ProfileMode also shows which Ops were running a c
+implementation.
--- a/doc/numpy.txt
+++ b/doc/numpy.txt
@@ -8,10 +8,9 @@ NumPy refresher
 Here are some quick guides to NumPy:
  * `Numpy quick guide for Matlab users <http://www.scipy.org/NumPy_for_Matlab_Users>`__
-  * `More detailed table showing the NumPy equivalent of Matlab commands <http://www.scribd.com/doc/26685/Matlab-Python-and-R>`__
+  * `Numpy User Guide <http://docs.scipy.org/doc/numpy/user/index.html>`__
+  * `More detailed Numpy tutorial <http://www.scipy.org/Tentative_NumPy_Tutorial>`__
-    ..  TODO [DefineBroadcasting Broadcasting]
-    .. Broadcastable - Implicitly assume that all previous entries are true.
    .. [TODO: More doc, e.g. see _test_tensor.py]
@@ -20,8 +19,10 @@ Matrix conventions for machine learning
 Rows are horizontal and columns are vertical.
-Every row is an example. Therefore, inputs[10,5] is a matrix of 10 examples with 5 dimensions per.
+Every row is an example. Therefore, inputs[10,5] is a matrix of 10 examples 
-So to make a NN out of it, multiply by a weight matrix of size (5, #hid).
+where each example has dimension 5. If this would be the input of a
+neural network then the weights from the input the the first hidden
+layer would represent a matrix of size (5, #hid). 
 If I have an array:
@@ -43,3 +44,22 @@ To access the entry in the 3rd row (row #2) and the 1st column (column #0):
 To remember this, keep in mind that we read left-to-right, top-to-bottom,
 so each thing that is contiguous is a row.  That is, there are 3 rows
 and 2 columns.
+Broadcasting
+============
+Numpy does *broadcasting* of arrays of different shapes during
+arithmetic operations. What this means in general is that the smaller 
+array (or scalar) is *broadcasted* across the larger array so that they have
+compatible shapes. The example below shows an instance of
+*broadcastaing*:
+>>> a = numpy.asarray([1.0, 2.0, 3.0])
+>>> b = 2.0
+>>> a * b
+array([2., 4., 6.])
+The smaller array ``b`` (actually a scalar here, which works like a 0-d array) in this case is *broadcasted* to the same size
+as ``a`` during the multiplication. This trick is often useful in
+simplifying how expression are written. More details about *broadcasting*
+can be found at `numpy user guide <http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html>`__.
--- a/doc/tutorial/remarks.txt
+++ b/doc/tutorial/remarks.txt
+.. _tutorial_general_remarks:
+=====================
+Some general Remarks
+=====================
+Theano offers quite a bit of flexibility, but has some limitations too.
+How should you write your algorithm to make the most of what Theano can do?
+Limitations
+-----------
+- Conditional control flow is possible but currently not efficient.  The current implementation will evaluate both sides of an ``if`` construct (see :func:`tensor.switch`).
+- While- or for-Loops within an expression graph are not supported, but soon will be.
+  A ``scan`` op is in ``theano.sandbox``, but not quite ready for mainstream yet.
+- Neither ``goto`` nor recursion is supported or planned within expression graphs.
+A few tips
+----------
+* Remember that your code builds a graph that theano compiles, and you cannot
+  literally put loops into that graph.
+* Remember that Variables are symbolic of computations, not
+  storage.  It does not make sense to *reassign* to a Variable.
--- a/doc/basic_tutorial/tools.txt
+++ b/doc/basic_tutorial/tools.txt
@@ -3,7 +3,7 @@
 Basic Tutorial Mini-Reference
 =============================
-.. miniref_mode:
+      .. miniref_mode:
 Mode
 ====
@@ -17,18 +17,18 @@ FAST_RUN          ``compile.mode.Mode(linker='c|py', optimizer='fast_run')``
 DEBUG_MODE        ``compile.debugmode.DebugMode()``                               Both implementations where available, all available graph transformations.
 ================= =============================================================== ===============================================================================
-.. _tensortypes:
+      .. _tensortypes:
 Types
 =====
-.. _predefinedtypes:
+      .. _predefinedtypes:
 Predefined types
 ----------------
 Predefined types are
-located in the :api:`theano.tensor` package. The name of the types follow
+located in the :ref:`theano.tensor <libdoc_tensor>` package. The name of the types follow
 a recipe:
 ``<dtype><dimensionality>``
@@ -48,26 +48,17 @@ d    double   floating point 64
 Dimensionality is one of:
-====== ====== ========================================== =============================================
-code   shape  Rows :term:`broadcastable <broadcasting>`? Columns :term:`broadcastable <broadcasting>`?
-====== ====== ========================================== =============================================
-scalar []     Yes                                        Yes
-vector [n]    Yes                                        N/A (vectors are used like row vectors)
-row    [1, n] Yes                                        No
-col    [m, 1] No                                         Yes
-matrix [m, n] No                                         No
-====== ====== ========================================== =============================================
 So, if you want a row of 32-bit floats, it is available
-as :api:`theano.tensor.frow <theano.tensor.basic.frow>`.
+as :ref:`theano.tensor.frow <libdoc_tensor_type>`.
 If you want a matrix of unsigned 32-bit integers it is available as
-:api:`theano.tensor.imatrix <theano.tensor.basic.imatrix>`.
+:ref:`theano.tensor.imatrix <libdoc_tensor_type>`.
 Each of the types described above can be constructed by two methods:
-a singular version (e.g., :api:`dmatrix <theano.tensor.basic.dmatrix>`)
+a singular version (e.g., :ref:`dmatrix <libdoc_tensor_creation>`)
-and a plural version (:api:`dmatrices <theano.tensor.dmatrices>`).
+and a plural version (:ref:`dmatrices <libdoc_tensor_creation>`).
 When called, the singular version takes a single
-argument which is the name of the :term:`Variable` we want to make and it
+argument which is the name of the *Variable* we want to make and it
 makes a single Variable of that type. The plural version can either take
 an integer or several strings. If an integer is provided, the method
 will return that many Variables and if strings are provided, it will
@@ -91,7 +82,7 @@ Custom tensor types
 If you wish to use a type of tensor which is not already available here
 (for example, a 3D tensor) you can build an appropriate type using
-:api:`theano.tensor.TensorType <theano.tensor.basic.TensorType>`.
+:ref:`theano.tensor.TensorType <libdoc_tensor_type>`.
 The first argument you pass is the `dtype` and the second is the
 `broadcastable pattern`.
@@ -116,10 +107,10 @@ complex128  complex          128 (two float64)
 .. note::
-   Even though :api:`theano.tensor` does not define any type
+   Even though :ref:`theano.tensor <libdoc_tensor>` does not define any type
   using ``complex`` dtypes (``complex64`` or ``complex128``),
   you can define them explicitly with
-   :api:`TensorType <theano.tensor.basic.TensorType>` (see example
+   :ref:`TensorType <libdoc_tensor_type>` (see example
   below). However, few operations are fully supported for complex
   types: as of version 0.1, only elementary operations (``+-*/``)
   have C implementations. Additionally, complex types have received
@@ -128,8 +119,7 @@ complex128  complex          128 (two float64)
 The broadcastable pattern indicates both the number of dimensions and
 whether a particular dimension must have length 1.
-Here is a table mapping the :term:`broadcastable
+Here is a table mapping the :ref:`broadcastable <libdoc_tensor_broadcastable>` pattern to what kind of tensor it encodes:
-<broadcasting>` pattern to what kind of tensor it encodes:
 ===================== =================================
 pattern               interpretation
@@ -170,8 +160,3 @@ bytes, we would do:
   my_cmatrix = theano.tensor.TensorType('complex64', [False, False])
-Ops
-===
-There are a lot of operations available in the :api:`theano.tensor` package.
-See :ref:`oplist`.
--- a/theano/compile/function.py
+++ b/theano/compile/function.py
@@ -3,14 +3,14 @@
 __docformat__ = "restructuredtext en"
 import sys, traceback, logging
-_logger = logging.getLogger('theano.compile.function_module')
+_logger = logging.getLogger('theano.compile.function')
-import theano
+from io import In
 from function_module import orig_function
 from pfunc import pfunc
 from numpy import any #for to work in python 2.4
-def function(inputs, outputs=None, mode=None, updates=[], givens=[], accept_inplace=False):
+def function(inputs, outputs=None, mode=None, updates=[], givens=[], accept_inplace=False, name=None):
    """
    Return a callable object that will calculate `outputs` from `inputs`.
@@ -21,7 +21,7 @@ def function(inputs, outputs=None, mode=None, updates=[], givens=[], accept_inpl
    :type outputs: list of Variables or Out instances
    :param outputs: expressions to compute
-    :type mode: string or `theano.compile.Mode` instance.
+    :type mode: string or `Mode` instance.
    :param mode: compilation mode
    :type updates: iterable over pairs (shared_variable, new_expression). List, tuple or dict.
@@ -33,7 +33,9 @@ def function(inputs, outputs=None, mode=None, updates=[], givens=[], accept_inpl
    :param givens: specific substitutions to make in the computation graph (Var2 replaces
    Var1).  
-    :rtype: theano.compile.Function
+    :param name: an optional name for this function. The profile mode will print the time spent in this function.
+    :rtype: Function instance
    :returns: a callable object that will compute the outputs (given the inputs)
    and update the implicit function arguments according to the `updates`.
@@ -45,7 +47,7 @@ def function(inputs, outputs=None, mode=None, updates=[], givens=[], accept_inpl
    """
    # compute some features of the arguments:
-    uses_In = any([isinstance(i, theano.In) for i in inputs]) #N.B. the square brackets are ncessary
+    uses_In = any([isinstance(i, In) for i in inputs]) #N.B. the square brackets are ncessary
    uses_tuple = any([isinstance(i, (list, tuple)) for i in inputs])#N.B. the square brackets are ncessary
    uses_updates = (updates != [])
    uses_givens = (givens != [])
@@ -56,11 +58,11 @@ def function(inputs, outputs=None, mode=None, updates=[], givens=[], accept_inpl
            raise NotImplementedError("In() instances and tuple inputs triggers the old semantics, which disallow using updates and givens")
        return orig_function(inputs, outputs, 
                mode=mode,
-                accept_inplace=accept_inplace)
+                accept_inplace=accept_inplace, name=name)
    else:
        return pfunc(params=inputs, 
                outputs=outputs,
                mode=mode, 
                updates=updates, 
                givens=givens,
-                accept_inplace=accept_inplace)
+                accept_inplace=accept_inplace,name=name)
--- a/theano/compile/function_module.py
+++ b/theano/compile/function_module.py
@@ -423,6 +423,8 @@ class Function(object):
        return cpy
    def __call__(self, *args, **kwargs):
+        t0 = time.time()
        # Reinitialize each container's 'provided' counter
        for c in self.input_storage:
            c.provided = 0
@@ -478,6 +480,11 @@ class Function(object):
                if isinstance(value, gof.Container):
                    value = value.storage[0]
                self[i] = value
+        dt_call=time.time()-t0
+        if hasattr(self.maker.mode,'fct_call_time'):
+          self.maker.mode.fct_call_time[self.name] += dt_call
+          self.maker.mode.fct_call[self.name] += 1
        if self.return_none:
            return None
@@ -830,7 +837,7 @@ def check_equal(x, y):
 def register_checker(checker):
    __checkers.insert(0, checker)
-def orig_function(inputs, outputs, mode=None, accept_inplace = False):
+def orig_function(inputs, outputs, mode=None, accept_inplace = False, name=None):
    """
    Return a Function that will calculate the outputs from the inputs.
@@ -843,6 +850,8 @@ def orig_function(inputs, outputs, mode=None, accept_inplace = False):
    :param mode: a descriptive string or a Mode instance. (Default of None means to use
    `mode.default_mode` (See below for descriptive string list).
+    :param name: an optional name for this fct. If used, the profile mode will print the time spent in this fct.
    Currently, the library provides the following mode strings:
     - FAST_RUN (default) (optimize without too much time)
@@ -910,6 +919,13 @@ def orig_function(inputs, outputs, mode=None, accept_inplace = False):
    if hasattr(mode, 'compile_time'):
        mode.compile_time+=t2-t1
+    fn.name = name
+    if hasattr(mode,'fct_call_time'):
+      mode.fct_call_time.setdefault(name,0)
+    if hasattr(mode,'fct_call'):
+      mode.fct_call.setdefault(name,0)
    return fn

--- a/theano/compile/pfunc.py
+++ b/theano/compile/pfunc.py
@@ -10,7 +10,7 @@ class Param(object):
    def __init__(self, variable, default=None, name=None, mutable=False, strict=False,
            implicit=None):
        """
-        :param variable: A node in an expression graph to set with each function call.
+        :param variable: A variable in an expression graph to use as a compiled-function parameter
        :param default: The default value to use at call-time (can also be a Container where
        the function will find a value at call-time.)
@@ -33,7 +33,7 @@ class Param(object):
        self.strict = strict
        self.implicit = implicit
-def pfunc(params, outputs=None, mode=None, updates=[], givens=[], accept_inplace=False):
+def pfunc(params, outputs=None, mode=None, updates=[], givens=[], accept_inplace=False, name=None):
    """Function-constructor for graphs with shared variables.
    :type params: list of either Variable or Param instances.
@@ -55,6 +55,8 @@ def pfunc(params, outputs=None, mode=None, updates=[], givens=[], accept_inplace
    :param givens: specific substitutions to make in the computation graph (Var2 replaces
    Var1).  
+    :param name: an optional name for this fct. If used, the profile mode will print the time spent in this fct.
    :rtype: theano.compile.Function
    :returns: a callable object that will compute the outputs (given the inputs)
    and update the implicit function arguments according to the `updates`.
@@ -205,7 +207,7 @@ def pfunc(params, outputs=None, mode=None, updates=[], givens=[], accept_inplace
            in_sv.update = new_val
            in_sv.mutable = True 
-    return orig_function(inputs, cloned_outputs, mode, accept_inplace=accept_inplace)
+    return orig_function(inputs, cloned_outputs, mode, accept_inplace=accept_inplace,name=name)
 def _pfunc_param_to_in(param):
    if isinstance(param, Constant):

--- a/theano/compile/profilemode.py
+++ b/theano/compile/profilemode.py
@@ -17,21 +17,25 @@ class ProfileMode(Mode):
        op_time = {}
        op_cimpl = {}
        op_call = {}
-        compile_time = 0 #time passed in function()
+        compile_time = 0 #time passed in theano.function()
+        fct_call_time = {}#time passed inside theano fct call including op time.
+        fct_call = {}
        self.__setstate__((linker, optimizer, local_time,
                           apply_time, apply_call,
-                           op_time, op_cimpl, op_call, compile_time))
+                           op_time, op_cimpl, op_call, 
+                           compile_time, fct_call_time, fct_call))
    def __getstate__(self):
        #print "__getstate__",self.provided_linker,self.provided_optimizer
        return (self.provided_linker, self.provided_optimizer, self.local_time,
-                           self.apply_time, self.apply_call,
+                self.apply_time, self.apply_call,
-                           self.op_time, self.op_cimpl, self.op_call, self.compile_time)
+                self.op_time, self.op_cimpl, self.op_call, self.compile_time, self.fct_call_time, self.fct_call)
    def __setstate__(self, (linker, optimizer, local_time,
-                           apply_time, apply_call,
+                            apply_time, apply_call,
-                           op_time, op_cimpl, op_call, compile_time)):
+                            op_time, op_cimpl, op_call, 
+                            compile_time, fct_call_time, fct_call)):
        self.local_time = local_time
        self.apply_time = apply_time
@@ -40,6 +44,8 @@ class ProfileMode(Mode):
        self.op_cimpl = op_cimpl
        self.op_call = op_call
        self.compile_time = compile_time
+        self.fct_call_time = fct_call_time
+        self.fct_call = fct_call
        def blah(i, node, th):
            if hasattr(th, 'cthunk'):
@@ -93,6 +99,8 @@ class ProfileMode(Mode):
        local_time = self.local_time[0]
        compile_time = self.compile_time
+        fct_call_time = self.fct_call_time
+        fct_call = self.fct_call
        apply_time = self.apply_time
        apply_call = self.apply_call
        op_time = self.op_time
@@ -104,7 +112,7 @@ class ProfileMode(Mode):
            if hasattr(a,'flops'):
                op_flops[a]=a.flops*op_call[a]/t/1e6
-        self.print_summary_("print_summary",local_time, compile_time,
+        self.print_summary_("print_summary",local_time, compile_time, fct_call_time, fct_call,
                            apply_time, apply_call, op_time, op_call, op_cimpl,
                            op_flops, n_apply_to_print, n_ops_to_print)
@@ -152,6 +160,8 @@ class ProfileMode(Mode):
        local_time = self.local_time[0]-other.local_time[0]
        compile_time = self.compile_time-other.compile_time
+        fct_call_time = diff_dict(self.fct_call_time,other.fct_call_time)
+        fct_call = diff_dict(self.fct_call,other.fct_call)
        apply_time = diff_dict(self.apply_time, other.apply_time)
        apply_call = diff_dict(self.apply_call, other.apply_call)
        op_time = diff_dict(self.op_time, other.op_time)
@@ -160,13 +170,14 @@ class ProfileMode(Mode):
        op_flops = diff_dict_flops(self.op_time, other.op_time, self.op_call, other.op_call)
-        self.print_summary_("print_diff_summary",local_time, compile_time,
+        self.print_summary_("print_diff_summary",local_time, compile_time, fct_call_time, fct_call,
                            apply_time, apply_call, op_time, op_call, op_cimpl,
                            op_flops, n_apply_to_print=n_apply_to_print,
                            n_ops_to_print=n_ops_to_print, print_apply=False)
    @staticmethod
-    def print_summary_(fct_name, local_time, compile_time, apply_time, apply_call, op_time, op_call, op_cimpl,
+    def print_summary_(fct_name, local_time, compile_time, fct_call_time, fct_call,
+                       apply_time, apply_call, op_time, op_call, op_cimpl,
                       op_flops=None, n_apply_to_print=15, n_ops_to_print=20, print_apply=True):
        """
        do the actual printing of print_summary and print_diff_summary.
@@ -251,13 +262,24 @@ class ProfileMode(Mode):
                  sum(t for f, t, a, ci, nb_call in sotimes[n_ops_to_print:]))
        print '(*) Op is running a c implementation'
-        print
        total_time = time.time() - import_time
+        total_fct_time = sum(fct_call_time.values())
+        total_fct_call = sum(fct_call.values())
        other_time = total_time - local_time - compile_time
+        print
+        print 'Theano fct summary: <% total fct time> <total time> <time per call> <nb call> <fct name>'
+        for key in fct_call.keys():
+            print '   %4.1f%% %.3fs %.2es %d %s'%(fct_call_time[key]/total_fct_time*100 ,fct_call_time[key],
+                                                  fct_call_time[key]/fct_call[key], fct_call[key],key)
+        print
        print 'Time since import %.3fs'%(total_time)
-        print 'Local time %.3fs %.1f%%(Time spent running thunks)'% (local_time,local_time/total_time*100)
        print 'Compile time: %.3fs %.1f%%'%(compile_time, compile_time/total_time*100)
+        print 'Theano fct call %.3fs %.1f%%'%(total_fct_time,total_fct_time/total_time*100)
+        print '   Theano Op time (included in fct call, Time spent running thunks) %.3fs %.1f%%(of total) %.1f%%(of fct call)'% (local_time,local_time/total_time*100,local_time/total_fct_time*100)
        print 'Other time since import %.3fs %.1f%%'%(other_time,other_time/total_time*100)
+        print '%i Theano fct call, %.3fs per call'%(total_fct_call, total_fct_time/total_fct_call)
        if any([x[2].__name__.startswith("Gpu") for x in sotimes]):
            cpu=[]

--- a/theano/floatx.py
+++ b/theano/floatx.py
 """Provide xscalar, xvector, xmatrix, etc. pseudo-types 
 """
 import theano.config as config
-from theano.scalar import float32, float64
+from theano.scalar import float64, float32
 from theano.tensor import (fscalar, fvector, fmatrix, frow, fcol, ftensor3, ftensor4, dscalar,
        dvector, dmatrix, drow, dcol, dtensor3, dtensor4)

--- a/theano/gradient.py
+++ b/theano/gradient.py
@@ -21,26 +21,6 @@ _msg_badlen = 'op.grad(...) returned wrong number of gradients'
 def grad_sources_inputs(sources, graph_inputs, warn_type=True):
    """
-    A gradient source is a pair (``r``, ``g_r``), in which ``r`` is a `Variable`, and ``g_r`` is a
-    `Variable` that is a gradient wrt ``r``.
-    This function traverses the graph backward from the ``r`` sources,
-    calling ``op.grad(...)`` for all ops with some non-None gradient on an output.
-    The ``op.grad(...)`` functions are called like this:
-    .. code-block:: python
-        op.grad(op.inputs[:], [total_gradient(v for v in op.outputs)])
-    This call to ``op.grad`` should return a list or tuple: one symbolic gradient per input.
-    If ``op`` has a single input, then ``op.grad``  should return a list or tuple of length 1.
-    For each input wrt to which ``op`` is not differentiable, it should return ``None`` instead
-    of a `Variable` instance.
-    If a source ``r`` receives a gradient from another source ``r2``, then the effective
-    gradient on ``r`` is the sum of both gradients.
    :type sources: list of pairs of Variable: (v, gradient-on-v)
    :param sources: gradients to back-propagate using chain rule
    :type graph_inputs: list of Variable

--- a/theano/sandbox/cuda/tests/test_opt.py
+++ b/theano/sandbox/cuda/tests/test_opt.py
@@ -27,6 +27,6 @@ def test_no_shared_var_graph():
    f = theano.function([a,b],[a+b], mode=mode_with_gpu)
    l = f.maker.env.toposort()
    assert len(l)==4
-    assert any(isinstance(x.op,cuda.GpuElemwise) for x in l)
+    assert numpy.any(isinstance(x.op,cuda.GpuElemwise) for x in l)
-    assert any(isinstance(x.op,cuda.GpuFromHost) for x in l)
+    assert numpy.any(isinstance(x.op,cuda.GpuFromHost) for x in l)
-    assert any(isinstance(x.op,cuda.HostFromGpu) for x in l)
+    assert numpy.any(isinstance(x.op,cuda.HostFromGpu) for x in l)
--- a/theano/sandbox/scan.py
+++ b/theano/sandbox/scan.py
 """Provide Scan and related functions
-Scanning a function over sequential input(s) producing sequential output(s).
+ Scanning a function over sequential input(s) producing sequential output(s).
-Scanning is a general form of recurrence, which can be used for looping.
+ Scanning is a general form of recurrence, which can be used for looping.
-The idea is that you 'scan' a function along some input sequence, producing an output at each
+ The idea is that you 'scan' a function along some input sequence, producing 
-time-step that can be seen (but not modified) by the function at the next time-step.
+ an output at each time-step that can be seen (but not modified) by the 
-(Technically, the function can see the previous K time-steps.)
+ function at the next time-step. (Technically, the function can see the 
+ previous K  time-steps.)
-So for example, ``sum()`` could be computed by scanning the ``z+x_i`` function over a list,
+ So for example, ``sum()`` could be computed by scanning the ``z+x_i`` 
-given an initial state of ``z=0``. 
+ function over a list, given an initial state of ``z=0``. 
-Special cases:
+ Special cases:
-    - A ``reduce()`` operation can be performed by returning only the last output of a scan.
+    - A ``reduce()`` operation can be performed by returning only the last 
+      output of a scan.
-    - A ``map()`` operation can be performed by applying a function that ignores each previous
+    - A ``map()`` operation can be performed by applying a function that 
-      output.
+      ignores each previous output.
-Often a for loop can be expressed as a scan() operation, and scan is the closest that theano
+ Often a for loop can be expressed as a scan() operation, and scan is the 
-comes to looping.
+ closest that theano comes to looping.
-This module provides scanning functionality with the `Scan` Op.
+ This module provides scanning functionality with the `Scan` Op.
 """
 __docformat__ = 'restructedtext en'
-import traceback
 import numpy 
 import theano
-import theano.compile
 from theano.tensor import opt
 from theano import gof
 from theano.compile import optdb
-'''
+# Logging function for sending warning or info
- TODO : move out of sandbox !
+import logging
-'''
+_logger = logging.getLogger('theano.scan')
+def warning(*msg):
-class Scan(theano.Op):
+    _logger.warning('WARNING theano.scan: '+' '.join(msg))
-    """Scan a function `fn` over several inputs producing several outputs 
+def info(*msg):
+    _logger.info('INFO theano.scan: '+' '.join(msg))
+# Hashing a list; list used by scan are list of numbers, therefore a list 
+# can be hashed by hashing all elements in the list
+def hash_list(list):
+    hash_value = 0
+    for v in list:
+        hash_value ^= v
+    return hash_value
+# Hashing a dictionary; the dictionary used by scan has as keys numbers and 
+# as values either numbers or list of numbers
+def hash_dict(dictionary):
+    hash_value = 0
+    for k,v in dictionary,iteritems():
+        # hash key
+        hash_value ^= k
+        if type(v) in (list,tuple):
+            hash_value ^= hash_list(v)
+        else:
+            hash_value ^= v
+    return hash_value
-    This Op implements a generalization of scan in which `fn` may consult several previous
-    outputs from the past, from positions (taps) relative to the current time.   The number of
-    taps (T_j) to use for each output (y_j) must be provided when creating a Scan Op.
-    Apply Inputs:
+def scan(fn, sequnces, non_sequences, seed_values, inplace_map={}, 
+         sequences_taps={}, outputs_taps = {},
+         len = theano.tensor.zero(), force_gradient = False, 
+         truncate_gradient = -1, go_backwards = False, mode = 'FAST_RUN'):
+    '''The function creates a more intuitive interface to the scan op.
-        X sequence inputs x_1, x_2, ... x_X
+    This function first creates a scan op object, and afterwards applies it 
+    to the input data. The scan operation iterates over X sequences producing
+    Y outputs. The function that is applied recursively may consult several 
+    previous outputs from the past as well as past values and future values 
+    of the input. You can see it as havin the inputs :
-        Y initial states (u_1, u_2, ... u_Y) for our outputs. Each must have appropriate length
+        X sequences inptus x_1, x_2, .. x_X
-        (T_1, T_2, ..., T_Y).
-        W other inputs w_1, w_2, ... w_W
+        Y seeds/initial values ( u_1, u_2, .. u_Y) for the outputs
-    Apply Outputs:
+        W non sequences inputs w_1, w_2, .. w_W
-        Y sequence outputs y_1, y_2, ... y_Y
+    Outputs :
+        Y sequence outputs y_1, y_2, .. y_Y
-    Each output y_j is computed one time-step at a time according to the formula:
+    Each otuput y_j computed one time step at a time according to the 
+    formula:
    .. code-block:: python
-        (y_1[t], y_2[t],.., y_Y[t]) = fn(
+      (y_1[t], y_2[t], .. y_Y[t]) = f( 
-            x_1[t], x_2[t], ... x_X[t],          # X current input values
+        x_1[t-K_1],.. x_1[t],x_1[t+1],.. x_1[t+L_1], # x_1 past and future 
-            y_1(t-1), y_1(t-2), .., y_1(t-T_1),  # T_1 previous outputs for y_1
+                                                     #values
-            y_2(t-1), y_2(t-2), ..., y_2(t-T_2), # T_2 previous outputs for y_2
+        x_2[t-K-2],.. x_2[t],x_2[t+1],.. x_2[t+L_2], # x_2 past and future 
-            ...,                                 # ...
+                                                     # values
-            y_Y(t-1), y_Y(t-2), ..., y_Y(t-T_Y), # T_Y previous outputs for y_Y
+        ...                                          # ...
-            w_1, w_2,..., w_W)                   # W 'timeless' inputs
+        y_1[t-1], y_1[t-2], .. y[t - T_1],           # past values of y_1
+        y_2[t-1], y_2[t-2], .. y[t - T_2],,          # past values of y_2 
+        ...
+        w_1, w_2, .., w_W)                           # 'timeless' inputs 
-    So `fn` must accept X + T_1 + T_2 + ... + T_Y + W arguments.
-    There are two high-level methods (`symbolic`, `compiled`) for creating a Scan Op besides
-    the low-level `__init__` constructor.  ***Why would you call them?***
+    :param fn: fn is a lambda expression or a function that given a list of 
+    symbolic inputs returns the update list and symbolic outputs list of the 
+    function that shall be applied recursively. 
+    :param sequences:list of sequences over which the scan op should iterate;
+    sequnces length should also cover past and future taps; for example if 
+    you also use for a sequence the past tap -3 and future tap +4, to total 
+    length should be n+7, where first 3 values of sequence are those 
+    corresponding to -3 -2 -1 and the last 4 values correspond to n+1 n+2 
+    n+3 and n+4
+    :param non_sequences: list of inputs over which it shouldn't iterate 
+    :param seed_values: seeds (initial values) of the outputs; if past taps 
+    are this seeds should contain enough values to cover this past values; 
+    note that index 0 of a seed belongs to the largest past tap 
+    :param inplace_map: a dictionary telling which output should be 
+    computed in place of which input sequence ; input sequence has to be 
+    of the same shape as the output
-    When applying a Scan Op to theano Variables, the order of arguments is very important! When
+    :param sequence_taps: a dictionary telling for each sequence what past 
-    using the full flexibility of Scan there can be a lot of arguments, but it is essential to
+    and future taps it should use; past values should be negative, future
-    put them in the following order: 
+    taps positives; by default 0 is added in this dictionary (current value)
+    if nothing is provided
-     1. "Ignored inputs" (x_i with i < n_inplace_ignore) that will be overwritten by an inplace scan.
+    :param outputs_taps: a dictionary telling for each output what past 
+    taps it should use (negative values); by default -1 is added to this 
+    dictionary if nothing is provided
-     2. Inputs that will be overwritten by an inplace scan (x_i with i < n_inplace)
+    :param len: a value (or theano scalar) describing for how many steps 
+    the scan should iterate; 0 means that it should iterate over the entire
+    length of the input sequence(s)
-     3. Remaining Inputs (x_i with i >= n_inplace)
+    :param force_gradient: a flag telling scan op that the gradient can be 
+    computed even though inplace or updates are used - use this on your own
+    risk
-     3. Output states (u_j) corresponding to the outputs that are computed inplace (j <
+    :param truncate_gradient: tells for how many steps should scan go 
-     n_inplace)
+    back in time on the backward pass of backpropagation through time 
-     4. Remaining output states not given in 3 (u_j with j >= n_inplace)
+    :param go_backwards: a flag indicating if scan should iterate back from 
+    the end of the sequence to the begining (if it is true) or from 0 to 
+    the end
-     5. Other inputs (w_1, w_2, ... w_W)
+    :param mode: indicates the mode that should be used to compile the
+    function that will be applied recursively
+    '''
-    Inplace Operation
-    =================
-    The Scan Op supports computing some (`n_inplace`) of the outputs y_j using the memory from
+    # check if inputs are just single variables instead of lists     
-    corresponding inputs x_j.
+    if not (type(sequences) in (list, tuple)):
-    It is not possible to indicate precisely which outputs overwrite which inputs, but without
+        seqs = [sequences]
-    loss of generality we assume that each of the first `n_inplace` outputs (y_j) overwrites
+    elif seqs = sequences
-    the corresponding input (x_j).
+    if not type(seed_values) in (list,tuple)):
+        seeds = [seed_values]
+    elif 
+        seeds = seed_values
+    if not (type(non_sequences) in (list,tuple)):
+        non_seqs = [non_sequences]
+    elif 
+        non_seqs = non_sequences
+    # compute number of sequences and number of seeds    
+    n_seqs     = len(seqs)
+    # see if there are outputs that do not feed anything back to the function
+    # applied recursively
+    outs_tapkeys = outputs_taps.keys()
+    for k in outs_tapkeys.sort():
+        if outputs_taps[k] == []
+            # add empty lists where you have outputs that do not have past 
+            # values
+            seeds = seeds[:k] + [[]] + seeds[k:]
+    n_seeds   = len(seeds)
+    # update sequences_taps[idx] to contain 0 if it is not defined
+    for i in xrange(n_seqs):
+        if not sequences_taps.has_key(i):
+            sequences_taps.update({i:[0]})
+        # if input sequence is not actually used by the recursive function
+        elif sequences_taps[i] == []:
+            sequences_taps.__delitem__(i)
+        elif not (sequences_taps[i] in (list,tuple)):
+            sequences_taps[i] = [sequences_taps[i]]
+    # update outputs_taps[idx] to contain -1 if it is not defined
+    for i in xrange(n_seeds):
+        if not outputs_taps.has_key(i):
+            outputs_taps.update({i:-1})
+        # if output sequence is not actually used as input to the recursive 
+        # function
+        elif outputs_taps[i] == []:
+            outputs_taps.__delitem__(i)
+        elif not(outputs_taps[i] in (list,tuple)):
+            outputs_taps[i] = [outputs_taps[i]]
+    # create theano inputs for the recursive function  
+    args = []
+    for (i,seq) in enumerate(seqs):
+      if sequences_taps.has_key(i):
+        for k in len(sequences_taps[i]):
+            args += [seq[0].type() ]
+    for (i,seed) in enumerate(seeds):
+      if outputs_taps.has_key(i):
+        for k in len(outputs_taps[i]):
+            args += [seed[0].type() ]
+    args += non_seqs
+    next_outs, updates = fn(*args)
+    # Create the Scan op object
+    local_op = Scan( (args,next_outs, updates), n_seqs,n_seeds,inplace_map,
+            sequences_taps, outputs_taps, force_gradient, truncate_gradient,
+            go_backwards, mode)
+    # Call the object on the input sequences, seeds, and non sequences
+    return local_op( *(    [thenao.tensor.as_tensor(len)]  \
+                         + seqs \
+                         + seeds \
+                         + non_seqs))
+''' The class implementing the scan op 
+The actual class. I would not recommend using it directly unless you really 
+know what you are doing' 
+'''
+class Scan(theano.Op):
+    def __init__(self,(inputs, outputs, updates),n_seqs, n_seeds,
+                 inplace_map={}, seqs_taps={}, outs_taps={},
+                 force_gradient = False, truncate_gradient = -1,
+                 go_backwards = False, inplace=False):
+        '''
+        :param inputs: list of symbolic inputs of the function that will 
+        be applied recursively 
+        :param outputs: list of symbolic outputs for the function applied 
+        recursively
-    Note that using inplace computations destroys information, and may make it
+        :param updates: list of updates for the function applied recursively
-    impossible to compute the gradient.
-    As long as the function 'fn' does not update any of the other
-    parameters (w_1,..) a gradient of this operation is supported.
-    ***Who will care about this?  Someone just using the Op? Someone writing an inplace
-    optimization?*** 
-    Ignored Inputs
+        :param n_seqs: number of sequences in the input over which it needs
-    ==============
+        to iterate
-    **** Behaviour?  Rationale?  Use case?
+        :param n_seeds: number of outputs (same as the number of seeds) 
-    """
+        :param inplace_map: dictionary discribing which output should be 
-    @classmethod
+        computed inplace of which input 
-    def symbolic(cls,(in_args,out_args), n_ins, n_outs,\
-                n_inplace=0, n_inplace_ignore=0, taps={},
-                mode = 'FAST_RUN'):
-        # if in_args is not a list assume it is just a variable and 
-        # convert it to a list (if this is neither the case the code will 
-        # raise an error somewhere else !)
-        if not( type(in_args) in (list,tuple)):
-            in_args = [in_args]
-        # if out_args is not a list assume it is just a variable and 
-        # convert it to a list 
-        if not (type(out_args) in (list,tuple)):
-            out_args = [out_args]
-        # Create fn 
-        my_fn   = theano.compile.sandbox.pfunc(in_args, out_args, mode = mode)
-        # Create gradient function 
-        gy_next  = [out_args[0].type()]
-        g_inputs = theano.tensor.grad(out_args[0],in_args,g_cost=gy_next[-1])
-        for y_next in out_args[1:] :
-            gy_next +=[y_next.type()]
-            g_ls = theano.tensor.grad(y_next,in_args,g_cost=gy_next[-1])
-            for i in xrange(len(in_args)):
-                g_inputs[i] += g_ls[i]
-        g_fn=theano.compile.sandbox.pfunc(gy_next+in_args,g_inputs,
-                             mode=mode)
+        :param seqs_taps: dictionary discribing which past and future taps
-        return cls(my_fn, g_fn, n_ins, n_outs,\
+        of the input sequences are used by the recursive function
-                   n_inplace,n_inplace_ignore, taps)
-    @classmethod
+        :param outs_taps: dictionary discribing which past taps of the 
-    def compiled(cls,fn,n_ins, n_outs,\
+        outputs the recursive function is using 
-            n_inplace=0, n_inplace_ignore=0, taps={}):
-        """Return a Scan instance that will scan the callable `fn` over `n_ins` inputs and
-        `n_outs` outputs.
+        :param force_gradient: a flag indicating if the gradient is still 
+        computable even though inplace operation or updates are used
-        """
+        :param truncate_gradient: if different from -1 it tells after how 
-        return cls(fn, None, n_ins, n_outs, \
+        many steps in the backward pass of BPTT 
-                   n_inplace, n_inplace_ignore, taps= taps)
+        '''
+        # check inplace map
+        for _out,_in in inplace_map.iteritems():
+            if _out > n_seeds:
+                raise ValueError(('Inplace map reffers to an unexisting'\
+                          'output %d')% _out)
+            if _in > n_seqs:
+                raise ValueError(('Inplace map reffers to an unexisting'\
+                          'input sequence %d')%_in)
+            if (_in >= 0) and (min(seqs_taps[_in]) < 0):
+                raise ValueError(('Input sequence %d uses past values that '\
+                         'will be overwritten by inplace operation')%_in)
-    def __init__(self,fn,grad_fn,n_ins,n_outs,
+        #check sequences past taps
-                 n_inplace=0, n_inplace_ignore=0,                 
+        for k,v in seqs_taps.map_iteritems():
-                 taps={}, inplace=False):
+          if k > n_seqs:
-        """Create an instance of the scan class
+            raise ValueError(('Sequences past taps dictionary reffers to '
+                    'an unexisting sequence %d')%k)
-        To use Scan, first you need to create it specifying the number of inputs, outputs,
+        #check outputs past taps
-        inplace outputs (see notes below), and inputs to be ignored, a dictionary describing
+        for k,v in outs_taps.map_iteritems():
-        the time taps used, the function that will be applied recursively and optionally, the
+          if k > n_seeds:
-        gradient function (or a symbolic definition of the function and the op will compute the
+            raise ValueError(('Sequences past taps dictionary reffers to '
-        gradient on its own). Secondly you just call the op with a list of parameters.
+                    'an unexisting sequence %d')%k)
+          if max(v) > -1:
+            raise ValueError(('Can not require future value %d of output'
+                    '%d')%(k,max(v)))
-        :param fn: compiled function that takes you from time step t-1 to t
-        :param grad_fn: gradient of the function applied recursevly
-        :param n_ins: number of inputs; in the list of arguments
-        they start from 0 to 'n_ins'
-        :param n_outs: number of outputs; in the list of arguments you 
-        need to give the initial state of each outputs, this will be from 
-        'n_ins' to 'n_outs'; each initial state should be a matrix where 
-        the first dimension is time and should be sufficiently large to 
-        cover the time taps. The matrix for an initial state should be 
-        ordered such that if you use k delays, index 0 of matrix stands for 
-        the value at time -k, index 1 for value at time 1-k, index 2 for 
-        value at time 2-k and index k-1 for value at time -1
-        :param n_inplace: indicates the number of outputs that should be 
-        computed inplace; in the list of arguments there will be the first
-        'n_inplace' outputs in place of the first 'n_inplace' inputs
-        :param n_inplace_ignore: indicates the number of inputs that are 
-        given just to be replaced by the inplace computation and which
-        should not be given as arguments to the function applied 
-        recursevly
-        :param taps: a dictionary which for each output index gives
-        a list of what taps it uses; a tap is given as an int, 
-        where x stands for output(t - x); note that a past trace of 1 makes
-        no sense, since you get that by default
-        :param inplace: is used by the optimizer that allows the inplace 
-        computation
-        """
-        if n_ins < 1:
-            raise ValueError('Scan should iterate over at least on one input')
-        if n_outs <1:
-            raise ValueError('Scan should have at least one output')
-        if (n_inplace > n_ins):
-            raise ValueError('Number of inplace outputs should be smaller than '
-                     'the number of inputs.')
-        if (n_inplace < 0):
-            raise ValueError('Number of inplace outputs should be larger '
-                             'or equal to 0')
-        if (n_inplace_ignore > n_inplace):
-            raise ValueError('Number of inputs to ignore should not be '\
-                             'larger than number of inplace outputs')
-        if (n_inplace_ignore < 0):
-            raise ValueError('n_inplace_ignore should be non-negative')
        self.destroy_map = {}
        if inplace:
-            for i in xrange(n_inplace):
+            self.destroy_map = inplace_map
-                self.destroy_map.update( {i:[i]} )
+        self.seqs_taps      = seqs_taps
-        for (k,v) in taps.iteritems():
+        self.outs_taps      = outs_taps
-            if k < 0 or k > n_outs:
+        self.n_seqs         = n_seqs
-                raise ValueError('Taps dictionary contains wrong key!')
+        self.n_seeds        = n_seeds
-            for vi in v:
+        self.n_args         = n_seqs+n_seeds+1
-                # why is it illegal to specify  vi < 2?  
+        self.inplace_map    = inplace_map
-                # what is special about vi == 1?
+        self.inplace        = inplace
-                #
+        self.inputs         = inputs
-                # Would it be simpler to just leave v alone if it is non-empty (checking that
+        self.outputs        = outputs
-                # all vi are >=1) and set v = [1] for all missing output keys?
+        self.updates        = updates
-              if vi < 2:
+        self.force_gradient = force_gradient
-                raise ValueError('Taps dictionary contains wrong values!')
+        self.truncate_gradient = truncate_gradient
+        self.go_backwards   = go_backwards
-        self.taps   = taps
-        self.n_ins  = n_ins
-        self.n_outs = n_outs
-        self.n_inplace = n_inplace
-        self.inplace = inplace
-        self.n_inplace_ignore = n_inplace_ignore
-        self.fn = fn
-        self.grad_fn = grad_fn
-    def make_node(self, *inputs):
+        self.fn = theano.function(inputs,outputs, \
-        """Create an node for the Scan operation
+                                   updates = updates, mode = mode)
-        :param inputs: list of inputs for the operations; they should be 
+        g_y = [outputs[0].type()]
-        at least 'self.n_ins'+'self.n_outs' arguments; first 'self.n_inplace'
+        g_args = theano.tensor.grad(outputs[0],inputs, g_cost = g_y[-1])
-        are inputs that are replaced inplace, followed by oter inputs up 
+        # for all outputs compute gradients and then sum them up
-        to 'self.n_ins'; next 'self.n_outs' are ouputs followed by other 
+        for y in outputs[1:]:
-        arguments that will be given to the function applied recursevly
+            g_y += [y.type()]
-        """
+            g_args_y = theano.tensor.grad(y,inputs, g_cost=g_y[-1])
+            for i in xrange(len(g_args)):
+                g_args[i] += g_args_y[i]
-        n_args = len(inputs)
-        min_n_args = self.n_ins+self.n_outs
-        if n_args < min_n_args:
-            err = 'There should be at least '+str(min_n_args)+ 'arguments'
-            raise ValueError(err)
-        # Create list of output datatypes
+        self.g_ins = g_y+inputs   
-        out_types = []
+        self.g_outs = g_args
-        for i in xrange(self.n_ins,self.n_ins+self.n_outs):
-            out_types += [theano.tensor.Tensor(dtype=inputs[i].dtype,\
-                    broadcastable=(False,)+inputs[i].broadcastable[1:])()]
-        return theano.Apply(self,inputs, out_types)
+    def make_node(self,*inputs):
+      n_args = len(inputs)
+      if n_args < self.n_args :
+         err = 'There should be at least '+str(self.n_args)+ 'arguments'
+         raise ValueError(err)
+      # Create list of output datatypes
+      out_types = []
+      for i in xrange(self.n_seqs+1, self.n_seqs+self.n_seeds+1):
+         out_types += [theano.tensor.Tensor(dtype=inputs[i].dtype,\
+                 broadcastable=(False,)+inputs[i].broadcastable[1:])()]
+      return theano.Apply(self,inputs, out_types)
    def __eq__(self,other):
-        rval = type(self) == type(other)
+      rval = type(self) == type(other)
-        if rval:
+      if rval:
-            rval = (self.fn is other.fn) and \
+        rval = (self.inputs == other.inputs) and \
-                   (self.grad_fn is other.grad_fn) and \
+               (self.outputs ==  other.outputs) and \
-                   (self.n_ins == other.n_ins) and \
+               (self.updates == other.updates) and \
-                   (self.n_outs == other.n_outs) and \
+               (self.g_ins == other.g_ins) and \
-                   (self.n_inplace == other.n_inplace) and \
+               (self.g_outs == other.g_outs) and \
-                   (self.n_inplace_ignore == other.n_inplace_ignore) and\
+               (self.seqs_taps == other.seqs_taps) and \
-                   (self.inplace == other.inplace) and\
+               (self.outs_taps == other.outs_taps) and \
-                   (self.taps == other.taps) 
+               (self.inplace_map == other.inplace_map) and \
-        return rval
+               (self.n_seqs == other.n_seqs) and\
+               (self.inplace == other.inplace) and\
+               (self.go_backwards == other.go_backwards) and\
+               (self.truncate_gradient == other.truncate_gradient) and\
+               (self.force_gradient = other.force_gradient) and\
+               (self.n_seeds == other.n_seeds) and\
+               (self.n_args == other.n_args)
+      return rval
    def __hash__(self):
-        # hash the taps dictionary
+      return hash(type(self)) ^ \
-        taps_hash = 0
+             hash(self.n_seqs) ^ \
-        for k,v in self.taps.iteritems():
+             hash(self.n_seeds) ^ \
-            taps_hash ^= k
+             hash(self.force_gradient) ^\
-            for vi in v : 
+             hash(self.inplace) ^\
-                taps_hash ^= vi
+             hash(self.go_backwards) ^\
+             hash(self.truncate_gradient) ^\
-        return hash(type(self)) ^ \
+             hash(self.n_args) ^ \
-               hash(self.fn) ^ \
+             hash_list(self.outputs) ^ \
-               hash(self.grad_fn) ^ \
+             hash_list(self.inputs) ^ \
-               hash(self.n_ins) ^ \
+             hash_list(g_ins) ^ \
-               hash(self.n_outs) ^ \
+             hash_list(h_outs) ^ \
-               hash(self.n_inplace) ^ \
+             hash_dict(self.seqs_taps) ^\
-               hash(self.n_inplace_ignore) ^\
+             hash_dict(self.outs_taps) ^\
-               hash(self.inplace) ^\
+             hash_dict(self.inplace_map) ^\
-               taps_hash 
+             hash_dict(self.updates)
-    def grad(self, inputs, g_outs):
-        if self.grad_fn == None:
-            print 'Warning! no gradient for the recursive function was given'
-            return [None for i in inputs]
-        else:
-            y = self(*inputs)
-            if not( type(y) in (list,tuple)):
-                y = [y]
-            for i in xrange(len(y)):
-                if g_outs[i] == None:
-                    g_outs[i] = theano.tensor.zeros_like(y[i])
-            # Construct my gradient class: 
-            gradScan = ScanGrad(self.grad_fn, 
-                            self.n_ins- self.n_inplace_ignore, self.n_outs,
-                            self.taps)
-            args = g_outs + y + \
-                   inputs[self.n_inplace_ignore:]
-            grads = gradScan(*args)
-            rval = [None for i in inputs[:self.n_inplace_ignore]]+grads
-            return rval
    def perform(self,node,args, outs):
-        # find number of timesteps, note that a precondition is to have 
+        n_steps = 0 
-        # atleast one input to iterate over
+        if (self.n_seqs ==0 ) and (args[0] == 0)
-        n_steps = len(args[0])
+            raise ValueError('Scan does not know over how many steps it '
+                'should iterate! No input sequence or number of steps to '
+                'iterate given !')
-        # check if we deal with a inplace operation 
+        if (args[0] != 0):
-        n_inplace = self.n_inplace
+            n_steps = args[0]
-        n_inplace_ignore = self.n_inplace_ignore
+        for i in xrange(self.n_seqs):
+          if self.seqs_taps.has_key(i):
+              # compute actual length of the sequence ( we need to see what
+              # past taps this sequence has, and leave room for them 
+              seq_len = args[i+1].shape[0] + min(self.seqs_taps[i+1])
+              if self.seqs_taps[i+1][2] > 0: 
+                  # using future values, so need to end the sequence earlier
+                  seq_len -= self.seqs_taps[i+1][2]
+              if n_steps == 0 :
+                  # length of the sequences, leaving room for the largest
+                  n_steps = seq_len
+              if seq_len != n_steps : 
+                  warning(('Input sequence %d has a shorter length then the '
+                          'expected number of steps %d')%(i,n_steps))
+                  n_steps = min(seq_len,n_steps)
+        # check if we deal with an inplace operation 
+        inplace_map  = self.inplace_map
        if not self.inplace: #if it was not optimized to work inplace
-            n_inplace = 0
+            inplace_map = {}
-        # check lengths of inputs
+        # check lengths of seeds
-        for i in xrange(self.n_ins):
+        for i in xrange(self.n_seqs+1, \
-            if args[i].shape[0] != n_steps:
+                        self.n_seqs+self.n_seeds+1):
-                raise ValueError('All inputs should have n_steps length!')
+          if self.outs_taps.has_key(i-self.n_seqs-1):
+            req_size = abs(min(self.outs_taps[i-self.n_seqs-1]))-1
-        # check lengths of initial states
-        for i in xrange(self.n_ins, self.n_ins+self.n_outs):
-            req_size = 1
-            if self.taps.has_key(i- self.n_ins):
-                req_size = max(self.taps[i-self.n_ins])
-            if len(args[i].shape) == 0:
-              raise ValueError('Wrong initial state! ')
            if args[i].shape[0] < req_size:
-              raise ValueError('Wrong initial state! ')
+              warning(('Initial state for output %d has fewer values then '
+                 'required by the maximal past value %d. Scan will use 0s'
-        # allocate space for the outputs 
+                 ' for missing values')%(i-self.n_iterable-1,req_size))
-        y = []
-        # inplace outputs
-        for i in xrange(n_inplace):
-            y += [args[i]]
-        # add outputs 
-        for i in xrange(self.n_ins+n_inplace,self.n_ins+self.n_outs):
-            y_shape = (n_steps,)+args[i].shape[1:]
-            y += [numpy.empty(y_shape, dtype = args[i].dtype)]
-        # iterate
-        for i in xrange(n_steps):
-            fn_args = []
-            # get a time slice of inputs
-            for j in xrange(n_inplace_ignore, self.n_ins):
-                fn_args += [args[j][i]]
-            # get past values of outputs (t-1 + taps)
+        self.n_steps = n_steps
-            for j in xrange(self.n_outs):
+        y = self.scan(self.fn, args[1:],self.n_seqs, self.n_seeds, 
-                # get list of taps
+                 self.seqs_taps, self.outs_taps, n_steps, self.go_backwards, 
-                ls_taps = [1]
+                 inplace_map)
-                if self.taps.has_key(j):
-                    ls_taps += self.taps[j]
-                maxVal = max(ls_taps)
-                for tap_value in ls_taps:
-                    if i - tap_value < 0:
-                        fn_args += [args[j+self.n_ins][maxVal-tap_value+i]]
-                    else:
-                        fn_args += [y[j][i-tap_value]]
-            # get the none iterable parameters
-            fn_args += list(args[(self.n_ins+self.n_outs):])
-            # compute output
-            something = self.fn(*fn_args)
-            # update y and inplace outputs
-            for j in xrange(self.n_outs):
-                y[j][i] = something[j]
        # write to storage
-        for i in xrange(self.n_outs):
+        for i in xrange(self.n_seeds):
            outs[i][0]=y[i]
+    def scan(fn, args, n_seqs, n_seeds, seqs_taps, outs_taps,  n_steps, 
+             go_backwards, inplace_map):
+      y = []
+      for i in xrange(self.n_seeds):
+        if inplace_map.has_key(i) and (inplace_map[i] >= 0):
+          y += [args[inplace_map[i]]]
+        else:
+          y_shape = (n_steps,)+args[i+self.n_seqs].shape[1:]
+          y += [numpy.empty(y_shape,
+                            dtype=args[i+self.n_seqs].dtype)]
+      #iterate
+      if go_backwards:
+        the_range = xrange(n_steps-1,-1,-1)
+      else:
+        the_range = xrange(n_steps)
+      seqs_mins = {}
+      for j in xrange(self.n_seqs):
+        if seqs_taps.has_key(j):
+          seqs_mins.update({j:  min(seqs_taps[j])})
+      outs_mins = {}
+      seed_size = {}
+      for j in xrange(self.n_seeds):
+        if outs_taps.has_key(j):
+          outs_mins.update({j: min(outs_taps[j])})
+          seed_size.update({j: args[n_seqs+j].shape[0]})
+      for i in the_range:
+        fn_args = []
+        # sequences over which scan iterates
+        for j in xrange(self.n_seqs):
+          if seqs_taps.has_key(j):
+            ls_taps = seqs_taps[j]
+            min_tap = seqs_mins[j]
+            for tap_value in ls_taps:
+                k = i - min_tap + tap_value
+                fn_args += [args[j][k]]
+        # seeds or past values of outputs
+        for j in xrange(self.n_seeds):
+          if outs_taps.has_key(j):
+            ls_taps = outs_taps[j]
+            min_tap = outs_mins[j]
+            seed_sz = seed_size[j]
+            for tap_value in ls_taps:
+              if i + tap_value < 0:
+                k = i + seed_sz + tap_value
+                if k < 0
+                  # past value not provided.. issue a warning and use 0s
+                  fn_args += [numpy.zeros(args[j][0].shape)]
+                  warning('Past value %d for output %d not given in seeds' %
+                           (j,tap_value))
+                else:
+                  fn_args += [args[j][k]]
+              else:
+                fn_args += [y[j][i + tap_value]]
+        # get the non-iterable sequences
+        fn_args += list(args[(self.n_seqs+self.n_seedss):]
+        # compute output
+        something = fn(*fn_args)
+        #update outputs 
+        for j in xrange(self.n_seeds):
+          y[j][i] = something[j]
+      return y
+    def grad(self, args, g_outs):
+        if (not self.force_gradient) and \
+           ((self.updates.keys() != []) or (self.inplace_map.keys() != [])):
+            warning('Can not compute gradients if inplace or updates ' \
+                    'are used. Use force_gradient if you know for sure '\
+                    'that the gradient can be computed automatically.')
+            return [None for i in inputs]
+        else:
+            # forward pass 
+            y = self(*args)
+            if not( type(y) in (list,tuple)):
+                y = [y]
+            # backwards pass
+            for i in xrange(len(y)):
+               if g_outs[i] == None:
+                  g_outs[i] = theano.tensor.zeros_like(y[i])
+            g_args = [self.n_steps]+g_outs + y 
+            # check if go_backwards is true
+            if self.go_backwards:
+               for seq in args[1:self.n_seqs]:
+                 g_args += [seq[::-1]]
+            else:
+               g_args += args[1:self.n_seqs] 
+            g_args += args[1+self.n_seqs: ]
+            g_scan = ScanGrad((self.g_ins,self.g_outs), self.n_seqs, \
+                              self.n_seeds,self.seqs_taps, self.outs_taps,
+                              self.truncate_gradient)
+            return g_scan(g_args)
 @gof.local_optimizer([None])
 def scan_make_inplace(node):
    op = node.op
-    if isinstance(op, Scan) and (not op.inplace) and (op.n_inplace>0):
+    if isinstance(op, Scan) and (not op.inplace) \
-        return Scan(op.fn, op.grad_fn, op.n_ins,\
+                            and (op.inplace_map.keys() != []):
-                    op.n_outs, op.n_inplace, op.n_inplace_ignore,\
+        return Scan((op.inputs, op.outputs, op.updates), op.n_seqs,  \
-                    op.taps,inplace=True\
+                    op.n_seeds, op.inplace_map, op.seqs_taps, op.outs_taps, \
-                                       ).make_node(*node.inputs).outputs
+                    op.force_gradient, op.truncate_gradient, \
+                    op.go_backwards, inplace=True \
+                      ).make_node(*node.inputs).outputs
    return False
 optdb.register('scan_make_inplace', opt.in2out(scan_make_inplace,\
               ignore_newtrees=True), 75, 'fast_run', 'inplace')
@@ -428,144 +588,160 @@ optdb.register('scan_make_inplace', opt.in2out(scan_make_inplace,\
 class ScanGrad(theano.Op):
    """Gradient Op for Scan"""
+    def __init__(self,(g_ins, g_outs) , n_seqs, n_outs, 
-    def __init__(self, grad_fn, n_ins, n_outs, 
+                 seqs_taps = {}, outs_taps= {}, truncate_gradient = -1):
-                 taps = {},inplace=False):
+        self.grad_fn = theano.function(g_ins, g_outs)
-        self.grad_fn = grad_fn
+        self.inputs = g_ins
-        self.n_ins = n_ins # number of inputs of Scan op not of Grad Scan !!
+        self.outputs = g_outs
-        self.n_outs = n_outs # number of outs of Scan op not of Grad Scan !!
+        self.n_seqs = n_seqs
-        self.inplace = inplace
+        self.truncate_gradient = truncate_gradient
-        self.taps = taps
+        self.n_outs = n_outs
+        self.seqs_taps = seqs_taps
+        self.outs_taps = outs_taps
        self.destroy_map = {}
-        if self.inplace:
-          for i in xrange(self.n_outs):
-            # claiming that output "-i" is destroying inputs is the way to
-            # declare that no real output is aliased to any inputs.  We just
-            # trash the inputs by using them as workspace.
-            self.destroy_map.update( {-i:[i]})
    def __eq__(self,other): 
        rval = type(self) == type(other)
        if rval:
-           rval = (self.grad_fn is other.grad_fn) and \
+           rval = (self.inputs == other.inputs) and \
-                  (self.n_ins == other.n_ins) and \
+                  (self.outputs == other.outputs) and \
+                  (self.n_seqs == other.n_seqs) and \
                  (self.n_outs == other.n_outs) and \
-                  (self.inplace == other.inplace) and \
+                  (self.truncate_gradient == other.truncate_gradient) and\
-                  (self.taps == other.taps)
+                  (self.seqs_taps == other.seqs_taps) and \
+                  (self.outs_taps == other.outs_taps) 
        return rval
    def __hash__(self):
-        taps_hash = 0 
-        for k,v in self.taps.iteritems():
-            taps_hash ^= k
-            for vi in v :
-                taps_hash ^= vi
        return hash(type(self)) ^ \
-               hash(self.grad_fn) ^ \
+               hash(self.n_seqs) ^ \
-               hash(self.n_ins) ^ \
               hash(self.n_outs) ^ \
-               hash(self.inplace) ^ taps_hash
+               hash(self.truncate_gradient) ^\
+               hash_list(self.inputs) ^ \
+               hash_list(self.outputs) ^ \
+               hash_dict(self.seqs_taps) ^ \
+               hash_dict(self.outs_taps)
    def make_node(self, *args):
        # input of the gradient op : 
-        # | g_outs | y      | ins   | outs   | other_args |
+        # | g_outs | y      | seqs   | outs    | non_seqs   |
-        # | n_outs | n_outs | n_ins | n_outs | unknown    |
+        # | n_outs | n_outs | n_seqs | n_outs  | unknown    |
        # return 
-        # | grad of ins | grad of outs | grad of other_args|
+        # | grad of seqs | grad of outs | grad of non_seqs  |
-        # |   n_ins     |  n_outs      |  unknown          |
+        # |   n_seqs     |  n_outs      |  unknown          |
        return theano.Apply(self, list(args),
-                    [i.type() for i in args[self.n_outs+self.n_outs:] ])
+                    [i.type() for i in args[1+2*self.n_outs:] ])
    def perform(self, node, args, storage):
            # get scan inputs
-            inputs = args[self.n_outs+self.n_outs:]
+            n_steps = args[0]
-            ins = inputs[:self.n_ins]
+            inputs = args[2*self.n_outs+1:]
-            initSt = inputs[self.n_ins:self.n_ins+self.n_outs]
+            seqs = inputs[:self.n_seqs]
-            otherArgs = inputs[self.n_outs+self.n_ins:]
+            seeds = inputs[self.n_seqs:self.n_seqs+self.n_outs]
+            non_seqs = inputs[self.n_outs+self.n_seqs:]
            # generate space for gradient 
-            # not do if inplace !?
+            g_seqs     = [numpy.zeros_like(k) for k in seqs]
-            g_ins   = [numpy.zeros_like(k) for k in ins]
+            g_seeds    = [numpy.zeros_like(k) for k in seeds]
-            g_initSt = [numpy.zeros_like(k) for k in initSt]
+            g_non_seqs = [numpy.zeros_like(k) for k in non_seqs]
-            g_otherArgs = [numpy.zeros_like(k) for k in otherArgs]
            # get gradient from above
            g_outs = args[:self.n_outs]
-            # we modify g_outs inplace ..
-            if not self.inplace:
-                g_outs = [gout.copy() for gout in g_outs]
            # get the output of the scan operation
            outs = args[self.n_outs:2*self.n_outs]
-            # check for Nones (non - differentiable )
-            #for i,g_o in enumerate(g_outs):
-            #    if numpy.all(g_o == 0.):
-            #        g_outs[i] = numpy.zeros_like(outs[i])
-            # go back through time to 0 (use a time window !?)
+            # go back through time to 0 or n_steps - truncate_gradient
-            for i in xrange(len(ins[0])-1,-1,-1):
+            lower_limit = n_steps - self.truncate_gradient
+            if lower_limit > n_steps-1:
+                the_range = xrange(n_steps-1,-1,-1)
+            elif lower_limit < -1:
+                the_range = xrange(n_steps-1,-1,-1)
+            else:
+                the_range = xrange(n_steps-1, lower_limit,-1)
+            seqs_mins = {}
+            for j in xrange(self.n_seqs):
+              if self.seqs_taps.has_key(j):
+                seqs_mins.update({j: min(self.seqs_taps[j])})
+            outs_mins = {}
+            seed_size = {}
+            for j in xrange(self.n_outs):
+              if self.outs_taps.has_key(j):
+                outs_mins.update({j: min(self.outs_taps[j])})
+                seed_size.update({j: g_seeds[j]..shape[0]})
+            for i in the_range:
              # time slice of inputs
-              _ins = [arg[i] for arg in ins]
+              _ins = []
+              for j in xrange(self.n_seqs)
+                if self.seqs_taps.has_key(j):
+                  ls_taps = self.seqs_taps[j] 
+                  min_tap =      seqs_mins[j]
+                  for tap_value in ls_taps:
+                    k = i - min_tap + tap_value
+                    _ins += [ins[j][k]]
              # time slice of outputs + taps
              _outs = []
              for j in xrange(self.n_outs):
-                ls_taps = [1]
+                if self.outs_taps.has_key(j):
-                if self.taps.has_key(j):
+                  ls_taps = self.outs_taps[j]
-                    ls_taps += self.taps[j]
+                  min_tap =      outs_mins[j]
-                maxVal = max(ls_taps)
+                  seed_sz =      seed_size[j]
-                for tap_value in ls_taps:
+                  for tap_value in ls_taps:
-                    if i - tap_value < 0:
+                    if i + tap_value < 0:
-                        _outs += [initSt[j][maxVal-tap_value+i]]
+                      k = i + seed_sz  + tap_value
+                      if k < 0 :
+                        #past value not provided .. issue a warning and use 0
+                        _outs += [numpy.zeros(seeds[j][0].shape)]
+                        warning('Past value %d for output $d not given' \
+                              %(j,tap_value))
+                      else:
+                        _outs += [seeds[j][[k]]
                    else:
-                        _outs += [outs[j][i- tap_value]]
+                      _outs += [outs[j][i + tap_value]]
              g_out = [arg[i] for arg in g_outs]
-              grad_args = g_out + _ins + _outs + otherArgs
+              grad_args = g_out + _ins + _outs + non_seqs
              grads=self.grad_fn(*grad_args)
              # get gradient for inputs 
-              for j in xrange(self.n_ins):
+              pos = 0
-                g_ins[j][i] = grads[j]
+              for j in xrange(self.n_seqs):
+                if self.seqs_taps.has_key(j):
+                  ls_taps = self.seqs_taps[j]
+                  min_tap =      seqs_mins[j]
+                  for tap_value in ls_taps :
+                    k = i - min_tap + tap_value
+                    g_ins[j][k] += grads[pos]
+                    pos += 1
              # get gradient for outputs
-              pos = self.n_ins
              for j in xrange(self.n_outs):
-                ls_taps = [1]
+                if self.outs_taps.has_key(j):
-                if self.taps.has_key(j):
+                  ls_taps = self.outs_taps[j]
-                    ls_taps += self.taps[j]
+                  min_tap =      outs_mins[j]
-                maxVal = max(ls_taps)
+                  seed_sz =      seed_size[j]
-                for tap_value in ls_taps:
+                  for tap_value in ls_taps:
-                    if i - tap_value < 0:
+                    if i+tap_value < 0 :
-                        g_initSt[j][maxVal-tap_value+i] += grads[pos]
+                     k = i + seed_sz + tap_value
-                        pos +=1
+                     if  k > 0 :
-                    else:
+                        g_seeds[j][k] += grads[pos]
-                       g_outs[j][i-tap_value]+= grads[pos]
+                        pos += 1
-                       pos += 1
+              for j in xrange(len(g_non_seqs)):
-              for j in xrange(len(g_otherArgs)):
+                g_non_seqs[j] += grads[j+pos]
-                g_otherArgs[j] += grads[j+pos]
-            # return the gradient 
-            for i in xrange(len(g_ins)):
-                storage[i][0] = g_ins[i] 
-            for i in xrange(len(g_initSt)):
-                storage[i+self.n_ins][0] = g_initSt[i]
-            for i in xrange(len(g_otherArgs)):
+            # return the gradient
-                storage[i+self.n_ins+self.n_outs][0] = g_otherArgs[i]
+            for i,v in enumerate(g_ins + g_seeds+ g_non_seqs):
+                storage[i][0] = v
-@gof.local_optimizer([None])
-def grad_scan_make_inplace(node):
-    op = node.op
-    if isinstance(op, ScanGrad) and (not op.inplace):
-        return ScanGrad(op.grad_fn, op.n_ins, op.n_outs, op.taps, 
-                   inplace=True).make_node(*node.inputs).outputs
-    return False
-optdb.register('grad_scan_make_inplace', opt.in2out(grad_scan_make_inplace,\
-               ignore_newtrees=True), 75, 'fast_run', 'inplace')
--- a/theano/sandbox/solve.py
+++ b/theano/sandbox/solve.py
+import numpy
+from theano import gof, tensor
+import unittest
+class Solve(gof.Op):
+    """
+    Find the solution to the linear equation Ax=b,
+    where A is a 2d matrix and b is a 1d or 2d matrix.
+    It use numpy.solve to find the solution.
+    """
+    def make_node(self, A, b):
+        if not isinstance(A, gof.Variable) or not A.type==tensor.matrix().type:
+            raise TypeError("We expected that A had a matrix type")
+        if not isinstance(B, gof.Variable) or not B.type==tensor.matrix().type:
+            raise TypeError("We expected that B had a matrix type")
+        node = gof.Apply(op=self, inputs=[A, B], outputs=[tensor.matrix()])
+        return node
+    def perform(self, node, (A, B), (output, )):
+        ret=numpy.solve(A,B)
+        output[0]=ret
+    def grad(self, (theta, A, B), (gtheta,)):
+        raise NotImplementedError()
+solve = Solve()
+class T_solve(unittest.TestCase):
+    def setUp(self):
+        self.rng = numpy.random.RandomState(utt.fetch_seed(666))
+    def test0(self):
+        A=self.rng.randn(5,5)
+        b=numpy.array(range(5),dtype=float)
+        x=numpy.linalg.solve(A,b)
+        Ax = numpy.dot(A,x)
+        are = tensor.numeric_grad.abs_rel_err(Ax, b)
+        self.failUnless(numpy.all(are < 1.0e-5), (are, Ax, b))
+        #print A,b
+        #print numpy.dot(A,x)
--- a/theano/sandbox/test_scan.py
+++ b/theano/sandbox/test_scan.py
@@ -7,8 +7,6 @@ import random
 import numpy.random
 from theano.tests  import unittest_tools as utt
 def verify_grad(op, pt, n_tests=2, rng=None, eps = None, tol = None, 
                mode = None, cast_to_output_type = False):
    pt = [numpy.array(p) for p in pt]
@@ -75,455 +73,21 @@ def verify_grad(op, pt, n_tests=2, rng=None, eps = None, tol = None,
-class T_Scan(unittest.TestCase):
-    def setUp(self):
-        utt.seed_rng()
-        x_1 = theano.tensor.dscalar('x_1')
-        self.my_f = theano.function([x_1],[x_1]) #dummy function
-    # Naming convention : 
-    #  u_1,u_2,..   -> inputs, arrays to iterate over
-    #  x_1,x_2,..   -> outputs at t-1 that are required in the recurrent 
-    #                  computation
-    #  iu_1,iu_2,.. -> inplace inputs, inputs that are being replaced by 
-    #                  outputs during computation
-    #  du_1,du_2,.. -> dummy inputs used to do inplace computation, they 
-    #                  are not passed to my_f
-    #  ix_1,ix_2,.. -> inplace outputs at t-1
-    #  x_1_next,..  -> outputs at t
-    #  ix_1_next,.. -> inplace outputs at  time t
-    #  w_1,w_2,..   -> weights, paramters over which scan does not iterate
-    #  my_f         -> compiled function that will be applied recurrently
-    #  my_op        -> operator class
-    #  final_f      -> compiled function that applies the Scan operation
-    #  out_1,..     -> outputs of the Scan operation
-    ###################################################################
-    def test_numberOfIterableInputs(self):
-        def t1():
-            my_op = Scan.compiled(self.my_f,-1,1)
-        def t2():
-            my_op = Scan.compiled(self.my_f,0,1)
-        self.failUnlessRaises(ValueError,t1)
-        self.failUnlessRaises(ValueError,t2)
-    ###################################################################
-    def test_numberOfOutputs(self):
-        def t1():
-            my_op = Scan.compiled(self.my_f,1,-1)
-        def t2():
-            my_op = Scan.compiled(self.my_f,1,0)
-        self.failUnlessRaises(ValueError,t1)
-        self.failUnlessRaises(ValueError,t2)
-    #####################################################################
-    def test_numberOfInplaceOutputs(self):
-        def t1():
-            my_op =Scan.compiled(self.my_f,1,1,n_inplace = -1)
-        def t2():
-            my_op =Scan.compiled(self.my_f,1,1,n_inplace = 2)
-        def t3():
-            my_op =Scan.compiled(self.my_f,2,1,n_inplace=2)
-        def t4():
-            my_op =Scan.compiled(self.my_f,1,2,n_inplace=2)
-        def t5():
-            my_op =Scan.compiled(self.my_f,1,1,n_inplace=1,n_inplace_ignore=2)
-        self.failUnlessRaises(ValueError,t1)
-        self.failUnlessRaises(ValueError,t2)
-        self.failUnlessRaises(ValueError,t3)
-        self.failUnlessRaises(ValueError,t4)
-        self.failUnlessRaises(ValueError,t5)
-    #####################################################################
-    def test_taps(self):
-        def t1():
-            my_op = Scan.compiled(self.my_f,1,1, taps={2:[3]})
-        def t2():
-            my_op = Scan.compiled(self.my_f,1,2, taps={0:[0]})
-        def t3():
-            my_op = Scan.compiled(self.my_f,1,2, taps={0:[1]})
-        self.failUnlessRaises(ValueError,t1)
-        self.failUnlessRaises(ValueError,t2)
-        self.failUnlessRaises(ValueError,t3)
-    #####################################################################
-    def test_makeNode(self):
-        def t1():
-            ######### Test inputs of different lengths
-            # define the function that is applied recurrently
-            u_1      = theano.tensor.dscalar('u_1')
-            u_2      = theano.tensor.dscalar('u_2')
-            x_1      = theano.tensor.dscalar('x_1')
-            x_1_next = u_1+u_2*x_1
-            my_f     = theano.function([u_1,u_2,x_1],[x_1_next])
-            # define the function that applies the scan operation
-            my_op    = Scan.compiled(my_f,2,1)
-            u_1      = theano.tensor.dvector('u_1')
-            u_2      = theano.tensor.dvector('u_2')
-            x_1      = theano.tensor.dvector('x_1')
-            x_1_next = my_op(u_1,u_2,x_1)
-            final_f  = theano.function([u_1,u_2,x_1],[x_1_next])
-            # test the function final_f
-            u_1 = numpy.random.rand(3)
-            u_2 = numpy.random.rand(2)
-            x_1 = [numpy.random.rand()]
-            out = final_f(u_1,u_2,x_1)
-        def t2():
-            ######### Test function does not return correct number of outputs
-            # define the function that is applied recurrently
-            u_1       = theano.tensor.dscalar('u_1')
-            x_1       = theano.tensor.dscalar('x_1')
-            x_1_next  = u_1 * x_1
-            my_f      = theano.function([u_1,x_1],[x_1_next])
-            # define the function that applies the scan operation
-            my_op     = Scan.compiled(my_f,1,2)
-            u_1       = theano.tensor.dvector('u_1')
-            x_1       = theano.tensor.dvector('x_1')
-            x_2       = theano.tensor.dvector('x_2')
-            x_1_next,x_2_next = my_op(u_1,x_1,x_2)
-            final_f   = theano.function([u_1,x_1,x_2],[x_1_next,x_2_next])
-            #generate data
+# Naming convention : 
-            u_1 = numpy.random.rand(3)
+#  u_1,u_2,..   -> sequences
-            x_1 = [numpy.random.rand()]
+#  s_1,s_2,..   -> initial states
-            x_2 = [numpy.random.rand()]
+#  w_1,w_2,..   -> non-sequences
-            out_1,out_2 = final_f(u_1,x_1,x_2)
+###################################
+class T_Scan(unittest.TestCase):
+    def setUp(self):
+        utt.seed_rng()
-        self.failUnlessRaises(ValueError,t1)
+   def test_one(self):
-        self.failUnlessRaises(TypeError,t2)
+      pass
-    #####################################################################
-    def test_generator(self):
-        # compile my_f
-        u_1       = theano.tensor.dscalar('u_1') # dummy input, 
-                                            # required if no inplace is used!
-        x_1       = theano.tensor.dscalar('x_1')
-        w_1       = theano.tensor.dscalar('w_1')
-        x_1_next  = x_1*w_1
-        my_f      = theano.function([u_1,x_1,w_1],[x_1_next])
-        # create operation
-        my_op     = Scan.compiled(my_f,1,1)
-        u_1       = theano.tensor.dvector('u_1') # dummy input, there is no 
-                    #inplace, so output will not be put in place of this u_1!
-        x_1       = theano.tensor.dvector('x_1')
-        w_1       = theano.tensor.dscalar('w_1')
-        x_1_next  = my_op(u_1,x_1,w_1)
-        final_f   = theano.function([u_1,x_1,w_1],[x_1_next])
-        #generate data
-        x_1   = numpy.ndarray(3) # dummy input, just tells for how many time 
-                               # steps to run recursively
-        out_1 = final_f(x_1,[2],2)
-        self.failUnless(numpy.all(out_1 == numpy.asarray([4,8,16]))) 
-    #####################################################################
-    def test_generator_inplace_no_ignore(self):
-        # compile my_f
-        u_1      = theano.tensor.dscalar('u_1')
-        x_1      = theano.tensor.dscalar('x_1')
-        w_1      = theano.tensor.dscalar('w_1')
-        x_1_next = x_1*w_1
-        my_f     = theano.function([u_1,x_1,w_1],[x_1_next])
-        # create operation
-        my_op    = Scan.compiled(my_f,1,1,n_inplace=1)
-        iu_1     = theano.tensor.dvector('iu_1')
-        ix_1     = theano.tensor.dvector('ix_1')
-        w_1      = theano.tensor.dscalar('w_1')
-        ix_1_next= my_op(iu_1,ix_1,w_1)
-        final_f  = theano.function([theano.In(iu_1, mutable=True),ix_1,w_1],
-                                [ix_1_next], mode='FAST_RUN')
-        #generate data
-        iu_1  = numpy.ndarray(3)
-        out_1 = final_f(iu_1,[2],2)
-        # not concretely implemented yet .. 
-        self.failUnless(numpy.all(out_1 == numpy.asarray([4,8,16])))
-        self.failUnless(numpy.all(out_1 == iu_1))
-    #####################################################################
-    def test_generator_inplace_no_ignore_2states(self):
-        # compile my_f
-        u_1      = theano.tensor.dscalar('u_1')
-        u_2      = theano.tensor.dscalar('u_2')
-        x_1      = theano.tensor.dscalar('x_1')
-        x_2      = theano.tensor.dscalar('x_2')
-        w_1      = theano.tensor.dscalar('w_1')
-        x_1_next = x_1*w_1
-        x_2_next = x_2*w_1
-        my_f     = theano.function([u_1,u_2,x_1,x_2,w_1],[x_1_next,x_2_next])
-        # create operation
-        my_op    = Scan.compiled(my_f,2,2,n_inplace=2)
-        iu_1     = theano.tensor.dvector('iu_1')
-        iu_2     = theano.tensor.dvector('iu_2')
-        ix_1     = theano.tensor.dvector('ix_1')
-        ix_2     = theano.tensor.dvector('ix_2')
-        w_1      = theano.tensor.dscalar('w_1')
-        ix_1_next,ix_2_next= my_op(iu_1,iu_2,ix_1,ix_2,w_1)
-        final_f  = theano.function([theano.In(iu_1, mutable=True),
-                              theano.In(iu_2, mutable=True),ix_1,ix_2,
-                              w_1],[ix_1_next,ix_2_next], mode='FAST_RUN')
-        #generate data
-        iu_1  = numpy.ndarray(3)
-        iu_2  = numpy.ndarray(3)
-        out_1,out_2 = final_f(iu_1,iu_2,[2],[1],2)
-        # not concretely implemented yet .. 
-        self.failUnless(numpy.all(out_1 == numpy.asarray([4,8,16])))
-        self.failUnless(numpy.all(out_1 == iu_1))
-        self.failUnless(numpy.all(out_2 == numpy.asarray([2,4,8])))
-        self.failUnless(numpy.all(out_2 == iu_2))
-    #######################################################################
-    def test_generator_inplace(self):
-        #compile my_f
-        u_1      = theano.tensor.dscalar('u_1')
-        x_1      = theano.tensor.dscalar('x_1')
-        x_2      = theano.tensor.dscalar('x_2')
-        x_1_next = u_1 + x_1
-        x_2_next = x_1 * x_2
-        my_f     = theano.function([u_1,x_1,x_2],[x_1_next,x_2_next])
-        # create operation
-        my_op    = Scan.compiled(my_f,2,2,n_inplace=2,n_inplace_ignore=1)
-        du_1     = theano.tensor.dvector('du_1')
-        iu_1     = theano.tensor.dvector('iu_1')
-        ix_1     = theano.tensor.dvector('ix_1')
-        ix_2     = theano.tensor.dvector('ix_2')
-        ix_1_next,ix_2_next = my_op(du_1,iu_1,ix_1,ix_2)
-        final_f=theano.function([theano.In(du_1, mutable = True),
-                                 theano.In(iu_1, mutable = True),
-                            ix_1,ix_2],[ix_1_next,ix_2_next],mode='FAST_RUN')
-        # generate data
-        du_1 = numpy.asarray([0.,0.,0.])
-        iu_1 = numpy.asarray([1.,1.,1.])
-        ix_1 = [1]
-        ix_2 = [1]
-        out_1,out_2 = final_f(du_1,iu_1,ix_1,ix_2)
-        self.failUnless(numpy.all(out_1 == numpy.asarray([2,3,4])))
-        self.failUnless(numpy.all(out_2 == numpy.asarray([1,2,6])))
-        self.failUnless(numpy.all(out_1 == du_1))
-        self.failUnless(numpy.all(out_2 == iu_1))
-    #####################################################################
-    def tets_iterateOnlyOverX(self):
-        u_1      = theano.tensor.dscalar('u_1')
-        x_1      = theano.tensor.dscalar('x_1')
-        x_1_next = u_1*x_1
-        my_f     = theano.function([u_1,x_1],[x_1_next])
-        my_op    = Scan.compiled(my_f,1,1)
-        u_1      = theano.tensor.dvector('u_1')
-        x_1      = theano.tensor.dvector('x_1')
-        x_1_next = my_op(u_1,x_1)
-        final_f  = theano.function([x_1,u_1],[x_1_next])
-        u_1      = numpy.asarray([2,2,2])
-        out_1    = final_f(inp,2)
-        self.failUnless(numpy.all(out_1==numpy.asarray([4,8,16])))
-    #####################################################################
-    def test_iterateOverSeveralInputs(self):
-        u_1 = theano.tensor.dscalar('u_1') # input 1
-        u_2 = theano.tensor.dscalar('u_2') # input 2
-        x_1 = theano.tensor.dscalar('x_1') # output
-        x_1_next = (u_1+u_2)*x_1
-        my_f  = theano.function([u_1,u_2,x_1],[x_1_next])
-        my_op = Scan.compiled(my_f,2,1)
-        u_1 = theano.tensor.dvector('u_1')
-        u_2 = theano.tensor.dvector('u_2')
-        x_1 = theano.tensor.dvector('x_1')
-        x_1_next = my_op(u_1,u_2,x_1)
-        final_f = theano.function([u_1,u_2,x_1],[x_1_next])
-        u_1 = numpy.asarray([1,1,1])
-        u_2 = numpy.asarray([1,1,1])
-        x_1 = [2]
-        out_1 = final_f(u_1,u_2,x_1)
-        self.failUnless(numpy.all(out_1==numpy.asarray([4,8,16])))
-    #####################################################################
-    def test_iterateOverSeveralInputsSeveralInplace(self):
-        iu_1 = theano.tensor.dscalar('iu_1')
-        u_1  = theano.tensor.dscalar('u_1')
-        u_2  = theano.tensor.dscalar('u_2')
-        u_3  = theano.tensor.dscalar('u_3')
-        u_4  = theano.tensor.dscalar('u_4')
-        ix_1 = theano.tensor.dscalar('ix_1')
-        ix_2 = theano.tensor.dscalar('ix_2')
-        x_1  = theano.tensor.dscalar('x_1')
-        w_1  = theano.tensor.dscalar('w_1')
-        ix_1_next = u_3 + u_4
-        ix_2_next = ix_1 + ix_2
-        x_1_next  = x_1 + u_3 + u_4 + ix_1 + ix_2
-        my_f = theano.function([iu_1,u_1,u_2,u_3,u_4,ix_1,ix_2,x_1,w_1],\
-                    [ix_1_next,ix_2_next, x_1_next])
-        my_op = Scan.compiled(my_f,6,3, n_inplace=2,\
-                                    n_inplace_ignore=1)
-        du_1 = theano.tensor.dvector('du_1')
-        iu_1 = theano.tensor.dvector('iu_1')
-        u_1  = theano.tensor.dvector('u_1')
-        u_2  = theano.tensor.dvector('u_2')
-        u_3  = theano.tensor.dvector('u_3')
-        u_4  = theano.tensor.dvector('u_4')
-        x_1  = theano.tensor.dvector('x_1')
-        ix_1 = theano.tensor.dvector('ix_1')
-        ix_2 = theano.tensor.dvector('ix_2')
-        w_1  = theano.tensor.dscalar('w_1')
-        [ix_1_next,ix_2_next,x_1_next]= \
-            my_op(du_1,iu_1,u_1,u_2,u_3,u_4,x_1,ix_1,ix_2,w_1)
-        final_f=theano.function([theano.In(du_1, mutable = True),
-                                 theano.In(iu_1, mutable = True),
-                                 u_1,u_2,u_3,u_4,ix_1,ix_2,x_1,w_1],
-                                 [ix_1_next,ix_2_next,
-                                  x_1_next],mode='FAST_RUN')
-        #generate data
-        du_1 = numpy.asarray([0.,0.,0.])
-        iu_1 = numpy.asarray([0.,1.,2.])
-        u_1  = numpy.asarray([1.,2.,3.])
-        u_2  = numpy.asarray([1.,1.,1.])
-        u_3  = numpy.asarray([2.,2.,2.])
-        u_4  = numpy.asarray([3.,2.,1.])
-        x_1  = [1.]
-        ix_1 = [1.]
-        ix_2 = [1.]
-        w_1  = 2.
-        out_1,out_2,out_3 = final_f(du_1,iu_1,u_1,u_2,u_3,u_4,\
-                ix_1,ix_2,x_1,w_1)
-        self.failUnless(numpy.all(out_3 == numpy.asarray([8.,19.,33.])))
-        self.failUnless(numpy.all(out_1 == numpy.asarray([5.,4.,3.])))
-        self.failUnless(numpy.all(out_2 == numpy.asarray([2.,7.,11.])))
-        self.failUnless(numpy.all(out_1 == du_1))
-        self.failUnless(numpy.all(out_2 == iu_1))
-    #####################################################################
-    def test_computeInPlaceArguments(self):
-        u_1      = theano.tensor.dscalar('u_1')
-        x_1      = theano.tensor.dscalar('x_1')
-        w_1      = theano.tensor.dscalar('w_1')
-        x_1_next = u_1*w_1+x_1
-        my_f     = theano.function([u_1,x_1,theano.In(w_1,update=w_1*2)],
-                        [x_1_next])
-        my_op = Scan.compiled(my_f,1,1)
-        u_1 = theano.tensor.dvector('u_1')
-        x_1 = theano.tensor.dvector('x_1')
-        w_1 = theano.tensor.dscalar('w_1')
-        x_1_next = my_op(u_1,x_1,w_1)
-        final_f = theano.function([u_1,x_1,w_1], [x_1_next])
-        u_1 = [1.,1.,1.]
-        x_1 = [1.]
-        w_1 = 1.
-        out_1 = final_f(u_1,x_1,w_1)
-        self.failUnless(numpy.all(out_1 == numpy.asarray([2,4,8])))
-    #####################################################################
-    def test_timeTaps(self):
-        u_1       = theano.tensor.dscalar('u_1')
-        x_1       = theano.tensor.dscalar('x_1')
-        x_1_t2    = theano.tensor.dscalar('x_1_t2')
-        x_1_t4    = theano.tensor.dscalar('x_1_t4')
-        x_1_next  = u_1+x_1+x_1_t2+x_1_t4
-        my_f      = theano.function([u_1,x_1,x_1_t2,x_1_t4],[x_1_next])
-        my_op     = Scan.compiled(my_f,1,1,taps={0:[2,4]})
-        u_1       = theano.tensor.dvector('u_1')
-        x_1       = theano.tensor.dvector('x_1')
-        x_1_next  = my_op(u_1,x_1)
-        final_f   = theano.function([u_1,x_1],[x_1_next])
-        u_1       = [1.,1.,1.,1.,1.]
-        x_1       = [1.,2.,3.,4.]
-        out_1     = final_f(u_1,x_1)
-        self.failUnless(numpy.all(out_1==numpy.asarray([9.,16.,29.,50.,89.])))
-    #####################################################################
-    def test_constructFunction(self):
-        u_1      = theano.tensor.dscalar('u_1')
-        x_1      = theano.tensor.dscalar('x_1')
-        x_1_next = u_1 + x_1
-        my_op    = Scan.symbolic(([u_1,x_1],x_1_next),1,1)
-        u_1      = theano.tensor.dvector('u_1')
-        x_1      = theano.tensor.dvector('x_1')
-        x_1_next = my_op(u_1,x_1)
-        final_f  = theano.function([u_1,x_1],[x_1_next])
-        u_1      = [1.,1.,1.]
-        x_1      = [1.]
-        out_1    = final_f(u_1,x_1)
-        self.failUnless(numpy.all(out_1==numpy.asarray([2.,3.,4.])))
-    ######################################################################
-    def test_gradOneInputOneOutput(self):
-        u_1      = theano.tensor.dscalar('u_1')
-        x_1      = theano.tensor.dscalar('x_1')
-        x_1_next = u_1*x_1
-        my_op    = Scan.symbolic( ([u_1,x_1],x_1_next), 1,1)
-        u_1     = [1.,2.,3.]
-        x_1     = [1.]
-        verify_grad( my_op , [u_1,x_1] )
-    #######################################################################
-    def test_gradManyInputsManyOutputs(self):
-        u_1      = theano.tensor.dscalar('u_1')
-        u_2      = theano.tensor.dscalar('u_2')
-        x_1      = theano.tensor.dscalar('x_1')
-        x_2      = theano.tensor.dscalar('x_2')
-        x_1_next = x_1*u_1+x_2
-        x_2_next = x_2*u_2+x_1
-        my_op    = Scan.symbolic( ([u_1,u_2,x_1,x_2],
-                                   [x_1_next,x_2_next]),
-                          2,2)
-        u_1  = [1.,.2,3.]
-        u_2  = [1.5,1.25,.35]
-        x_1  = [.5]
-        x_2  = [.65]
-        verify_grad(my_op, [u_1,u_2,x_1,x_2])
-    ######################################################################
-    def test_gradTimeTaps(self):
-        u_1       = theano.tensor.dscalar('u_1')
-        x_1       = theano.tensor.dscalar('x_1')
-        x_1_t_2   = theano.tensor.dscalar('x_1_t_2')
-        x_1_next = x_1_t_2*x_1*u_1
-        my_op    = Scan.symbolic( ([u_1,x_1,x_1_t_2],
-                                   [x_1_next]),
-                        1,1,taps={0:[2]})
-        u_1 = [1.,2.,3.,4.]
-        x_1 = [2.,3.]
-        verify_grad(my_op, [u_1,x_1])
-    #######################################################################
-    def test_gradManyInputsManyOutputsTimeTaps(self):
-        u_1   = theano.tensor.dscalar('u_1')
-        u_2   = theano.tensor.dscalar('u_2')
-        x_1   = theano.tensor.dscalar('x_1')
-        x_1_2 = theano.tensor.dscalar('x_1_2')
-        x_2   = theano.tensor.dscalar('x_2')
-        x_2_2 = theano.tensor.dscalar('x_2_2')
-        x_1_n = x_1*x_2_2 + u_1*x_1_2
-        x_2_n = x_2*x_1_2 + u_2*x_2_2
-        my_op = Scan.symbolic(([u_1,u_2,x_1,x_1_2,
-                                x_2,x_2_2],[x_1_n,
-                                x_2_n]),2,2,taps=
-                                {0:[2],1:[2]})
-        u_1 = [1.,2.,3.,4.]
-        u_2 = [3.,2.,4.,1.]
-        x_1 = [0.1,0.2]
-        x_2 = [1.5,3.5]
-        verify_grad(my_op, [u_1,u_2,x_1,x_2])
 if __name__ == '__main__':
    unittest.main()
--- a/theano/tensor/basic.py
+++ b/theano/tensor/basic.py
@@ -3285,6 +3285,8 @@ class TensorDot(Op):
        return "tensordot"
 tensordot = TensorDot
+#TODO: tensordot should be function as described in rst docs.
 class Outer(Op):
    """ Compute vector-vector outer product
    """

--- a/theano/tensor/nnet.py
+++ b/theano/tensor/nnet.py
@@ -1353,32 +1353,6 @@ prepend_scalar_to_each_row = Prepend_scalar_to_each_row()
 prepend_0_to_each_row = Prepend_scalar_constant_to_each_row(0.)
 prepend_1_to_each_row = Prepend_scalar_constant_to_each_row(1.)
-class solve(gof.Op):
-    """
-    Find the solution to the linear equation Ax=b,
-    where A is a 2d matrix and b is a 1d or 2d matrix.
-    It use numpy.solve to find the solution.
-    """
-    def make_node(self, A, b):
-        if not isinstance(A, gof.Variable) or not A.type==tensor.matrix().type:
-            raise TypeError("We expected that A had a matrix type")
-        if not isinstance(B, gof.Variable) or not B.type==tensor.matrix().type:
-            raise TypeError("We expected that B had a matrix type")
-        node = gof.Apply(op=self, inputs=[A, B], outputs=[tensor.matrix()])
-        return node
-    def perform(self, node, (A, B), (output, )):
-        ret=numpy.solve(A,B)
-        output[0]=ret
-    def grad(self, (theta, A, B), (gtheta,)):
-        raise NotImplementedError()
 logsigm_to_softplus = gof.PatternSub(
    (tensor.log, (sigmoid, 'x')),
    (tensor.neg, (softplus, (tensor.neg, 'x'))),

--- a/theano/tensor/raw_random.py
+++ b/theano/tensor/raw_random.py
@@ -333,7 +333,7 @@ Note that the output will then be of dimension i+1.
 multinomial = random_function('multinomial', 'float64', 1, [0.5, 0.5], ndim_added=1)
 multinomial.__doc__ = """
-Usage: multinomial(random_state, size, n, pvals)
+Usage: multinomial(random_state, size, pvals)
 Sample from a multinomial distribution defined by probabilities pvals,
 as many times as required by size. For instance, if size=(p,q), p*q

--- a/theano/tensor/shared_randomstreams.py
+++ b/theano/tensor/shared_randomstreams.py
@@ -126,6 +126,8 @@ class RandomStreams(object):
    def binomial(self, *args, **kwargs):
        """Return a symbolic binomial sample
+        *args and **kwargs will be passed to numpy.random.RandomState.binomial
        This is a shortcut for a call to `self.gen`
        """
        return self.gen(raw_random.binomial, *args, **kwargs)
@@ -133,6 +135,8 @@ class RandomStreams(object):
    def uniform(self, *args, **kwargs):
        """Return a symbolic uniform sample
+        *args and **kwargs will be passed to numpy.random.RandomState.uniform
        This is a shortcut for a call to `self.gen`
        """
        return self.gen(raw_random.uniform, *args, **kwargs)
@@ -140,6 +144,8 @@ class RandomStreams(object):
    def normal(self, *args, **kwargs):
        """Return a symbolic normal sample
+        *args and **kwargs will be passed to numpy.random.RandomState.normal
        This is a shortcut for a call to `self.gen`
        """
        return self.gen(raw_random.normal, *args, **kwargs)
@@ -147,6 +153,8 @@ class RandomStreams(object):
    def random_integers(self, *args, **kwargs):
        """Return a symbolic random integer sample
+        *args and **kwargs will be passed to numpy.random.RandomState.random_integers
        This is a shortcut for a call to `self.gen`
        """
        return self.gen(raw_random.random_integers, *args, **kwargs)

--- a/theano/tensor/tests/test_nnet.py
+++ b/theano/tensor/tests/test_nnet.py
@@ -108,21 +108,6 @@ class T_prepend(unittest.TestCase):
        self.failUnless(my.shape == (3, 6))
        self.failUnless(numpy.all(my[:,0] == 5.0))
-class T_solve(unittest.TestCase):
-    def setUp(self):
-        self.rng = numpy.random.RandomState(utt.fetch_seed(666))
-    def test0(self):
-        A=self.rng.randn(5,5)
-        b=numpy.array(range(5),dtype=float)
-        x=numpy.linalg.solve(A,b)
-        Ax = numpy.dot(A,x)
-        are = T.numeric_grad.abs_rel_err(Ax, b)
-        self.failUnless(numpy.all(are < 1.0e-5), (are, Ax, b))
-        #print A,b
-        #print numpy.dot(A,x)
 class T_CrossentropyCategorical1Hot(unittest.TestCase):
    def setUp(self):
        utt.seed_rng()