merge

6c4df656 · james@X40 · 42ba5a5d · b9b97057 · 6c4df656 · 6c4df656
--- a/doc/doc/README.txt
+++ b/doc/doc/README.txt
@@ -114,7 +114,7 @@ Setup on OS-X
    Note that compiling gcc42 takes a significant time (hours) so it's probably
    not the best solution if you're in a rush! In my (Doomie) experience, scipy
    failed to compile the first time I tried the command, but the second time
-    it compiled just fine. Same thing with py25-zlib.
+    it compiled fine. Same thing with py25-zlib.


 - Install some kind of BLAS library (TODO: how?)

--- a/doc/doc/how_to_make_ops.txt
+++ b/doc/doc/how_to_make_ops.txt
@@ -305,9 +305,9 @@ This is done by setting the ``destroy_map`` field of the op. ``destroy_map`` mus
 Viewers
 -------

-Similarly, an Op might not modify the inputs, but return an output which shares state with one or several of its inputs. For example, ``transpose`` can be done very efficiently by viewing the same data as the original with modified dimensions and strides. That is fine, but the compiler needs to be told.
+Similarly, an Op might not modify the inputs, but return an output which shares state with one or several of its inputs. For example, ``transpose`` can be done efficiently by viewing the same data as the original with modified dimensions and strides. That is fine, but the compiler needs to be told.

-This is done by setting the ``view_map`` field of the op. It works just like the ``destroy_map`` field: to an output index is associated the list of inputs that it shares state with. For example, ``transpose.view_map == {0: [0]``} because its first output uses the same data as its first input. ``view_map`` is conservative: if there is any probability that an output will be the view of an input, that input must be in the view list of that output.
+This is done by setting the ``view_map`` field of the op. It works like the ``destroy_map`` field: to an output index is associated the list of inputs that it shares state with. For example, ``transpose.view_map == {0: [0]``} because its first output uses the same data as its first input. ``view_map`` is conservative: if there is any probability that an output will be the view of an input, that input must be in the view list of that output.

 Important note: currently, an output can only be the view of one input. This is limiting, as an 'if' or 'switch' op would need to declare its output as a view of both its then and else branches, but for the time being the framework is not powerful enough to handle it. A future version should address this issue.

@@ -316,7 +316,7 @@ Hidden outputs (as a form of op state)

 For performance purposes, an ``op`` might want to have a hidden internal state.

-Example: if we expect to call the op repeatedly on incrementally bigger inputs, we might want private output storage that's a lot bigger than needed and take incrementally bigger views on it, to save allocation overhead. In order to do this, we can simple have two outputs: one that we will return normally and will contain the answer and the other that will be the (larger) container. In this case, the advanced note in the 'reusing outputs' section applies. Furthermore, ``__call__`` should be overriden to only return the first output instead of both of them. Here is what the example's ``perform`` and ``__call__`` would look like:
+Example: if we expect to call the op repeatedly on incrementally bigger inputs, we might want private output storage that's a lot bigger than needed and take incrementally bigger views on it, to save allocation overhead. In order to do this, we can have two outputs: one that we will return normally and will contain the answer and the other that will be the (larger) container. In this case, the advanced note in the 'reusing outputs' section applies. Furthermore, ``__call__`` should be overriden to only return the first output instead of both of them. Here is what the example's ``perform`` and ``__call__`` would look like:

 .. code-block:: python


--- a/doc/doc/internal.txt
+++ b/doc/doc/internal.txt
@@ -27,6 +27,21 @@ However, if the link target is ambiguous, Sphinx will generate errors.
 NB the ``:api:`` reference is special magic by Olivier, in
 ./scripts/docgen.py.

+How to add TODO comments in Sphinx documentation
+-------------------------------------------------
+
+To include a TODO comment in Sphinx documentation, use an indented block as
+follows::
+
+    .. TODO: This is a comment.
+    .. You have to put .. at the beginning of every line :(
+    .. These lines should all be indented.
+
+It will not appear in the output generated.
+
+    .. TODO: Check it out, this won't appear.
+    .. Nor will this.
+
 How to write API documentation
 ---------------------------------------


--- a/doc/doc/module.txt
+++ b/doc/doc/module.txt
@@ -292,7 +292,7 @@ Complex models can be implemented by subclassing ``Module`` (though that is not
            self.l2_coef = M.Member(T.scalar()) # we can add a hyper parameter if we need to
            return self.l2_coef * T.sum(self.w * self.w)

-Using the model is quite simple:
+Here is how we use the model:

 .. code-block:: python


--- a/doc/doc/sparse.txt
+++ b/doc/doc/sparse.txt
@@ -7,8 +7,13 @@ Sparse matrices
 scipy.sparse
 ------------

-Note that you want scipy >= 0.7.0. 0.6 has a very bug and inconsistent
-implementation of sparse matrices.
+Note that you want scipy >= 0.7.0.
+
+.. warning::
+
+    In scipy 0.6, ``scipy.csc_matrix.dot`` has a bug with singleton
+    dimensions. There may be more bugs. It also has inconsistent
+    implementation of sparse matrices.

 We describe the details of the compressed sparse matrix types.
    ``scipy.sparse.csc_matrix``

--- a/doc/doc/tutorial.txt
+++ b/doc/doc/tutorial.txt
@@ -157,7 +157,7 @@ State example
 =============

 In this example, we'll look at a complete logistic regression model, with
-training by simple gradient descent.
+training by gradient descent.

 .. code-block:: python


--- a/doc/index.txt
+++ b/doc/index.txt
@@ -31,7 +31,7 @@ not limited to:

 * constant folding
 * merging of similar subgraphs, to avoid calculating the same values more than once
-* simple arithmetic simplification (``x*y/x -> y``)
+* arithmetic simplification (``x*y/x -> y``)
 * inserting efficient BLAS_ operations
 * using inplace operations wherever it is safe to do so.

@@ -47,7 +47,7 @@ Theano is released under a BSD license (:ref:`link <license>`)
 Sneak peek
 ==========

-Here is a simple example of how to use Theano. It doesn't show
+Here is an example of how to use Theano. It doesn't show
 off many of Theano's features, but it illustrates concretely what
 Theano is.

@@ -110,7 +110,7 @@ There exist another symbolic package in Python, namely sympy_. Theano
 is different from sympy in the sense that while Theano allows symbolic
 manipulation it puts more emphasis on the evaluation of these expressions
 and being able to repeatedly evaluate them on many different inputs. Theano
-is also better suited to handling very large tensors which have no
+is also better suited to handling large tensors which have no
 assumed structures.

 If numpy_ is to be compared to MATLAB_ and sympy_ to Mathematica_,

--- a/doc/install.txt
+++ b/doc/install.txt
@@ -43,17 +43,20 @@ The following libraries and software are optional:
 Easy install
 ------------

-The following command will install the very latest revision of Theano
+The following command will install the latest revision of Theano
 on your system:

+.. TODO: Does this install the latest package version, or the latest Mercurial
+.. revision?
+
 .. code-block:: bash
    
    easy_install http://pylearn.org/hg/theano/archive/tip.tar.gz

-TODO: make sure this works
+.. TODO: make sure this works

-TODO: change the command to install the latest *stable* version of
-Theano, when we figure out where to put it.
+.. TODO: change the command to install the latest *stable* version of
+.. Theano, when we figure out where to put it.


 --------------

--- a/doc/tutorials/advanced/ex1/cop.txt
+++ b/doc/tutorials/advanced/ex1/cop.txt
@@ -17,7 +17,7 @@ an input provided by the end user (using c_extract) or it might simply
 have been calculated by another operation. For each of the outputs,
 the variables associated to them will be declared and initialized.

-The operation then simply has to compute what it needs to using the
+The operation then has to compute what it needs to using the
 input variables and place the results in the output variables.


@@ -88,7 +88,7 @@ variables x_name, y_name and output_name are all of the primitive C

 Implementing multiplication is as simple as multiplying the two input
 doubles and setting the output double to what comes out of it. If you
-had more than one output, you would simply set the variable(s) for
+had more than one output, you would just set the variable(s) for
 each output to what they should be.

 .. warning::

--- a/doc/tutorials/advanced/ex1/ctype.txt
+++ b/doc/tutorials/advanced/ex1/ctype.txt
@@ -154,7 +154,7 @@ it, it's best to publish it somewhere.
        """ % dict(name = name)
    double.c_init = c_init

-Still straightforward. This function simply has to initialize the
+This function has to initialize the
 double we declared previously to a suitable value. This is useful if
 we want to avoid dealing with garbage values, especially if our data
 type is a pointer. This is not going to be called for all Results with
@@ -375,7 +375,7 @@ like this:
     //c_cleanup for x
   }

-It's not very good looking, but it gives you an idea of how things
+It's not pretty, but it gives you an idea of how things
 work (note that the variable names won't be x, y, z, etc. - they will
 get a unique mangled name). The ``fail`` code runs a goto to the
 appropriate label in order to run all cleanup that needs to be

--- a/doc/tutorials/advanced/ex1/op.txt
+++ b/doc/tutorials/advanced/ex1/op.txt
@@ -138,11 +138,10 @@ type and it should make an Apply node with an output Result of type
   mul.make_node = make_node


-This is a pretty simple definition: the first two lines make sure that
-both inputs are Results of the ``double`` type that we created in the
-previous section. We would not want to multiply two arbitrary types,
-it would not make much sense (and we'd be screwed when we implement
-this in C!)
+The first two lines make sure that both inputs are Results of the
+``double`` type that we created in the previous section. We would not
+want to multiply two arbitrary types, it would not make much sense
+(and we'd be screwed when we implement this in C!)

 The last line is the meat of the definition. There we create an Apply
 node representing the application of ``mul`` to ``x`` and ``y``. Apply
@@ -178,8 +177,8 @@ understand the role of all three arguments of ``perform``:
  return, per our own definition.

 - *output_storage*: This is a list of storage cells. There is one
-  storage cell for each output of the Op. A storage cell is quite
-  simply a one-element list (note: it is forbidden to change the
+  storage cell for each output of the Op. A storage cell is
+  a one-element list (note: it is forbidden to change the
  length of the list(s) contained in output_storage). In this example,
  output_storage will contain a single storage cell for the
  multiplication's result.
@@ -204,18 +203,19 @@ Here, ``z`` is a list of one element. By default, ``z == [None]``.
   :ref:`op` documentation.

 .. warning::
-   The data you put in the output_storage must match the type of the
-   symbolic output (this is a situation where the ``node`` argument
-   can come in handy). In the previous example, if you put, say, an
-   ``int`` in ``z[0]`` (even though we gave ``z`` the Theano type
+   The data you put in ``output_storage`` must match the type of the
+   symbolic output. This is a situation where the ``node`` argument
+   can come in handy. In this example, we gave ``z`` the Theano type
   ``double`` in ``make_node``, which means that a Python ``float``
-   must be put there) you might have nasty problems further down the
-   line since Theano often assumes Ops handle typing properly.
+   must be put there. You should not put, say, an ``int`` in ``z[0]``
+   because Theano assumes Ops handle typing properly.


 Trying out our new Op
 =====================

+In the following code, we use our new Op:
+
 >>> x, y = double('x'), double('y')
 >>> z = mul(x, y)
 >>> f = theano.function([x, y], z)
@@ -224,7 +224,7 @@ Trying out our new Op
 >>> f(5.6, 6.7)
 37.519999999999996

-Seems to work. Note that there is an implicit call to
+Note that there is an implicit call to
 ``double.filter()`` on each argument, so if we give integers as inputs
 they are magically casted to the right type. Now, what if we try this?

@@ -237,7 +237,8 @@ Traceback (most recent call last):
 AttributeError: 'int' object has no attribute 'type'

 Well, ok. We'd like our Op to be a bit more flexible. This can be done
-by fixing ``make_node`` a little bit:
+by modifying ``make_node`` to accept Python ``int`` or ``float`` as
+``x`` and/or ``y``:

 .. code-block:: python

@@ -252,8 +253,8 @@ by fixing ``make_node`` a little bit:
   mul.make_node = make_node

 Whenever we pass a Python int or float instead of a Result as ``x`` or
-``y``, make_node will convert it to :ref:`constant` for us. Constant
-is basically a :ref:`result` we statically know the value of.
+``y``, make_node will convert it to :ref:`constant` for us. ``gof.Constant``
+is a :ref:`result` we statically know the value of.

 >>> x = double('x')
 >>> z = mul(x, 2)
@@ -263,18 +264,16 @@ is basically a :ref:`result` we statically know the value of.
 >>> f(3.4)
 6.7999999999999998

-And now it works the way we want it to.
+Now the code works the way we want it to.


 Final version
 =============

-While I would call the above definitions appropriately pedagogical, it
-is not necessarily the best way to do things, especially when you need
-to define the other basic arithmetic operations ``add``, ``sub`` and
-``div``. It appears that the code for ``make_node`` can be shared
-between these Ops. Here is the final version of the four arithmetic
-operators (well, pending revision of this tutorial, I guess):
+The above example is pedagogical.  When you define other basic arithmetic
+operations ``add``, ``sub`` and ``div``, code for ``make_node`` can be
+shared between these Ops. Here is revised implementation of these four
+arithmetic operators:

 .. code-block:: python

@@ -313,37 +312,27 @@ operators (well, pending revision of this tutorial, I guess):
   div = BinaryDoubleOp(name = 'div',
                        fn = lambda x, y: x / y)

-Can you see how the definition of ``mul`` here does exactly the same
-thing as the definition we had earlier?
-
-Instead of working directly on an instance of Op, we create a subclass
-of Op that we can parametrize. First, all the operations we define are
-binary, they all work on inputs with type ``double`` and they all
-return a single Result of type ``double``. Therefore, ``make_node``
-basically does the same thing for all these operations, except for the
-fact that the Op reference passed as first argument to Apply must be
-themselves. Therefore we can abstract out most of the logic and pass
-self to Apply, which seems natural. We can also easily define
-``perform`` as depending on a function or lambda expression passed in
-the constructor.
-
-This design therefore appears to be a flexible way to define our four
-basic operations (and possibly many more!) without duplicating
-code. The same way a Type subclass represents a set of structurally
-similar types (see previous section), an Op subclass represents a set
-of structurally similar operations: operations that have the same
-input/output types, operations that only differ in one small detail,
-etc. If you see common patterns in several Ops that you want to
-define, it can be a good idea to abstract out what you can, as I did
-here. Remember that an Op is just an object which satisfies the
-contract described above on this page and that you should use all the
-tools at your disposal to create these objects as efficiently as
-possible.
-
-While I could have made a generic DoubleOp where the number of
-arguments can also be given as a parameter, I decided it was not
-necessary here.
-
+Instead of working directly on an instance of Op, we create a subclass of
+Op that we can parametrize. All the operations we define are binary. They
+all work on two inputs with type ``double``. They all return a single
+Result of type ``double``. Therefore, ``make_node`` does the same thing
+for all these operations, except for the Op reference ``self`` passed
+as first argument to Apply.  We define ``perform`` using the function
+``fn`` passed in the constructor.
+
+This design is a flexible way to define basic operations without
+duplicating code. The same way a Type subclass represents a set of
+structurally similar types (see previous section), an Op subclass
+represents a set of structurally similar operations: operations that
+have the same input/output types, operations that only differ in one
+small detail, etc. If you see common patterns in several Ops that you
+want to define, it can be a good idea to abstract out what you can.
+Remember that an Op is just an object which satisfies the contract
+described above on this page and that you should use all the tools at
+your disposal to create these objects as efficiently as possible.
+
+**Exercise**: Make a generic DoubleOp, where the number of
+arguments can also be given as a parameter.

 **Next:** `Implementing double in C`_


--- a/doc/tutorials/advanced/ex1/type.txt
+++ b/doc/tutorials/advanced/ex1/type.txt
--- a/doc/tutorials/advanced/index.txt
+++ b/doc/tutorials/advanced/index.txt
@@ -11,7 +11,7 @@ Before tackling this tutorial, it is highly recommended to read the

 The advanced tutorial is meant to give the reader a greater
 understanding of the building blocks of Theano. Through this tutorial
-we are going to define one :ref:`type`, ``double`` and basic
+we are going to define one :ref:`type`, ``double``, and basic
 arithmetic :ref:`operations <op>` on that Type. We will first define
 them using a Python implementation and then we will add a C
 implementation.

--- a/doc/tutorials/advanced/inplace.txt
+++ b/doc/tutorials/advanced/inplace.txt
@@ -166,7 +166,7 @@ first input (rank 0).
 Purely destructive operations
 =============================

-While some operations will operate inplace on their inputs, some will
+While some operations will operate inplace on their inputs, some might
 simply destroy or corrupt them. For example, an Op could do temporary
 calculations right in its inputs. If that is the case, Theano also
 needs to be notified. The way to notify Theano is to assume that some

--- a/doc/tutorials/advanced/optimization.txt
+++ b/doc/tutorials/advanced/optimization.txt
@@ -176,7 +176,7 @@ optimization you wrote. For example, consider the following:
 >>> e
 [div(mul(add(y, z), x), add(y, z))]

-Nothing happened here. The reason is simple: ``add(y, z) != add(y,
+Nothing happened here. The reason is: ``add(y, z) != add(y,
 z)``. That is the case for efficiency reasons. To fix this problem we
 first need to merge the parts of the graph that represent the same
 computation, using the ``merge_optimizer`` defined in

--- a/doc/tutorials/advanced/tips.txt
+++ b/doc/tutorials/advanced/tips.txt
@@ -14,7 +14,7 @@ WRITEME
 Don't define new Ops unless you have to
 =======================================

-It is usually not very useful to define Ops that can be easily
+It is usually not useful to define Ops that can be easily
 implemented using other already existing Ops. For example, instead of
 writing a "sum_square_difference" Op, you should probably just write a
 simple function:

--- a/doc/tutorials/basic/adding.txt
+++ b/doc/tutorials/basic/adding.txt
@@ -30,6 +30,12 @@ add. Note that from now on, we will use the term :term:`Result` to
 mean "symbol" (in other words, ``x``, ``y``, ``z`` are all Result
 objects).

+If you are following along and typing into an interpreter, you may have
+noticed that there was a slight delay in executing the ``function``
+instruction. Behind the scenes, ``f`` was being compiled into C code.
+
+    .. TODO: help
+
 -------------------------------------------

 **Step 1**
@@ -119,16 +125,15 @@ The result is a numpy array. We can also use numpy arrays directly as
 inputs:

 >>> import numpy
->>> f(numpy.ones((3, 5)), numpy.ones((3, 5)))
-array([[ 2.,  2.,  2.,  2.,  2.],
-       [ 2.,  2.,  2.,  2.,  2.],
-       [ 2.,  2.,  2.,  2.,  2.]])
+>>> f(numpy.array([[1, 2], [3, 4]]), numpy.array([[10, 20], [30, 40]]))
+array([[ 11.,  22.],
+       [ 33.,  44.]])

 It is possible to add scalars to matrices, vectors to matrices,
 scalars to vectors, etc. The behavior of these operations is defined
 by :term:`broadcasting`.

-The following types are readily available:
+The following types are available:

 * **byte**: bscalar, bvector, bmatrix
 * **32-bit integers**: iscalar, ivector, imatrix
@@ -136,16 +141,15 @@ The following types are readily available:
 * **float**: fscalar, fvector, fmatrix
 * **double**: dscalar, dvector, dmatrix

+The previous list is not exhaustive. A guide to all types compatible
+with numpy arrays may be found :ref:`here <predefinedtypes>`.
+
 .. note::

   Watch out for the distinction between 32 and 64 bit integers (i
   prefix vs the l prefix) and between 32 and 64 bit floats (f prefix
   vs the d prefix).

-Try to mix and match them and see what happens. The previous list is
-not exhaustive. A guide to all types compatible with numpy arrays may
-be found :ref:`here <predefinedtypes>`.
-

 **Next:** `More examples`_


--- a/doc/tutorials/basic/dlogistic.png
+++ b/doc/tutorials/basic/dlogistic.png
--- a/doc/tutorials/basic/examples.txt
+++ b/doc/tutorials/basic/examples.txt
@@ -17,39 +17,63 @@ the logistic curve, which is given by:

   s(x) = \frac{1}{1 + e^{-x}}

+.. figure:: logistic.png
+
+    A plot of the logistic function, with x on the x-axis and s(x) on the
+    y-axis.
+
 You want to compute the function :term:`elementwise` on matrices of
-doubles.
+doubles, which means that you want to apply this function to each
+individual element of the matrix.

 Well, what you do is this:

 >>> x = T.dmatrix('x')
 >>> s = 1 / (1 + T.exp(-x))
 >>> logistic = function([x], s)
+>>> logistic([[0, 1], [-1, -2]])
+array([[ 0.5       ,  0.73105858],
+       [ 0.26894142,  0.11920292]])

-Alternatively:
+The reason logistic is performed elementwise is because all of its
+operations---division, addition, exponentiation, and division---are
+themselves elementwise operations.

->>> s = (T.tanh(x) + 1) / 2
->>> logistic = function([x], s)
+It is also the case that:
+
+.. math::
+
+    s(x) = \frac{1}{1 + e^{-x}} = \frac{1 + \tanh(x/2)}{2}
+
+We can verify that this alternate form produces the same values:
+
+>>> s2 = (1 + T.tanh(x / 2)) / 2
+>>> logistic2 = function([x], s2)
+>>> logistic2([[0, 1], [-1, -2]])
+array([[ 0.5       ,  0.73105858],
+       [ 0.26894142,  0.11920292]])


 Computing more than one thing at the same time
 ==============================================

 Theano supports functions with multiple outputs. For example, we can
-compute the :term:`elementwise` absolute difference between two
-matrices ``x`` and ``y`` and the squared difference at the same time:
+compute the :term:`elementwise` difference, absolute difference, and
+squared difference between two matrices ``x`` and ``y`` at the same time:

 >>> x, y = T.dmatrices('xy')
 >>> diff = x - y
->>> abs_diff = abs(x - y)
+>>> abs_diff = abs(diff)
 >>> diff_squared = diff**2
->>> f = function([x, y], [abs_diff, diff_squared])
+>>> f = function([x, y], [diff, abs_diff, diff_squared])

 When we use the function, it will return the two results (the printing
 was reformatted for readability):

 >>> f([[1, 1], [1, 1]], [[0, 1], [2, 3]])
 [array([[ 1.,  0.],
+        [-1., -2.]]),
+ array([[ 1.,  0.],
        [ 1.,  2.]]),
 array([[ 1.,  0.],
        [ 1.,  4.]])]
@@ -62,9 +86,12 @@ Computing gradients
 ===================

 Now let's use Theano for a slightly more sophisticated task: create a
-function which computes the derivative of some expression ``e`` with
+function which computes the derivative of some expression ``y`` with
 respect to its parameter ``x``. For instance, we can compute the
-gradient of :math:`x^2` with respect to :math:`x`.
+gradient of :math:`x^2` with respect to :math:`x`. Note that:
+:math:`d(x^2)/dx = 2 \cdot x`.
+
+Here is code to compute this gradient:

 >>> x = T.dscalar('x')
 >>> y = x**2
@@ -76,17 +103,26 @@ array(8.0)
 array(188.40000000000001)

 We can also compute the gradient of complex expressions such as the
-logistic function defined above:
+logistic function defined above. It turns out that the derivative of the
+logistic is: :math:`ds(x)/dx = s(x) \cdot (1 - s(x))`.
+
+.. figure:: dlogistic.png
+
+    A plot of the gradient of the logistic function, with x on the x-axis
+    and :math:`ds(x)/dx` on the y-axis.

 >>> x = T.dmatrix('x')
 >>> s = 1 / (1 + T.exp(-x))
 >>> gs = T.grad(s, x)
->>> glogistic = function([x], gs)
+>>> dlogistic = function([x], gs)
+>>> dlogistic([[0, 1], [-1, -2]])
+array([[ 0.25      ,  0.19661193],
+       [ 0.19661193,  0.10499359]])
+

 The resulting function computes the gradient of its first argument
-with respect to the second. It is pretty much equivalent in semantics
-and in computational complexity as what you would obtain through an
-`automatic differentiation`_ tool.
+with respect to the second. In this way, Theano can be used for
+`automatic differentiation`_.

 .. note::

@@ -125,7 +161,7 @@ Making a function with state

 It is also possible to make a function with an internal state. For
 example, let's say we want to make an accumulator: at the beginning,
-the state is initialized to zero, then on each function call the state
+the state is initialized to zero. Then, on each function call, the state
 is incremented by the function's argument. We'll also make it so that
 the increment has a default value of 1.

@@ -136,12 +172,12 @@ First let's define the accumulator function:
 >>> new_state = state + inc
 >>> accumulator = function([(inc, 1), ((state, new_state), 0)], new_state)

-The first argument is a pair. As we saw in the previous section this
-simply means that inc is an input with a default value of 1. The
-second argument has a new syntax which creates an internal state or
+The first argument is a pair. As we saw in the previous section, this
+means that ``inc`` is an input with a default value of 1. The
+second argument has syntax that creates an internal state or
 closure. The syntax is ``((state_result, new_state_result),
 initial_value)``. What this means is that every time ``accumulator``
-will be called, the value of the internal ``state`` will be replaced
+is called, the value of the internal ``state`` will be replaced
 by the value computed as ``new_state``. In this case, the state will
 be replaced by the result of incrementing it by ``inc``.

@@ -152,7 +188,7 @@ however you like as long as the name does not conflict with the names
 of other inputs.

 Anyway, let's try it out! The state can be accessed using the square
-brackets notation ``[]``. You may access the state either by putting
+brackets notation ``[]``. You may access the state either by using
 the :ref:`result` representing it or the name of that
 :ref:`result`. In our example we can access the state either with the
 ``state`` object or the string 'state'.
@@ -174,8 +210,8 @@ array(301.0)
 >>> accumulator['state']
 array(301.0)

-It is of course possible to reset the state. This is done very
-naturally by assigning to the state using the square brackets
+It is possible to reset the state. This is done
+by assigning to the state using the square brackets
 notation:

 >>> accumulator['state'] = 5

--- a/doc/tutorials/basic/logistic.gp
+++ b/doc/tutorials/basic/logistic.gp
+set terminal svg font "Bitstream Vera Sans,10" size 300,200
+set output "logistic.svg"
+
+set xrange [-6:6]
+set xzeroaxis linetype -1
+set yzeroaxis linetype -1
+set xtics axis nomirror
+set ytics axis nomirror 0,0.5,1
+set key off
+set grid
+set border 1
+
+set samples 400
+
+plot 1/(1 + exp(-x)) with line linetype rgbcolor "blue" linewidth 2
+
+set ytics axis nomirror 0,0.25
+set output "dlogistic.svg"
+plot 1/(1 + exp(-x)) * (1 - 1/(1 + exp(-x))) with line linetype rgbcolor "blue" linewidth 2
--- a/doc/tutorials/basic/logistic.png
+++ b/doc/tutorials/basic/logistic.png
--- a/doc/tutorials/basic/module.txt
+++ b/doc/tutorials/basic/module.txt
@@ -3,11 +3,11 @@
 Using Module
 ============

-Now that we're familiar with the basics, we can see Theano's more
+Now that we're familiar with the basics, we introduce Theano's more
 advanced interface, Module. This interface allows you to define Theano
 "objects" which can have many state variables and many methods sharing
-these states. This is what you should use if you aim to use Theano to
-define complex systems such as a neural network.
+these states. This is what you should use to define complex systems such
+as a neural network.


 Remake of the "state" example
@@ -61,7 +61,7 @@ defined in our Module.
 The inc variable doesn't need to be declared as a Member because it
 will only serve as an input to the method we will define. This is why
 it is defined as an :ref:`external` variable. Do note that it is
-inconsequential if you do declare it as a Member - it is very unlikely
+inconsequential if you do declare it as a Member - it is unlikely
 to cause you any problems.

 .. note::

--- a/doc/tutorials/basic/randomstreams.txt
+++ b/doc/tutorials/basic/randomstreams.txt
@@ -52,7 +52,7 @@ object for each of fn and gn).

 >>> m.nearly_zeros = Method([], rv_u + rv_u - 2 * rv_u)

-This function will always return a 2x2 matrix of very small numbers, or possibly
+This function will always return a 2x2 matrix of small numbers, or possibly
 zeros.  It illustrates that random variables are not re-drawn every time they
 are used, they are only drawn once (per call).

@@ -84,7 +84,7 @@ seed method of a RandomStreamsInstance.

 Of course, a RandomStreamsInstance can contain several RandomState instances and
 these will _not_ all be seeded to the same seed_value.  They will all be seeded
-deterministically and very-probably uniquely as a function of the seed_value.
+deterministically and probably uniquely as a function of the seed_value.

 Seeding the generator in this way makes it possible to repeat random streams.


--- a/doc/tutorials/basic/tools.txt
+++ b/doc/tutorials/basic/tools.txt
@@ -22,7 +22,7 @@ much longer than intended - maybe we should just link to it! --OB
 Predefined types
 ----------------

-Theano gives you many premade types to work with. These types are
+Predefined types are
 located in the ``theano.tensor`` package. The name of the types follow
 a recipe:

@@ -53,9 +53,9 @@ col    [m, 1] No                                         Yes
 matrix [m, n] No                                         No
 ====== ====== ========================================== =============================================

-So for example if you want a row of 32-bit floats, it is available
-under ``theano.tensor.frow`` and if you want a matrix of unsigned
-32-bit integers it is available under ``theano.tensor.imatrix``.
+So, if you want a row of 32-bit floats, it is available
+as ``theano.tensor.frow``. If you want a matrix of unsigned
+32-bit integers it is available as ``theano.tensor.imatrix``.

 Each of the types described above can be constructed by two methods:
 a singular version (e.g., ``dmatrix``) and a plural version
@@ -108,16 +108,18 @@ complex128  complex          128 (two float64)

 .. note::

-   Even though ``theano.tensor`` does not define any type using
-   ``complex`` dtypes (``complex64`` or ``complex128``), you can define
-   them explicitly with ``Tensor`` (see example below). However, few
-   operations are fully supported for complex types: as of version 0.1,
-   only elementary operations (``+-*/``) have C implementations.
+   Even though ``theano.tensor`` does not define any type
+   using ``complex`` dtypes (``complex64`` or ``complex128``),
+   you can define them explicitly with ``Tensor`` (see example
+   below). However, few operations are fully supported for complex
+   types: as of version 0.1, only elementary operations (``+-*/``)
+   have C implementations. Additionally, complex types have received
+   little testing.


-The broadcastable pattern, on the other hand, indicates both the
-number of dimensions and whether a particular dimension has length
-1. Here is a handy table mapping the :term:`broadcastable
+The broadcastable pattern indicates both the number of dimensions and
+whether a particular dimension must have length 1.
+Here is a table mapping the :term:`broadcastable
 <broadcasting>` pattern to what kind of tensor it encodes:

 ===================== =================================
@@ -136,14 +138,18 @@ pattern               interpretation
 [False, False, False] A MxNxP tensor (pattern of a + b)
 ===================== =================================

+For dimensions in which broadcasting is False, the length of this
+dimension can be 1 or more.  For dimensions in which broadcasting is True,
+the length of this dimension must be 1.
+
 When two tensors have a different number of dimensions, the broadcastable
-pattern is *expanded to the left*, by padding with ``True``. So, for example,
+pattern is *expanded to the left*, by padding with ``True``. For example,
 a vector's pattern, ``[False]``, could be expanded to ``[True, False]``, and
 would behave like a row (1xN matrix). In the same way, a matrix (``[False,
 False]``) would behave like a 1xNxP tensor (``[True, False, False]``).

-So if we wanted to create a type representing a 3D array of unsigned
-bytes, we would simply do: 
+If we wanted to create a type representing a 3D array of unsigned
+bytes, we would do: 

 .. code-block:: python

@@ -158,10 +164,8 @@ bytes, we would simply do:
 Ops
 ===

-There's a lot of operations readily available in the ``theano.tensor``
-package. They do not require much explanation according to this
-tutorial's author, so he will simply direct you to the :ref:`oplist`
-:)
+There are a lot of operations available in the ``theano.tensor`` package.
+See :ref:`oplist`.




--- a/doc/tutorials/tensorop.txt
+++ b/doc/tutorials/tensorop.txt
@@ -24,7 +24,7 @@ difficult, we will give our Op a solid C implementation.
 Implementing a new Op in Python
 ===============================

-This is actually very simple to do. You are required to define two
+You are required to define two
 methods - one to create the :ref:`apply` node every time your Op is
 applied to some inputs, declaring the outputs in the process and
 another to operate on the inputs. There is also one optional method

--- a/theano/compile/function_module.py
+++ b/theano/compile/function_module.py
@@ -115,6 +115,19 @@ class Function(object):

    """

+    pickle_aliased_memory_strategy = 'warn'
+    """How to deal with pickling finding aliased storage.
+
+    Meaningful settings are: 'ignore', 'warn', 'raise'
+
+    If the value is 'warn', then a message will be printed to stderr if aliased storage is
+    dectected during pickle.dump.
+
+    If the value is 'raise', then an AliasedMemoryError will be raised if aliased storage is
+    detected during pickle.dump.
+    
+    """
+
    def __init__(self, fn, input_storage, output_storage, indices, outputs, defaults, unpack_single, maker):
        """
        fn -> a function returned by some linker's make_thunk method
@@ -334,9 +347,29 @@ def _pickle_Function(f):
        else:
            defaults.append(ins[0])
            del ins[0]
-    rval = (_constructor_Function, (f.maker, defaults, [x.data for x in f.input_storage]))
+
+    inputs_data = [x.data for x in f.input_storage]
+
+    #HACK to detect aliased storage.
+    # aliased relationships will not be preserved across the pickle operation
+    if not (f.pickle_aliased_memory_strategy == 'ignore'):
+        all_data = defaults + inputs_data
+        for i, d_i in enumerate(all_data):
+            for j, d_j in enumerate(all_data):
+                if (i < j) and isinstance(d_i, numpy.ndarray) and isinstance(d_j, numpy.ndarray):
+                    if f.pickle_aliased_memory_strategy == 'warn':
+                        print >> sys.stderr, ('WARNING: '
+                                'aliased relationship between Function arguments '
+                                'will not be preserved by un-pickling operation')
+                        #print >> sys.stderr, d_i, d_j, id(d_i), id(d_j)
+                    else:
+                        raise AliasedMemoryError(d_i, d_j)
+
+    rval = (_constructor_Function, (f.maker, defaults, inputs_data))
    return rval

+class AliasedMemoryError(Exception):pass
+
 def _constructor_Function(maker, defaults, data):
    f = maker.create(defaults, trustme = True)
    assert len(f.input_storage) == len(data)

--- a/theano/compile/module.py
+++ b/theano/compile/module.py
@@ -1143,13 +1143,11 @@ class Module(ComponentDict):
        value=unpack_member_and_external(value)
        if not hasattr(self,"local_attr"):
            self.__dict__["local_attr"]={}
-            self.__dict__["local_attr_order"]=[]
-
-        self.__dict__["local_attr"][attr]=value
-        self.__dict__["local_attr_order"].append((attr, value))
+        
+        self.__dict__["local_attr"][attr] = value

    def build(self, mode, memo):
-        for k,v in list(self.local_attr_order): #.iteritems():
+        for k,v in self.local_attr.iteritems():
            self.__setattr__(k,v)
        inst = super(Module, self).build(mode, memo)
        if not isinstance(inst, ModuleInstance):
@@ -1181,41 +1179,44 @@ class Module(ComponentDict):
        for name, value in chain(init.iteritems(), kwinit.iteritems()):
            inst[name] = value

-    def make_mi(self, *args, **kwargs):
-        mods=[]
-        meth=[]#we put the method after the member to be sure of the ordering.
+    def make_module_instance(self, *args, **kwargs):
+        """
+        Module's __setattr__ method hides all members under local_attr. This
+        method iterates over those elements and wraps them so they can be used
+        in a computation graph. The "wrapped" members are then set as object 
+        attributes accessible through the dotted notation syntax (<module_name>
+        <dot> <member_name>). Submodules are handled recursively.
+        """

+        # Function to go through member lists and dictionaries recursively,
+        # to look for submodules on which make_module_instance needs to be called
+        def recurse(v):
+            iter = enumerate(v) if isinstance(v,list) else v.iteritems()
+            for sk,sv in iter:
+                if isinstance(sv,(list,dict)):
+                    sv = recurse(sv)
+                elif isinstance(sv,Module):
+                    sv = sv.make_module_instance(args,kwargs)
+                v[sk] = sv
+            return v
+             
        for k,v in self.local_attr.iteritems():
            if isinstance(v,Module):
-                mods.append((k, v))
+                v = v.make_module_instance(args,kwargs)
+                self[k] = self.__wrapper__(v)
            elif isinstance(v,Method):
-                meth.append((k,v))
-            elif isinstance(v, list) and isinstance(v[0],Module):
-                temp = []
-                for m in v:
-                    m=m.make_mi(args,kwargs)
-                    m = self.__wrapper__(m)
-                    temp.append(m)
-                self[k] = self.__wrapper__(temp)
+                self.__setitem__(k,v)
            else:
-                v = self.__wrapper__(v)
+                # iterate through lists and dictionaries to wrap submodules
+                if isinstance(v,(list,dict)):
+                    self[k] = self.__wrapper__(recurse(v))
                try:
-                    self[k] = v
+                    self[k] = self.__wrapper__(v)
                except:
                    if isinstance(v, Component):
                        raise
                    else:
                        self.__dict__[k] = v
-#                self.__setitem__(k,v)
-
-        for k,v in mods:
-            v=v.make_mi(args,kwargs)
-            v = self.__wrapper__(v)
-            self[k] = v
-            
-        for k,v in meth:
-            self.__setitem__(k,v)
-
        return self

    def make(self, *args, **kwargs):
@@ -1226,7 +1227,7 @@ class Module(ComponentDict):
        arguments and the keyword arguments. If 'mode' is in the
        keyword arguments it will be passed to build().
        """
-        self.make_mi(args,kwargs)
+        self.make_module_instance(args,kwargs)

        mode = kwargs.pop('mode', default_mode)
        rval = self.make_no_init(mode)

--- a/theano/compile/tests/test_module.py
+++ b/theano/compile/tests/test_module.py
@@ -4,7 +4,9 @@
 __docformat__ = "restructuredtext en"

 import cPickle, numpy, unittest
+from theano.compile.mode import default_mode
 from theano.compile.module import *
+from theano.compile.function_module import AliasedMemoryError
 import theano.tensor as T
 import sys
 import theano
@@ -570,7 +572,8 @@ def test_pickle():
    M.f = Method([a], a + M.x + M.y)
    M.g = Method([a], a * M.x * M.y)

-    m = M.make(x=numpy.zeros((4,5)), y=numpy.ones((2,3)))
+    mode = default_mode if default_mode is not 'DEBUG_MODE' else 'FAST_RUN'
+    m = M.make(x=numpy.zeros((4,5)), y=numpy.ones((2,3)), mode=mode)

    m_dup = cPickle.loads(cPickle.dumps(m))

@@ -587,38 +590,56 @@ def test_pickle():
    assert m_dup.y is m_dup.g.input_storage[2].data

 def test_pickle_aliased_memory():
+    M = Module()
+    M.x = (T.dmatrix())
+    M.y = (T.dmatrix())
+    a = T.dmatrix()
+    M.f = Method([a], a + M.x + M.y)
+    M.g = Method([a], a * M.x * M.y)
+
+    mode = default_mode if default_mode is not 'DEBUG_MODE' else 'FAST_RUN'
+    m = M.make(x=numpy.zeros((4,5)), y=numpy.ones((2,3)), mode=mode)
+    m.y = m.x[:]
+
+    #m's x and y memory is aliased....
+    m.x[0,0] = 3.14
+    assert m.y[0,0] == 3.14
+
+    import StringIO
+
+    sio = StringIO.StringIO()
+
+    old_stderr = sys.stderr
+    sys.stderr = sio
+
+    m.f.pickle_aliased_memory_strategy = 'warn'
+    m.g.pickle_aliased_memory_strategy = 'warn'
+    m_dup = cPickle.loads(cPickle.dumps(m))
+    sys.stderr = old_stderr
+    assert sio.getvalue().startswith('WARNING: aliased relat')
    try:
-        M = Module()
-        M.x = (T.dmatrix())
-        M.y = (T.dmatrix())
-        a = T.dmatrix()
-        M.f = Method([a], a + M.x + M.y)
-        M.g = Method([a], a * M.x * M.y)
-
-        m = M.make(x=numpy.zeros((4,5)), y=numpy.ones((2,3)))
-        m.y = m.x[:]
+        m.f.pickle_aliased_memory_strategy = 'raise'
+        m.g.pickle_aliased_memory_strategy = 'raise'
        m_dup = cPickle.loads(cPickle.dumps(m))
+    except AliasedMemoryError, e:
+        return

-        #m's memory is aliased....
-        m.x[0,0] = 3.14
-        assert m.y[0,0] == 3.14
+    assert 0 #should have failed to pickle

-        #is m_dup's memory aliased?
-        m_dup.x[0,0] = 3.14
-        assert m_dup.y[0,0] == 3.14
+    #is m_dup's memory aliased?
+    m_dup.x[0,0] = 3.14
+    assert m_dup.y[0,0] == 3.14

-        #m's memory is aliased differently....
-        m.y = m.x[1:2]
-        m_dup = cPickle.loads(cPickle.dumps(m))
+    #m's memory is aliased differently....
+    m.y = m.x[1:2]
+    m_dup = cPickle.loads(cPickle.dumps(m))

+    if 0:
        #is m_dup's memory aliased the same way?
        m.x[1,0] = 3.142
        assert m.y[0,0] == 3.142
        m_dup.x[1,0] = 3.142
        assert m_dup.y[0,0] == 3.142
-    except Exception, e:
-        raise Exception('Known Failure: These branch cuts are known to fail', str(e))
-


 if __name__ == '__main__':