nnet ops docs

5971ef06 · Dumitru Erhan · 18fae10e · 5971ef06
--- a/doc/library/tensor/nnet.txt
+++ b/doc/library/tensor/nnet.txt
@@ -9,15 +9,105 @@
   :synopsis: Ops for neural networks
 .. moduleauthor:: LISA
+.. function:: sigmoid(x)
-TODO: Give examples for how to use these things! They are pretty complicated.
+   Returns the standard sigmoid nonlinearity applied to x
+    :Parameters: *x* - symbolic Tensor (or compatible)
+    :Return type: same as x
+    :Returns: element-wise sigmoid: :math:`sigmoid(x) = \frac{1}{1 + \exp(-x)}`. 
-.. function:: sigmoid(*todo)
+Example:
-.. function:: softplus(*todo)
+.. code-block:: python
-.. function:: softmax(*todo)
+    x,y,b = T.dvectors('x','y','b')
+    W = T.dmatrix('W')
+    y = T.nnet.sigmoid(T.dot(W,x) + b)
-.. function:: binary_crossentropy(*todo)
+.. note:: The underlying code will return an exact 0 or 1 if an element of x is too small or too big.
-.. function:: categorical_crossentropy(*todo)
+.. function:: softplus(x)
+   Returns the softplus nonlinearity applied to x
+    :Parameter: *x* - symbolic Tensor (or compatible)
+    :Return type: same as x
+    :Returns: elementwise softplus: :math:`softplus(x) = \log_e{\left(1 + \exp(x)\right)}`. 
+.. note:: The underlying code will return an exact 0 if an element of x is too small.
+.. code-block:: python
+    x,y,b = T.dvectors('x','y','b')
+    W = T.dmatrix('W')
+    y = T.nnet.softplus(T.dot(W,x) + b)
+.. function:: softmax(x)
+   Returns the softmax function of x:
+    :Parameter: *x* symbolic **2D** Tensor (or compatible). 
+    :Return type: same as x
+    :Returns: a symbolic 2D tensor whose ijth element is  :math:`softmax_{ij}(x) = \frac{\exp{x_{ij}}}{\sum_k\exp(x_{ik})}`. 
+The softmax function will, when applied to a matrix, compute the softmax values row-wise. 
+.. code-block:: python
+    x,y,b = T.dvectors('x','y','b')
+    W = T.dmatrix('W')
+    y = T.nnet.softmax(T.dot(W,x) + b)
+.. function:: binary_crossentropy(output,target)
+   Computes the binary cross-entropy between a target and an output:
+    :Parameters:
+       * *target* - symbolic Tensor (or compatible)
+       * *output* - symbolic Tensor (or compatible)
+    :Return type: same as target
+    :Returns: a symbolic tensor, where the following is applied elementwise :math:`crossentropy(t,o) = -(t\cdot log(o) + (1 - t) \cdot log(1 - o))`. 
+The following block implements a simple auto-associator with a sigmoid
+nonlinearity and a reconstruction error which corresponds to the binary
+cross-entropy (note that this assumes that x will contain values between 0 and
+1):
+.. code-block:: python
+    x,y,b = T.dvectors('x','y','b')
+    W = T.dmatrix('W') 
+    h = T.nnet.sigmoid(T.dot(W,x) + b)
+    x_recons = T.nnet.sigmoid(T.dot(V,h) + c)
+    recon_cost = T.nnet.binary_crossentropy(x_recons,x).mean()
+.. function:: categorical_crossentropy(coding_dist,true_dist)
+    Return the cross-entropy between an approximating distribution and a true distribution. 
+    The cross entropy between two probability distributions measures the average number of bits
+    needed to identify an event from a set of possibilities, if a coding scheme is used based
+    on a given probability distribution q, rather than the "true" distribution p. Mathematically, this 
+    function computes :math:`H(p,q) = - \sum_x p(x) \log(q(x))`, where
+    p=coding_dist and q=true_dist
+    :Parameters:
+       * *coding_dist* - symbolic 2D Tensor (or compatible). Each row
+         represents a distribution.
+       * *true_dist* - symbolic 2D Tensor **OR** symbolic vector of ints.  In
+         the case of an integer vector argument, each element represents the
+         position of the '1' in a 1-of-N encoding (aka "one-hot" encoding)
+    :Return type: tensor of rank one-less-than `coding_dist`
+.. note:: An application of the scenario where *true_dist* has a 1-of-N representation
+    is in classification with softmax outputs. If `coding_dist` is the output of
+    the softmax and `true_dist` is a vector of correct labels, then the function
+    will compute ``y_i = - \log(coding_dist[i, one_of_n[i]])``, which corresponds
+    to computing the neg-log-probability of the correct class (which is typically
+    the training criterion in classification settings).
+.. code-block:: python
+    y = T.nnet.softmax(T.dot(W,x) + b)
+    cost = T.nnet.categorical_crossentropy(y,o)
+    # o is either the above-mentioned 1-of-N vector or 2D tensor