Merge pull request #6326 from abergeron/dnn_rnn_doc

Add some documentation to RNNBlock.

Merge pull request #6326 from abergeron/dnn_rnn_doc
a015777b · Frédéric Bastien · GitHub · b980a8ee · 269753e7 · a015777b
--- a/doc/library/gpuarray/dnn.txt
+++ b/doc/library/gpuarray/dnn.txt
@@ -149,6 +149,94 @@ To get an error if Theano can not use cuDNN, use this Theano flag:
 - Spatial Transformer:
    - :func:`theano.gpuarray.dnn.dnn_spatialtf`.

+
+cuDNN RNN Example
+=================
+
+This is a code example of using the cuDNN RNN functionality.  We
+present the code with some commentary in between to explain some
+peculiarities.
+
+The terminology here assumes that you are familiar with RNN structure.
+
+.. code-block:: python
+
+    dtype = 'float32'
+    input_dim = 32
+    hidden_dim = 16
+    batch_size = 2
+    depth = 3
+    timesteps = 5
+
+To clarify the rest of the code we define some variables to hold sizes.
+
+.. code-block:: python
+
+    X = T.tensor3('X')
+    Y = T.tensor3('Y')
+    h0 = T.tensor3('h0')
+
+We also define some Theano variables to work with.  Here `X` is input,
+`Y` is output (as in expected output) and `h0` is the initial state
+for the recurrent inputs.
+
+.. code-block:: python
+
+    rnnb = dnn.RNNBlock(dtype, hidden_dim, depth, 'gru')
+
+This defines an RNNBlock.  This is a departure from usual Theano
+operations in that it has the structure of a layer more than a
+separate operation.  This is constrained by the underlying API.
+
+.. code-block:: python
+
+    psize = rnnb.get_param_size([batch_size, input_dim])
+    params_cudnn = gpuarray_shared_constructor(
+        np.zeros((psize,), dtype=theano.config.floatX))
+
+Here we allocate space for the trainable parameters of the RNN.  The
+first function tells us how many elements we will need to store the
+parameters.  This space if for all the parameters of all the layers
+inside the RNN and the layout is opaque.
+
+.. code-block:: python
+
+   layer = 0
+    = rnnb.split_params(params_cudnn, layer,
+                                  [batch_size, input_dim])
+
+If you need to access the parameters individually, you can call
+split_params on your shared variable to get all the parameters for a
+single layer. The order and number of returned items depends on the
+type of RNN.
+
+rnn_relu, rnn_tanh
+  input, recurrent
+
+gru
+  input reset, input update, input newmem, recurrent reset, recurrent
+  update, recurrent newmem
+
+lstm
+  input input gate, input forget gate, input newmem gate, input output
+  gate, recurrent input gate, recurrent update gate, recurrent newmem
+  gate, recurrent output gate
+
+All of these elements are composed of a weights and bias (matrix and
+vector).
+
+.. code-block:: python
+
+    y, hy = rnnb.apply(params_cudnn, X, h0)
+
+This is more akin to an op in Theano in that it will apply the RNN
+operation to a set of symbolic inputs and return symbolic outputs.
+`y` is the output, `hy` is the final state for the recurrent inputs.
+
+After this, the gradient works as usual so you can treat the returned
+symbolic outputs as normal Theano symbolic variables.
+
+
 List of Implemented Operations
 ==============================


--- a/doc/tutorial/using_gpu.txt
+++ b/doc/tutorial/using_gpu.txt
@@ -370,7 +370,7 @@ Consider again the logistic regression:

 .. testoutput::
   :hide:
-   :options: + ELLIPSIS
+   :options: +ELLIPSIS

   Used the cpu
   target values for D

--- a/theano/gpuarray/dnn.py
+++ b/theano/gpuarray/dnn.py
@@ -2508,7 +2508,7 @@ class GpuDnnRNNGradWeights(DnnBase):

 class RNNBlock(object):
    """
-    An object that allow us to use CuDNN v5 RNN implementation.
+    An object that allow us to use CuDNN RNN implementation.
    TODO: make an example how to use. You can check Theano tests
    test_dnn_rnn_gru() and test_dnn_rnn_lstm() in the file
    theano/gpuarray/tests/test_dnn.py for now.
@@ -2549,6 +2549,20 @@ class RNNBlock(object):
        self.dtype = dtype

    def get_param_size(self, input_size):
+        """
+        Get the size of the shared variable for the parameters of the RNN.
+
+        This will return a size (in items) necessary to store all the
+        parameters for the RNN.  You should allocate a variable of
+        that size to store those parameters.  The order and layout of
+        the parameters is opaque.
+
+        Parameters
+        ----------
+        input_size: (int, int)
+            Size of the input blocks
+
+        """
        bytesize = _get_param_size(self.desc, input_size, self.dtype,
                                   self.context_name)
        bytesize = int(bytesize)
@@ -2556,11 +2570,38 @@ class RNNBlock(object):
        return bytesize // np.dtype(self.dtype).itemsize

    def split_params(self, w, layer, input_size):
+        """
+        Split the opaque parameter block into components.
+
+        Parameters
+        ----------
+        w: GpuArraySharedVariable
+            opaque parameter block
+        layer: int
+            ID of the layer
+        input_size: (int, int)
+            Size of the input blocks
+
+        """
        if not isinstance(w, GpuArraySharedVariable):
            raise TypeError("split_params only works on gpuarray shared variables")
        return _split_rnn_params(w, self.desc, layer, input_size, self.dtype, self.rnn_mode)

    def apply(self, w, x, hx, cx=None):
+        """
+        Apply the RNN to some data
+
+        Parameters
+        ----------
+        w:
+            opaque parameter block
+        x:
+            input
+        hx:
+            initial hidden state
+        cx:
+            initial cell state (for LSTM)
+        """
        # Don't return the reserve as an output
        return GpuDnnRNNOp(self.rnn_mode, self.direction_mode)(
            rnndesc_type.make_constant(self.desc),