Remove gpuarray documentation

32b79040 · Maxim Kochurov · Brandon T. Willard · 0e3182d1 · 0e3182d1 · 0e3182d1
--- a/doc/library/gpuarray/ctc.rst
+++ b/doc/library/gpuarray/ctc.rst
-.. _libdoc_gpuarray_ctc:
-
-================================================================================
-:mod:`aesara.gpuarray.ctc` -- Connectionist Temporal Classification (CTC) loss
-================================================================================
-
-
-.. warning::
-
-    This is not the recomanded user interface. Use :ref:`the CPU
-    interface <libdoc_tensor_nnet_ctc>`. It will get moved
-    automatically to the GPU.
-
-.. note::
-
-    Usage of connectionist temporal classification (CTC) loss Op, requires that
-    the `warp-ctc <https://github.com/baidu-research/warp-ctc>`_ library is
-    available. In case the warp-ctc library is not in your compiler's library path,
-    the :attr:`config.ctc__root` configuration option must be appropriately set to the
-    directory containing the warp-ctc library files.
-
-.. note::
-
-    Unfortunately, Windows platforms are not yet supported by the underlying
-    library.
-
-.. module:: aesara.gpuarray.ctc
-   :platform: Unix
-   :synopsis: Connectionist temporal classification (CTC) loss Op, using the warp-ctc library
-.. moduleauthor:: `João Victor Risso <https://github.com/joaovictortr>`_
-
-.. autofunction:: aesara.gpuarray.ctc.gpu_ctc
-.. autoclass:: aesara.gpuarray.ctc.GpuConnectionistTemporalClassification
--- a/doc/library/gpuarray/dnn.rst
+++ b/doc/library/gpuarray/dnn.rst
-.. _libdoc_gpuarray_dnn:
-
-===========================================
-:mod:`aesara.gpuarray.dnn` -- cuDNN
-===========================================
-
-.. moduleauthor:: LISA
-
-`cuDNN <https://developer.nvidia.com/cuDNN>`_ is an NVIDIA library
-with functionality used by deep neural networks. It provides optimized
-versions of some operations like the convolution. cuDNN is not
-currently installed with CUDA. You must download and install it
-yourself.
-
-To install it, decompress the downloaded file and make the ``*.h`` and
-``*.so*`` files available to the compilation environment.
-There are at least three possible ways of doing so:
-
- The easiest is to include them in your CUDA installation. Copy the
-  ``*.h`` files to ``CUDA_ROOT/include`` and the ``*.so*`` files to
-  ``CUDA_ROOT/lib64`` (by default, ``CUDA_ROOT`` is ``/usr/local/cuda``
-  on Linux).
- Alternatively, on Linux, you can set the environment variables
-  ``LD_LIBRARY_PATH``, ``LIBRARY_PATH`` and ``CPATH`` to the directory
-  extracted from the download. If needed, separate multiple directories
-  with ``:`` as in the ``PATH`` environment variable.
-
-  example::
-
-      export LD_LIBRARY_PATH=/home/user/path_to_CUDNN_folder/lib64:$LD_LIBRARY_PATH
-      export CPATH=/home/user/path_to_CUDNN_folder/include:$CPATH
-      export LIBRARY_PATH=/home/user/path_to_CUDNN_folder/lib64:$LD_LIBRARY_PATH
-
- And as a third way, also on Linux, you can copy the ``*.h`` files
-  to ``/usr/include`` and the ``*.so*`` files to ``/lib64``.
-
-By default, Aesara will detect if it can use cuDNN. If so, it will use
-it.  If not, Aesara optimizations will not introduce cuDNN ops. So
-Aesara will still work if the user did not introduce them manually.
-
-To get an error if Aesara can not use cuDNN, use this Aesara flag:
-``optimizer_including=cudnn``.
-
-.. note::
-
-   cuDNN v5.1 is supported in Aesara master version. So it dropped cuDNN v3 support.
-
-.. note::
-
-   Starting in cuDNN v3, multiple convolution implementations are offered and
-   it is possible to use heuristics to automatically choose a convolution
-   implementation well suited to the parameters of the convolution.
-
-   The Aesara flag ``dnn__conv__algo_fwd`` allows to specify the cuDNN
-   convolution implementation that Aesara should use for forward convolutions.
-   Possible values include :
-
-   * ``small`` (default) : use a convolution implementation with small memory
-     usage
-   * ``none`` : use a slower implementation with minimal memory usage
-   * ``large`` : use a sometimes faster implementation with large memory usage
-   * ``fft`` : use the Fast Fourier Transform implementation of convolution
-     (very high memory usage)
-   * ``guess_once`` : the first time a convolution is executed, the
-     implementation to use is chosen according to cuDNN's heuristics and reused
-     for every subsequent execution of the convolution.
-   * ``guess_on_shape_change`` : like ``guess_once`` but a new convolution
-     implementation selected every time the shapes of the inputs and kernels
-     don't match the shapes from the last execution.
-   * ``time_once`` : the first time a convolution is executed, every convolution
-     implementation offered by cuDNN is executed and timed. The fastest is
-     reused for every subsequent execution of the convolution.
-   * ``time_on_shape_change`` : like ``time_once`` but a new convolution
-     implementation selected every time the shapes of the inputs and kernels
-     don't match the shapes from the last execution.
-
-   The Aesara flag ``dnn.conv.algo_bwd`` allows to specify the cuDNN
-   convolution implementation that Aesara should use for gradient convolutions.
-   Possible values include :
-
-   * ``none`` (default) : use the default non-deterministic convolution
-     implementation
-   * ``deterministic`` : use a slower but deterministic implementation
-   * ``fft`` : use the Fast Fourier Transform implementation of convolution
-     (very high memory usage)
-   * ``guess_once`` : the first time a convolution is executed, the
-     implementation to use is chosen according to cuDNN's heuristics and reused
-     for every subsequent execution of the convolution.
-   * ``guess_on_shape_change`` : like ``guess_once`` but a new convolution
-     implementation selected every time the shapes of the inputs and kernels
-     don't match the shapes from the last execution.
-   * ``time_once`` : the first time a convolution is executed, every convolution
-     implementation offered by cuDNN is executed and timed. The fastest is
-     reused for every subsequent execution of the convolution.
-   * ``time_on_shape_change`` : like ``time_once`` but a new convolution
-     implementation selected every time the shapes of the inputs and kernels
-     don't match the shapes from the last execution.
-
-   ``guess_*`` and ``time_*`` flag values take into account the amount of
-   available memory when selecting an implementation. This means that slower
-   implementations might be selected if not enough memory is available for the
-   faster implementations.
-
-.. note::
-
-    Normally you should not call GPU Ops directly, but the CPU interface
-    currently does not allow all options supported by cuDNN ops. So it is
-    possible that you will need to call them manually.
-
-.. note::
-
-    The documentation of CUDNN tells that, for the 2 following operations, the
-    reproducibility is not guaranteed with the default implementation:
-    `cudnnConvolutionBackwardFilter` and `cudnnConvolutionBackwardData`.
-    Those correspond to the gradient wrt the weights and the gradient wrt the
-    input of the convolution. They are also used sometimes in the forward
-    pass, when they give a speed up.
-
-    The Aesara flag ``dnn.conv.algo_bwd`` can be use to force the use of a
-    slower but deterministic convolution implementation.
-
-.. note::
-
-    There is a problem we do not understand yet when cudnn paths are
-    used with symbolic links. So avoid using that.
-
-.. note::
-
-    cudnn.so* must be readable and executable by everybody.
-    cudnn.h must be readable by everybody.
-
-
- Convolution:
-    - :func:`aesara.gpuarray.dnn.dnn_conv`, :func:`aesara.gpuarray.dnn.dnn_conv3d`.
-    - :func:`aesara.gpuarray.dnn.dnn_gradweight`, :func:`aesara.gpuarray.dnn.dnn_gradweight3d`.
-    - :func:`aesara.gpuarray.dnn.dnn_gradinput`, :func:`aesara.gpuarray.dnn.dnn_gradinput3d`.
- Pooling:
-    - :func:`aesara.gpuarray.dnn.dnn_pool`.
- Batch Normalization:
-    - :func:`aesara.gpuarray.dnn.dnn_batch_normalization_train`
-    - :func:`aesara.gpuarray.dnn.dnn_batch_normalization_test`.
- RNN:
-    - :class:`aesara.gpuarray.dnn.RNNBlock`
- Softmax:
-    - You can manually use the op :class:`GpuDnnSoftmax
-      <aesara.gpuarray.dnn.GpuDnnSoftmax>` to use its extra feature.
- Spatial Transformer:
-    - :func:`aesara.gpuarray.dnn.dnn_spatialtf`.
-
-
-cuDNN RNN Example
-=================
-
-This is a code example of using the cuDNN RNN functionality.  We
-present the code with some commentary in between to explain some
-peculiarities.
-
-The terminology here assumes that you are familiar with RNN structure.
-
-.. code-block:: python
-
-    dtype = 'float32'
-    input_dim = 32
-    hidden_dim = 16
-    batch_size = 2
-    depth = 3
-    timesteps = 5
-
-To clarify the rest of the code we define some variables to hold sizes.
-
-.. code-block:: python
-
-    from aesara.tensor.type import tensor3
-
-    X = tensor3('X')
-    Y = tensor3('Y')
-    h0 = tensor3('h0')
-
-We also define some Aesara variables to work with.  Here `X` is input,
-`Y` is output (as in expected output) and `h0` is the initial state
-for the recurrent inputs.
-
-.. code-block:: python
-
-    rnnb = dnn.RNNBlock(dtype, hidden_dim, depth, 'gru')
-
-This defines an RNNBlock.  This is a departure from usual Aesara
-operations in that it has the structure of a layer more than a
-separate operation.  This is constrained by the underlying API.
-
-.. code-block:: python
-
-    psize = rnnb.get_param_size([batch_size, input_dim])
-    params_cudnn = gpuarray_shared_constructor(
-        np.zeros((psize,), dtype=aesara.config.floatX))
-
-Here we allocate space for the trainable parameters of the RNN.  The
-first function tells us how many elements we will need to store the
-parameters.  This space if for all the parameters of all the layers
-inside the RNN and the layout is opaque.
-
-.. code-block:: python
-
-   layer = 0
-    = rnnb.split_params(params_cudnn, layer,
-                                  [batch_size, input_dim])
-
-If you need to access the parameters individually, you can call
-split_params on your shared variable to get all the parameters for a
-single layer. The order and number of returned items depends on the
-type of RNN.
-
-rnn_relu, rnn_tanh
-  input, recurrent
-
-gru
-  input reset, input update, input newmem, recurrent reset, recurrent
-  update, recurrent newmem
-
-lstm
-  input input gate, input forget gate, input newmem gate, input output
-  gate, recurrent input gate, recurrent update gate, recurrent newmem
-  gate, recurrent output gate
-
-All of these elements are composed of a weights and bias (matrix and
-vector).
-
-.. code-block:: python
-
-    y, hy = rnnb.apply(params_cudnn, X, h0)
-
-This is more akin to an op in Aesara in that it will apply the RNN
-operation to a set of symbolic inputs and return symbolic outputs.
-`y` is the output, `hy` is the final state for the recurrent inputs.
-
-After this, the gradient works as usual so you can treat the returned
-symbolic outputs as normal Aesara symbolic variables.
-
-
-List of Implemented Operations
-==============================
-
-.. automodule:: aesara.gpuarray.dnn
-   :members:
--- a/doc/library/gpuarray/extra.rst
+++ b/doc/library/gpuarray/extra.rst
-.. _libdoc_gpuarray_extra:
-
-=================
-Utility functions
-=================
-
-Optimization
------------
-
-.. automodule:: aesara.gpuarray.opt_util
-   :members:
-
-Kernel generation
-----------------
-
-.. automodule:: aesara.gpuarray.kernel_codegen
-   :members:
-
-float16
-------
-
-.. automodule:: aesara.gpuarray.fp16_help
-   :members:
--- a/doc/library/gpuarray/fft.rst
+++ b/doc/library/gpuarray/fft.rst
-.. _libdoc_gpuarray_fft:
-
-=====================================================
-:mod:`aesara.gpuarray.fft` -- Fast Fourier Transforms
-=====================================================
-
-Performs Fast Fourier Transforms (FFT) on the GPU.
-
-FFT gradients are implemented as the opposite Fourier transform of the output gradients.
-
-.. note ::
-    You must install `scikit-cuda <http://scikit-cuda.readthedocs.io/en/latest>`_
-    to compute Fourier transforms on the GPU.
-
-
-.. warning ::
-    The real and imaginary parts of the Fourier domain arrays are stored as a pair of float32
-    arrays, emulating complex64. Since aesara has limited support for complex
-    number operations, care must be taken to manually implement operations such as gradients.
-
-.. automodule:: aesara.gpuarray.fft
-   :members: curfft, cuirfft
-
-For example, the code below performs the real input FFT of a box function, which is a sinc function.
-The absolute value is plotted, since the phase oscillates due to the box function being
-shifted to the middle of the array. The Aesara flag ``device=cuda{0,1...}`` must be used.
-
-.. testcode::
-
-    import numpy as np
-    import aesara
-    import aesara.tensor as at
-    from aesara.gpuarray import fft
-
-    x = at.matrix('x', dtype='float32')
-
-    rfft = fft.curfft(x, norm='ortho')
-    f_rfft = aesara.function([x], rfft)
-
-    N = 1024
-    box = np.zeros((1, N), dtype='float32')
-    box[:, N/2-10: N/2+10] = 1
-
-    out = f_rfft(box)
-    c_out = np.asarray(out[0, :, 0] + 1j*out[0, :, 1])
-    abs_out = abs(c_out)
-
-.. testoutput::
-   :hide:
-   :options: +SKIP
-
-   ...
-
-.. image:: plot_fft.png
--- a/doc/library/gpuarray/index.rst
+++ b/doc/library/gpuarray/index.rst
-
-.. _libdoc_gpuarray:
-
-=======================================================
-:mod:`gpuarray` -- The (new) GPU backend
-=======================================================
-
-.. module:: aesara.gpuarray
-   :platform: Unix, Windows
-   :synopsis: Code for GPU programming (new)
-.. moduleauthor:: MILA
-
-.. toctree::
-    :maxdepth: 1
-
-    op
-    dnn
-    fft
-    type
-    extra
-    ctc
-    linalg
--- a/doc/library/gpuarray/linalg.rst
+++ b/doc/library/gpuarray/linalg.rst
-.. _libdoc_gpuarray_linalg:
-
-=========================================================
-:mod:`aesara.gpuarray.linalg` -- Linear algebra operation
-=========================================================
-
-
-.. warning::
-
-   Some operation need Magma to be installed and the Aesara flags
-   :attr:`config.magma__enabled=True` to be activated. See also the
-   flags :attr:`config.magma__include_path` and
-   :attr:`config.magma__library_path`.
-
-Linalg Op
-=========
-
-.. automodule:: aesara.gpuarray.linalg
-    :members:
--- a/doc/library/gpuarray/op.rst
+++ b/doc/library/gpuarray/op.rst
-.. _libdoc_gpuarray_op:
-
-================================
-List of gpuarray Ops implemented
-================================
-
-.. moduleauthor:: LISA
-
-Normally you should not call directly those Ops! Aesara should
-automatically transform CPU ops to their GPU equivalent. So this list
-is just useful to let people know what is implemented on the GPU.
-
-Basic Op
-========
-
-.. automodule:: aesara.gpuarray.basic_ops
-    :members:
-
-Blas Op
-=======
-
-.. automodule:: aesara.gpuarray.blas
-    :members:
-
-Elemwise Op
-===========
-
-.. automodule:: aesara.gpuarray.elemwise
-    :members:
-
-Subtensor Op
-============
-
-.. automodule:: aesara.gpuarray.subtensor
-    :members:
-
-Nnet Op
-=======
-
-.. automodule:: aesara.gpuarray.nnet
-    :members:
-
-.. automodule:: aesara.gpuarray.neighbours
-    :members:
--- a/doc/library/gpuarray/plot_fft.png
+++ b/doc/library/gpuarray/plot_fft.png
--- a/doc/library/gpuarray/type.rst
+++ b/doc/library/gpuarray/type.rst
-.. _libdoc_gpuarray_type:
-
-===================================================
-:mod:`aesara.gpuarray.type` -- Type classes
-===================================================
-
-.. automodule:: aesara.gpuarray.type
-   :members: