提交 4c685afb authored 作者: Rémi Louf's avatar Rémi Louf 提交者: Brandon T. Willard

Remove mentions of `aesara.tensor.signal` and `aesara.tensor.nnet` in documentation

上级 cf4709d8
...@@ -209,29 +209,6 @@ Here is an example showing how to use :func:`verify_grad` on an :class:`Op` inst ...@@ -209,29 +209,6 @@ Here is an example showing how to use :func:`verify_grad` on an :class:`Op` inst
rng = np.random.default_rng(42) rng = np.random.default_rng(42)
aesara.gradient.verify_grad(at.Flatten(), [a_val], rng=rng) aesara.gradient.verify_grad(at.Flatten(), [a_val], rng=rng)
Here is another example, showing how to verify the gradient w.r.t. a subset of
an :class:`Op`'s inputs. This is useful in particular when the gradient w.r.t. some of
the inputs cannot be computed by finite difference (e.g. for discrete inputs),
which would cause :func:`verify_grad` to crash.
.. testcode::
def test_crossentropy_softmax_grad():
op = at.nnet.crossentropy_softmax_argmax_1hot_with_bias
def op_with_fixed_y_idx(x, b):
# Input `y_idx` of this `Op` takes integer values, so we fix them
# to some constant array.
# Although this `Op` has multiple outputs, we can return only one.
# Here, we return the first output only.
return op(x, b, y_idx=np.asarray([0, 2]))[0]
x_val = np.asarray([[-1, 0, 1], [3, 2, 1]], dtype='float64')
b_val = np.asarray([1, 2, 3], dtype='float64')
rng = np.random.default_rng(42)
aesara.gradient.verify_grad(op_with_fixed_y_idx, [x_val, b_val], rng=rng)
.. note:: .. note::
Although :func:`verify_grad` is defined in :mod:`aesara.gradient`, unittests Although :func:`verify_grad` is defined in :mod:`aesara.gradient`, unittests
......
...@@ -104,7 +104,7 @@ ...@@ -104,7 +104,7 @@
"\n", "\n",
"wy = th.shared(rng.normal(0, 1, (nhiddens, noutputs)))\n", "wy = th.shared(rng.normal(0, 1, (nhiddens, noutputs)))\n",
"by = th.shared(np.zeros(noutputs), borrow=True)\n", "by = th.shared(np.zeros(noutputs), borrow=True)\n",
"y = at.nnet.softmax(at.dot(h, wy) + by)\n", "y = at.math.softmax(at.dot(h, wy) + by)\n",
"\n", "\n",
"predict = th.function([x], y)" "predict = th.function([x], y)"
] ]
......
...@@ -67,7 +67,7 @@ hidden layer and a softmax output layer. ...@@ -67,7 +67,7 @@ hidden layer and a softmax output layer.
wy = th.shared(rng.normal(0, 1, (nhiddens, noutputs))) wy = th.shared(rng.normal(0, 1, (nhiddens, noutputs)))
by = th.shared(np.zeros(noutputs), borrow=True) by = th.shared(np.zeros(noutputs), borrow=True)
y = at.nnet.softmax(at.dot(h, wy) + by) y = at.math.softmax(at.dot(h, wy) + by)
predict = th.function([x], y) predict = th.function([x], y)
......
.. _libdoc_neighbours:
===================================================================
:mod:`sandbox.neighbours` -- Neighbours Ops
===================================================================
.. module:: sandbox.neighbours
:platform: Unix, Windows
:synopsis: Neighbours Ops
.. moduleauthor:: LISA
:ref:`Moved <libdoc_tensor_nnet_neighbours>`
...@@ -18,9 +18,7 @@ They are grouped into the following sections: ...@@ -18,9 +18,7 @@ They are grouped into the following sections:
:maxdepth: 1 :maxdepth: 1
basic basic
nnet/index
random/index random/index
signal/index
utils utils
elemwise elemwise
extra_ops extra_ops
......
.. _libdoc_tensor_nnet_basic:
======================================================
:mod:`basic` -- Basic Ops for neural networks
======================================================
.. module:: aesara.tensor.nnet.basic
:platform: Unix, Windows
:synopsis: Ops for neural networks
.. moduleauthor:: LISA
- Sigmoid
- :func:`sigmoid`
- :func:`ultra_fast_sigmoid`
- :func:`hard_sigmoid`
- Others
- :func:`softplus`
- :func:`softmax`
- :func:`softsign`
- :func:`relu() <aesara.tensor.nnet.relu>`
- :func:`elu() <aesara.tensor.nnet.elu>`
- :func:`selu() <aesara.tensor.nnet.selu>`
- :func:`binary_crossentropy`
- :func:`sigmoid_binary_crossentropy`
- :func:`.categorical_crossentropy`
- :func:`h_softmax() <aesara.tensor.nnet.h_softmax>`
- :func:`confusion_matrix <aesara.tensor.nnet.confusion_matrix>`
.. function:: sigmoid(x)
Returns the standard sigmoid nonlinearity applied to x
:Parameters: *x* - symbolic Tensor (or compatible)
:Return type: same as x
:Returns: element-wise sigmoid: :math:`sigmoid(x) = \frac{1}{1 + \exp(-x)}`.
:note: see :func:`ultra_fast_sigmoid` or :func:`hard_sigmoid` for faster versions.
Speed comparison for 100M float64 elements on a Core2 Duo @ 3.16 GHz:
- hard_sigmoid: 1.0s
- ultra_fast_sigmoid: 1.3s
- sigmoid (with amdlibm): 2.3s
- sigmoid (without amdlibm): 3.7s
Precision: sigmoid(with or without amdlibm) > ultra_fast_sigmoid > hard_sigmoid.
.. image:: sigmoid_prec.png
Example:
.. testcode::
import aesara.tensor as at
x, y, b = at.dvectors('x', 'y', 'b')
W = at.dmatrix('W')
y = at.sigmoid(at.dot(W, x) + b)
.. note:: The underlying code will return an exact 0 or 1 if an
element of x is too small or too big.
.. function:: ultra_fast_sigmoid(x)
Returns an approximate standard :func:`sigmoid` nonlinearity applied to ``x``.
:Parameters: ``x`` - symbolic Tensor (or compatible)
:Return type: same as ``x``
:Returns: approximated element-wise sigmoid: :math:`sigmoid(x) = \frac{1}{1 + \exp(-x)}`.
:note: To automatically change all :func:`sigmoid`\ :class:`Op`\s to this version, use
the Aesara rewrite `local_ultra_fast_sigmoid`. This can be done
with the Aesara flag ``optimizer_including=local_ultra_fast_sigmoid``.
This rewrite is done late, so it should not affect stabilization rewrites.
.. note:: The underlying code will return 0.00247262315663 as the
minimum value and 0.997527376843 as the maximum value. So it
never returns 0 or 1.
.. note:: Using directly the `ultra_fast_sigmoid` in the graph will
disable stabilization rewrites associated with it. But
using the rewrite to insert them won't disable the
stability rewrites.
.. function:: hard_sigmoid(x)
Returns an approximate standard :func:`sigmoid` nonlinearity applied to `1x1`.
:Parameters: ``x`` - symbolic Tensor (or compatible)
:Return type: same as ``x``
:Returns: approximated element-wise sigmoid: :math:`sigmoid(x) = \frac{1}{1 + \exp(-x)}`.
:note: To automatically change all :func:`sigmoid`\ :class:`Op`\s to this version, use
the Aesara rewrite `local_hard_sigmoid`. This can be done
with the Aesara flag ``optimizer_including=local_hard_sigmoid``.
This rewrite is done late, so it should not affect
stabilization rewrites.
.. note:: The underlying code will return an exact 0 or 1 if an
element of ``x`` is too small or too big.
.. note:: Using directly the `ultra_fast_sigmoid` in the graph will
disable stabilization rewrites associated with it. But
using the rewrites to insert them won't disable the
stability rewrites.
.. function:: softplus(x)
Returns the softplus nonlinearity applied to x
:Parameter: *x* - symbolic Tensor (or compatible)
:Return type: same as x
:Returns: element-wise softplus: :math:`softplus(x) = \log_e{\left(1 + \exp(x)\right)}`.
.. note:: The underlying code will return an exact 0 if an element of x is too small.
.. testcode::
x, y, b = at.dvectors('x', 'y', 'b')
W = at.dmatrix('W')
y = at.nnet.softplus(at.dot(W,x) + b)
.. function:: softsign(x)
Return the elemwise softsign activation function
:math:`\\varphi(\\mathbf{x}) = \\frac{1}{1+|x|}`
.. function:: softmax(x)
Returns the softmax function of x:
:Parameter: *x* symbolic **2D** Tensor (or compatible).
:Return type: same as x
:Returns: a symbolic 2D tensor whose ijth element is :math:`softmax_{ij}(x) = \frac{\exp{x_{ij}}}{\sum_k\exp(x_{ik})}`.
The softmax function will, when applied to a matrix, compute the softmax values row-wise.
:note: this supports hessian free as well. The code of
the softmax op is more numerically stable because it uses this code:
.. code-block:: python
e_x = exp(x - x.max(axis=1, keepdims=True))
out = e_x / e_x.sum(axis=1, keepdims=True)
Example of use:
.. testcode::
x, y, b = at.dvectors('x', 'y', 'b')
W = at.dmatrix('W')
y = at.nnet.softmax(at.dot(W,x) + b)
.. autofunction:: aesara.tensor.nnet.relu
.. autofunction:: aesara.tensor.nnet.elu
.. autofunction:: aesara.tensor.nnet.selu
.. function:: binary_crossentropy(output,target)
Computes the binary cross-entropy between a target and an output:
:Parameters:
* *target* - symbolic Tensor (or compatible)
* *output* - symbolic Tensor (or compatible)
:Return type: same as target
:Returns: a symbolic tensor, where the following is applied element-wise :math:`crossentropy(t,o) = -(t\cdot log(o) + (1 - t) \cdot log(1 - o))`.
The following block implements a simple auto-associator with a
sigmoid nonlinearity and a reconstruction error which corresponds
to the binary cross-entropy (note that this assumes that x will
contain values between 0 and 1):
.. testcode::
x, y, b, c = at.dvectors('x', 'y', 'b', 'c')
W = at.dmatrix('W')
V = at.dmatrix('V')
h = at.sigmoid(at.dot(W, x) + b)
x_recons = at.sigmoid(at.dot(V, h) + c)
recon_cost = at.nnet.binary_crossentropy(x_recons, x).mean()
.. function:: sigmoid_binary_crossentropy(output,target)
Computes the binary cross-entropy between a target and the sigmoid of an output:
:Parameters:
* *target* - symbolic Tensor (or compatible)
* *output* - symbolic Tensor (or compatible)
:Return type: same as target
:Returns: a symbolic tensor, where the following is applied element-wise :math:`crossentropy(o,t) = -(t\cdot log(sigmoid(o)) + (1 - t) \cdot log(1 - sigmoid(o)))`.
It is equivalent to `binary_crossentropy(sigmoid(output), target)`,
but with more efficient and numerically stable computation, especially when
taking gradients.
The following block implements a simple auto-associator with a
sigmoid nonlinearity and a reconstruction error which corresponds
to the binary cross-entropy (note that this assumes that x will
contain values between 0 and 1):
.. testcode::
x, y, b, c = at.dvectors('x', 'y', 'b', 'c')
W = at.dmatrix('W')
V = at.dmatrix('V')
h = at.sigmoid(at.dot(W, x) + b)
x_precons = at.dot(V, h) + c
# final reconstructions are given by sigmoid(x_precons), but we leave
# them unnormalized as sigmoid_binary_crossentropy applies sigmoid
recon_cost = at.sigmoid_binary_crossentropy(x_precons, x).mean()
.. function:: categorical_crossentropy(coding_dist,true_dist)
Return the cross-entropy between an approximating distribution and a true distribution.
The cross entropy between two probability distributions measures the average number of bits
needed to identify an event from a set of possibilities, if a coding scheme is used based
on a given probability distribution q, rather than the "true" distribution p. Mathematically, this
function computes :math:`H(p,q) = - \sum_x p(x) \log(q(x))`, where
p=true_dist and q=coding_dist.
:Parameters:
* *coding_dist* - symbolic 2D Tensor (or compatible). Each row
represents a distribution.
* *true_dist* - symbolic 2D Tensor **OR** symbolic vector of ints. In
the case of an integer vector argument, each element represents the
position of the '1' in a 1-of-N encoding (aka "one-hot" encoding)
:Return type: tensor of rank one-less-than `coding_dist`
.. note:: An application of the scenario where *true_dist* has a
1-of-N representation is in classification with softmax
outputs. If `coding_dist` is the output of the softmax and
`true_dist` is a vector of correct labels, then the function
will compute ``y_i = - \log(coding_dist[i, one_of_n[i]])``,
which corresponds to computing the neg-log-probability of the
correct class (which is typically the training criterion in
classification settings).
.. testsetup::
import aesara
o = aesara.tensor.ivector()
.. testcode::
y = at.nnet.softmax(at.dot(W, x) + b)
cost = at.nnet.categorical_crossentropy(y, o)
# o is either the above-mentioned 1-of-N vector or 2D tensor
.. autofunction:: aesara.tensor.nnet.h_softmax
.. _libdoc_tensor_nnet_batchnorm:
=======================================
:mod:`batchnorm` -- Batch Normalization
=======================================
.. module:: tensor.nnet.batchnorm
:platform: Unix, Windows
:synopsis: Batch Normalization
.. moduleauthor:: LISA
.. autofunction:: aesara.tensor.nnet.batchnorm.batch_normalization_train
.. autofunction:: aesara.tensor.nnet.batchnorm.batch_normalization_test
.. seealso:: cuDNN batch normalization: :class:`aesara.gpuarray.dnn.dnn_batch_normalization_train`, :class:`aesara.gpuarray.dnn.dnn_batch_normalization_test>`.
.. autofunction:: aesara.tensor.nnet.batchnorm.batch_normalization
.. _libdoc_blocksparse:
===============================================================================
:mod:`blocksparse` -- Block sparse dot operations (gemv and outer)
===============================================================================
.. module:: tensor.nnet.blocksparse
:platform: Unix, Windows
:synopsis: Block sparse dot
.. moduleauthor:: LISA
.. automodule:: aesara.tensor.nnet.blocksparse
:members:
.. _libdoc_tensor_nnet_conv:
==========================================================
:mod:`conv` -- Ops for convolutional neural nets
==========================================================
.. note::
Two similar implementation exists for conv2d:
:func:`signal.conv2d <aesara.tensor.signal.conv.conv2d>` and
:func:`nnet.conv2d <aesara.tensor.nnet.conv2d>`.
The former implements a traditional
2D convolution, while the latter implements the convolutional layers
present in convolutional neural networks (where filters are 3D and pool
over several input channels).
.. module:: conv
:platform: Unix, Windows
:synopsis: ops for signal processing
.. moduleauthor:: LISA
The recommended user interface are:
- :func:`aesara.tensor.nnet.conv2d` for 2d convolution
- :func:`aesara.tensor.nnet.conv3d` for 3d convolution
With those new interface, Aesara will automatically use the fastest
implementation in many cases. On the CPU, the implementation is a GEMM
based one.
This auto-tuning has the inconvenience that the first call is much
slower as it tries and times each implementation it has. So if you
benchmark, it is important that you remove the first call from your
timing.
Implementation Details
======================
This section gives more implementation detail. Most of the time you do
not need to read it. Aesara will select it for you.
- Implemented operators for neural network 2D / image convolution:
- :func:`nnet.conv.conv2d <aesara.tensor.nnet.conv.conv2d>`.
old 2d convolution. DO NOT USE ANYMORE.
For each element in a batch, it first creates a
`Toeplitz <http://en.wikipedia.org/wiki/Toeplitz_matrix>`_ matrix in a CUDA kernel.
Then, it performs a ``gemm`` call to multiply this Toeplitz matrix and the filters
(hence the name: MM is for matrix multiplication).
It needs extra memory for the Toeplitz matrix, which is a 2D matrix of shape
``(no of channels * filter width * filter height, output width * output height)``.
- :func:`CorrMM <aesara.tensor.nnet.corr.CorrMM>`
This is a CPU-only 2d correlation implementation taken from
`caffe's cpp implementation <https://github.com/BVLC/caffe/blob/master/src/caffe/layers/conv_layer.cpp>`_.
It does not flip the kernel.
- Implemented operators for neural network 3D / video convolution:
- :func:`Corr3dMM <aesara.tensor.nnet.corr3d.Corr3dMM>`
This is a CPU-only 3d correlation implementation based on
the 2d version (:func:`CorrMM <aesara.tensor.nnet.corr.CorrMM>`).
It does not flip the kernel. As it provides a gradient, you can use it as a
replacement for nnet.conv3d. For convolutions done on CPU,
nnet.conv3d will be replaced by Corr3dMM.
- :func:`conv3d2d <aesara.tensor.nnet.conv3d2d.conv3d>`
Another conv3d implementation that uses the conv2d with data reshaping.
It is faster in some corner cases than conv3d. It flips the kernel.
.. autofunction:: aesara.tensor.nnet.conv2d
.. autofunction:: aesara.tensor.nnet.conv2d_transpose
.. autofunction:: aesara.tensor.nnet.conv3d
.. autofunction:: aesara.tensor.nnet.conv3d2d.conv3d
.. autofunction:: aesara.tensor.nnet.conv.conv2d
.. automodule:: aesara.tensor.nnet.abstract_conv
:members:
.. _libdoc_tensor_nnet_ctc:
==================================================================================
:mod:`aesara.tensor.nnet.ctc` -- Connectionist Temporal Classification (CTC) loss
==================================================================================
.. note::
Usage of connectionist temporal classification (CTC) loss Op, requires that
the `warp-ctc <https://github.com/baidu-research/warp-ctc>`_ library is
available. In case the warp-ctc library is not in your compiler's library path,
the ``config.ctc__root`` configuration option must be appropriately set to the
directory containing the warp-ctc library files.
.. note::
This interface is the preferred interface.
.. note::
Unfortunately, Windows platforms are not yet supported by the underlying
library.
.. module:: aesara.tensor.nnet.ctc
:platform: Unix
:synopsis: Connectionist temporal classification (CTC) loss Op, using the warp-ctc library
.. moduleauthor:: `João Victor Risso <https://github.com/joaovictortr>`_
.. autofunction:: aesara.tensor.nnet.ctc.ctc
.. autoclass:: aesara.tensor.nnet.ctc.ConnectionistTemporalClassification
.. _libdoc_tensor_nnet:
==================================================
:mod:`nnet` -- Ops related to neural networks
==================================================
.. module:: aesara.tensor.nnet
:platform: Unix, Windows
:synopsis: various ops relating to neural networks
.. moduleauthor:: LISA
Aesara was originally developed for machine learning applications, particularly
for the topic of deep learning. As such, our lab has developed many functions
and ops which are particular to neural networks and deep learning.
.. toctree::
:maxdepth: 1
conv
basic
neighbours
batchnorm
blocksparse
ctc
.. _libdoc_tensor_nnet_neighbours:
=======================================================================
:mod:`neighbours` -- Ops for working with images in convolutional nets
=======================================================================
.. module:: aesara.tensor.nnet.neighbours
:platform: Unix, Windows
:synopsis: Ops for working with images in conv nets
.. moduleauthor:: LISA
Functions
=========
.. autofunction:: aesara.tensor.nnet.neighbours.images2neibs
.. autofunction:: aesara.tensor.nnet.neighbours.neibs2images
See also
========
- :ref:`indexing`
- :ref:`lib_scan`
.. _libdoc_tensor_signal_conv:
======================================================
:mod:`conv` -- Convolution
======================================================
.. note::
Two similar implementation exists for conv2d:
:func:`signal.conv2d <aesara.tensor.signal.conv.conv2d>` and
:func:`nnet.conv2d <aesara.tensor.nnet.conv.conv2d>`.
The former implements a traditional
2D convolution, while the latter implements the convolutional layers
present in convolutional neural networks (where filters are 3D and pool
over several input channels).
.. module:: aesara.tensor.signal.conv
:platform: Unix, Windows
:synopsis: ops for performing convolutions
.. moduleauthor:: LISA
.. autofunction:: aesara.tensor.signal.conv.conv2d
.. function:: fft(*todo)
[James has some code for this, but hasn't gotten it into the source tree yet.]
.. _libdoc_tensor_signal_downsample:
======================================================
:mod:`downsample` -- Down-Sampling
======================================================
.. module:: downsample
:platform: Unix, Windows
:synopsis: ops for performing various forms of downsampling
.. moduleauthor:: LISA
.. note::
This module is deprecated. Use the functions in :func:`aesara.tensor.nnet.signal.pool`
.. _libdoc_tensor_signal:
=====================================================
:mod:`signal` -- Signal Processing
=====================================================
Signal Processing
-----------------
.. module:: signal
:platform: Unix, Windows
:synopsis: various ops for performing basic signal processing
(convolutions, subsampling, fft, etc.)
.. moduleauthor:: LISA
The signal subpackage contains ops which are useful for performing various
forms of signal processing.
.. toctree::
:maxdepth: 1
conv
pool
downsample
.. _libdoc_tensor_signal_pool:
======================================================
:mod:`pool` -- Down-Sampling
======================================================
.. module:: pool
:platform: Unix, Windows
:synopsis: ops for performing various forms of downsampling
.. moduleauthor:: LISA
.. seealso:: :func:`aesara.tensor.nnet.neighbours.images2neibs`
.. autofunction:: aesara.tensor.signal.pool.pool_2d
.. autofunction:: aesara.tensor.signal.pool.max_pool_2d_same_size
.. autofunction:: aesara.tensor.signal.pool.pool_3d
...@@ -127,9 +127,6 @@ Could lower the memory usage, but raise computation time: ...@@ -127,9 +127,6 @@ Could lower the memory usage, but raise computation time:
- :attr:`config.scan__allow_gc` = True - :attr:`config.scan__allow_gc` = True
- :attr:`config.scan__allow_output_prealloc` =False - :attr:`config.scan__allow_output_prealloc` =False
- Use :func:`batch_normalization()
<aesara.tensor.nnet.batchnorm.batch_normalization>`. It use less memory
then building a corresponding Aesara graph.
- Disable one or scan more rewrites: - Disable one or scan more rewrites:
- ``optimizer_excluding=scan_pushout_seqs_ops`` - ``optimizer_excluding=scan_pushout_seqs_ops``
- ``optimizer_excluding=scan_pushout_dot1`` - ``optimizer_excluding=scan_pushout_dot1``
......
差异被折叠。
...@@ -41,7 +41,6 @@ Advanced ...@@ -41,7 +41,6 @@ Advanced
.. toctree:: .. toctree::
sparse sparse
conv_arithmetic
Advanced configuration and debugging Advanced configuration and debugging
------------------------------------ ------------------------------------
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论