提交 8e1612a7 authored 作者: AdeB's avatar AdeB

Add doc to h_softmax

上级 02482e46
...@@ -20,3 +20,4 @@ and ops which are particular to neural networks and deep learning. ...@@ -20,3 +20,4 @@ and ops which are particular to neural networks and deep learning.
nnet nnet
neighbours neighbours
bn bn
blocksparse
...@@ -21,6 +21,7 @@ ...@@ -21,6 +21,7 @@
- :func:`relu() <theano.tensor.nnet.relu>` - :func:`relu() <theano.tensor.nnet.relu>`
- :func:`binary_crossentropy` - :func:`binary_crossentropy`
- :func:`.categorical_crossentropy` - :func:`.categorical_crossentropy`
- :func:`h_softmax() <theano.tensor.nnet.h_softmax>`
.. function:: sigmoid(x) .. function:: sigmoid(x)
...@@ -204,3 +205,6 @@ ...@@ -204,3 +205,6 @@
y = T.nnet.softmax(T.dot(W, x) + b) y = T.nnet.softmax(T.dot(W, x) + b)
cost = T.nnet.categorical_crossentropy(y, o) cost = T.nnet.categorical_crossentropy(y, o)
# o is either the above-mentioned 1-of-N vector or 2D tensor # o is either the above-mentioned 1-of-N vector or 2D tensor
.. autofunction:: theano.tensor.nnet.h_softmax
...@@ -46,7 +46,7 @@ from theano.sandbox.cuda.blas import ( ...@@ -46,7 +46,7 @@ from theano.sandbox.cuda.blas import (
GpuDownsampleFactorMax, GpuDownsampleFactorMaxGrad, GpuDownsampleFactorMax, GpuDownsampleFactorMaxGrad,
GpuDownsampleFactorMaxGradGrad) GpuDownsampleFactorMaxGradGrad)
from theano.sandbox.blocksparse import SparseBlockGemv, SparseBlockOuter from theano.tensor.nnet.blocksparse import SparseBlockGemv, SparseBlockOuter
from theano.sandbox.cuda.blocksparse import ( from theano.sandbox.cuda.blocksparse import (
GpuSparseBlockGemv, GpuSparseBlockGemv,
GpuSparseBlockOuter, GpuSparseBlockOuter,
......
...@@ -22,7 +22,7 @@ else: ...@@ -22,7 +22,7 @@ else:
class BlockSparse_Gemv_and_Outer( class BlockSparse_Gemv_and_Outer(
theano.sandbox.tests.test_blocksparse.BlockSparse_Gemv_and_Outer): theano.tensor.nnet.tests.test_blocksparse.BlockSparse_Gemv_and_Outer):
def setUp(self): def setUp(self):
utt.seed_rng() utt.seed_rng()
self.mode = mode_with_gpu.excluding('constant_folding') self.mode = mode_with_gpu.excluding('constant_folding')
......
...@@ -2068,7 +2068,9 @@ def h_softmax(x, batch_size, n_outputs, n_classes, n_outputs_per_class, ...@@ -2068,7 +2068,9 @@ def h_softmax(x, batch_size, n_outputs, n_classes, n_outputs_per_class,
The outputs are grouped in the same order as they are initially defined. The outputs are grouped in the same order as they are initially defined.
Arguments: .. versionadded:: 0.7.1
Parameters
---------- ----------
x: tensor of shape (batch_size, number of features) x: tensor of shape (batch_size, number of features)
the minibatch input of the two-layer hierarchical softmax. the minibatch input of the two-layer hierarchical softmax.
...@@ -2087,19 +2089,18 @@ def h_softmax(x, batch_size, n_outputs, n_classes, n_outputs_per_class, ...@@ -2087,19 +2089,18 @@ def h_softmax(x, batch_size, n_outputs, n_classes, n_outputs_per_class,
n_outputs_per_class: int n_outputs_per_class: int
the number of outputs per class. See note at the end. the number of outputs per class. See note at the end.
W1: tensor of shape (number of features of the input x, number of classes) W1: tensor of shape (number of features of the input x, n_classes)
the weight matrix of the first softmax, which maps the input x to the the weight matrix of the first softmax, which maps the input x to the
probabilities of the classes. probabilities of the classes.
b1: tensor of shape (number of classes,) b1: tensor of shape (n_classes,)
the bias vector of the first softmax layer. the bias vector of the first softmax layer.
W2: tensor of shape (number of classes, number of features of the input x, W2: tensor of shape (n_classes, number of features of the input x, n_outputs_per_class)
number of outputs per class)
the weight matrix of the second softmax, which maps the input x to the weight matrix of the second softmax, which maps the input x to
the probabilities of the outputs. the probabilities of the outputs.
b2: tensor of shape (number of classes, number of outputs per class) b2: tensor of shape (n_classes, n_outputs_per_class)
the bias vector of the second softmax layer. the bias vector of the second softmax layer.
target: tensor of shape either (batch_size,) or (batch_size, 1) target: tensor of shape either (batch_size,) or (batch_size, 1)
...@@ -2109,7 +2110,16 @@ def h_softmax(x, batch_size, n_outputs, n_classes, n_outputs_per_class, ...@@ -2109,7 +2110,16 @@ def h_softmax(x, batch_size, n_outputs, n_classes, n_outputs_per_class,
corresponding target. If target is None, then all the outputs are corresponding target. If target is None, then all the outputs are
computed for each input. computed for each input.
Notes Returns
-------
tensor of shape (batch_size, n_outputs) or (batch_size, 1)
Output of the two-layer hierarchical softmax for input x. If target is
not specified (None), then all the outputs are computed and the
returned tensor has shape (batch_size, n_outputs). Otherwise, when
target is specified, only the corresponding outputs are computed and
the returned tensor has thus shape (batch_size, 1).
Notes:
----- -----
The product of n_outputs_per_class and n_classes has to be greater or equal The product of n_outputs_per_class and n_classes has to be greater or equal
to n_outputs. If it is strictly greater, then the irrelevant outputs will to n_outputs. If it is strictly greater, then the irrelevant outputs will
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论