提交 993bd3cd authored 作者: affanv14's avatar affanv14

add tutorial doc for separable convolutions

上级 99cb1b4e
......@@ -957,6 +957,7 @@ the final output. A few examples of works using grouped convolutions are `Krizhe
A special case of grouped convolutions is when :math:`n` equals the number of input
channels. This is called depth-wise convolutions or channel-wise convolutions.
depth-wise convolutions also forms a part of separable convolutions.
An example to use Grouped convolutions would be:
......@@ -977,6 +978,49 @@ An example to use Grouped convolutions would be:
"Aggregated Residual Transformations for Deep Neural Networks".
arxiv preprint arXiv:1611.05431 (2016).
Separable Convolutions
----------------------
Separable convolutions consists of two consecutive convolution operations.
First is depth-wise convolutions which performs convolutions separately for
each channel of the input. The output of this operation is the given as input
to point-wise convolutions which mixes the channels to give the final output.
As we can see from this diagram, modified from `Vanhoucke(2014)`_ [#]_, depth-wise
convolutions is performed with :math:`c2` single channel depth-wise filters
to give a total of :math:`c2` output channels in the intermediate output where
each channel in the input separately performs convolutions with separate kernels
to give :math:`c2 / n` channels to the intermediate output, where :math:`n` is
the number of input channels. The intermediate output then performs point-wise
convolutions with :math:`c3` 1x1 filters which mixes the channels of the intermediate
output to give the final output.
.. image:: conv_arithmetic_figures/sep2D.jpg
:align: center
Separable convolutions is used as follows:
.. code-block:: python
output = theano.tensor.nnet.separable_conv2d(
input, depthwise_filters, pointwise_filters, num_channels = c1,
input_shape=(b, c1, i1, i2), depthwise_filter_shape=(c2, 1, k1, k2),
pointwise_filter_shape=(c3, c2, 1, 1), border_mode=(p1, p2),
subsample=(s1, s2), filter_dilation=(d1, d2))
# output.shape[0] == b
# output.shape[1] == c3
# output.shape[2] == (i1 + 2 * p1 - k1 - (k1 - 1) * (d1 - 1)) // s1 + 1
# output.shape[3] == (i2 + 2 * p2 - k2 - (k2 - 1) * (d2 - 1)) // s2 + 1
.. _Vanhoucke(2014):
http://scholar.google.co.in/scholar_url?url=http://vincent.vanhoucke.com/
publications/vanhoucke-iclr14.pdf&hl=en&sa=X&scisig=AAGBfm0x0bgnudAqSVgZ
ALfu8uPjYOIWwQ&nossl=1&oi=scholarr&ved=0ahUKEwjLreLjr_DVAhULwI8KHWmHAM8QgAMIJigAMAA
.. [#] Vincent Vanhoucke. "Learning Visual Representations at Scale",
International Conference on Learning Representations(2014).
Quick reference
===============
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论