提交 f27c11e2 authored 作者: Francesco Visin's avatar Francesco Visin

Final conv arithmetic tutorial review

上级 d75cf06c
...@@ -80,9 +80,8 @@ Here is an example of a discrete convolution: ...@@ -80,9 +80,8 @@ Here is an example of a discrete convolution:
.. figure:: conv_arithmetic_figures/numerical_no_padding_no_strides.gif .. figure:: conv_arithmetic_figures/numerical_no_padding_no_strides.gif
:figclass: align-center :figclass: align-center
The light blue grid is called the *input feature map*. (An example of this is The light blue grid is called the *input feature map*. A *kernel* (shaded area)
what was referred to earlier as *channels* for images and sound clips.) A of value
*kernel* (shaded area) of value
.. math:: .. math::
...@@ -94,11 +93,14 @@ what was referred to earlier as *channels* for images and sound clips.) A ...@@ -94,11 +93,14 @@ what was referred to earlier as *channels* for images and sound clips.) A
slides across the input feature map. At each location, the product between each slides across the input feature map. At each location, the product between each
element of the kernel and the input element it overlaps is computed and the element of the kernel and the input element it overlaps is computed and the
results are summed up to obtain the output in the current location. The results are summed up to obtain the output in the current location. The final
procedure can be repeated using different kernels to form as many output feature output of this procedure is a matrix called *output feature map* (in green).
maps as desired. The final outputs of this procedure are called *output feature
maps*. To keep the drawing simple, a single input feature map is represented, This procedure can be repeated using different kernels to form as many output
but it is not uncommon to have multiple feature maps stacked one onto another. feature maps (a.k.a. *output channels*) as desired. Note also that to keep the
drawing simple a single input feature map is being represented, but it is not
uncommon to have multiple feature maps stacked one onto another (an example of
this is what was referred to earlier as *channels* for images and sound clips).
.. note:: .. note::
...@@ -109,15 +111,18 @@ but it is not uncommon to have multiple feature maps stacked one onto another. ...@@ -109,15 +111,18 @@ but it is not uncommon to have multiple feature maps stacked one onto another.
used in this tutorial. used in this tutorial.
If there are multiple input and output feature maps, the collection of kernels If there are multiple input and output feature maps, the collection of kernels
form a 4D array (``num_kernels, num_input_channels, filter_rows, form a 4D array (``output_channels, input_channels, filter_rows,
filter_columns``). For each output channel, each input channel is convolved with filter_columns``). For each output channel, each input channel is convolved with
a distinct kernel and the resulting set of feature maps is summed elementwise a distinct part of the kernel and the resulting set of feature maps is summed
to produce the corresponding output feature map. elementwise to produce the corresponding output feature map. The result of this
procedure is a set of output feature maps, one for each output channel, that is
the output of the convolution.
The convolution depicted above is an instance of a 2-D convolution, but it can The convolution depicted above is an instance of a 2-D convolution, but can be
be generalized to N-D convolutions. For instance, in a 3-D convolution, the generalized to N-D convolutions. For instance, in a 3-D convolution, the kernel
kernel would be a *cuboid* and would slide across the height, width and depth would be a *cuboid* and would slide across the height, width and depth of the
of the input feature map. input feature map.
The collection of kernels defining a discrete convolution has a shape The collection of kernels defining a discrete convolution has a shape
corresponding to some permutation of :math:`(n, m, k_1, \ldots, k_N)`, where corresponding to some permutation of :math:`(n, m, k_1, \ldots, k_N)`, where
...@@ -256,7 +261,7 @@ relationship: ...@@ -256,7 +261,7 @@ relationship:
input, filters, input_shape=(b, c2, i1, i2), filter_shape=(c1, c2, k1, k2), input, filters, input_shape=(b, c2, i1, i2), filter_shape=(c1, c2, k1, k2),
border_mode=(p1, p2), subsample=(1, 1)) border_mode=(p1, p2), subsample=(1, 1))
# output.shape[2] == (i1 - k1) + 2 * p1 + 1 # output.shape[2] == (i1 - k1) + 2 * p1 + 1
# output.shape[3] == (i2 - k2) + 2 * p1 + 1 # output.shape[3] == (i2 - k2) + 2 * p2 + 1
Here is an example for :math:`i = 5`, :math:`k = 4` and :math:`p = 2`: Here is an example for :math:`i = 5`, :math:`k = 4` and :math:`p = 2`:
...@@ -646,6 +651,8 @@ It is indeed the case, as shown in here for :math:`i = 5`, :math:`k = 4` and ...@@ -646,6 +651,8 @@ It is indeed the case, as shown in here for :math:`i = 5`, :math:`k = 4` and
Formally, the following relationship applies for zero padded convolutions: Formally, the following relationship applies for zero padded convolutions:
.. _Relationship8:
.. admonition:: Relationship 8 .. admonition:: Relationship 8
A convolution described by :math:`s = 1`, :math:`k` and :math:`p` has an A convolution described by :math:`s = 1`, :math:`k` and :math:`p` has an
...@@ -773,6 +780,8 @@ For the moment, it will be assumed that the convolution is non-padded (:math:`p ...@@ -773,6 +780,8 @@ For the moment, it will be assumed that the convolution is non-padded (:math:`p
= 0`) and that its input size :math:`i` is such that :math:`i - k` is a multiple = 0`) and that its input size :math:`i` is such that :math:`i - k` is a multiple
of :math:`s`. In that case, the following relationship holds: of :math:`s`. In that case, the following relationship holds:
.. _Relationship11:
.. admonition:: Relationship 11 .. admonition:: Relationship 11
A convolution described by :math:`p = 0`, :math:`k` and :math:`s` and whose A convolution described by :math:`p = 0`, :math:`k` and :math:`s` and whose
...@@ -801,7 +810,8 @@ Zero padding, non-unit strides, transposed ...@@ -801,7 +810,8 @@ Zero padding, non-unit strides, transposed
When the convolution's input size :math:`i` is such that :math:`i + 2p - k` is a When the convolution's input size :math:`i` is such that :math:`i + 2p - k` is a
multiple of :math:`s`, the analysis can extended to the zero padded case by multiple of :math:`s`, the analysis can extended to the zero padded case by
combining Relationship 8 and Relationship 11: combining :ref:`Relationship 8 <Relationship8>` and
:ref:`Relationship 11 <Relationship11>`:
.. admonition:: Relationship 12 .. admonition:: Relationship 12
...@@ -859,7 +869,7 @@ between the :math:`s` different cases that all lead to the same :math:`i'`: ...@@ -859,7 +869,7 @@ between the :math:`s` different cases that all lead to the same :math:`i'`:
o_prime2 = s2 * (output.shape[3] - 1) + a2 + k2 - 2 * p2 o_prime2 = s2 * (output.shape[3] - 1) + a2 + k2 - 2 * p2
input = theano.tensor.nnet.conv2d_grad_wrt_inputs( input = theano.tensor.nnet.conv2d_grad_wrt_inputs(
output, filters, input_shape=(b, c1, o_prime1, o_prime2), output, filters, input_shape=(b, c1, o_prime1, o_prime2),
filter_shape=(c1, c2, k, k), border_mode=(p1, p2), filter_shape=(c1, c2, k1, k2), border_mode=(p1, p2),
subsample=(s1, s2)) subsample=(s1, s2))
Here is an example for :math:`i = 6`, :math:`k = 3`, :math:`s = 2` and :math:`p Here is an example for :math:`i = 6`, :math:`k = 3`, :math:`s = 2` and :math:`p
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论