update tensordot docstring and add to docs

3311764c · Jeremiah Lowin · 70b958b1 · 3311764c · 3311764c
--- a/doc/library/tensor/basic.txt
+++ b/doc/library/tensor/basic.txt
@@ -1202,19 +1202,94 @@ Linear Algebra

    :return: vector-vector outer product

-.. function:: tensordot(X, Y, axes=2)
-
-    This is a symbolic standing for ``numpy.tensordot``.
-
-    :param X: left term
-    :param Y: right term
-    :param axes: sum out these axes from X and Y.
-    :type X: symbolic tensor
-    :type Y: symbolic tensor
+.. function:: tensordot(a, b, axes=2)
+
+    Given two tensors a and b,tensordot computes a generalized dot product over
+    the provided axes. Theano's implementation reduces all expressions to
+    matrix or vector dot products and is based on code from Tijmen Tieleman's
+    gnumpy (http://www.cs.toronto.edu/~tijmen/gnumpy.html).
+
+    :param a: the first tensor variable
+    :type a: symbolic tensor
+
+    :param b: the second tensor variable
+    :type b: symbolic tensor
+
+    :param axes: an integer or array. If an integer, the number of axes
+                 to sum over. If an array, it must have two array
+                 elements containing the axes to sum over in each tensor.
+
+                 Note that the default value of 2 is not guaranteed to work
+                 for all values of a and b, and an error will be raised if
+                 that is the case. The reason for keeping the default is to
+                 maintain the same signature as numpy's tensordot function
+                 (and np.tensordot raises analogous errors for non-compatible
+                 inputs).
+
+                 If an integer i, it is converted to an array containing
+                 the last i dimensions of the first tensor and the first
+                 i dimensions of the second tensor:
+                     axes = [range(a.ndim - i, b.ndim), range(i)]
+
+                 If an array, its two elements must contain compatible axes
+                 of the two tensors. For example, [[1, 2], [2, 0]] means sum
+                 over the 2nd and 3rd axes of a and the 3rd and 1st axes of b.
+                 (Remember axes are zero-indexed!) The 2nd axis of a and the
+                 3rd axis of b must have the same shape; the same is true for
+                 the 3rd axis of a and the 1st axis of b.
+    :type axes: int or array-like of length 2
+
+    :returns: a tensor with shape equal to the concatenation of a's shape
+              (less any dimensions that were summed over) and b's shape
+              (less any dimensions that were summed over).
    :rtype: symbolic tensor
-    :type axes: see numpy.tensordot

-    :return: tensor product
+    It may be helpful to consider an example to see what tensordot does.
+    Theano's implementation is identical to NumPy's. Here a has shape (2, 3, 4)
+    and b has shape (5, 6, 4, 3). The axes to sum over are [[1, 2], [3, 2]] --
+    note that a.shape[1] == b.shape[3] and a.shape[2] == b.shape[2]; these axes
+    are compatible. The resulting tensor will have shape (2, 5, 6) -- the
+    dimensions that are not being summed:
+
+        a = np.random.random((2,3,4))
+        b = np.random.random((5,6,4,3))
+
+        #tensordot
+        c = np.tensordot(a, b, [[1,2],[3,2]])
+
+        #loop replicating tensordot
+        a0, a1, a2 = a.shape
+        b0, b1, _, _ = b.shape
+        cloop = np.zeros((a0,b0,b1))
+
+        #loop over non-summed indices -- these exist
+        #in the tensor product.
+        for i in range(a0):
+            for j in range(b0):
+                for k in range(b1):
+                    #loop over summed indices -- these don't exist
+                    #in the tensor product.
+                    for l in range(a1):
+                        for m in range(a2):
+                            cloop[i,j,k] += a[i,l,m] * b[j,k,m,l]
+
+        np.allclose(c, cloop) #true
+
+    This specific implementation avoids a loop by transposing a and b such that
+    the summed axes of a are last and the summed axes of b are first. The
+    resulting arrays are reshaped to 2 dimensions (or left as vectors, if
+    appropriate) and a matrix or vector dot product is taken. The result is
+    reshaped back to the required output dimensions.
+
+    In an extreme case, no axes may be specified. The resulting tensor
+    will have shape equal to the concatenation of the shapes of a and b:
+
+        c = np.tensordot(a, b, 0)
+        print(a.shape) #(2,3,4)
+        print(b.shape) #(5,6,4,3)
+        print(c.shape) #(2,3,4,5,6,4,3)
+
+    See the documentation of numpy.tensordot for more examples.

 .. function:: batched_dot(X, Y)


--- a/theano/tensor/basic.py
+++ b/theano/tensor/basic.py
@@ -7112,13 +7112,15 @@ def dot(a, b):
 def tensordot(a, b, axes = 2):
    """
    Given two tensors a and b,tensordot computes a generalized dot product over
-    the provided axes. This implementation reduces all expressions to matrix or
-    vector dot products and is based on code from Tijmen Tieleman's gnumpy
-    (http://www.cs.toronto.edu/~tijmen/gnumpy.html).
+    the provided axes. Theano's implementation reduces all expressions to
+    matrix or vector dot products and is based on code from Tijmen Tieleman's
+    gnumpy (http://www.cs.toronto.edu/~tijmen/gnumpy.html).

    :param a: the first tensor variable
+    :type a: symbolic tensor

    :param b: the second tensor variable
+    :type b: symbolic tensor

    :param axes: an integer or array. If an integer, the number of axes
                 to sum over. If an array, it must have two array
@@ -7142,10 +7144,12 @@ def tensordot(a, b, axes = 2):
                 (Remember axes are zero-indexed!) The 2nd axis of a and the
                 3rd axis of b must have the same shape; the same is true for
                 the 3rd axis of a and the 1st axis of b.
+    :type axes: int or array-like of length 2

    :returns: a tensor with shape equal to the concatenation of a's shape
              (less any dimensions that were summed over) and b's shape
              (less any dimensions that were summed over).
+    :rtype: symbolic tensor

    It may be helpful to consider an example to see what tensordot does.
    Theano's implementation is identical to NumPy's. Here a has shape (2, 3, 4)
@@ -7192,7 +7196,7 @@ def tensordot(a, b, axes = 2):
        print(b.shape) #(5,6,4,3)
        print(c.shape) #(2,3,4,5,6,4,3)

-    See the documentation of np.tensordot for more examples.
+    See the documentation of numpy.tensordot for more examples.
    """
    a, b = as_tensor_variable(a), as_tensor_variable(b)