Update developer documentation on Type to include new partial shape information

150add2c · Brandon T. Willard · Brandon T. Willard · e6f9689e · 150add2c · 150add2c
--- a/doc/extending/graphstructures.rst
+++ b/doc/extending/graphstructures.rst
@@ -203,36 +203,39 @@ structures, code going like ``def f(x): ...`` would produce an :class:`Op` for
 :class:`Type`
 -------------
-A :class:`Type` in Aesara represents a set of constraints on potential
+A :class:`Type` in Aesara provides static information (or constraints) about
-data objects. These constraints allow Aesara to tailor C code to handle
+data objects in a graph. The information provided by :class:`Type`\s allows
-them and to statically optimize the computation graph. For instance,
+Aesara to perform optimizations and produce more efficient compiled code.
-the :ref:`irow <libdoc_tensor_creation>` type in the :mod:`aesara.tensor` package
-gives the following constraints on the data the :class:`Variable`\s of type ``irow``
+Every symbolic :class:`Variable` in an Aesara graph has an associated
-may contain:
+:class:`Type` instance, and :class:`Type`\s also serve as a means of
+constructing :class:`Variable` instances.  In other words, :class:`Type`\s and
-#. Must be an instance of :class:`numpy.ndarray`: ``isinstance(x, numpy.ndarray)``
+:class:`Variable`\s go hand-in-hand.
-#. Must be an array of 32-bit integers: ``str(x.dtype) == 'int32'``
-#. Must have a shape of 1xN: ``len(x.shape) == 2 and x.shape[0] == 1``
+For example, :ref:`aesara.tensor.irow <libdoc_tensor_creation>` is an instance of a
+:class:`Type` and it can be used to construct variables as follows:
-Knowing these restrictions, Aesara may generate C code for addition, etc.
-that declares the right data types and that contains the right number
+>>> from aesara.tensor import irow
-of loops over the dimensions.
+>>> irow()
+<TensorType(int32, (1, None))>
-Note that an Aesara :class:`Type` is not equivalent to a Python type or
-class. Indeed, in Aesara, :ref:`irow <libdoc_tensor_creation>` and :ref:`dmatrix
+As the string print-out shows, `irow` specifies the following information about
-<libdoc_tensor_creation>` both use :class:`numpy.ndarray` as the underlying type
+the :class:`Variable`\s it constructs:
-for doing computations and storing data, yet they are different Aesara
-:class:`Type`\s. Indeed, the constraints set by `dmatrix` are:
+#. They represent tensors that are backed by :class:`numpy.ndarray`\s.
+   This comes from the fact that `irow` is an instance of :class:`TensorType`,
-#. Must be an instance of :class:`numpy.ndarray`: ``isinstance(x, numpy.ndarray)``
+   which is the base :class:`Type` for symbolic :class:`numpy.ndarray`\s.
-#. Must be an array of 64-bit floating point numbers: ``str(x.dtype) == 'float64'``
+#. They represent arrays of 32-bit integers (i.e. from the ``int32``).
-#. Must have a shape of ``MxN``, no restriction on ``M`` or ``N``: ``len(x.shape) == 2``
+#. They represent arrays with shapes of :math:`1 \times N`, or, in code, ``(1,
+   None)``, where ``None`` represents any shape value.
-These restrictions are different from those of ``irow`` which are listed above.
+Note that Aesara :class:`Type`\s are not necessarily equivalent to Python types or
-There are cases in which a :class:`Type` can fully correspond to a Python type,
+classes. Aesara's :class:`TensorType`'s, like `irow`, use :class:`numpy.ndarray`
-such as the `double`\ :class:`Type`, which corresponds to
+as the underlying Python type for performing computations and storing data, but
-Python's ``float``.
+:class:`numpy.ndarray`\s model a much wider class of arrays than most :class:`TensorType`\s.
+In other words, Aesara :class:`Type`'s try to be more specific.
+For more information see :ref:`aesara_type`.
 .. index::
   single: Variable

--- a/doc/extending/type.rst
+++ b/doc/extending/type.rst
@@ -4,6 +4,167 @@
 :class:`Type`\s
 ===============
+The :class:`Type` class is used to provide "static" information about the types of
+:class:`Variable`\s in an Aesara graph.  This information is used for graph optimizations
+and compilation to languages with typing that's stricter than Python's.
+The types handled by Aesara naturally overlap a lot with NumPy, but
+they also differ from it in some very important ways.  In the following, we use
+:class:`TensorType` to illustrate some important concepts and functionality
+regarding :class:`Type`, because it's the most common and feature rich subclass of
+:class:`Type`.  Just be aware that all the same high-level concepts apply
+to any other graph objects modeled by a :class:`Type` subclass.
+The :class:`TensorType`
+-----------------------
+Aesara has a :class:`Type` subclass for tensors/arrays called :class:`TensorType`.
+It broadly represents a type for tensors, but, more specifically, all of its
+computations are performed using instances of the :class:`numpy.ndarray` class, so it
+effectively represents the same objects as :class:`numpy.ndarray`.
+The expression ``TensorType(dtype, shape)()`` will construct a symbolic
+:class:`TensorVariable` instance (a subclass of :class:`Variable`), like
+``numpy.ndarray(shape, dtype)`` will construct a :class:`numpy.ndarray`
+instance.
+Notice the extra parenthesis in the :class:`TensorType` example.  Those are
+necessary because ``TensorType(dtype, shape)`` only constructs an instance of a
+:class:`TensorType`, then, with that instance, a :class:`Variable` instance can
+be constructed using :meth:`TensorType.__call__`.  Just remember that
+:class:`Type` objects are *not* Python types/classes; they're instances of the
+Python class :class:`Type`.  The purpose is effectively the same, though:
+:class:`Type`\s provide high-level typing information and construct instances of
+the high-level types they model.  While Python types/classes do this for the Python
+VM, Aesara :class:`Type`\s do this for its effective "VM".
+In relation to NumPy, the important difference is that Aesara works at the
+symbolic level, and, because of that, there are no concrete array instances with
+which it can call the :attr:`dtype` and :attr:`shape`
+methods and get information about the data type or shape of a symbolic
+variable.  Aesara needs static class/type-level information to serve that purpose,
+and it can't use the :class:`numpy.ndarray` class itself, because that doesn't have
+fixed data types or shapes.
+In analogy with NumPy, we could imagine that the expression :class:`TensorType`
+is a :class:`numpy.ndarray` class constructor like the following:
+.. code::
+   def NdarrayType(dtype, shape):
+       class fixed_dtype_shape_ndarray(_numpy.ndarray):
+           dtype = dtype
+           shape = shape
+           def __call__(self):
+               return super().__call__(dtype, shape)
+       return fixed_dtype_shape_ndarray
+This hypothetical :class:`NdarrayType` would construct :class:`numpy.ndarray` subclasses that
+produces instances with fixed data types and shapes.  Also, the subclasses
+created by this class constructor, would provide data type
+and shape information about the instances they produce without ever needing to
+construct an actual instance (e.g. one can simply inspect the class-level
+``shape`` and ``dtype`` members for that information).
+:class:`TensorType`\s provide a way to carry around the same array information
+at the type level, but they also perform comparisons and conversions
+between and to different types.
+For instance, :class:`TensorType`\s allow for _partial_ shape information.  In other words, the
+shape values for some--or all--dimensions may be unspecified.  The only fixed requirement is that
+the _number_ of dimensions be fixed/given (i.e. the length of the shape
+``tuple``).  To encode partial shape information, :class:`TensorType` allows its
+``shape`` arguments to include ``None``\s.
+To illustrate, ``TensorType("float64", (2, None))`` could represent an array of
+shape ``(2, 0)``, ``(2, 1)``, etc.  This dynamic opens up some questions
+regarding the comparison of :class:`TensorType`\s.
+For example, let's say we have two :class:`Variable`\s with the following
+:class:`TensorType`\s:
+>>> from aesara.tensor.type import TensorType
+>>> v1 = TensorType("float64", (2, None))()
+>>> v1.type
+TensorType(float64, (2, None))
+>>> v2 = TensorType("float64", (2, 1))()
+>>> v2.type
+TensorType(float64, (2, 1))
+If we ever wanted to replace ``v1`` in an Aesara graph with ``v2``, we would first
+need to check that they're "compatible".  This could be done by noticing that
+their shapes match everywhere except on the second dimension, where ``v1`` has the shape
+value ``None`` and ``v2`` has a ``1``.  Since ``None`` indicates "any" shape
+value, the two are "compatible" in some sense.
+The "compatibility" we're describing here is really that ``v1``'s :class:`Type`
+represents a larger set of arrays, and ``v2``'s represents a much more
+specific subset, but both belong to the same set of array types.
+:class:`Type` provides a generic interface for these kinds of comparisons with its
+:meth:`Type.in_same_class` and :meth:`Type.is_super` methods.  These type-comparison methods
+are in turn used by the :class:`Variable` conversion methods to "narrow" the
+type information received at different stages of graph construction and
+rewriting.
+For example:
+>>> v1.type.in_same_class(v2.type)
+False
+This result is due to the definition of "type class" used by
+:class:`TensorType`.  Its definition is based on the broadcastable dimensions
+(i.e. ``1``\s) in the available static shape information.  See the
+docstring for :meth:`TensorType.in_same_class` for more information.
+>>> v1.type.is_super(v2.type)
+True
+This result is due to the fact that ``v1.type`` models a superset of the types
+that ``v2.type`` models, since ``v2.type`` is a type for arrays with the
+specific shape ``(2, 1)`` and ``v1.type`` is a type for _all_ arrays with shape
+``(2, N)`` for any ``N``--of which ``v2.type``'s type is only a single instance.
+This relation is used to "filter" :class:`Variable`\s through specific :class:`Type`\s
+in order to generate a new :class:`Variable` that's compatible with both.  This "filtering"
+is an important step in the node replacement process during graph rewriting, for instance.
+>>> v1.type.filter_variable(v2)
+<TensorType(float64, (2, 1))>
+"Filtering" returned a variable of the same :class:`Type` as ``v2``, because ``v2``'s :class:`Type` is
+more specific/informative than ``v1``'s--and both are compatible.
+>>> v3 = v2.type.filter_variable(v1)
+>>> v3
+Rebroadcast{(0, False),(1, True)}.0
+>>> import aesara
+>>> aesara.dprint(v3, print_type=True)
+Rebroadcast{(0, False),(1, True)} [id A] <TensorType(float64, (None, 1))> ''
+ |<TensorType(float64, (2, None))> [id B] <TensorType(float64, (2, None))>
+Performing this in the opposite direction returned the output of a
+:class:`Rebroadcast`\ :class:`Op`.  This :class:`Rebroadcast` uses ``v1`` as an
+input and serves to produce a new :class:`Variable` that has a :class:`Type` compatible with
+both ``v1`` and ``v2``.
+.. note::
+   The :class:`Type` for ``v3`` should really have a static shape of ``(2, 1)``
+   (i.e. ``v2``'s shape), but the static shape information feature is still
+   under development.
+It's important to keep these special type comparisons in mind when developing custom
+:class:`Op`\s and graph rewrites in Aesara, because simple naive comparisons
+like ``v1.type == v2.type`` may unnecessarily restrict logic and prevent
+more refined type information from propagating throughout a graph.  They may not
+cause errors, but they could prevent Aesara from performing at its best.
 .. _type_contract:
@@ -20,28 +181,17 @@ i.e.  the same default argument names and values. If you wish to add
 extra arguments to any of these methods, these extra arguments must have
 default values.
-.. class:: Type
+.. autoclass:: aesara.graph.type.Type
+    :noindex:
-    .. method:: filter(value, strict=False, allow_downcast=None)
+    .. automethod:: in_same_class
+      :noindex:
-      This casts a value to match the :class:`Type` and returns the
+    .. automethod:: is_super
-      cast value. If ``value`` is incompatible with the :class:`Type`,
+      :noindex:
-      the method must raise an exception. If ``strict`` is True, ``filter`` must return a
-      reference to ``value`` (i.e. casting prohibited).
-      If ``strict`` is False, then casting may happen, but downcasting should
-      only be used in two situations:
-      * if ``allow_downcast`` is True
-      * if ``allow_downcast`` is ``None`` and the default behavior for this
-        type allows downcasting for the given ``value`` (this behavior is
-        type-dependent, you may decide what your own type does by default)
-      We need to define ``filter`` with three arguments. The second argument
-      must be called ``strict`` (Aesara often calls it by keyword) and must
-      have a default value of ``False``. The third argument must be called
-      ``allow_downcast`` and must have a default value of ``None``.
    .. method:: filter_inplace(value, storage, strict=False, allow_downcast=None)
+      :noindex:
      If filter_inplace is defined, it will be called instead of
      filter() This is to allow reusing the old allocated memory. As
@@ -52,6 +202,7 @@ default values.
      CudaNdarray, ...
    .. method:: is_valid_value(value)
+      :noindex:
      Returns True iff the value is compatible with the :class:`Type`. If
      ``filter(value, strict = True)`` does not raise an exception, the
@@ -61,12 +212,14 @@ default values.
      an exception.
    .. method:: values_eq(a, b)
+      :noindex:
      Returns True iff ``a`` and ``b`` are equal.
      *Default:* ``a == b``
    .. method:: values_eq_approx(a, b)
+      :noindex:
      Returns True iff ``a`` and ``b`` are approximately equal, for a
      definition of "approximately" which varies from :class:`Type` to :class:`Type`.
@@ -74,6 +227,7 @@ default values.
      *Default:* ``values_eq(a, b)``
    .. method:: make_variable(name=None)
+      :noindex:
      Makes a :term:`Variable` of this :class:`Type` with the specified name, if
      ``name`` is not ``None``. If ``name`` is ``None``, then the `Variable` does
@@ -85,18 +239,21 @@ default values.
      other words, ``self``).
    .. method:: __call__(name=None)
+      :noindex:
      Syntactic shortcut to ``make_variable``.
      *Default:* ``make_variable``
    .. method:: __eq__(other)
+      :noindex:
      Used to compare :class:`Type` instances themselves
      *Default:* ``object.__eq__``
    .. method:: __hash__()
+      :noindex:
      :class:`Type`\s should not be mutable, so it should be OK to define a hash
      function.  Typically this function should hash all of the terms
@@ -104,7 +261,27 @@ default values.
      *Default:* ``id(self)``
+    .. automethod:: clone
+       :noindex:
+.. autoclass:: aesara.tensor.type.TensorType
+    :noindex:
+    .. method:: may_share_memory(a, b)
+      :noindex:
+        Optional to run, but mandatory for `DebugMode`. Return ``True`` if the Python
+        objects `a` and `b` could share memory. Return ``False``
+        otherwise. It is used to debug when :class:`Op`\s did not declare memory
+        aliasing between variables. Can be a static method.
+        It is highly recommended to use and is mandatory for :class:`Type` in Aesara
+        as our buildbot runs in `DebugMode`.
    .. method:: get_shape_info(obj)
+      :noindex:
      Optional. Only needed to profile the memory of this :class:`Type` of object.
@@ -127,6 +304,7 @@ default values.
    .. method:: get_size(shape_info)
+      :noindex:
        Number of bytes taken by the object represented by shape_info.
@@ -136,33 +314,6 @@ default values.
        :return: the number of bytes taken by the object described by
            ``shape_info``.
-    .. method:: clone(dtype=None, shape=None)
-       Optional, for TensorType-alikes.
-       Return a copy of the type with a possibly changed value for
-       dtype and broadcastable (if they aren't `None`).
-       :param dtype: New dtype for the copy.
-       :param broadcastable: New broadcastable tuple for the copy.
-    .. method:: may_share_memory(a, b)
-        Optional to run, but mandatory for `DebugMode`. Return ``True`` if the Python
-        objects `a` and `b` could share memory. Return ``False``
-        otherwise. It is used to debug when :class:`Op`\s did not declare memory
-        aliasing between variables. Can be a static method.
-        It is highly recommended to use and is mandatory for :class:`Type` in Aesara
-        as our buildbot runs in `DebugMode`.
-For each method, the *default* is what `Type` defines
-for you. So, if you create an instance of `Type` or an
-instance of a subclass of `Type`, you
-must define ``filter``. You might want to override ``values_eq_approx``,
-as well as ``values_eq``. The other defaults generally need not be
-overridden.
-For more details you can go see the documentation for :ref:`type`.
 Additional definitions
@@ -187,33 +338,32 @@ with it as argument.
 An example
 ==========
-We are going to base :class:`Type` ``double`` on Python's ``float``. We
+We are going to base :class:`Type` ``DoubleType`` on Python's ``float``. We
-must define ``filter`` and shall override ``values_eq_approx``.
+must define ``filter`` and ``values_eq_approx``.
 **filter**
 .. testcode::
-    # Note that we shadow Python's function ``filter`` with this
+    # note that we shadow python's function ``filter`` with this
    # definition.
-    def filter(x, strict=False, allow_downcast=None):
+    def filter(x, strict=false, allow_downcast=none):
        if strict:
            if isinstance(x, float):
                return x
            else:
-                raise TypeError('Expected a float!')
+                raise typeerror('expected a float!')
        elif allow_downcast:
            return float(x)
-        else:   # Covers both the False and None cases.
+        else:   # covers both the false and none cases.
            x_float = float(x)
            if x_float == x:
                return x_float
            else:
-                 raise TypeError('The double type cannot accurately represent '
+                  raise TypeError('The double type cannot accurately represent '
-                                 'value %s (of type %s): you must explicitly '
+                                  f'value {x} (of type {type(x)}): you must explicitly '
-                                 'allow downcasting if you want to do this.'
+                                  'allow downcasting if you want to do this.')
-                                 % (x, type(x)))
 If ``strict`` is True we need to return ``x``. If ``strict`` is True and ``x`` is not a
 ``float`` (for example, ``x`` could easily be an ``int``) then it is
@@ -258,26 +408,13 @@ chose to be 1e-4.
 What we want is an object that respects the aforementioned
 contract. Recall that :class:`Type` defines default implementations for all
-required methods of the interface, except ``filter``. One way to make
+required methods of the interface, except ``filter``.
-the :class:`Type` is to instantiate a plain :class:`Type` and set the needed fields:
-.. testcode::
-   from aesara.graph.type import Type
-   double = Type()
-   double.filter = filter
-   double.values_eq_approx = values_eq_approx
-Another way to make this :class:`Type` is to make a subclass of `Type`
-and define ``filter`` and ``values_eq_approx`` in the subclass:
 .. code-block:: python
   from aesara.graph.type import Type
-   class Double(Type):
+   class DoubleType(Type):
       def filter(self, x, strict=False, allow_downcast=None):
           # See code above.
@@ -287,59 +424,28 @@ and define ``filter`` and ``values_eq_approx`` in the subclass:
           # See code above.
           ...
-   double = Double()
+   double = DoubleType()
-``double`` is then an instance of :class:`Type`\ `Double`, which in turn is a
+``double`` is then an instance of :class:`Type`\ :class:`DoubleType`, which in turn is a
 subclass of `Type`.
-There is a small issue with defining ``double`` this way. All
+There is a small issue with our :class:`DoubleType`: all
-instances of `Double` are technically the same :class:`Type`. However, different
+instances of `DoubleType` are technically the same :class:`Type`; however, different
-`Double`\ :class:`Type` instances do not compare the same:
+`DoubleType`\ :class:`Type` instances do not compare the same:
-.. testsetup::
-   from aesara.graph.type import Type
-   class Double(Type):
-       def filter(self, x, strict=False, allow_downcast=None):
-           if strict:
-               if isinstance(x, float):
-                   return x
-               else:
-                   raise TypeError('Expected a float!')
-           elif allow_downcast:
-               return float(x)
-           else:   # Covers both the False and None cases.
-               x_float = float(x)
-               if x_float == x:
-                   return x_float
-               else:
-                    raise TypeError('The double type cannot accurately represent '
-                                    'value %s (of type %s): you must explicitly '
-                                    'allow downcasting if you want to do this.'
-                                    % (x, type(x)))
-       def values_eq_approx(self, x, y, tolerance=1e-4):
-           return abs(x - y) / (abs(x) + abs(y)) < tolerance
-       def __str__(self):
-           return "double"
-   double = Double()
+>>> double1 = DoubleType()
+>>> double2 = DoubleType()
->>> double1 = Double()
->>> double2 = Double()
 >>> double1 == double2
 False
 Aesara compares :class:`Type`\s using ``==`` to see if they are the same.
 This happens in :class:`DebugMode`.  Also, :class:`Op`\s can (and should) ensure that their inputs
-have the expected :class:`Type` by checking something like ``if x.type == lvector``.
+have the expected :class:`Type` by checking something like
+``x.type.is_super(lvector)`` or ``x.type.in_same_class(lvector)``.
 There are several ways to make sure that equality testing works properly:
- #. Define ``Double.__eq__`` so that instances of type Double
+ #. Define :meth:`DoubleType.__eq__` so that instances of type :class:`DoubleType`
    are equal. For example:
    .. testcode::
@@ -347,8 +453,8 @@ There are several ways to make sure that equality testing works properly:
        def __eq__(self, other):
            return type(self) == type(other)
- #. Override :meth:`Double.__new__` to always return the same instance.
+ #. Override :meth:`DoubleType.__new__` to always return the same instance.
- #. Hide the Double class and only advertise a single instance of it.
+ #. Hide the :class:`DoubleType` class and only advertise a single instance of it.
 Here we will prefer the final option, because it is the simplest.
 :class:`Op`\s in the Aesara code often define the :meth:`__eq__` method though.
@@ -361,7 +467,7 @@ Initially, confusion is common on what an instance of :class:`Type` is versus
 a subclass of :class:`Type` or an instance of :class:`Variable`. Some of this confusion is
 syntactic. A :class:`Type` is any object which has fields corresponding to the
 functions defined above. The :class:`Type` class provides sensible defaults for
-all of them except ``filter``, so when defining new :class:`Type`\s it is natural
+all of them except :meth:`Type.filter`, so when defining new :class:`Type`\s it is natural
 to subclass :class:`Type`. Therefore, we often end up with :class:`Type` subclasses and
 it is can be confusing what these represent semantically. Here is an
 attempt to clear up the confusion:
@@ -380,7 +486,7 @@ attempt to clear up the confusion:
 * A **subclass of :class:`Type`** is a way of implementing
  a set of :class:`Type` instances that share
-  structural similarities. In the ``double`` example that we are doing,
+  structural similarities. In the :class:`DoubleType` example that we are doing,
  there is actually only one :class:`Type` in that set, therefore the subclass
  does not represent anything that one of its instances does not. In this
  case it is a singleton, a set with one element. However, the
@@ -390,43 +496,3 @@ attempt to clear up the confusion:
  parametrized by their data type or number of dimensions. We could say
  that subclassing :class:`Type` builds a hierarchy of :class:`Type`\s which is based upon
  structural similarity rather than compatibility.
-Final version
-=============
-.. testcode::
-   from aesara.graph.type import Type
-   class Double(Type):
-       def filter(self, x, strict=False, allow_downcast=None):
-           if strict:
-               if isinstance(x, float):
-                   return x
-               else:
-                   raise TypeError('Expected a float!')
-           elif allow_downcast:
-               return float(x)
-           else:   # Covers both the False and None cases.
-               x_float = float(x)
-               if x_float == x:
-                   return x_float
-               else:
-                    raise TypeError('The double type cannot accurately represent '
-                                    'value %s (of type %s): you must explicitly '
-                                    'allow downcasting if you want to do this.'
-                                    % (x, type(x)))
-       def values_eq_approx(self, x, y, tolerance=1e-4):
-           return abs(x - y) / (abs(x) + abs(y)) < tolerance
-       def __str__(self):
-           return "double"
-   double = Double()
-We add one utility function, ``__str__``. That way, when we print
-``double``, it will print out something intelligible.
--- a/doc/tutorial/shape_info.rst
+++ b/doc/tutorial/shape_info.rst
@@ -4,25 +4,21 @@
 How Shape Information is Handled by Aesara
 ==========================================
-It is not possible to strictly enforce the shape of an Aesara variable when
+Currently, information regarding shape is used in the following ways by Aesara:
-building a graph since the particular value provided at run-time for a parameter of a
-Aesara function may condition the shape of the Aesara variables in its graph.
-Currently, information regarding shape is used in two ways in Aesara:
- To generate faster C code for the 2d convolution on the CPU and the GPU,
-  when the exact output shape is known in advance.
 - To remove computations in the graph when we only want to know the
  shape, but not the actual value of a variable. This is done with the
-  `Op.infer_shape` method.
+  :meth:`Op.infer_shape` method.
+- To generate faster compiled code (e.g. for a 2D convolution).
-  Example:
+Example:
 >>> import aesara
 >>> x = aesara.tensor.matrix('x')
 >>> f = aesara.function([x], (x ** 2).shape)
->>> aesara.printing.debugprint(f) # doctest: +NORMALIZE_WHITESPACE
+>>> aesara.dprint(f)
 MakeVector{dtype='int64'} [id A] ''   2
 |Shape_i{0} [id B] ''   1
 | |x [id C]
@@ -30,15 +26,46 @@ MakeVector{dtype='int64'} [id A] ''   2
   |x [id C]
-The output of this compiled function does not contain any multiplication
+The output of this compiled function does not contain any multiplication or
-or power. Aesara has removed them to compute directly the shape of the
+power computations; Aesara has removed them to compute the shape of the output
-output.
+directly.
+Aesara propagates information about shapes within a graph using specialized
+:class:`Op`\s and static :class:`Type` information (see :ref:`aesara_type`).
+Specifying Exact Shape
+======================
+Currently, specifying a shape is not as easy and flexible as we wish and we plan some
+upgrade.  Here is the current state of what can be done:
+- You can pass the shape info directly to the ``ConvOp`` created
+  when calling ``conv2d``. You simply set the parameters ``image_shape``
+  and ``filter_shape`` inside the call. They must be tuples of 4
+  elements. For example:
+.. code-block:: python
+    aesara.tensor.nnet.conv2d(..., image_shape=(7, 3, 5, 5), filter_shape=(2, 3, 4, 4))
+- You can use the ``SpecifyShape`` op to add shape information anywhere in the
+  graph. This allows to perform some optimizations. In the following example,
+  this makes it possible to precompute the Aesara function to a constant.
+>>> import aesara
+>>> x = aesara.tensor.matrix()
+>>> x_specify_shape = aesara.tensor.specify_shape(x, (2, 2))
+>>> f = aesara.function([x], (x_specify_shape ** 2).shape)
+>>> aesara.printing.debugprint(f) # doctest: +NORMALIZE_WHITESPACE
+DeepCopyOp [id A] ''   0
+ |TensorConstant{(2,) of 2} [id B]
-Shape Inference Problem
+Problems with Shape inference
-=======================
+=============================
-Aesara propagates information about shape in the graph. Sometimes this
+Sometimes this can lead to errors.  Consider this example:
-can lead to errors. Consider this example:
 >>> import numpy
 >>> import aesara
@@ -75,7 +102,7 @@ Traceback (most recent call last):
 ValueError: ...
 As you can see, when asking only for the shape of some computation (``join`` in the
-example), an inferred shape is computed directly, without executing
+example above), an inferred shape is computed directly, without executing
 the computation itself (there is no ``join`` in the first output or debugprint).
 This makes the computation of the shape faster, but it can also hide errors. In
@@ -93,40 +120,3 @@ optimization, using the Aesara flag
 same effect by running in the modes ``FAST_COMPILE`` (it will not apply this
 optimization, nor most other optimizations) or ``DebugMode`` (it will test
 before and after all optimizations (much slower)).
-Specifying Exact Shape
-======================
-Currently, specifying a shape is not as easy and flexible as we wish and we plan some
-upgrade.  Here is the current state of what can be done:
- You can pass the shape info directly to the ``ConvOp`` created
-  when calling ``conv2d``. You simply set the parameters ``image_shape``
-  and ``filter_shape`` inside the call. They must be tuples of 4
-  elements. For example:
-.. code-block:: python
-    aesara.tensor.nnet.conv2d(..., image_shape=(7, 3, 5, 5), filter_shape=(2, 3, 4, 4))
- You can use the ``SpecifyShape`` op to add shape information anywhere in the
-  graph. This allows to perform some optimizations. In the following example,
-  this makes it possible to precompute the Aesara function to a constant.
->>> import aesara
->>> x = aesara.tensor.matrix()
->>> x_specify_shape = aesara.tensor.specify_shape(x, (2, 2))
->>> f = aesara.function([x], (x_specify_shape ** 2).shape)
->>> aesara.printing.debugprint(f) # doctest: +NORMALIZE_WHITESPACE
-DeepCopyOp [id A] ''   0
- |TensorConstant{(2,) of 2} [id B]
-Future Plans
-============
-  The parameter "constant shape" will be added to ``aesara.shared()``. This is probably
-  the most frequent occurrence with ``shared`` variables. It will make the code
-  simpler and will make it possible to check that the shape does not change when
-  updating the ``shared`` variable.