Enable sphinx-lint pre-commit hook

3e55a209 · Virgile Andreani · Thomas Wiecki · a8735971 · 3e55a209 · 3e55a209
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -21,6 +21,11 @@ repos:
              pytensor/tensor/variable\.py|
          )$
      - id: check-merge-conflict
+  - repo: https://github.com/sphinx-contrib/sphinx-lint                                  
+    rev: v1.0.0                                                                          
+    hooks:                                                                               
+    - id: sphinx-lint 
+      args: ["."]
  - repo: https://github.com/astral-sh/ruff-pre-commit
    rev: v0.6.5
    hooks:

--- a/doc/extending/creating_a_c_op.rst
+++ b/doc/extending/creating_a_c_op.rst
@@ -152,7 +152,7 @@ This distance between consecutive elements of an array over a given dimension,
 is called the stride of that dimension.
-Accessing NumPy :class`ndarray`\s' data and properties
+Accessing NumPy :class:`ndarray`'s data and properties
 ------------------------------------------------------
 The following macros serve to access various attributes of NumPy :class:`ndarray`\s.

--- a/doc/extending/creating_a_numba_jax_op.rst
+++ b/doc/extending/creating_a_numba_jax_op.rst
@@ -4,7 +4,7 @@ Adding JAX, Numba and Pytorch support for `Op`\s
 PyTensor is able to convert its graphs into JAX, Numba and Pytorch compiled functions. In order to do
 this, each :class:`Op` in an PyTensor graph must have an equivalent JAX/Numba/Pytorch implementation function.
-This tutorial will explain how JAX, Numba and Pytorch implementations are created for an :class:`Op`. 
+This tutorial will explain how JAX, Numba and Pytorch implementations are created for an :class:`Op`.
 Step 1: Identify the PyTensor :class:`Op` you'd like to implement
 ------------------------------------------------------------------------
@@ -60,7 +60,7 @@ could also have any data type (e.g. floats, ints), so our implementation
 must be able to handle all the possible data types.
 It also tells us that there's only one return value, that it has a data type
-determined by :meth:`x.type()` i.e., the data type of the original tensor.
+determined by :meth:`x.type` i.e., the data type of the original tensor.
 This implies that the result is necessarily a matrix.
 Some class may have a more complex behavior. For example, the :class:`CumOp`\ :class:`Op`
@@ -116,7 +116,7 @@ Here's an example for :class:`DimShuffle`:
 .. tab-set::
-        .. tab-item:: JAX     
+        .. tab-item:: JAX
            .. code:: python
@@ -134,7 +134,7 @@ Here's an example for :class:`DimShuffle`:
                        res = jnp.copy(res)
                    return res
        .. tab-item:: Numba
            .. code:: python
@@ -465,7 +465,7 @@ Step 4: Write tests
    .. tab-item:: JAX
        Test that your registered `Op` is working correctly by adding tests to the
-        appropriate test suites in PyTensor (e.g. in ``tests.link.jax``). 
+        appropriate test suites in PyTensor (e.g. in ``tests.link.jax``).
        The tests should ensure that your implementation can
        handle the appropriate types of inputs and produce outputs equivalent to `Op.perform`.
        Check the existing tests for the general outline of these kinds of tests. In
@@ -478,7 +478,7 @@ Step 4: Write tests
        Here's a small example of a test for :class:`CumOp` above:
        .. code:: python
            import numpy as np
            import pytensor.tensor as pt
            from pytensor.configdefaults import config
@@ -514,22 +514,22 @@ Step 4: Write tests
        .. code:: python
            import pytest
            def test_jax_CumOp():
                """Test JAX conversion of the `CumOp` `Op`."""
                a = pt.matrix("a")
                a.tag.test_value = np.arange(9, dtype=config.floatX).reshape((3, 3))
                with pytest.raises(NotImplementedError):
                    out = pt.cumprod(a, axis=1)
                    fgraph = FunctionGraph([a], [out])
                    compare_jax_and_py(fgraph, [get_test_value(i) for i in fgraph.inputs])
    .. tab-item:: Numba
        Test that your registered `Op` is working correctly by adding tests to the
-        appropriate test suites in PyTensor (e.g. in ``tests.link.numba``). 
+        appropriate test suites in PyTensor (e.g. in ``tests.link.numba``).
        The tests should ensure that your implementation can
        handle the appropriate types of inputs and produce outputs equivalent to `Op.perform`.
        Check the existing tests for the general outline of these kinds of tests. In
@@ -542,7 +542,7 @@ Step 4: Write tests
        Here's a small example of a test for :class:`CumOp` above:
        .. code:: python
            from tests.link.numba.test_basic import compare_numba_and_py
            from pytensor.graph import FunctionGraph
            from pytensor.compile.sharedvalue import SharedVariable
@@ -561,11 +561,11 @@ Step 4: Write tests
                        if not isinstance(i, SharedVariable | Constant)
                    ],
                )
    .. tab-item:: Pytorch
        Test that your registered `Op` is working correctly by adding tests to the
        appropriate test suites in PyTensor (``tests.link.pytorch``). The tests should ensure that your implementation can
        handle the appropriate types of inputs and produce outputs equivalent to `Op.perform`.
@@ -579,7 +579,7 @@ Step 4: Write tests
        Here's a small example of a test for :class:`CumOp` above:
        .. code:: python
            import numpy as np
            import pytest
            import pytensor.tensor as pt
@@ -592,7 +592,7 @@ Step 4: Write tests
                ["float64", "int64"],
            )
            @pytest.mark.parametrize(
-                "axis", 
+                "axis",
                [None, 1, (0,)],
            )
            def test_pytorch_CumOp(axis, dtype):
@@ -650,4 +650,4 @@ as reported in issue `#654 <https://github.com/pymc-devs/pytensor/issues/654>`_.
 All jitted functions now must have constant shape, which means a graph like the
 one of :class:`Eye` can never be translated to JAX, since it's fundamentally a
 function with dynamic shapes. In other words, only PyTensor graphs with static shapes
 can be translated to JAX at the moment.
\ No newline at end of file
--- a/doc/extending/type.rst
+++ b/doc/extending/type.rst
@@ -333,7 +333,7 @@ returns eitehr a new transferred variable (which can be the same as
 the input if no transfer is necessary) or returns None if the transfer
 can't be done.
-Then register that function by calling :func:`register_transfer()`
+Then register that function by calling :func:`register_transfer`
 with it as argument.
 An example

--- a/doc/library/compile/io.rst
+++ b/doc/library/compile/io.rst
@@ -36,7 +36,7 @@ The ``inputs`` argument to ``pytensor.function`` is a list, containing the ``Var
      ``self.<name>``. The default value is ``None``.
      ``value``: literal or ``Container``. The initial/default value for this
-        input. If update is`` None``, this input acts just like
+        input. If update is ``None``, this input acts just like
        an argument with a default value in Python. If update is not ``None``,
        changes to this
        value will "stick around", whether due to an update or a user's

--- a/doc/library/config.rst
+++ b/doc/library/config.rst
@@ -226,7 +226,7 @@ import ``pytensor`` and print the config variable, as in:
    in the future.
    The ``'numpy+floatX'`` setting attempts to mimic NumPy casting rules,
-    although it prefers to use ``float32` `numbers instead of ``float64`` when
+    although it prefers to use ``float32`` numbers instead of ``float64`` when
    ``config.floatX`` is set to ``'float32'`` and the associated data is not
    explicitly typed as ``float64`` (e.g. regular Python floats).  Note that
    ``'numpy+floatX'`` is not currently behaving exactly as planned (it is a

--- a/doc/library/tensor/basic.rst
+++ b/doc/library/tensor/basic.rst
@@ -908,8 +908,8 @@ Reductions
    :Parameter: *x* -  symbolic Tensor (or compatible)
    :Parameter: *axis* - axis or axes along which to compute the maximum
    :Parameter: *keepdims* - (boolean) If this is set to True, the axes which are reduced are
-		left in the result as dimensions with size one. With this option, the result
+        left in the result as dimensions with size one. With this option, the result
-		will broadcast correctly against the original tensor.
+        will broadcast correctly against the original tensor.
    :Returns: maximum of *x* along *axis*
    axis can be:
@@ -922,8 +922,8 @@ Reductions
    :Parameter: *x* - symbolic Tensor (or compatible)
    :Parameter: *axis* - axis along which to compute the index of the maximum
    :Parameter: *keepdims* - (boolean) If this is set to True, the axis which is reduced is
-		left in the result as a dimension with size one. With this option, the result
+        left in the result as a dimension with size one. With this option, the result
-		will broadcast correctly against the original tensor.
+        will broadcast correctly against the original tensor.
    :Returns: the index of the maximum value along a given axis
    if ``axis == None``, `argmax` over the flattened tensor (like NumPy)
@@ -933,8 +933,8 @@ Reductions
    :Parameter: *x* - symbolic Tensor (or compatible)
    :Parameter: *axis* - axis along which to compute the maximum and its index
    :Parameter: *keepdims* - (boolean) If this is set to True, the axis which is reduced is
-		left in the result as a dimension with size one. With this option, the result
+        left in the result as a dimension with size one. With this option, the result
-		will broadcast correctly against the original tensor.
+        will broadcast correctly against the original tensor.
    :Returns: the maximum value along a given axis and its index.
    if ``axis == None``, `max_and_argmax` over the flattened tensor (like NumPy)
@@ -944,8 +944,8 @@ Reductions
    :Parameter: *x* -  symbolic Tensor (or compatible)
    :Parameter: *axis* - axis or axes along which to compute the minimum
    :Parameter: *keepdims* - (boolean) If this is set to True, the axes which are reduced are
-		left in the result as dimensions with size one. With this option, the result
+        left in the result as dimensions with size one. With this option, the result
-		will broadcast correctly against the original tensor.
+        will broadcast correctly against the original tensor.
    :Returns: minimum of *x* along *axis*
    `axis` can be:
@@ -958,8 +958,8 @@ Reductions
    :Parameter: *x* - symbolic Tensor (or compatible)
    :Parameter: *axis* - axis along which to compute the index of the minimum
    :Parameter: *keepdims* - (boolean) If this is set to True, the axes which are reduced are
-		left in the result as dimensions with size one. With this option, the result
+        left in the result as dimensions with size one. With this option, the result
-		will broadcast correctly against the original tensor.
+        will broadcast correctly against the original tensor.
    :Returns: the index of the minimum value along a given axis
    if ``axis == None``, `argmin` over the flattened tensor (like NumPy)
@@ -980,8 +980,8 @@ Reductions
        This default dtype does _not_ depend on the value of "acc_dtype".
    :Parameter: *keepdims* - (boolean) If this is set to True, the axes which are reduced are
-		left in the result as dimensions with size one. With this option, the result
+        left in the result as dimensions with size one. With this option, the result
-		will broadcast correctly against the original tensor.
+        will broadcast correctly against the original tensor.
    :Parameter: *acc_dtype* -  The dtype of the internal accumulator.
        If None (default), we use the dtype in the list below,
@@ -1015,8 +1015,8 @@ Reductions
        This default dtype does _not_ depend on the value of "acc_dtype".
    :Parameter: *keepdims* - (boolean) If this is set to True, the axes which are reduced are
-		left in the result as dimensions with size one. With this option, the result
+        left in the result as dimensions with size one. With this option, the result
-		will broadcast correctly against the original tensor.
+        will broadcast correctly against the original tensor.
    :Parameter: *acc_dtype* -  The dtype of the internal accumulator.
        If None (default), we use the dtype in the list below,
@@ -1031,16 +1031,16 @@ Reductions
         as we need to handle 3 different cases: without zeros in the
         input reduced group, with 1 zero or with more zeros.
-	 This could slow you down, but more importantly, we currently
+    This could slow you down, but more importantly, we currently
-	 don't support the second derivative of the 3 cases. So you
+    don't support the second derivative of the 3 cases. So you
-	 cannot take the second derivative of the default prod().
+    cannot take the second derivative of the default prod().
-	 To remove the handling of the special cases of 0 and so get
+    To remove the handling of the special cases of 0 and so get
-	 some small speed up and allow second derivative set
+    some small speed up and allow second derivative set
-	 ``no_zeros_in_inputs`` to ``True``. It defaults to ``False``.
+    ``no_zeros_in_inputs`` to ``True``. It defaults to ``False``.
-	 **It is the user responsibility to make sure there are no zeros
+    **It is the user responsibility to make sure there are no zeros
-	 in the inputs. If there are, the grad will be wrong.**
+    in the inputs. If there are, the grad will be wrong.**
    :Returns: product of every term in *x* along *axis*
@@ -1058,13 +1058,13 @@ Reductions
        done in float64 (acc_dtype would be float64 by default),
        but that result will be casted back in float32.
    :Parameter: *keepdims* - (boolean) If this is set to True, the axes which are reduced are
-		left in the result as dimensions with size one. With this option, the result
+        left in the result as dimensions with size one. With this option, the result
-		will broadcast correctly against the original tensor.
+        will broadcast correctly against the original tensor.
    :Parameter: *acc_dtype* -  The dtype of the internal accumulator of the
        inner summation. This will not necessarily be the dtype of the
        output (in particular if it is a discrete (int/uint) dtype, the
        output will be in a float type).  If None, then we use the same
-        rules as :func:`sum()`.
+        rules as :func:`sum`.
    :Returns: mean value of *x* along *axis*
    `axis` can be:
@@ -1077,8 +1077,8 @@ Reductions
    :Parameter: *x* -  symbolic Tensor (or compatible)
    :Parameter: *axis* - axis or axes along which to compute the variance
    :Parameter: *keepdims* - (boolean) If this is set to True, the axes which are reduced are
-		left in the result as dimensions with size one. With this option, the result
+        left in the result as dimensions with size one. With this option, the result
-		will broadcast correctly against the original tensor.
+        will broadcast correctly against the original tensor.
    :Returns: variance of *x* along *axis*
    `axis` can be:
@@ -1091,8 +1091,8 @@ Reductions
    :Parameter: *x* -  symbolic Tensor (or compatible)
    :Parameter: *axis* - axis or axes along which to compute the standard deviation
    :Parameter: *keepdims* - (boolean) If this is set to True, the axes which are reduced are
-		left in the result as dimensions with size one. With this option, the result
+        left in the result as dimensions with size one. With this option, the result
-		will broadcast correctly against the original tensor.
+        will broadcast correctly against the original tensor.
    :Returns: variance of *x* along *axis*
    `axis` can be:
@@ -1105,8 +1105,8 @@ Reductions
    :Parameter: *x* -  symbolic Tensor (or compatible)
    :Parameter: *axis* - axis or axes along which to apply 'bitwise and'
    :Parameter: *keepdims* - (boolean) If this is set to True, the axes which are reduced are
-		left in the result as dimensions with size one. With this option, the result
+        left in the result as dimensions with size one. With this option, the result
-		will broadcast correctly against the original tensor.
+        will broadcast correctly against the original tensor.
    :Returns: bitwise and of *x* along *axis*
    `axis` can be:
@@ -1119,8 +1119,8 @@ Reductions
    :Parameter: *x* -  symbolic Tensor (or compatible)
    :Parameter: *axis* - axis or axes along which to apply bitwise or
    :Parameter: *keepdims* - (boolean) If this is set to True, the axes which are reduced are
-		left in the result as dimensions with size one. With this option, the result
+        left in the result as dimensions with size one. With this option, the result
-		will broadcast correctly against the original tensor.
+        will broadcast correctly against the original tensor.
    :Returns: bitwise or of *x* along *axis*
    `axis` can be:
@@ -1745,7 +1745,7 @@ Linear Algebra
              when indexed, so that each returned argument has the same shape.
              The dimensions and number of the output arrays are equal to the
              number of indexing dimensions. If the step length is not a complex
-	      number, then the stop is not inclusive.
+              number, then the stop is not inclusive.
    Example:

--- a/doc/library/tensor/conv.rst
+++ b/doc/library/tensor/conv.rst
@@ -8,4 +8,4 @@
 .. moduleauthor:: LISA, PyMC Developers, PyTensor Developers
 .. automodule:: pytensor.tensor.conv
    :members:
\ No newline at end of file
--- a/doc/optimizations.rst
+++ b/doc/optimizations.rst
@@ -262,8 +262,8 @@ Optimization                                              o4             o3  o2
    local_remove_all_assert
        This is an unsafe optimization.
        For the fastest possible PyTensor, this optimization can be enabled by
-	setting ``optimizer_including=local_remove_all_assert`` which will
+        setting ``optimizer_including=local_remove_all_assert`` which will
-	remove all assertions in the graph for checking user inputs are valid.
+        remove all assertions in the graph for checking user inputs are valid.
        Use this optimization if you are sure everything is valid in your graph.
-	See :ref:`unsafe_rewrites`
+    See :ref:`unsafe_rewrites`
--- a/doc/tutorial/adding.rst
+++ b/doc/tutorial/adding.rst
@@ -7,12 +7,12 @@ Baby Steps - Algebra
 Understanding Tensors
 ===========================
-Before diving into PyTensor, it's essential to understand the fundamental 
+Before diving into PyTensor, it's essential to understand the fundamental
-data structure it operates on: the *tensor*. A *tensor* is a multi-dimensional 
+data structure it operates on: the *tensor*. A *tensor* is a multi-dimensional
 array that serves as the foundation for symbolic computations.
-tensors can represent anything from a single number (scalar) to 
+tensors can represent anything from a single number (scalar) to
-complex multi-dimensional arrays. Each tensor has a type that dictates its 
+complex multi-dimensional arrays. Each tensor has a type that dictates its
 dimensionality and the kind of data it holds.
 For example, the following code creates a symbolic scalar and a symbolic matrix:
@@ -20,11 +20,11 @@ For example, the following code creates a symbolic scalar and a symbolic matrix:
 >>> x = pt.scalar('x')
 >>> y = pt.matrix('y')
-Here, `scalar` refers to a tensor with zero dimensions, while `matrix` refers 
+Here, `scalar` refers to a tensor with zero dimensions, while `matrix` refers
-to a tensor with two dimensions. The same principles apply to tensors of other 
+to a tensor with two dimensions. The same principles apply to tensors of other
 dimensions.
-For more information about tensors and their associated operations can be 
+For more information about tensors and their associated operations can be
 found here: :ref:`tensor <libdoc_tensor>`.

--- a/doc/tutorial/prng.rst
+++ b/doc/tutorial/prng.rst
@@ -51,10 +51,10 @@ In the long-run this deterministic mapping function should produce draws that ar
 For illustration we implement a very bad mapping function from a bit generator to uniform draws.
 .. code:: python
    def bad_uniform_rng(rng, size):
        bit_generator = rng.bit_generator
        uniform_draws = np.empty(size)
        for i in range(size):
            bit_generator.advance(1)
@@ -175,9 +175,9 @@ Shared variables are global variables that don't need (and can't) be passed as e
 >>> rng = pytensor.shared(np.random.default_rng(123))
 >>> next_rng, x = pt.random.uniform(rng=rng).owner.outputs
->>> 
+>>>
 >>> f = pytensor.function([], [next_rng, x])
->>> 
+>>>
 >>> next_rng_val, x = f()
 >>> print(x)
 0.6823518632481435
@@ -200,9 +200,9 @@ In this case it makes sense to simply replace the original value by the next_rng
 >>> rng = pytensor.shared(np.random.default_rng(123))
 >>> next_rng, x = pt.random.uniform(rng=rng).owner.outputs
->>> 
+>>>
 >>> f = pytensor.function([], x, updates={rng: next_rng})
->>> 
+>>>
 >>> f(), f(), f()
 (array(0.68235186), array(0.05382102), array(0.22035987))
@@ -210,10 +210,10 @@ Another way of doing that is setting a default_update in the shared RNG variable
 >>> rng = pytensor.shared(np.random.default_rng(123))
 >>> next_rng, x = pt.random.uniform(rng=rng).owner.outputs
->>> 
+>>>
 >>> rng.default_update = next_rng
 >>> f = pytensor.function([], x)
->>> 
+>>>
 >>> f(), f(), f()
 (array(0.68235186), array(0.05382102), array(0.22035987))
@@ -232,12 +232,12 @@ the SciPy-like API of `pytensor.tensor.random`. Full documentation can be found
 >>> print(f(), f(), f())
 0.19365083425294516 0.7541389670292019 0.2762903411491048
-Shared RNGs are created by default 
+Shared RNGs are created by default
 ----------------------------------
 If no rng is provided to a RandomVariable Op, a shared RandomGenerator is created automatically.
-This can give the appearance that PyTensor functions of random variables don't have any variable inputs, 
+This can give the appearance that PyTensor functions of random variables don't have any variable inputs,
 but this is not true.
 They are simply shared variables.
@@ -252,10 +252,10 @@ Shared RNG variables can be "reseeded" by setting them to the original RNG
 >>> rng = pytensor.shared(np.random.default_rng(123))
 >>> next_rng, x = pt.random.normal(rng=rng).owner.outputs
->>> 
+>>>
 >>> rng.default_update = next_rng
 >>> f = pytensor.function([], x)
->>> 
+>>>
 >>> print(f(), f())
 >>> rng.set_value(np.random.default_rng(123))
 >>> print(f(), f())
@@ -267,7 +267,7 @@ RandomStreams provide a helper method to achieve the same
 >>> rng = pt.random.RandomStream(seed=123)
 >>> x = srng.normal()
 >>> f = pytensor.function([], x)
->>> 
+>>>
 >>> print(f(), f())
 >>> srng.seed(123)
 >>> print(f(), f())
@@ -373,7 +373,7 @@ uniform_rv{"(),()->()"}.1 [id A] d={0: [0]} 0
 >>> rng = pytensor.shared(np.random.default_rng(), name="rng")
 >>> next_rng, x = pt.random.uniform(rng=rng).owner.outputs
->>> 
+>>>
 >>> inplace_f = pytensor.function([], [x], updates={rng: next_rng})
 >>> pytensor.dprint(inplace_f, print_destroy_map=True) # doctest: +SKIP
 uniform_rv{"(),()->()"}.1 [id A] d={0: [0]} 0
@@ -392,15 +392,15 @@ It's common practice to use separate RNG variables for each RandomVariable in Py
 >>> rng_x = pytensor.shared(np.random.default_rng(123), name="rng_x")
 >>> rng_y = pytensor.shared(np.random.default_rng(456), name="rng_y")
->>> 
+>>>
 >>> next_rng_x, x = pt.random.normal(loc=0, scale=10, rng=rng_x).owner.outputs
 >>> next_rng_y, y = pt.random.normal(loc=x, scale=0.1, rng=rng_y).owner.outputs
->>> 
+>>>
 >>> next_rng_x.name = "next_rng_x"
 >>> next_rng_y.name = "next_rng_y"
 >>> rng_x.default_update = next_rng_x
 >>> rng_y.default_update = next_rng_y
->>> 
+>>>
 >>> f = pytensor.function([], [x, y])
 >>> pytensor.dprint(f, print_type=True) # doctest: +SKIP
 normal_rv{"(),()->()"}.1 [id A] <Scalar(float64, shape=())> 0
@@ -430,7 +430,7 @@ This is what RandomStream does as well
 >>> srng = pt.random.RandomStream(seed=123)
 >>> x = srng.normal(loc=0, scale=10)
 >>> y = srng.normal(loc=x, scale=0.1)
->>> 
+>>>
 >>> f = pytensor.function([], [x, y])
 >>> pytensor.dprint(f, print_type=True) # doctest: +SKIP
 normal_rv{"(),()->()"}.1 [id A] <Scalar(float64, shape=())> 0
@@ -462,7 +462,7 @@ We could have used a single rng.
 >>> next_rng_x.name = "next_rng_x"
 >>> next_rng_y, y = pt.random.normal(loc=100, scale=1, rng=next_rng_x).owner.outputs
 >>> next_rng_y.name = "next_rng_y"
->>> 
+>>>
 >>> f = pytensor.function([], [x, y], updates={rng: next_rng_y})
 >>> pytensor.dprint(f, print_type=True) # doctest: +SKIP
 normal_rv{"(),()->()"}.1 [id A] <Scalar(float64, shape=())> 0
@@ -508,10 +508,10 @@ Scan works very similar to a function (that is called repeatedly inside an outer
 This means that random variables will always return the same output unless updates are specified.
 >>> rng = pytensor.shared(np.random.default_rng(123), name="rng")
->>> 
+>>>
 >>> def constant_step(rng):
 >>>     return pt.random.normal(rng=rng)
->>> 
+>>>
 >>> draws, updates = pytensor.scan(
 >>>     fn=constant_step,
 >>>     outputs_info=[None],
@@ -519,7 +519,7 @@ This means that random variables will always return the same output unless updat
 >>>     n_steps=5,
 >>>     strict=True,
 >>> )
->>> 
+>>>
 >>> f = pytensor.function([], draws, updates=updates)
 >>> f(), f()
 (array([-0.98912135, -0.98912135, -0.98912135, -0.98912135, -0.98912135]),
@@ -528,12 +528,12 @@ This means that random variables will always return the same output unless updat
 Scan accepts an update dictionary as an output to tell how shared variables should be updated after every iteration.
 >>> rng = pytensor.shared(np.random.default_rng(123))
->>> 
+>>>
 >>> def random_step(rng):
 >>>     next_rng, x = pt.random.normal(rng=rng).owner.outputs
 >>>     scan_update = {rng: next_rng}
 >>>     return x, scan_update
->>> 
+>>>
 >>> draws, updates = pytensor.scan(
 >>>     fn=random_step,
 >>>     outputs_info=[None],
@@ -541,7 +541,7 @@ Scan accepts an update dictionary as an output to tell how shared variables shou
 >>>     n_steps=5,
 >>>     strict=True
 >>> )
->>> 
+>>>
 >>> f = pytensor.function([], draws)
 >>> f(), f()
 (array([-0.98912135, -0.36778665,  1.28792526,  0.19397442,  0.9202309 ]),
@@ -563,7 +563,7 @@ Like function, scan also respects shared variables default updates
 >>>     next_rng, x = pt.random.normal(rng=rng).owner.outputs
 >>>     rng.default_update = next_rng
 >>>     return x
->>> 
+>>>
 >>> draws, updates = pytensor.scan(
 >>>     fn=random_step,
 >>>     outputs_info=[None],
@@ -589,10 +589,10 @@ As expected, Scan only looks at default updates for shared variables created ins
 >>> rng = pytensor.shared(np.random.default_rng(123), name="rng")
 >>> next_rng, x = pt.random.normal(rng=rng).owner.outputs
 >>> rng.default_update = next_rng
->>>     
+>>>
->>> def random_step(rng, x):    
+>>> def random_step(rng, x):
 >>>     return x
->>> 
+>>>
 >>> draws, updates = pytensor.scan(
 >>>     fn=random_step,
 >>>     outputs_info=[None],
@@ -611,11 +611,11 @@ As expected, Scan only looks at default updates for shared variables created ins
 RNGs in Scan are only supported via shared variables in non-sequences at the moment
 >>> rng = pt.random.type.RandomGeneratorType()("rng")
->>> 
+>>>
 >>> def random_step(rng):
 >>>     next_rng, x = pt.random.normal(rng=rng).owner.outputs
 >>>     return next_rng, x
->>> 
+>>>
 >>> try:
 >>>     (next_rngs, draws), updates = pytensor.scan(
 >>>         fn=random_step,
@@ -635,21 +635,21 @@ OpFromGraph
 In contrast to Scan, non-shared RNG variables can be used directly in OpFromGraph
 >>> from pytensor.compile.builders import OpFromGraph
->>> 
+>>>
 >>> rng = pt.random.type.RandomGeneratorType()("rng")
->>> 
+>>>
 >>> def lognormal(rng):
 >>>     next_rng, x = pt.random.normal(rng=rng).owner.outputs
 >>>     return [next_rng, pt.exp(x)]
->>> 
+>>>
 >>> lognormal_ofg = OpFromGraph([rng], lognormal(rng))
 >>> rng_x = pytensor.shared(np.random.default_rng(1), name="rng_x")
 >>> rng_y = pytensor.shared(np.random.default_rng(2), name="rng_y")
->>> 
+>>>
 >>> next_rng_x, x = lognormal_ofg(rng_x)
->>> next_rng_y, y = lognormal_ofg(rng_y) 
+>>> next_rng_y, y = lognormal_ofg(rng_y)
->>> 
+>>>
 >>> f = pytensor.function([], [x, y], updates={rng_x: next_rng_x, rng_y: next_rng_y})
 >>> f(), f(), f()
@@ -749,4 +749,4 @@ PyTensor could provide shared JAX-like RNGs and allow RandomVariables to accept
 but that would break the spirit of one graph `->` multiple backends.
 Alternatively, PyTensor could try to use a more general type for RNGs that can be used across different backends,
 either directly or after some conversion operation (if such operations can be implemented in the different backends).
\ No newline at end of file
--- a/doc/tutorial/symbolic_graphs.rst
+++ b/doc/tutorial/symbolic_graphs.rst
 :orphan:
-This page has been moved. Please refer to: :ref:`graphstructures`. 
+This page has been moved. Please refer to: :ref:`graphstructures`.