提交 0f8d876e authored 作者: Ricardo Vieira's avatar Ricardo Vieira 提交者: Ricardo Vieira

Make BLAS flags check lazy and more actionable

It replaces the old warning that does not actually apply by a more informative and actionable one. This warning was for Ops that might use the alternative blas_headers, which rely on the Numpy C-API. However, regular PyTensor user has not used this for a while. The only Op that would use C-code with this alternative headers is the GEMM Op which is not included in current rewrites. Instead Dot22 or Dot22Scalar are introduced, which refuse to generate C-code altogether if the blas flags are missing.
上级 5fb56bab
...@@ -145,44 +145,64 @@ How do I configure/test my BLAS library ...@@ -145,44 +145,64 @@ How do I configure/test my BLAS library
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
There are many ways to configure BLAS for PyTensor. This is done with the PyTensor There are many ways to configure BLAS for PyTensor. This is done with the PyTensor
flags ``blas__ldflags`` (:ref:`libdoc_config`). The default is to use the BLAS flags ``blas__ldflags`` (:ref:`libdoc_config`). If not specified, PyTensor will
installation information in NumPy, accessible via attempt to find a local BLAS library to link against, prioritizing specialized implementations.
``numpy.__config__.show()``. You can tell pytensor to use a different The details can be found in :func:`pytensor.link.c.cmodule.default_blas_ldflags`.
version of BLAS, in case you did not compile NumPy with a fast BLAS or if NumPy
was compiled with a static library of BLAS (the latter is not supported in
PyTensor).
The short way to configure the PyTensor flags ``blas__ldflags`` is by setting the Users can manually set the PyTensor flags ``blas__ldflags`` to link against a
environment variable :envvar:`PYTENSOR_FLAGS` to ``blas__ldflags=XXX`` (in bash specific version. This is useful even if the default version is the desired one,
``export PYTENSOR_FLAGS=blas__ldflags=XXX``) as it will avoid the costly work of trying to find the best BLAS library at runtime.
The ``${HOME}/.pytensorrc`` file is the simplest way to set a relatively The PyTensor flags can be set in a few ways:
permanent option like this one. Add a ``[blas]`` section with an ``ldflags``
entry like this: 1. In the ``${HOME}/.pytensorrc`` file.
.. code-block:: cfg .. code-block:: cfg
# other stuff can go here # other stuff can go here
[blas] [blas]
ldflags = -lf77blas -latlas -lgfortran #put your flags here ldflags = -llapack -lblas -lcblas # put your flags here
# other stuff can go here # other stuff can go here
For more information on the formatting of ``~/.pytensorrc`` and the 2. In BASH before running your script:
configuration options that you can put there, see :ref:`libdoc_config`.
.. code-block:: bash
export PYTENSOR_FLAGS="blas__ldflags='-llapack -lblas -lcblas'"
3. In an Ipython/Jupyter notebook before importing PyTensor:
.. code-block:: python
%set_env PYTENSOR_FLAGS=blas__ldflags='-llapack -lblas -lcblas'
4. In `pytensor.config` directly:
.. code-block:: python
import pytensor
pytensor.config.blas__ldflags = '-llapack -lblas -lcblas'
(For more information on the formatting of ``~/.pytensorrc`` and the
configuration options that you can put there, see :ref:`libdoc_config`.)
You can find the default BLAS library that PyTensor is linking against by
checking ``pytensor.config.blas__ldflags``
or running :func:`pytensor.link.c.cmodule.default_blas_ldflags`.
Here are some different way to configure BLAS: Here are some different way to configure BLAS:
0) Do nothing and use the default config, which is to link against the same 0) Do nothing and use the default config.
BLAS against which NumPy was built. This does not work in the case NumPy was This will usually work great for installation via conda/mamba/pixi (conda-forge channel).
compiled with a static library (e.g. ATLAS is compiled by default only as a It will usually fail to link altogether for installation via pip.
static library).
1) Disable the usage of BLAS and fall back on NumPy for dot products. To do 1) Disable the usage of BLAS and fall back on NumPy for dot products. To do
this, set the value of ``blas__ldflags`` as the empty string (ex: ``export this, set the value of ``blas__ldflags`` as the empty string.
PYTENSOR_FLAGS=blas__ldflags=``). Depending on the kind of matrix operations your Depending on the kind of matrix operations your PyTensor code performs,
PyTensor code performs, this might slow some things down (vs. linking with BLAS this might slow some things down (vs. linking with BLAS directly).
directly).
2) You can install the default (reference) version of BLAS if the NumPy version 2) You can install the default (reference) version of BLAS if the NumPy version
(against which PyTensor links) does not work. If you have root or sudo access in (against which PyTensor links) does not work. If you have root or sudo access in
...@@ -208,10 +228,29 @@ correctly (for example, for MKL this might be ``-lmkl -lguide -lpthread`` or ...@@ -208,10 +228,29 @@ correctly (for example, for MKL this might be ``-lmkl -lguide -lpthread`` or
``-lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -lguide -liomp5 -lmkl_mc ``-lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -lguide -liomp5 -lmkl_mc
-lpthread``). -lpthread``).
5) Use another backend such as Numba or JAX that perform their own BLAS optimizations,
by setting the configuration mode to ``"NUMBA"`` or ``"JAX"`` and making sure those packages are installed.
This configuration mode can be set in all the ways that the BLAS flags can be set, described above.
Alternatively, you can pass `mode='NUMBA'` when compiling individual PyTensor functions without changing the default.
or use the ``config.change_flags`` context manager.
.. code-block:: python
from pytensor import function, config
from pytensor.tensor import matrix
x = matrix('x')
y = x @ x.T
f = function([x], y, mode='NUMBA')
with config.change_flags(mode='NUMBA'):
# compiling function that benefits from BLAS using NUMBA
f = function([x], y)
.. note:: .. note::
Make sure your BLAS Make sure your BLAS libraries are available as dynamically-loadable libraries.
libraries are available as dynamically-loadable libraries.
ATLAS is often installed only as a static library. PyTensor is not able to ATLAS is often installed only as a static library. PyTensor is not able to
use this static library. Your ATLAS installation might need to be modified use this static library. Your ATLAS installation might need to be modified
to provide dynamically loadable libraries. (On Linux this to provide dynamically loadable libraries. (On Linux this
...@@ -267,7 +306,7 @@ configuration information. Then, it will print the running time of the same ...@@ -267,7 +306,7 @@ configuration information. Then, it will print the running time of the same
benchmarks for your installation. Try to find a CPU similar to yours in benchmarks for your installation. Try to find a CPU similar to yours in
the table, and check that the single-threaded timings are roughly the same. the table, and check that the single-threaded timings are roughly the same.
PyTensor should link to a parallel version of Blas and use all cores PyTensor should link to a parallel version of BLAS and use all cores
when possible. By default it should use all cores. Set the environment when possible. By default it should use all cores. Set the environment
variable "OMP_NUM_THREADS=N" to specify to use N threads. variable "OMP_NUM_THREADS=N" to specify to use N threads.
......
...@@ -1985,7 +1985,7 @@ class Compiler: ...@@ -1985,7 +1985,7 @@ class Compiler:
) )
def try_blas_flag(flags): def try_blas_flag(flags) -> str:
test_code = textwrap.dedent( test_code = textwrap.dedent(
"""\ """\
extern "C" double ddot_(int*, double*, int*, double*, int*); extern "C" double ddot_(int*, double*, int*, double*, int*);
...@@ -2734,12 +2734,30 @@ sure you have the right version you *will* get wrong results. ...@@ -2734,12 +2734,30 @@ sure you have the right version you *will* get wrong results.
) )
def default_blas_ldflags(): def default_blas_ldflags() -> str:
"""Read local NumPy and MKL build settings and construct `ld` flags from them. """Look for an available BLAS implementation in the system.
This function tries to compile a simple C code that uses the BLAS
if the required files are found in the system.
It sequentially tries to link to the following implementations, until one is found:
1. Intel MKL with Intel OpenMP threading
2. Intel MKL with GNU OpenMP threading
3. Lapack + BLAS
4. BLAS alone
5. OpenBLAS
Returns Returns
------- -------
str blas flags: str
Blas flags needed to link to the BLAS implementation found in the system.
If no BLAS implementation is found, an empty string is returned.
Notes
-----
This function is triggered when `pytensor.config.blas__ldflags` is not given a user
default, and it is first accessed at runtime. It can be rather slow, so it is advised
to cache the results of this function in PYTENSORRC configuration file or
PyTensor environment flags.
""" """
...@@ -2788,7 +2806,7 @@ def default_blas_ldflags(): ...@@ -2788,7 +2806,7 @@ def default_blas_ldflags():
def check_libs( def check_libs(
all_libs, required_libs, extra_compile_flags=None, cxx_library_dirs=None all_libs, required_libs, extra_compile_flags=None, cxx_library_dirs=None
): ) -> str:
if cxx_library_dirs is None: if cxx_library_dirs is None:
cxx_library_dirs = [] cxx_library_dirs = []
if extra_compile_flags is None: if extra_compile_flags is None:
...@@ -2938,6 +2956,14 @@ def default_blas_ldflags(): ...@@ -2938,6 +2956,14 @@ def default_blas_ldflags():
except Exception as e: except Exception as e:
_logger.debug(e) _logger.debug(e)
_logger.debug("Failed to identify blas ldflags. Will leave them empty.") _logger.debug("Failed to identify blas ldflags. Will leave them empty.")
warnings.warn(
"PyTensor could not link to a BLAS installation. Operations that might benefit from BLAS will be severely degraded.\n"
"This usually happens when PyTensor is installed via pip. We recommend it be installed via conda/mamba/pixi instead.\n"
"Alternatively, you can use an experimental backend such as Numba or JAX that perform their own BLAS optimizations, "
"by setting `pytensor.config.mode == 'NUMBA'` or passing `mode='NUMBA'` when compiling a PyTensor function.\n"
"For more options and details see https://pytensor.readthedocs.io/en/latest/troubleshooting.html#how-do-i-configure-test-my-blas-library",
UserWarning,
)
return "" return ""
......
...@@ -742,6 +742,11 @@ def blas_header_text(): ...@@ -742,6 +742,11 @@ def blas_header_text():
blas_code = "" blas_code = ""
if not config.blas__ldflags: if not config.blas__ldflags:
# This code can only be reached by compiling a function with a manually specified GEMM Op.
# Normal PyTensor usage will end up with Dot22 or Dot22Scalar instead,
# which opt out of C-code completely if the blas flags are missing
_logger.warning("Using NumPy C-API based implementation for BLAS functions.")
# Include the Numpy version implementation of [sd]gemm_. # Include the Numpy version implementation of [sd]gemm_.
current_filedir = Path(__file__).parent current_filedir = Path(__file__).parent
blas_common_filepath = current_filedir / "c_code/alt_blas_common.h" blas_common_filepath = current_filedir / "c_code/alt_blas_common.h"
...@@ -1003,10 +1008,6 @@ def blas_header_text(): ...@@ -1003,10 +1008,6 @@ def blas_header_text():
return header + blas_code return header + blas_code
if not config.blas__ldflags:
_logger.warning("Using NumPy C-API based implementation for BLAS functions.")
def mkl_threads_text(): def mkl_threads_text():
"""C header for MKL threads interface""" """C header for MKL threads interface"""
header = """ header = """
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论