提交 5071cf53 authored 作者: James Bergstra's avatar James Bergstra 提交者: Frederic

comments in blas.py

上级 f85856b2
...@@ -6,18 +6,26 @@ Learn more about BLAS here: ...@@ -6,18 +6,26 @@ Learn more about BLAS here:
The standard BLAS libraries implement what is called "legacy BLAS" in that The standard BLAS libraries implement what is called "legacy BLAS" in that
document. document.
This documentation section describes Theano's BLAS optimization This documentation describes Theano's BLAS optimization pipeline.
pipeline.
Where there is a discrepancy between how things do work and how they *should* Where there is a discrepancy between how things do work and how they *should*
work, both aspects should be documented. It helps keep a broader agenda in view work, both aspects should be documented.
even while fixing little bugs etc. from day to day.
There are four kinds of BLAS Ops in Theano:
- Python implementations (this file)
- SciPy-based (blas_scipy)
- C-based (blas_c)
- CUDA-based (theano.sandbox.cuda.blas)
:note: Unfortunately (because it's confusing) this file currently contains Ops
that contain both Python and C versions. I think it would be better to
move the C implementations to blas_c so that this file is pure Python.
-JB
Ops Ops
=== ===
There are two BLAS calls wrapped in this module: GEMM and GEMV.
GEMM: Dot22, Dot22Scalar, GemmRelated, Gemm GEMM: Dot22, Dot22Scalar, GemmRelated, Gemm
------------------------------------------- -------------------------------------------
...@@ -45,16 +53,17 @@ GEMV: Gemv ...@@ -45,16 +53,17 @@ GEMV: Gemv
The BLAS GEMV operation implements Z <- a X Y + b Z, The BLAS GEMV operation implements Z <- a X Y + b Z,
where X is a matrix, Y, and Z are vectors, and a and b are scalars. where X is a matrix, Y, and Z are vectors, and a and b are scalars.
Gemv implements the GEMV call in all its generality.
GER: Ger
--------
The BLAS GER operation implements Z <- a X' Y + Z,
where X and Y are vectors, and matrix Z gets a rank-1 update.
Other Notable BLAS-related Ops Other Notable BLAS-related Ops
------------------------------ ------------------------------
GpuOuter is currently a wrapper around GER. GER is a useful special case of
GEMM, and in the future it would be good to have a GER Op. With a GER Op here,
the GpuOuter could be turned into a GpuGER.
SYRK is another useful special case of GEMM. Particularly SYRK preserves SYRK is another useful special case of GEMM. Particularly SYRK preserves
symmetry in the matrix that it updates. See how the linear-algebra module uses symmetry in the matrix that it updates. See how the linear-algebra module uses
symmetry hints before implementing this Op, so that this Op is compatible with symmetry hints before implementing this Op, so that this Op is compatible with
...@@ -64,14 +73,19 @@ that system. ...@@ -64,14 +73,19 @@ that system.
Optimizations Optimizations
============= =============
The current optimization pipeline is not exactly clear to me. Instead I will The optimization pipeline works something like this:
describe how it should work.
The high level pipeline is:
1. identify dot22 from dot 1. identify dot22 from dot
2. identify gemm from dot22 2. identify gemm from dot22
3. identify dot22scalar from dot22 that are not gemm 3. identify dot22scalar from dot22 that are not gemm
4. specialize gemm to gemv where applicable 4. specialize gemm to gemv where applicable
5. specialize gemm to ger where applicable
6. specialize dot22 -> gemv or ger where applicable
:note: GEMM is the most canonical BLAS signature that we deal with so far, it
would be good to turn most things into GEMM (dot, inner, outer, dot22,
dot22scalar), and then to specialize from gemm to the various other L2 and
L3 operations.
Identify Dot22 Identify Dot22
-------------- --------------
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论