Rework the interface to GpuKernelBase to accept a list of kernel object.
Each item will be precompiled separetly and embedded into the c_code
of the Op. This allows ops that need multiple kernels or that will
choose between alternatives at runtime to use this interface. It also
groups all kernel-related parameters under one object.
This change also save the source of the kernel code to re-attempt
source compilation in case the binary is rejected for some reason
(some implementations do not support reloading from pre-compiled
kernel).
There may still be more changes to how stuff works under the hood
(most notably a blacklist of bad runtime/drivers that crash when
attempting to create a kernel from a binary), but the visible
interface should not change anymore, so now is the time to start using
it more.
正在显示
请
注册
或者
登录
后发表评论