提交 db1e2d56 authored 作者: Ian Goodfellow's avatar Ian Goodfellow

added doc about runtime of the op

上级 7d48ea8a
...@@ -512,6 +512,12 @@ class GpuCAReduce(GpuOp): ...@@ -512,6 +512,12 @@ class GpuCAReduce(GpuOp):
many code cases are supported for scalar_op being anything other than many code cases are supported for scalar_op being anything other than
scal.Add instances yet. scal.Add instances yet.
Important note: if you implement new cases for this op, be sure to
benchmark them and make sure that the local_gpu_careduce op should
really replace CAReduce with GpuCAReduce for these cases. GPUs are
not especially well-suited to reduction operations so it is quite
possible that the GPU might be slower for some cases.
""" """
def __init__(self, reduce_mask, scalar_op): def __init__(self, reduce_mask, scalar_op):
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论