-
由 Arnaud Bergeron 提交于
This should help older GPUs run at all and newer GPUs fit more blocks on one SM. With this change the code is cc 2.0+ compatible. But it will only be fast on cc 3.0+ cards (due to atomicAdd).
5e69ec44
This should help older GPUs run at all and newer GPUs fit more blocks on one SM. With this change the code is cc 2.0+ compatible. But it will only be fast on cc 3.0+ cards (due to atomicAdd).