1. 12 6月, 2017 1 次提交
    • xiaoqie's avatar
      cuda fix · fc36eefb
      xiaoqie 提交于
      All tests in test_nnet.py pass with CUDA.
      Only fp32 tests in test_nnet.py pass with OpenCL. GpuFromHost doesn't work with fp16 or fp64.
      Larger work item size doesn't improve performance.
      Add 2 local_barrier(), it's strange that AMD card doesn't need these local_barrier(), but they are necessary for NVIDIA cards.
      fc36eefb
  2. 07 6月, 2017 1 次提交
  3. 06 6月, 2017 5 次提交
  4. 29 5月, 2017 1 次提交
  5. 25 5月, 2017 1 次提交
  6. 23 5月, 2017 3 次提交
  7. 20 5月, 2017 2 次提交
  8. 19 5月, 2017 3 次提交
  9. 18 5月, 2017 4 次提交
  10. 17 5月, 2017 12 次提交
  11. 16 5月, 2017 5 次提交
  12. 15 5月, 2017 2 次提交