提交 fc599cf6 authored 作者: Frederic Bastien's avatar Frederic Bastien

added run time number for blas.

上级 02e9e377
......@@ -88,14 +88,16 @@ if __name__ == "__main__":
print """
Some result that you can compare again. They where 10 executions of gemm in float64 with matrix of shape 2000x2000 on FC9.
Cpu tested: Xeon E5345, Xeon E5430, Xeon E5450, Core 2 E8500, Core i7 930(hyper-threads enabled)
Cpu tested: Xeon E5345, Xeon E5430, Xeon E5450(3Ghz), Xeon X5560(2.8Ghz, hyper-threads enabled?)
Core 2 E8500, Core i7 930(2.8Ghz, hyper-threads enabled)
Lib tested:
* numpy with ATLAS from distribution(FC9) package (1 thread)
* manually compiled numpy and ATLAS with 2 threads
* goto with 1, 2, 4 and 8 threads.
Xeon Xeon Xeon Core2 i7
lib/nb threads E5345 E5430 E5450 E8500 930
Xeon Xeon Xeon Core2 i7 Xeon
lib/nb threads E5345 E5430 E5450 E8500 930 X5560
numpy_FC9_atlas/1 39.2s 35.0s 30.7s 29.6s 21.5s
goto/1 18.7s 16.1s 14.2s 13.7s 16.1s
......@@ -103,6 +105,10 @@ if __name__ == "__main__":
goto/2 9.5s 8.1s 7.1s 7.3s 8.1s
goto/4 4.9s 4.4s 3.7s - 4.1s
goto/8 2.7s 2.4s 2.0s - 4.1s
mkl 10.2.2.025/1 13.7s
mkl 10.2.2.025/2 7.6s
mkl 10.2.2.025/4 4.0s
mkl 10.2.2.025/8 2.0s
Test time in float32 with cuda 3.0.14
(cuda version 3.2RC and up are supposed to have faster gemm on the GTX4?? card)
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论