提交 · 74e83a72d3b87663b8bbf5487f385ef12cbc339f · testgroup / pytensor

14 8月, 2014 2 次提交
- Merge pull request #2035 from lamblin/doc_sparse_block_dot · 74e83a72
  由 abergeron 提交于 8月 13, 2014
```
Extend documentation (plus small fix)
```
  74e83a72
- Extend documentation (plus small fix) · 0a59f26f
  由 Pascal Lamblin 提交于 8月 13, 2014
  
  0a59f26f
13 8月, 2014 38 次提交
- Merge pull request #1929 from abergeron/sparse_blockdot · 22c4a29b
  由 Pascal Lamblin 提交于 8月 12, 2014
```
Add a blocksparse multiplication implementation
```
  22c4a29b
- Merge pull request #2029 from abergeron/fix_gpuadvsub1 · 081f64c8
  由 Pascal Lamblin 提交于 8月 12, 2014
```
Remove the restriction on indexing a broadcastable dimension.
```
  081f64c8
- Merge pull request #2030 from Tanjay94/As_Op · 1e51644a
  由 abergeron 提交于 8月 12, 2014
```
@as_op() documentation
```
  1e51644a
- Fix spelling of CUBLAS_STATUS_INVALID_VALUE. · 94a43515
  由 Arnaud Bergeron 提交于 8月 12, 2014
  
  94a43515
- Fix the checks on the GPU versions of the ops (again). · a711ef41
  由 Arnaud Bergeron 提交于 8月 12, 2014
  
  a711ef41
- Remove the old sparse_grad interface. · 395c211e
  由 Arnaud Bergeron 提交于 8月 12, 2014
  
  395c211e
- Nicer error message for people that abuse the sizes of the blocks. · 84161c84
  由 Arnaud Bergeron 提交于 8月 12, 2014
  
  84161c84
- Fix a corruption bug in fill_lists. · 824c7c27
  由 Arnaud Bergeron 提交于 8月 06, 2014
  
  824c7c27
- Fix error message for sgemv to say "Sgemv" rather than "Sgemm" · a5010fe7
  由 Arnaud Bergeron 提交于 8月 06, 2014
  
  a5010fe7
- Limit the total size of blocks to 512 and the size of the grids to 65535. · 5e69ec44
  由 Arnaud Bergeron 提交于 8月 05, 2014
```
This should help older GPUs run at all and newer GPUs fit more blocks
on one SM.

With this change the code is cc 2.0+ compatible.  But it will only be
fast on cc 3.0+ cards (due to atomicAdd).
```
  5e69ec44
- Remove the need for an intermediate buffer with a custom SgemvBatched kernel. · b4b6a31e
  由 Arnaud Bergeron 提交于 8月 01, 2014
```
Also some small kernel speedups elsewhere.
```
  b4b6a31e
- Fix the stupid scheduling for better performance (should be much faster). · 496cb1c7
  由 Arnaud Bergeron 提交于 8月 01, 2014
```
Also address some other issues that came up in code review.
```
  496cb1c7
- Make a custom ger kernel that uses atomicAdd to do the addition · 5e9c7bce
  由 Arnaud Bergeron 提交于 7月 31, 2014
```
Remove the beta parameter since it's always 1 anyway.
```
  5e9c7bce
- Enable debugging of kernels. · 52cd5ee4
  由 Arnaud Bergeron 提交于 7月 31, 2014
  
  52cd5ee4
- Add support for dimensions of size 1 in all cases. · d1f762aa
  由 Arnaud Bergeron 提交于 7月 30, 2014
  
  d1f762aa
- Remove the python version of these ops as it laughably slow and forces · 8e23c533
  由 Arnaud Bergeron 提交于 7月 30, 2014
```
a dependecy on scikits.cuda and pycuda.
```
  8e23c533
- Update docs to reflect batches and add some fallback code to add batches of 1 to… · d3088260
  由 Arnaud Bergeron 提交于 7月 22, 2014
```
Update docs to reflect batches and add some fallback code to add batches of 1 to non-batched version.
```
  d3088260
- Add batch support to blocksparse. · 42f4cb3e
  由 Arnaud Bergeron 提交于 7月 22, 2014
  
  42f4cb3e
- Use the right spelling for config.unittests.rseed. · 47d59687
  由 Arnaud Bergeron 提交于 7月 22, 2014
  
  47d59687
- Now the opt actually compute the right value and there is a test. · a7329037
  由 Arnaud Bergeron 提交于 7月 21, 2014
  
  a7329037
- Add optimizations to make the gradient update inplace. There are no tests yet. · f1515639
  由 Arnaud Bergeron 提交于 7月 21, 2014
  
  f1515639
- Add infer_shape to the ops. · 2e51a436
  由 Arnaud Bergeron 提交于 7月 21, 2014
  
  2e51a436
- Add C code using gemmBatched to SparseBlockDotOuterSS (the gradient). · 57865538
  由 Arnaud Bergeron 提交于 7月 17, 2014
  
  57865538
- Small forgotten speedup in SparseBlockDotGemvSS. · 98a15fa1
  由 Arnaud Bergeron 提交于 7月 17, 2014
  
  98a15fa1
- Use gemm_batched from python code in the gradient.ù · 9841c0db
  由 Arnaud Bergeron 提交于 7月 17, 2014
  
  9841c0db
- C code version of the python loop. · 437b1a5f
  由 Arnaud Bergeron 提交于 7月 17, 2014
  
  437b1a5f
- Remove leftover opt. · 69895eea
  由 Arnaud Bergeron 提交于 7月 16, 2014
  
  69895eea
- C code that uses SgemmBatched and a kernel to initialize the list of stuff. · ed244b6b
  由 Arnaud Bergeron 提交于 7月 16, 2014
  
  ed244b6b
- Fix memory leak in C code for blocksparse. · c774e32e
  由 Arnaud Bergeron 提交于 7月 14, 2014
  
  c774e32e
- Use gemm_batched in the python code. · 8f9c2a12
  由 Arnaud Bergeron 提交于 7月 14, 2014
  
  8f9c2a12
- Fix errors in C code and add a cache version. It passes the tests and works. · 29db8ffb
  由 Arnaud Bergeron 提交于 7月 14, 2014
  
  29db8ffb
- Add C code to SparseBlockGemvSS · 1519d758
  由 Arnaud Bergeron 提交于 7月 14, 2014
  
  1519d758
- Add support for the fortran order in gemv (and a test for it). · 0bc12fe9
  由 Arnaud Bergeron 提交于 7月 07, 2014
  
  0bc12fe9
- Fix shape error in tests (which also means that we had wrong behavior). · d12f4aea
  由 Arnaud Bergeron 提交于 7月 02, 2014
  
  d12f4aea
- And make the opt test work. · 7f15e04a
  由 Arnaud Bergeron 提交于 6月 25, 2014
  
  7f15e04a
- Add test to check that the inplace opts are working. · 6c77f4a6
  由 Arnaud Bergeron 提交于 6月 25, 2014
  
  6c77f4a6
- Don't try to use opt when cuda is not available. · 93879ae4
  由 Arnaud Bergeron 提交于 6月 25, 2014
  
  93879ae4
- Add inplace optimizations. · 1c3afdf6
  由 Arnaud Bergeron 提交于 6月 23, 2014
  
  1c3afdf6