提交 · 8da30f6e619dafed2c8fa89ce554d5e79dfe4437 · testgroup / pytensor

18 9月, 2015 5 次提交

In debugprint, don't print the profile headers more then once and only add it if… · 8da30f6e
由 Frederic 提交于 9月 17, 2015
```
In debugprint, don't print the profile headers more then once and only add it if the profile isn't empty.
```
8da30f6e

由提交于 9月 17, 2015

The merge optimization when applied with the destroy handler can be very slow.
I have one case that got speed up from 190s to 12s by this PR:

Original:
SeqOptimizer  OPT_FAST_RUN  time 190.461s for 2161/548 nodes before/after optimization
   1.280s for fgraph.validate()
   132.634s for callback
   time      - (name, class, index) - validate time
   170.167204s - ('merge3', 'MergeOptimizer', 38) - 0.000s
     MergeOptimizer
       nb fail= 7728 merged=    0 constant=    0
       time replace=170.17 validate=0.00 callback=128.92
       callbacks_time
(<theano.gof.toolbox.PreserveNames object at 0x7f46b8864310>, 0.04287838935852051)
(<theano.gof.toolbox.ReplaceValidate object at 0x7f46b33f43d0>, 0.06792092323303223)
(<theano.tensor.opt.ShapeFeature object at 0x7f46b8864890>, 0.44670534133911133)
(<theano.gof.destroyhandler.DestroyHandler object at 0x7f46b4a44810>, 1.9552459716796875)
(<theano.gof.opt.MergeFeature object at 0x7f46b8864650>, 126.08237051963806)
   10.041219s - ('canonicalize', 'EquilibriumOptimizer', 4) - 0.051s

New one:

 SeqOptimizer  OPT_FAST_RUN  time 12.338s for 2161/548 nodes before/after optimization
   1.219s for fgraph.validate()
   3.536s for callback
   time      - (name, class, index) - validate time
   4.779337s - ('gpu_opt', 'SeqOptimizer', 12) - 0.012s
     SeqOptimizer      gpu_opt  time 4.779s for 633/504 nodes before/after optimization
       0.012s for fgraph.validate()
       0.817s for callback
       4.748835s - ('gpu_local_optimizations', 'EquilibriumOptimizer', 1) - 0.012s
...
   0.002217s - ('merge3', 'MergeOptimizer', 38) - 0.000s
     MergeOptimizer
       nb fail=    0 merged=    0 constant=    0
       time replace=0.00 validate=0.00 callback=0.00

19ccd72c

Merge pull request #3406 from nouiz/rtd · 460e393a
由 Frédéric Bastien 提交于 9月 17, 2015
```
Add requirement specific to rtd
```
460e393a
Add requirement specific to rtd · 5ed0ddd6
由 Frederic 提交于 9月 17, 2015

5ed0ddd6
Merge pull request #3401 from Kaixhin/master · b24338c8
由 abergeron 提交于 9月 17, 2015
```
Fix Sphinx markup in docs
```
b24338c8

17 9月, 2015 7 次提交
- Fix Sphinx markup in docs · 1d40fd3b
  由 Kaixhin 提交于 9月 17, 2015
  
  1d40fd3b
- Merge pull request #3399 from Kaixhin/master · a5814d54
  由 Frédéric Bastien 提交于 9月 17, 2015
```
Add Docker images to the installation docs
```
  a5814d54
- Add Docker images to the installation docs · 2384c83f
  由 Kaixhin 提交于 9月 16, 2015
  
  2384c83f
- Merge pull request #3397 from vesis84/master · c20412f4
  由 Frédéric Bastien 提交于 9月 16, 2015
```
automatic gpu selection no longer works in CUDA-TK 7.0
```
  c20412f4
- Merge pull request #3355 from abergeron/hgemm · 175045d9
  由 Frédéric Bastien 提交于 9月 16, 2015
```
Enables float16 gemm on gpuarray when the cuda version supports it
```
  175045d9
- don't call 'cuda_ndarray.select_a_gpu()' when using pycuda gpu selection, · 8c5fef8a
  由 vesis84 提交于 9月 16, 2015
  
  8c5fef8a
- Revert "force-disable of GPU initialization with pycuda" · 3af33c63
  由 vesis84 提交于 9月 16, 2015
```
This reverts commit 1814576c.
```
  3af33c63
16 9月, 2015 17 次提交
- Fix the warning log. · 560bafe5
  由 Arnaud Bergeron 提交于 9月 16, 2015
  
  560bafe5
- Merge pull request #3380 from nouiz/mixed2 · d219054e
  由 abergeron 提交于 9月 16, 2015
```
Mixed2
```
  d219054e
- force-disable of GPU initialization with pycuda · 1814576c
  由 vesis84 提交于 9月 16, 2015
```
- pycuda initialization was also affected by the bug from CUDA 7.0
```
  1814576c
- bugfix for automatic GPU selection when 'exclusive mode' is used · eed9d97d
  由 vesis84 提交于 9月 16, 2015
```
- fixing an issue introduced with CUDA-TK 7.0, when automatic selection
  of free gpu by os / library no longer works. Library always selects '0',
  which leads to crash in case of 'exclusive mode'.
- this fix is inspired by Dan Povey's fix for Kaldi:
  https://github.com/kaldi-asr/kaldi/commit/6548565445167e00125848f91d7da5f3f949b2a2
- it does a loop over gpus until a free gpu is taken.
```
  eed9d97d
- Merge pull request #3092 from t13m/opt_local_fill_sink · 88eac16c
  由 Frédéric Bastien 提交于 9月 16, 2015
```
Optimize local_fill_sink
```
  88eac16c
- Merge pull request #3368 from kashif/issue-3347 · 0ce5393a
  由 Frédéric Bastien 提交于 9月 16, 2015
```
fix shape mismatch in GpuDnnPoolGrad
```
  0ce5393a
- Merge pull request #3227 from abergeron/gpua_advsub1 · 975e0d2b
  由 Frédéric Bastien 提交于 9月 15, 2015
```
Implement GpuAdvancedSubtensor1 for gpuarray
```
  975e0d2b
- remove debugprint line in tensor/tests/test_opt.py · c1e3c978
  由 Ziye Fan 提交于 9月 16, 2015
  
  c1e3c978
- Merge pull request #3392 from nouiz/aalmah-elemwise_opt · c13853ad
  由 Frédéric Bastien 提交于 9月 15, 2015
```
elemwise opt
```
  c13853ad
- need to run with floatX=float32 · 4d5484e8
  由 Kashif Rasul 提交于 9月 15, 2015
  
  4d5484e8
- Add a tests for alpha and output merge and do some more fixes. · 342d5018
  由 Arnaud Bergeron 提交于 9月 15, 2015
  
  342d5018
- Some new fixes to address compilation problems. · e3474eda
  由 Arnaud Bergeron 提交于 9月 15, 2015
  
  e3474eda
- Address comments from review. · cd849526
  由 Arnaud Bergeron 提交于 9月 15, 2015
  
  cd849526
- Flake8 fixes. · e6914181
  由 Arnaud Bergeron 提交于 9月 02, 2015
  
  e6914181
- Add support for float16 mm product with libgpuarray. · 7e03d1a9
  由 Arnaud Bergeron 提交于 9月 02, 2015
  
  7e03d1a9
- fix test in python3 · 2e4f475a
  由 Frederic 提交于 9月 15, 2015
  
  2e4f475a
- Merge pull request #3382 from carriepl/scan_inplace_opt · bae54705
  由 abergeron 提交于 9月 15, 2015
```
Scan inplace opt
```
  bae54705
15 9月, 2015 11 次提交
- Fix test in float32 · 7a79acaf
  由 Frederic 提交于 9月 15, 2015
  
  7a79acaf
- flake8 · 4f43e93c
  由 Frederic 提交于 9月 15, 2015
  
  4f43e93c
- Don't introduce useless mul · d8b2a977
  由 Frederic 提交于 9月 15, 2015
  
  d8b2a977
- Don't introduce useless prod op · 7b7b6618
  由 Frederic 提交于 9月 15, 2015
  
  7b7b6618
- Make an opt to not add useless op that will get removed. · 219d33bf
  由 Frederic 提交于 9月 15, 2015
  
  219d33bf
- flake8 · 3ee52a9a
  由 Frederic 提交于 9月 15, 2015
  
  3ee52a9a
- Remove line for debug · 19f37131
  由 Ziye Fan 提交于 8月 14, 2015
  
  19f37131
- Fix travis failed test case · 10d36f41
  由 Ziye Fan 提交于 8月 14, 2015
```
 - it is caused by modification of theano.config.experimental.local_alloc_elemwise_assert in T_local_switch_sink
```
  10d36f41
- I can't reproduce the travis failed test in my · 62d03384
  由 Ziye Fan 提交于 8月 13, 2015
```
computer, add one line in the test case to try
to know what happened when it fails
```
  62d03384
- check if the model is a useless fill · a07899c8
  由 Ziye Fan 提交于 8月 12, 2015
  
  a07899c8
- register local_mul_switch_sink in specialize optdb · ee07a356
  由 Ziye Fan 提交于 8月 07, 2015
  
  ee07a356