- 18 9月, 2015 5 次提交
-
-
由 Frederic 提交于
In debugprint, don't print the profile headers more then once and only add it if the profile isn't empty.
-
由 Frederic 提交于
The merge optimization when applied with the destroy handler can be very slow. I have one case that got speed up from 190s to 12s by this PR: Original: SeqOptimizer OPT_FAST_RUN time 190.461s for 2161/548 nodes before/after optimization 1.280s for fgraph.validate() 132.634s for callback time - (name, class, index) - validate time 170.167204s - ('merge3', 'MergeOptimizer', 38) - 0.000s MergeOptimizer nb fail= 7728 merged= 0 constant= 0 time replace=170.17 validate=0.00 callback=128.92 callbacks_time (<theano.gof.toolbox.PreserveNames object at 0x7f46b8864310>, 0.04287838935852051) (<theano.gof.toolbox.ReplaceValidate object at 0x7f46b33f43d0>, 0.06792092323303223) (<theano.tensor.opt.ShapeFeature object at 0x7f46b8864890>, 0.44670534133911133) (<theano.gof.destroyhandler.DestroyHandler object at 0x7f46b4a44810>, 1.9552459716796875) (<theano.gof.opt.MergeFeature object at 0x7f46b8864650>, 126.08237051963806) 10.041219s - ('canonicalize', 'EquilibriumOptimizer', 4) - 0.051s New one: SeqOptimizer OPT_FAST_RUN time 12.338s for 2161/548 nodes before/after optimization 1.219s for fgraph.validate() 3.536s for callback time - (name, class, index) - validate time 4.779337s - ('gpu_opt', 'SeqOptimizer', 12) - 0.012s SeqOptimizer gpu_opt time 4.779s for 633/504 nodes before/after optimization 0.012s for fgraph.validate() 0.817s for callback 4.748835s - ('gpu_local_optimizations', 'EquilibriumOptimizer', 1) - 0.012s ... 0.002217s - ('merge3', 'MergeOptimizer', 38) - 0.000s MergeOptimizer nb fail= 0 merged= 0 constant= 0 time replace=0.00 validate=0.00 callback=0.00 -
由 Frédéric Bastien 提交于
Add requirement specific to rtd
-
由 Frederic 提交于
-
由 abergeron 提交于
Fix Sphinx markup in docs
-
- 17 9月, 2015 7 次提交
-
-
由 Kaixhin 提交于
-
由 Frédéric Bastien 提交于
Add Docker images to the installation docs
-
由 Kaixhin 提交于
-
由 Frédéric Bastien 提交于
automatic gpu selection no longer works in CUDA-TK 7.0
-
由 Frédéric Bastien 提交于
Enables float16 gemm on gpuarray when the cuda version supports it
-
由 vesis84 提交于
-
- 16 9月, 2015 17 次提交
-
-
由 Arnaud Bergeron 提交于
-
由 abergeron 提交于
Mixed2
-
由 vesis84 提交于
- pycuda initialization was also affected by the bug from CUDA 7.0
-
由 vesis84 提交于
- fixing an issue introduced with CUDA-TK 7.0, when automatic selection of free gpu by os / library no longer works. Library always selects '0', which leads to crash in case of 'exclusive mode'. - this fix is inspired by Dan Povey's fix for Kaldi: https://github.com/kaldi-asr/kaldi/commit/6548565445167e00125848f91d7da5f3f949b2a2 - it does a loop over gpus until a free gpu is taken.
-
由 Frédéric Bastien 提交于
Optimize local_fill_sink
-
由 Frédéric Bastien 提交于
fix shape mismatch in GpuDnnPoolGrad
-
由 Frédéric Bastien 提交于
Implement GpuAdvancedSubtensor1 for gpuarray
-
由 Ziye Fan 提交于
-
由 Frédéric Bastien 提交于
elemwise opt
-
由 Kashif Rasul 提交于
-
由 Arnaud Bergeron 提交于
-
由 Arnaud Bergeron 提交于
-
由 Arnaud Bergeron 提交于
-
由 Arnaud Bergeron 提交于
-
由 Arnaud Bergeron 提交于
-
由 Frederic 提交于
-
由 abergeron 提交于
Scan inplace opt
-
- 15 9月, 2015 11 次提交
-
-
由 Frederic 提交于
-
由 Frederic 提交于
-
由 Frederic 提交于
-
由 Frederic 提交于
-
由 Frederic 提交于
-
由 Frederic 提交于
-
由 Ziye Fan 提交于
-
由 Ziye Fan 提交于
- it is caused by modification of theano.config.experimental.local_alloc_elemwise_assert in T_local_switch_sink
-
由 Ziye Fan 提交于
computer, add one line in the test case to try to know what happened when it fails
-
由 Ziye Fan 提交于
-
由 Ziye Fan 提交于
-