提交 · bb25e99789b8d9ec0d69cf3bfc586a8c8d114b9d · testgroup / pytensor

26 10月, 2016 1 次提交

New update. · bb25e997

由提交于 10月 26, 2016

Some commented codes about empty ldflags have been removed
and replaced with normal comments.

Some changes are made in alt_gemm_template.c relative to
last recommendations.

bb25e997

25 10月, 2016 1 次提交

New update. · 70896325

由提交于 10月 24, 2016

* Removed other checkings of empty blas.ldflags in blas.py
and test_blas.py.

* Correction in alt_gemm_template.c to prevent some test
failures in test_blas.py (occuring after above modification):
in some tests, a zero-content matrix is passed as C
(with M*N == 0). Now if we encounter this case, we just
skip it (gemm not calculated, C not modified).

* Correction in blas_headers.py to prevent some old string-
formating errors when strings are C code containing "%" symbols.

Now test_blas.py runs perfectly. I have to re-run all other
tests tonight.

70896325

22 10月, 2016 1 次提交

New update! I have integrated the required changes and · 24f96fa8

由提交于 10月 21, 2016

re-run the tests. All tests are passed as before, and now
some tests are faster. The biggest gain on my computer is
for theano/tensor/nnet/tests/test_corr3d.py, which goes from
687 seconds before to 259 seconds now. For other tests, it's
between 3 and 20 seconds.

Now there is not copy nor memory allocation
(apart from NumPy wrapping structures) when BETA == 0.

I rewrote the OP(matrix) function so that it does not return
new allocated data anymore. Instead it just creates a
PyArrayObject wrapper around the matrix pointer with the right
format: F-contiguous (nrow * ncol) by default, or
C-contiguous (ncol * nrow) if matrix need to be transposed.

I also rewrote the matrix sum function so that it requires
scalars to multiply each passed matrix before addition.
Now the function do: B = alpha*A + beta*B
with alpha and beta as the scalars (both set to 1 if we just want
B = A + B). Thus, there is now only one iteration over A and B,
in which A and B are each read once, and B modified once.

24f96fa8

20 10月, 2016 12 次提交

Flake8 issue corrected. · a4cc7dbd
由 notoraptor 提交于 10月 19, 2016

a4cc7dbd

It seems everithing works well now ! · a4eb981d

由提交于 10月 19, 2016

Tests passed (see details below) (with blas.ldflags empty and ldflags skipping removed from files):
tensor/nnet/tests/test_corr.py
tensor/nnet/tests/test_corr3d.py
tensor/tests/test_blas.py
tensor/tests/test_blas_scipy.py
tensor/tests/test_blas_c.py (28 tests skipped)
tensor/nnet/tests/test_abstract_conv.py:TestCorrConv2d
tensor/nnet/tests/test_abstract_conv.py:TestCorrConv3d
tensor/nnet/tests/test_abstract_conv.py:TestAbstractConvNoOptim
tensor/nnet/tests/test_abstract_conv.py:TestCpuConv2d (252 tests skipped)
tensor/nnet/tests/test_abstract_conv.py:TestCpuConv3d (60 tests skipped)
tensor/nnet/tests/test_abstract_conv.py:TestBilinearUpsampling


__
$ theano-cache purge && THEANO_FLAGS=optdb.max_use_ratio=7,blas.ldflags= nosetests --verbose theano/tensor/nnet/tests/test_corr.py
Tests that basic correlations work for odd and even ... ok
Checks dtype upcast for CorrMM methods. ... ok
Tests correlation where filter dilation != (1,1) ... ok
Tests basic correlation in full mode and case where filter ... ok
test_img_kernel_same_shape (theano.tensor.nnet.tests.test_corr.TestCorr2D) ... ok
test_infer_shape_forward (theano.tensor.nnet.tests.test_corr.TestCorr2D) ... ok
test_infer_shape_gradI (theano.tensor.nnet.tests.test_corr.TestCorr2D) ... ok
test_infer_shape_gradW (theano.tensor.nnet.tests.test_corr.TestCorr2D) ... ok
Tests scenario where filter_shape[1] != input_shape[1] ... ok
test_non_contiguous (theano.tensor.nnet.tests.test_corr.TestCorr2D) ... ok
Tests correlation where the {image,filter}_shape is a Constant tensor. ... ok
Tests correlation where subsampling != (1,1) ... ok
Make sure errors are raised when image and kernel are not 4D tensors ... ok

----------------------------------------------------------------------
Ran 13 tests in 167.377s

OK

$ theano-cache purge && THEANO_FLAGS=optdb.max_use_ratio=10,blas.ldflags= nosetests --verbose theano/tensor/nnet/tests/test_corr3d.py
Tests that basic correlations work for odd and even ... ok
Checks dtype upcast for Corr3dMM methods. ... ok
Tests correlation where filter dilation != (1,1,1) ... ok
Tests basic correlation in full mode and case where filter ... ok
test_img_kernel_same_shape (theano.tensor.nnet.tests.test_corr3d.TestCorr3D) ... ok
test_infer_shape_forward (theano.tensor.nnet.tests.test_corr3d.TestCorr3D) ... ok
test_infer_shape_gradI (theano.tensor.nnet.tests.test_corr3d.TestCorr3D) ... ok
test_infer_shape_gradW (theano.tensor.nnet.tests.test_corr3d.TestCorr3D) ... ok
Tests scenario where filter_shape[1] != input_shape[1] ... ok
test_non_contiguous (theano.tensor.nnet.tests.test_corr3d.TestCorr3D) ... ok
Tests correlation where the {image,filter}_shape is a Constant tensor. ... ok
Tests correlation where subsampling != (1,1,1) ... ok
Make sure errors are raised when image and kernel are not 5D tensors ... ok

----------------------------------------------------------------------
Ran 13 tests in 687.905s

OK
$ theano-cache purge && THEANO_FLAGS=optdb.max_use_ratio=10,blas.ldflags= nosetests theano/tensor/tests/test_blas.py
..............................................................................................................
----------------------------------------------------------------------
Ran 110 tests in 69.618s

OK
$ theano-cache purge && THEANO_FLAGS=optdb.max_use_ratio=10,blas.ldflags= nosetests theano/tensor/tests/test_blas_scipy.py
..............
----------------------------------------------------------------------
Ran 14 tests in 16.113s

OK
$ theano-cache purge && THEANO_FLAGS=optdb.max_use_ratio=10,blas.ldflags= nosetests theano/tensor/tests/test_blas_c.py
...........S.S.SSSSSSSSSSSSSSSSSSSSSSSS...SS
----------------------------------------------------------------------
Ran 44 tests in 14.716s

OK (SKIP=28)
$ theano-cache purge && THEANO_FLAGS=optdb.max_use_ratio=10,blas.ldflags= nosetests theano/tensor/nnet/tests/test_abstract_conv.py:TestCorrConv2d
..................................................................................................................................................................................................................................................................................................................................................................................................................
----------------------------------------------------------------------
Ran 402 tests in 589.767s

OK
$ theano-cache purge && THEANO_FLAGS=optdb.max_use_ratio=10,blas.ldflags= nosetests theano/tensor/nnet/tests/test_abstract_conv.py:TestCorrConv3d
..................................................................................................
----------------------------------------------------------------------
Ran 98 tests in 302.220s

OK
$ theano-cache purge && THEANO_FLAGS=blas.ldflags= nosetests theano/tensor/nnet/tests/test_abstract_conv.py:TestAbstractConvNoOptim
....................
----------------------------------------------------------------------
Ran 20 tests in 93.374s

OK
$ theano-cache purge && THEANO_FLAGS=blas.ldflags= nosetests theano/tensor/nnet/tests/test_abstract_conv.py:TestCpuConv2d
.......................SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS.........................SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS.........................SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS.........................SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS.........................SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS.........................SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS..
----------------------------------------------------------------------
Ran 402 tests in 137.067s

OK (SKIP=252)
# 252 SKIPs pour la même raison: SKIP: No dilation implementation for basic cpu ConvOp.
# (test_abstract_conv.py, ligne 494)
$ theano-cache purge && THEANO_FLAGS=blas.ldflags= nosetests theano/tensor/nnet/tests/test_abstract_conv.py:TestCpuConv3d
.................SSSSSSSSSSSSSSSSSSSSSSSSSSSSSS...................SSSSSSSSSSSSSSSSSSSSSSSSSSSSSS..
----------------------------------------------------------------------
Ran 98 tests in 44.181s

OK (SKIP=60)
#  60 SKIPs pour la même raison: SKIP: No dilation implementation for basic cpu Conv3D.
# (test_abstract_conv.py, ligne 688)
$ theano-cache purge && THEANO_FLAGS=blas.ldflags= nosetests theano/tensor/nnet/tests/test_abstract_conv.py:TestBilinearUpsampling
.....
----------------------------------------------------------------------
Ran 5 tests in 29.046s

OK

a4eb981d

Git-Rebasing and updating. · bc911254
由 notoraptor 提交于 10月 18, 2016

bc911254
Some light flake8 corrections · d5fb8413
由 notoraptor 提交于 10月 18, 2016

d5fb8413

Added another optimization for [sd]gemm_, · 61980334

由提交于 10月 17, 2016

just by skipping alpha*matrix multiplication when alpha == 1.0.

All tests succeed (with blas.ldflags empty) for:
* test_abstract_conv.py in theano/tensor/nnet/tests/
* test_blas.py and test_blas_scipy.py in theano/tensor/tests/

I have modified theano/tensor/tests/test_blas_c.py
to skip all tests that involves either gemv or ger functions.
* Before the modifications, this file executed 44 tests and 34 were skipped.
* After  the modifications, this file executes 44 tests and 29 are skipped.

# $ theano-cache purge && THEANO_FLAGS=blas.ldflags= nosetests --verbose theano/tensor/tests/test_blas_c.py

PS: I also tried to execute test_corr.py in
theano/tensor/nnet/tests/test_corr.py after removing ldflags checking,
but I get many errors in many of the tests (theano outputs not matches ref outputs).
So for the moment I have let this file and I will continue investigations tomorrow.

61980334

Required corrections and modifications are done. · 9ca9474b

由提交于 10月 13, 2016

Recall: code is tested with:

$ theano-cache purge && THEANO_FLAGS=blas.ldflags= nosetests theano/tensor/nnet/tests/test_abstract_conv.py:TestCorrConv2d

NB:
1) dgemm_ is never called in these tests. Only sgemm_ is called.
2) All LDA,LDB,LDC are always in the set {M;N;K} during these tests.

9ca9474b

flake8 errors fixed · 6b3afd89
由 notoraptor 提交于 10月 12, 2016

6b3afd89
Added some simplifications. · 39a2b2d2
由 notoraptor 提交于 10月 11, 2016

39a2b2d2

I added an implementation of C-functions "sgemm_" and "dgemm_" that call Numpy… · 0cd7aa7b

由提交于 10月 11, 2016

I added an implementation of C-functions "sgemm_" and "dgemm_" that call Numpy C-API functions to perform matrix product when BLAS is explicitely disabled (with theano flag "blas.ldflags" set to empty).

This can be tested with:
THEANO_FLAGS=blas.ldflags= nosetests theano/tensor/nnet/tests/test_abstract_conv.py:TestCorrConv2d

0cd7aa7b

Merge pull request #5122 from gvtulder/f-faster-conv3d-tests · 96be471e
由 Frédéric Bastien 提交于 10月 20, 2016
```
Somewhat smaller/faster 3d convolution tests
```
96be471e
Make test_abstract_conv 3D tests a bit smaller/faster. · cb0e84e0
由 Gijs van Tulder 提交于 10月 20, 2016

cb0e84e0
Make test_corr3d tests smaller/faster. · 10caddc2
由 Gijs van Tulder 提交于 10月 20, 2016

10caddc2

19 10月, 2016 2 次提交
- Merge pull request #5086 from nouiz/mixed · 67f31fb4
  由 Pascal Lamblin 提交于 10月 19, 2016
```
A few small update.
```
  67f31fb4
- Merge pull request #4996 from Thrandis/ccw · 418a5f1b
  由 Frédéric Bastien 提交于 10月 19, 2016
```
Scan with Checkpoints (part 2)
```
  418a5f1b
18 10月, 2016 6 次提交
- Merge pull request #5108 from nouiz/fast_compile_fix · a29c31d3
  由 Pascal Lamblin 提交于 10月 18, 2016
```
Fix a change to gpuarray conv from gh-5069. dnn conv wasn't working a…
```
  a29c31d3
- Merge pull request #5105 from nouiz/32bit · 1e279256
  由 Pascal Lamblin 提交于 10月 18, 2016
```
32bit
```
  1e279256
- Merge pull request #4915 from abergeron/dnn_rnn2 · 3007bf79
  由 Pascal Lamblin 提交于 10月 17, 2016
```
Cudnn RNN bindings.
```
  3007bf79
- Merge pull request #5103 from nouiz/floatX_error · d72325e4
  由 Pascal Lamblin 提交于 10月 17, 2016
```
Make device=gpu with floatX=float16 raise an error (cause many user q…
```
  d72325e4
- Fix a change to gpuarray conv from gh-5069. dnn conv wasn't working anymore in fast_compile. · 5e2e87e0
  由 Frederic Bastien 提交于 10月 17, 2016
  
  5e2e87e0
- Fix the last test in python 32 bit. Keep constant shape as constant and not casted constant. · a94d5406
  由 Frederic Bastien 提交于 10月 17, 2016
  
  a94d5406
17 10月, 2016 5 次提交
- Fix pooling op compilation with python 32bit and newer gcc. · eefbb4fd
  由 Frederic Bastien 提交于 10月 17, 2016
  
  eefbb4fd
- Skip key error when doing the key cleanup. · cbc60f01
  由 Frederic Bastien 提交于 10月 17, 2016
  
  cbc60f01
- Fix compilation on newer gcc and python 32bit · 770f9457
  由 Frederic Bastien 提交于 10月 17, 2016
  
  770f9457
- Make device=gpu with floatX=float16 raise an error (cause many user questions… · 301f7a2f
  由 Frederic Bastien 提交于 10月 17, 2016
```
Make device=gpu with floatX=float16 raise an error (cause many user questions and this should never be used). if floatX=float64, raise an error.
```
  301f7a2f
- Merge pull request #5096 from nouiz/fast_compile · 173826a7
  由 Frédéric Bastien 提交于 10月 17, 2016
```
fix tests in mode=FAST_COMPILE
```
  173826a7
16 10月, 2016 1 次提交
- Merge pull request #5069 from gvtulder/f-gpuarray-corr_gemm · 0acf8988
  由 Pascal Lamblin 提交于 10月 15, 2016
```
GpuCorrMM and GpuCorr3dMM in new backend
```
  0acf8988
15 10月, 2016 11 次提交
- Fix flake8 error that slipped in the rebase. · c27959ba
  由 Arnaud Bergeron 提交于 10月 14, 2016
  
  c27959ba
- Fix the grad_c test. · 8cd94e47
  由 Arnaud Bergeron 提交于 10月 14, 2016
  
  8cd94e47
- Fix the connectivty test. · 3d85d7cf
  由 Arnaud Bergeron 提交于 10月 14, 2016
  
  3d85d7cf
- Flake8 · b0982c25
  由 Arnaud Bergeron 提交于 10月 14, 2016
  
  b0982c25
- Fix test. · 242145f6
  由 Arnaud Bergeron 提交于 10月 14, 2016
  
  242145f6
- Add tests for gradient connectivity to cy only. · f722afba
  由 Arnaud Bergeron 提交于 10月 14, 2016
  
  f722afba
- Add tests for various patterns of connectivity of y and hy. · fd2f23f3
  由 Arnaud Bergeron 提交于 10月 14, 2016
  
  fd2f23f3
- Remove the dropout arguemnts since they aren't supported. · 8062f2a9
  由 Arnaud Bergeron 提交于 10月 14, 2016
  
  8062f2a9
- Replace dy with zeros if disconnected instead of dhy[-1]. · 9a0208da
  由 Arnaud Bergeron 提交于 10月 14, 2016
  
  9a0208da
- After a re-read of the docs, adjust the shape comments. · dd9de351
  由 Arnaud Bergeron 提交于 10月 14, 2016
  
  dd9de351
- Flake8 fix. · 5fc707eb
  由 Arnaud Bergeron 提交于 9月 29, 2016
  
  5fc707eb