1. 26 10月, 2016 1 次提交
    • notoraptor's avatar
      New update. · bb25e997
      notoraptor 提交于
      Some commented codes about empty ldflags have been removed
      and replaced with normal comments.
      
      Some changes are made in alt_gemm_template.c relative to
      last recommendations.
      bb25e997
  2. 25 10月, 2016 1 次提交
    • notoraptor's avatar
      New update. · 70896325
      notoraptor 提交于
      * Removed other checkings of empty blas.ldflags in blas.py
      and test_blas.py.
      
      * Correction in alt_gemm_template.c to prevent some test
      failures in test_blas.py (occuring after above modification):
      in some tests, a zero-content matrix is passed as C
      (with M*N == 0). Now if we encounter this case, we just
      skip it (gemm not calculated, C not modified).
      
      * Correction in blas_headers.py to prevent some old string-
      formating errors when strings are C code containing "%" symbols.
      
      Now test_blas.py runs perfectly. I have to re-run all other
      tests tonight.
      70896325
  3. 22 10月, 2016 1 次提交
    • notoraptor's avatar
      New update! I have integrated the required changes and · 24f96fa8
      notoraptor 提交于
      re-run the tests. All tests are passed as before, and now
      some tests are faster. The biggest gain on my computer is
      for theano/tensor/nnet/tests/test_corr3d.py, which goes from
      687 seconds before to 259 seconds now. For other tests, it's
      between 3 and 20 seconds.
      
      Now there is not copy nor memory allocation
      (apart from NumPy wrapping structures) when BETA == 0.
      
      I rewrote the OP(matrix) function so that it does not return
      new allocated data anymore. Instead it just creates a
      PyArrayObject wrapper around the matrix pointer with the right
      format: F-contiguous (nrow * ncol) by default, or
      C-contiguous (ncol * nrow) if matrix need to be transposed.
      
      I also rewrote the matrix sum function so that it requires
      scalars to multiply each passed matrix before addition.
      Now the function do: B = alpha*A + beta*B
      with alpha and beta as the scalars (both set to 1 if we just want
      B = A + B). Thus, there is now only one iteration over A and B,
      in which A and B are each read once, and B modified once.
      24f96fa8
  4. 20 10月, 2016 12 次提交
    • notoraptor's avatar
      Flake8 issue corrected. · a4cc7dbd
      notoraptor 提交于
      a4cc7dbd
    • notoraptor's avatar
      It seems everithing works well now ! · a4eb981d
      notoraptor 提交于
      Tests passed (see details below) (with blas.ldflags empty and ldflags skipping removed from files):
      tensor/nnet/tests/test_corr.py
      tensor/nnet/tests/test_corr3d.py
      tensor/tests/test_blas.py
      tensor/tests/test_blas_scipy.py
      tensor/tests/test_blas_c.py (28 tests skipped)
      tensor/nnet/tests/test_abstract_conv.py:TestCorrConv2d
      tensor/nnet/tests/test_abstract_conv.py:TestCorrConv3d
      tensor/nnet/tests/test_abstract_conv.py:TestAbstractConvNoOptim
      tensor/nnet/tests/test_abstract_conv.py:TestCpuConv2d (252 tests skipped)
      tensor/nnet/tests/test_abstract_conv.py:TestCpuConv3d (60 tests skipped)
      tensor/nnet/tests/test_abstract_conv.py:TestBilinearUpsampling
      
      
      __
      $ theano-cache purge && THEANO_FLAGS=optdb.max_use_ratio=7,blas.ldflags= nosetests --verbose theano/tensor/nnet/tests/test_corr.py
      Tests that basic correlations work for odd and even ... ok
      Checks dtype upcast for CorrMM methods. ... ok
      Tests correlation where filter dilation != (1,1) ... ok
      Tests basic correlation in full mode and case where filter ... ok
      test_img_kernel_same_shape (theano.tensor.nnet.tests.test_corr.TestCorr2D) ... ok
      test_infer_shape_forward (theano.tensor.nnet.tests.test_corr.TestCorr2D) ... ok
      test_infer_shape_gradI (theano.tensor.nnet.tests.test_corr.TestCorr2D) ... ok
      test_infer_shape_gradW (theano.tensor.nnet.tests.test_corr.TestCorr2D) ... ok
      Tests scenario where filter_shape[1] != input_shape[1] ... ok
      test_non_contiguous (theano.tensor.nnet.tests.test_corr.TestCorr2D) ... ok
      Tests correlation where the {image,filter}_shape is a Constant tensor. ... ok
      Tests correlation where subsampling != (1,1) ... ok
      Make sure errors are raised when image and kernel are not 4D tensors ... ok
      
      ----------------------------------------------------------------------
      Ran 13 tests in 167.377s
      
      OK
      
      $ theano-cache purge && THEANO_FLAGS=optdb.max_use_ratio=10,blas.ldflags= nosetests --verbose theano/tensor/nnet/tests/test_corr3d.py
      Tests that basic correlations work for odd and even ... ok
      Checks dtype upcast for Corr3dMM methods. ... ok
      Tests correlation where filter dilation != (1,1,1) ... ok
      Tests basic correlation in full mode and case where filter ... ok
      test_img_kernel_same_shape (theano.tensor.nnet.tests.test_corr3d.TestCorr3D) ... ok
      test_infer_shape_forward (theano.tensor.nnet.tests.test_corr3d.TestCorr3D) ... ok
      test_infer_shape_gradI (theano.tensor.nnet.tests.test_corr3d.TestCorr3D) ... ok
      test_infer_shape_gradW (theano.tensor.nnet.tests.test_corr3d.TestCorr3D) ... ok
      Tests scenario where filter_shape[1] != input_shape[1] ... ok
      test_non_contiguous (theano.tensor.nnet.tests.test_corr3d.TestCorr3D) ... ok
      Tests correlation where the {image,filter}_shape is a Constant tensor. ... ok
      Tests correlation where subsampling != (1,1,1) ... ok
      Make sure errors are raised when image and kernel are not 5D tensors ... ok
      
      ----------------------------------------------------------------------
      Ran 13 tests in 687.905s
      
      OK
      $ theano-cache purge && THEANO_FLAGS=optdb.max_use_ratio=10,blas.ldflags= nosetests theano/tensor/tests/test_blas.py
      ..............................................................................................................
      ----------------------------------------------------------------------
      Ran 110 tests in 69.618s
      
      OK
      $ theano-cache purge && THEANO_FLAGS=optdb.max_use_ratio=10,blas.ldflags= nosetests theano/tensor/tests/test_blas_scipy.py
      ..............
      ----------------------------------------------------------------------
      Ran 14 tests in 16.113s
      
      OK
      $ theano-cache purge && THEANO_FLAGS=optdb.max_use_ratio=10,blas.ldflags= nosetests theano/tensor/tests/test_blas_c.py
      ...........S.S.SSSSSSSSSSSSSSSSSSSSSSSS...SS
      ----------------------------------------------------------------------
      Ran 44 tests in 14.716s
      
      OK (SKIP=28)
      $ theano-cache purge && THEANO_FLAGS=optdb.max_use_ratio=10,blas.ldflags= nosetests theano/tensor/nnet/tests/test_abstract_conv.py:TestCorrConv2d
      ..................................................................................................................................................................................................................................................................................................................................................................................................................
      ----------------------------------------------------------------------
      Ran 402 tests in 589.767s
      
      OK
      $ theano-cache purge && THEANO_FLAGS=optdb.max_use_ratio=10,blas.ldflags= nosetests theano/tensor/nnet/tests/test_abstract_conv.py:TestCorrConv3d
      ..................................................................................................
      ----------------------------------------------------------------------
      Ran 98 tests in 302.220s
      
      OK
      $ theano-cache purge && THEANO_FLAGS=blas.ldflags= nosetests theano/tensor/nnet/tests/test_abstract_conv.py:TestAbstractConvNoOptim
      ....................
      ----------------------------------------------------------------------
      Ran 20 tests in 93.374s
      
      OK
      $ theano-cache purge && THEANO_FLAGS=blas.ldflags= nosetests theano/tensor/nnet/tests/test_abstract_conv.py:TestCpuConv2d
      .......................SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS.........................SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS.........................SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS.........................SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS.........................SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS.........................SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS..
      ----------------------------------------------------------------------
      Ran 402 tests in 137.067s
      
      OK (SKIP=252)
      # 252 SKIPs pour la même raison: SKIP: No dilation implementation for basic cpu ConvOp.
      # (test_abstract_conv.py, ligne 494)
      $ theano-cache purge && THEANO_FLAGS=blas.ldflags= nosetests theano/tensor/nnet/tests/test_abstract_conv.py:TestCpuConv3d
      .................SSSSSSSSSSSSSSSSSSSSSSSSSSSSSS...................SSSSSSSSSSSSSSSSSSSSSSSSSSSSSS..
      ----------------------------------------------------------------------
      Ran 98 tests in 44.181s
      
      OK (SKIP=60)
      #  60 SKIPs pour la même raison: SKIP: No dilation implementation for basic cpu Conv3D.
      # (test_abstract_conv.py, ligne 688)
      $ theano-cache purge && THEANO_FLAGS=blas.ldflags= nosetests theano/tensor/nnet/tests/test_abstract_conv.py:TestBilinearUpsampling
      .....
      ----------------------------------------------------------------------
      Ran 5 tests in 29.046s
      
      OK
      a4eb981d
    • notoraptor's avatar
      Git-Rebasing and updating. · bc911254
      notoraptor 提交于
      bc911254
    • notoraptor's avatar
      Some light flake8 corrections · d5fb8413
      notoraptor 提交于
      d5fb8413
    • notoraptor's avatar
      Added another optimization for [sd]gemm_, · 61980334
      notoraptor 提交于
      just by skipping alpha*matrix multiplication when alpha == 1.0.
      
      All tests succeed (with blas.ldflags empty) for:
      * test_abstract_conv.py in theano/tensor/nnet/tests/
      * test_blas.py and test_blas_scipy.py in theano/tensor/tests/
      
      I have modified theano/tensor/tests/test_blas_c.py
      to skip all tests that involves either gemv or ger functions.
      * Before the modifications, this file executed 44 tests and 34 were skipped.
      * After  the modifications, this file executes 44 tests and 29 are skipped.
      
      # $ theano-cache purge && THEANO_FLAGS=blas.ldflags= nosetests --verbose theano/tensor/tests/test_blas_c.py
      
      PS: I also tried to execute test_corr.py in
      theano/tensor/nnet/tests/test_corr.py after removing ldflags checking,
      but I get many errors in many of the tests (theano outputs not matches ref outputs).
      So for the moment I have let this file and I will continue investigations tomorrow.
      61980334
    • notoraptor's avatar
      Required corrections and modifications are done. · 9ca9474b
      notoraptor 提交于
      Recall: code is tested with:
      
      $ theano-cache purge && THEANO_FLAGS=blas.ldflags= nosetests theano/tensor/nnet/tests/test_abstract_conv.py:TestCorrConv2d
      
      NB:
      1) dgemm_ is never called in these tests. Only sgemm_ is called.
      2) All LDA,LDB,LDC are always in the set {M;N;K} during these tests.
      9ca9474b
    • notoraptor's avatar
      flake8 errors fixed · 6b3afd89
      notoraptor 提交于
      6b3afd89
    • notoraptor's avatar
      Added some simplifications. · 39a2b2d2
      notoraptor 提交于
      39a2b2d2
    • notoraptor's avatar
      I added an implementation of C-functions "sgemm_" and "dgemm_" that call Numpy… · 0cd7aa7b
      notoraptor 提交于
      I added an implementation of C-functions "sgemm_" and "dgemm_" that call Numpy C-API functions to perform matrix product when BLAS is explicitely disabled (with theano flag "blas.ldflags" set to empty).
      
      This can be tested with:
      THEANO_FLAGS=blas.ldflags= nosetests theano/tensor/nnet/tests/test_abstract_conv.py:TestCorrConv2d
      0cd7aa7b
    • Frédéric Bastien's avatar
      Merge pull request #5122 from gvtulder/f-faster-conv3d-tests · 96be471e
      Frédéric Bastien 提交于
      Somewhat smaller/faster 3d convolution tests
      96be471e
    • Gijs van Tulder's avatar
      cb0e84e0
    • Gijs van Tulder's avatar
      Make test_corr3d tests smaller/faster. · 10caddc2
      Gijs van Tulder 提交于
      10caddc2
  5. 19 10月, 2016 2 次提交
  6. 18 10月, 2016 6 次提交
  7. 17 10月, 2016 5 次提交
  8. 16 10月, 2016 1 次提交
  9. 15 10月, 2016 11 次提交