theano/tensor · 24f96fa8fad9b7f25b6b593da1c72d287a237f58 · testgroup / pytensor

New update! I have integrated the required changes and · 24f96fa8

由提交于 10月 21, 2016

re-run the tests. All tests are passed as before, and now
some tests are faster. The biggest gain on my computer is
for theano/tensor/nnet/tests/test_corr3d.py, which goes from
687 seconds before to 259 seconds now. For other tests, it's
between 3 and 20 seconds.

Now there is not copy nor memory allocation
(apart from NumPy wrapping structures) when BETA == 0.

I rewrote the OP(matrix) function so that it does not return
new allocated data anymore. Instead it just creates a
PyArrayObject wrapper around the matrix pointer with the right
format: F-contiguous (nrow * ncol) by default, or
C-contiguous (ncol * nrow) if matrix need to be transposed.

I also rewrote the matrix sum function so that it requires
scalars to multiply each passed matrix before addition.
Now the function do: B = alpha*A + beta*B
with alpha and beta as the scalars (both set to 1 if we just want
B = A + B). Thus, there is now only one iteration over A and B,
in which A and B are each read once, and B modified once.

24f96fa8

名称	最后提交	最后更新
..
nnet		正在载入提交数据...
signal		正在载入提交数据...
tests		正在载入提交数据...
__init__.py		正在载入提交数据...
alt_gemm_common.c		正在载入提交数据...
alt_gemm_template.c		正在载入提交数据...
basic.py		正在载入提交数据...
blas.py		正在载入提交数据...
blas_c.py		正在载入提交数据...
blas_headers.py		正在载入提交数据...
blas_scipy.py		正在载入提交数据...
elemwise.py		正在载入提交数据...
elemwise_cgen.py		正在载入提交数据...
extra_ops.py		正在载入提交数据...
fft.py		正在载入提交数据...
fourier.py		正在载入提交数据...
inplace.py		正在载入提交数据...
io.py		正在载入提交数据...
nlinalg.py		正在载入提交数据...
opt.py		正在载入提交数据...
opt_uncanonicalize.py		正在载入提交数据...
raw_random.py		正在载入提交数据...
shared_randomstreams.py		正在载入提交数据...
sharedvar.py		正在载入提交数据...
slinalg.py		正在载入提交数据...
sort.py		正在载入提交数据...
subtensor.py		正在载入提交数据...
type.py		正在载入提交数据...
type_other.py		正在载入提交数据...
utils.py		正在载入提交数据...
var.py		正在载入提交数据...
xlogx.py		正在载入提交数据...