• notoraptor's avatar
    New update! I have integrated the required changes and · 24f96fa8
    notoraptor 提交于
    re-run the tests. All tests are passed as before, and now
    some tests are faster. The biggest gain on my computer is
    for theano/tensor/nnet/tests/test_corr3d.py, which goes from
    687 seconds before to 259 seconds now. For other tests, it's
    between 3 and 20 seconds.
    
    Now there is not copy nor memory allocation
    (apart from NumPy wrapping structures) when BETA == 0.
    
    I rewrote the OP(matrix) function so that it does not return
    new allocated data anymore. Instead it just creates a
    PyArrayObject wrapper around the matrix pointer with the right
    format: F-contiguous (nrow * ncol) by default, or
    C-contiguous (ncol * nrow) if matrix need to be transposed.
    
    I also rewrote the matrix sum function so that it requires
    scalars to multiply each passed matrix before addition.
    Now the function do: B = alpha*A + beta*B
    with alpha and beta as the scalars (both set to 1 if we just want
    B = A + B). Thus, there is now only one iteration over A and B,
    in which A and B are each read once, and B modified once.
    24f96fa8
名称
最后提交
最后更新
..
compat 正在载入提交数据...
compile 正在载入提交数据...
d3viz 正在载入提交数据...
gof 正在载入提交数据...
gpuarray 正在载入提交数据...
misc 正在载入提交数据...
sandbox 正在载入提交数据...
scalar 正在载入提交数据...
scan_module 正在载入提交数据...
sparse 正在载入提交数据...
tensor 正在载入提交数据...
tests 正在载入提交数据...
typed_list 正在载入提交数据...
__init__.py 正在载入提交数据...
configdefaults.py 正在载入提交数据...
configparser.py 正在载入提交数据...
gradient.py 正在载入提交数据...
ifelse.py 正在载入提交数据...
printing.py 正在载入提交数据...
raise_op.py 正在载入提交数据...
updates.py 正在载入提交数据...
version.py 正在载入提交数据...