This did come up in an implementation of adagrad at some point and will be much faster on GPU.
拖放文件到此处或者 点击上传