Skip to content
项目
群组
代码片段
帮助
当前项目
正在载入...
登录 / 注册
切换导航面板
P
pytensor
项目
项目
详情
活动
周期分析
仓库
仓库
文件
提交
分支
标签
贡献者
图表
比较
统计图
议题
0
议题
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
CI / CD
CI / CD
流水线
作业
日程
统计图
Wiki
Wiki
代码片段
代码片段
成员
成员
折叠边栏
关闭边栏
活动
图像
聊天
创建新问题
作业
提交
问题看板
Open sidebar
testgroup
pytensor
Commits
24ef1606
提交
24ef1606
authored
1月 19, 2010
作者:
Olivier Delalleau
浏览文件
操作
浏览文件
下载
差异文件
Merged
上级
d7286bf8
43cf804a
隐藏空白字符变更
内嵌
并排
正在显示
15 个修改的文件
包含
874 行增加
和
302 行删除
+874
-302
conf.py
doc/conf.py
+1
-1
index.txt
doc/index.txt
+2
-2
install.txt
doc/install.txt
+66
-28
dev_start_guide.txt
doc/internal/dev_start_guide.txt
+7
-12
introduction.txt
doc/introduction.txt
+36
-40
basic.txt
doc/library/tensor/basic.txt
+2
-0
links.txt
doc/links.txt
+2
-2
symbolic_graphs.txt
doc/tutorial/symbolic_graphs.txt
+5
-4
conv.py
theano/sandbox/conv.py
+304
-3
scan.py
theano/sandbox/scan.py
+0
-11
test_conv.py
theano/sandbox/test_conv.py
+85
-35
test_scan.py
theano/sandbox/test_scan.py
+97
-32
nnet.py
theano/tensor/nnet.py
+55
-33
opt.py
theano/tensor/opt.py
+14
-4
test_nnet.py
theano/tensor/tests/test_nnet.py
+198
-95
没有找到文件。
doc/conf.py
浏览文件 @
24ef1606
...
...
@@ -168,7 +168,7 @@ latex_font_size = '11pt'
# Grouping the document tree into LaTeX files. List of tuples
# (source start file, target name, title, author, document class [howto/manual]).
latex_documents
=
[
(
'
contents
'
,
'theano.tex'
,
'theano Documentation'
,
(
'
index
'
,
'theano.tex'
,
'theano Documentation'
,
'LISA lab, University of Montreal'
,
'manual'
),
]
...
...
doc/index.txt
浏览文件 @
24ef1606
...
...
@@ -37,7 +37,7 @@ Roughly in order of what you'll want to check out:
* :ref:`extending` -- Learn to add a Type, Op, or graph optimization.
* :ref:`internal` -- How to maintaining Theano, LISA-specific tips, and more...
You can download the latest `PDF documentation <http://
pylearn.org/theano
/theano.pdf>`_, rather than reading it online.
You can download the latest `PDF documentation <http://
deeplearning.net/theanodoc
/theano.pdf>`_, rather than reading it online.
Community
=========
...
...
@@ -46,7 +46,7 @@ Community
* Register and post to `theano-dev`_ if you want to talk to the developers.
* We try to stay organized with `Theano's Trac <
trac/
>`__
* We try to stay organized with `Theano's Trac <
http://trac-hg.assembla.com/theano/report/1
>`__
* Come visit us in Montreal! Most of the developers are students in the LISA_ group at the `University of Montreal`_.
...
...
doc/install.txt
浏览文件 @
24ef1606
...
...
@@ -20,7 +20,7 @@ to be installed:
We develop mainly on 64-bit Linux machines. 32-bit architectures are
not well-tested.
python >= 2.5
python >= 2.5
(2.4 should be supported as well)
`numpy <http://numpy.scipy.org/>`_ >= 1.2
Earlier versions have memory leaks.
...
...
@@ -30,6 +30,8 @@ to be installed:
is buggy in 0.6. (scipy.csc_matrix dot has a bug with singleton
dimensions. There may be more bugs.)
A BLAS installation (with Level 3 functionality)
The following libraries and software are optional:
g++, python-dev
...
...
@@ -42,41 +44,49 @@ The following libraries and software are optional:
`mercurial <http://www.selenic.com/mercurial/>`_
To download bleeding-edge version of Theano.
.. _install_bleeding_edge:
Getting the code
-----------------
Easy install
------------
If you are a developer of Theano, then check out the :ref:`dev_start_guide` guide.
The following
command will install the latest release of Theano
on your system
:
The following
are general instructions that will set you up with the bleeding-edge
version of Theano. First, get the code using `mercurial <http://www.selenic.com/mercurial/wiki/>`__
:
.. code-block:: bash
easy_install
Theano
hg clone http://hg.assembla.com/theano
Theano
Manual install
--------------
Configuring PYTHONPATH
---------------------------
The subdirectory Theano/theano has to be located in a path
mentioned in your PYTHONPATH. In order to do that, you can either
create a symbolic link to Theano/theano in a directory already
mentioned in your PYTHONPATH environment variable, or modify the
PYTHONPATH so that it mentions Theano.
To install the latest release of Theano from source, visit the `downloads
<http://pylearn.org/theano/downloads/>`_ page and download the release you
want. Unpack the release, and type:
To create a symbolic link:
.. code-block:: bash
python setup.py build
python setup.py test
python setup.py install
ln -s Theano/theano <someplace on your PYTHONPATH>/theano
.. _install_bleeding_edge
:
To modify the environment variable PYTHONPATH in bash, you may do this
:
Bleeding Edge
--------------
.. code-block:: bash
Feeling lucky and want to run bleeding-edge code?
Then check out the :ref:`dev_start_guide` guide.
export PYTHONPATH=<path to Theano's parent dir>/Theano:$PYTHONPATH
In csh:
Configuring the environment
---------------------------
.. code-block:: csh
setenv PYTHONPATH <path to Theano's parent dir>/Theano:$PYTHONPATH
Configuring Theano's environmental variables
---------------------------------------------
Two environment variables are used to control automatic code
generation. It is possible to use Theano in a way which avoids all
...
...
@@ -118,6 +128,33 @@ automatic code generation, but that way is much, much slower.
Omitting this variable defaults the mode to ``'FAST_RUN'``.
Testing your installation
---------------------------
Once you have completed these steps, you should run the theano test suite like this:
.. code-block:: bash
cd Theano
nosetests #execute all the tests
All tests should pass. If some test fails on your machine, you are
encouraged to tell us what went wrong on the ``theano-users@googlegroups.com``
mailing list.
Updating
-------------
To update your library to the latest revision, change directory (``cd``)
to your ``Theano`` folder and execute the following command:
.. code-block:: bash
hg pull -u
You should update frequently, bugs are fixed on a very regular basis.
Mac
---
...
...
@@ -126,20 +163,21 @@ Mac
-
.. code-block:: bash
$ sudo port install gcc4
2
py25-zlib py25-numpy py25-scipy mercurial
$ sudo port install gcc4
4
py25-zlib py25-numpy py25-scipy mercurial
Note that compiling gcc
42
takes a significant time (hours) so it is probably
Note that compiling gcc takes a significant time (hours) so it is probably
not the best solution if you are in a rush! It may happen that SciPy
fails to compile the first time and still compiles just fine on a second
try. Same thing with py25-zlib.
-
Install some kind of BLAS library (TODO: how?)
-
scipy depends on ATLAS (a BLAS library), which will be installed by MacPorts.
- Set ``THEANO_BLAS_LDFLAGS`` to something which will link against said BLAS
library. E.g., ``THEANO_BLAS_LDFLAGS='-lcblas -latlas -lgfortran'``.
This advice has not been tested recently, so please inform us of your results.
These installation instructions have not tested recently, please infom us of your results!
We would be especially interested in dependencies that we missed listing, as well as tests
that fail on your platform (use the ``theano-users@googlegroups.com`` mailing list).
Windows
...
...
@@ -247,9 +285,9 @@ Generating the documentation
----------------------------
You can read the latest HTML documentation `here
<http://
pylearn.org/theano/contents.html
>`__.
<http://
deeplearning.net/theanodoc
>`__.
You can download the latest PDF documentation `here
<http://
pylearn.org/theano
/theano.pdf>`__.
<http://
deeplearning.net/theanodoc
/theano.pdf>`__.
We recommend you look at the documentation on the website, since it
will be more current than the documentation included with the package.
...
...
doc/internal/dev_start_guide.txt
浏览文件 @
24ef1606
...
...
@@ -21,11 +21,10 @@ Developer Start Guide
Accounts
========
To obtain developer access: send an email to an admin with an username and
temporary password. Pending approval, this will give you access to both the
repository and Trac. You should then change your password in the
`<http://pylearn.org/theano/prefs preferences>` tab - do *NOT* use a good
password! We are using plain text http which is not secure.
To obtain developer access: register with `Assembla
<http://www.assembla.com/>`_ and add yourself as a watcher on the `Theano space
<http://www.assembla.com/spaces/theano>`_. Then send an email to an admin asking
to be promoted to a member of the project.
Theano code
...
...
@@ -34,10 +33,9 @@ Theano code
*To get the source via mercurial,* you must have `mercurial
<http://www.selenic.com/mercurial/wiki/>`__ installed.
The code that makes up Theano is in a single repository available in
`<http://pylearn.org/hg/Theano>`__.
As a developer, you should clone this repository like this:
The code that makes up Theano is in a `single repository
<http://www.assembla.com/spaces/theano/trac_mercurial_tool>`__. As a developer,
you should clone this repository like this:
.. code-block:: bash
...
...
@@ -121,9 +119,6 @@ to your ``Theano`` folder and execute the following command:
hg pull -u
You may also download the latest source directly as a gzip'd tar file:
`<http://pylearn.org/hg/Theano/archive/tip.tar.gz>`__.
Nightly test
============
...
...
doc/introduction.txt
浏览文件 @
24ef1606
...
...
@@ -5,43 +5,40 @@
Theano at a Glance
==================
Theano is a Python library that allows you to define, optimize, and evaluate
mathematical expressions involving multi-dimensional arrays. Using Theano it is
Theano is a Python library that lets you to define, optimize, and evaluate
mathematical expressions, especially ones with multi-dimensional arrays
(numpy.ndarray). Using Theano it is
possible to attain speeds rivaling hand-crafted C implementations for problems
involving large amounts of data. It can also surpass C on a CPU by many orders
of magnitude by taking advantage of recent GPUs.
Theano melds some aspects of a computer algebra system (CAS) with
aspects of an optimizing compiler. It can even transform some or all
of the mathematical expression into C code and compile it into native
machine instructions. This combination of CAS with optimizing
compilation is particularly useful for tasks in which complicated
mathematical expressions are evaluated repeatedly and evaluation speed
is critical.
Theano supports a range of numerical types in multiple dimensions and
a number of well-tested operations. It also allows you to compute the
gradient of an expression with respect to another. Symbolic
expressions may be compiled into functions, which work on the same
data structures as numpy_, allowing for easy interoperability.
Theano combines aspects of a computer algebra system (CAS) with aspects of an
optimizing compiler. It can also generate customized C code for many
mathematical operations. This combination of CAS with optimizing compilation
is particularly useful for tasks in which complicated mathematical expressions
are evaluated repeatedly and evaluation speed is critical. For situations
where many different expressions are each evaluated once Theano can minimize
the amount of compilation/analysis overhead, but still provide symbolic
features such as automatic differentiation.
Theano's compiler applies many optimizations of varying complexity to
these symbolic expressions. These optimizations include, but are not
limited to:
* use of GPU for computations
* constant folding
* merging of similar subgraphs, to avoid
calculating the same values
more than once
*
arithmetic simplification (``x*y/x -> y``)
* inserting efficient BLAS_ operation
s
* using
inplace operations wherever it is safe to do so.
Theano defines several optimizations which improve the numerical
stability of computations.
Theano was written at the LISA_ lab to support the development of
efficient machine learning algorithms while minimizing human time. We
use it especially in gradient-based learning techniques.
Theano is
* merging of similar subgraphs, to avoid
redundant calculation
* arithmetic simplification (e.g. ``x*y/x -> y``, ``--x -> x``)
*
inserting efficient BLAS_ operations (e.g. ``GEMM``) in a variety of
context
s
* using
memory aliasing to avoid calculation
* using inplace operations wherever it does not interfere with aliasing
* loop fusion for elementwise sub-expressions
* improvements to numerical stability (e.g. :math:`\log(1+\exp(x))` and :math:`\log(\sum_i \exp(x[i]))`)
* for a complete list, see :ref:`_optimizations`
Theano was written at the LISA_ lab to support rapid development of
efficient machine learning algorithms.
Theano is
named after the `Greek mathematician`_, who may have been Pythagoras'
wife. Theano is released under a BSD license (:ref:`link <license>`).
...
...
@@ -92,30 +89,28 @@ machine instructions.
What does it do that they don't?
================================
Theano is a
p
ython library and optimizing compiler for manipulating
Theano is a
P
ython library and optimizing compiler for manipulating
and evaluating expressions, especially matrix-valued
ones. Manipulation of matrices is typically done using the numpy
package, so what does Theano do that Python and numpy do not?
- *execution speed optimizations*: Theano can use `g++` to compile
parts your expression graph into
native machine code, which runs
much faster than python.
- *execution speed optimizations*: Theano can use `g++`
or `nvcc`
to compile
parts your expression graph into
CPU or GPU instructions, which run
much faster than p
ure P
ython.
- *symbolic differentiation*: Theano can automatic build symbolic graphs
for computing gradients.
- *stability optimizations*: Theano can recognize numerically unstable
- *stability optimizations*: Theano can recognize
[some]
numerically unstable
expressions and compute them with more stable algorithms.
There exist another symbolic package in Python, namely sympy_. Theano
is different from sympy in the sense that while Theano allows symbolic
manipulation it puts more emphasis on the evaluation of these expressions
and being able to repeatedly evaluate them on many different inputs. Theano
is also better suited to handling large tensors which have no
assumed structures.
The closest Python package to Theano is sympy_.
Theano focuses more on tensor expressions than Sympy, and has more machinery
for compilation. Sympy has more sophisticated algebra rules and can
handle a wider variety of mathematical operations (such as series, limits, and integrals).
If numpy_ is to be compared to MATLAB_ and sympy_ to Mathematica_,
Theano is a sort of hybrid of the two which tries to
mak
e the best of
Theano is a sort of hybrid of the two which tries to
combin
e the best of
both worlds.
...
...
@@ -134,7 +129,8 @@ Getting started
the :ref:`tutorial` first though.
A PDF version of the online documentation may be found `here <theano.pdf>`_.
A PDF version of the online documentation may be found `here
<http://deeplearning.net/theanodoc/theano.pdf>`_.
Contact us
...
...
doc/library/tensor/basic.txt
浏览文件 @
24ef1606
...
...
@@ -331,6 +331,8 @@ Indexing
Basic indexing.
Mirrors numpy's `basic indexing <http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html>`_. Read that page first.
Advanced indexing.
.. _libdoc_tensor_elementwise:
...
...
doc/links.txt
浏览文件 @
24ef1606
...
...
@@ -40,10 +40,10 @@ This is a sort of memo for developers and would-be developers.
.. _mercurial: http://www.selenic.com/mercurial/wiki/
.. _nosetests: http://somethingaboutorange.com/mrl/projects/nose/
.. _numpy: http://numpy.scipy.org/
.. _python: http://www.python.or
.. _python: http://www.python.or
g
.. _scipy: http://scipy.org/
.. _autodiff: http://autodiff.org
.. _autodiff: http://
www.
autodiff.org
.. _boost.python: http://www.boost.org/doc/libs/1_38_0/libs/python/doc/index.html
.. _cython: http://www.cython.org/
.. _liboil: http://liboil.freedesktop.org/wiki/
...
...
doc/tutorial/symbolic_graphs.txt
浏览文件 @
24ef1606
...
...
@@ -41,9 +41,10 @@ details about these building blocks see :ref:`variable`, :ref:`op`,
.. figure:: apply.png
:align: center
Arrows represent references to the Python objects pointed at. The blue
box is an :ref:`apply` node. Red boxes are :ref:`variable` nodes. Green
circles are :ref:`Ops <op>`. Purple boxes are :ref:`Types <type>`.
Arrows represent references to the Python objects pointed at. The blue
box is an :ref:`apply` node. Red boxes are :ref:`variable` nodes. Green
circles are :ref:`Ops <op>`. Purple boxes are :ref:`Types <type>`.
The graph can be traversed starting from outputs (the result of some
...
...
@@ -104,7 +105,7 @@ how to compute the gradient of the node's outputs with respect to its
inputs. Note that if an :ref:`op` does not provide this information,
it is assumed that the gradient does not defined.
Using the
`chain rule <http://en.wikipedia.org/wiki/Chain_r
i
le>`_
`chain rule <http://en.wikipedia.org/wiki/Chain_r
u
le>`_
these gradients can be composed in order to obtain the expression of the
gradient of the graph's output with respect to the graph's inputs .
...
...
theano/sandbox/conv.py
浏览文件 @
24ef1606
...
...
@@ -29,9 +29,10 @@ class ConvOp(Op):
#TODO: make the stacksize its own parameter, and make imshp a pair
def
__init__
(
self
,
imshp
,
kshp
,
nkern
,
bsize
,
dx
,
dy
,
output_mode
=
'valid'
,
unroll_batch
=
4
,
unroll_kern
=
4
,
def
__init__
(
self
,
imshp
=
None
,
kshp
=
None
,
nkern
=
None
,
bsize
=
None
,
dx
=
None
,
dy
=
None
,
output_mode
=
'valid'
,
unroll_batch
=
0
,
unroll_kern
=
0
,
unroll_patch
=
False
,
imshp_logical
=
None
,
kshp_logical
=
None
,
kshp_logical_top_aligned
=
True
,
...
...
@@ -47,6 +48,7 @@ class ConvOp(Op):
dx - patch stride rows
dy - patch stride cols
out_mode - 'valid', 'full'
unroll_patch - c code generation option
unroll_batch - c code generation option
unroll_kern - c code generation option
verbose - passed to GpuConv
...
...
@@ -60,6 +62,7 @@ class ConvOp(Op):
gradient on the filters.
unroll_patch. If True will use a version that is faster then without not unroll by unroll the patch loop.
unroll_batch. If >0 will use a version that will unroll the batch loop by the value of the option. By default don't use this version of the code.
unroll_nkern. idem as unroll_batch but unroll the kernel loop.
...
...
@@ -95,6 +98,7 @@ class ConvOp(Op):
self
.
unroll_batch
=
unroll_batch
self
.
unroll_kern
=
unroll_kern
self
.
unroll_patch
=
unroll_patch
if
self
.
unroll_batch
>
0
and
self
.
bsize
%
self
.
unroll_batch
!=
0
:
if
self
.
bsize
<=
self
.
unroll_batch
:
...
...
@@ -407,6 +411,7 @@ using namespace std;
d
[
"self_imshp0"
]
=
self
.
imshp
[
0
]
d
[
"self_imshp1"
]
=
self
.
imshp
[
1
]
d
[
"self_imshp2"
]
=
self
.
imshp
[
2
]
d
[
"mode"
]
=
self
.
out_mode
.
upper
()
d
[
"self_kshp0"
]
=
self
.
kshp
[
0
]
d
[
"self_kshp1"
]
=
self
.
kshp
[
1
]
d
[
"self_kshp_logical_r"
]
=
self
.
kshp_logical
[
0
]
...
...
@@ -439,8 +444,12 @@ using namespace std;
#print self.out_mode, d["self_imshp_logical_stride_r"]
if
self
.
imshp
!=
self
.
imshp_logical
or
self
.
kshp
!=
self
.
kshp_logical
:
# print "return imshp!=imshp_logical or self.kshp != self.kshp_logical shape version"
return
_conv_op_code_a
%
d
if
self
.
unroll_patch
:
# print "return unroll patch version",self.dx,self.dy
return
_conv_op_code_unroll_patch
%
d
if
self
.
unroll_batch
>
0
or
self
.
unroll_kern
>
0
:
if
self
.
unroll_batch
<=
0
:
self
.
unroll_batch
=
1
if
self
.
unroll_kern
<=
0
:
self
.
unroll_kern
=
1
...
...
@@ -1212,3 +1221,295 @@ Py_XDECREF(img2d);
Py_XDECREF(filtersflipped);
"""
return
ret
_conv_op_code_unroll_patch
=
"""
const int mode=
%(mode)
s;
int typenum=0, typenum_f=0;
PyArrayObject *ain1=NULL, *ain2=NULL, *filtersflipped_arr=NULL, *img2d_arr=NULL;
const
%(type)
s fill_value = 0;
int type_im=PyArray_TYPE(
%(img2d)
s);
int type_ker=PyArray_TYPE(
%(filtersflipped)
s);
npy_intp dim_zz[2]={
%(self_outshp0)
s,
%(self_outshp1)
s};
npy_intp dim_im[2]={
%(self_imshp1)
s,
%(self_imshp2)
s};
npy_intp dim_ker[2]={
%(self_kshp0)
s,
%(self_kshp1)
s};
PyArray_Dims img2d_shape;
npy_intp img2d_dim[4]={1,1,0,0};
img2d_shape.ptr=img2d_dim;
img2d_shape.len=4;
PyArray_Dims kerns_shape;
npy_intp kerns_dim[4]={1,1,0,0};
kerns_shape.ptr=kerns_dim;
kerns_shape.len=4;
PyObject *img2d=NULL, *contig, *filtersflipped=NULL;
if(
%(img2d)
s->nd==2){
img2d_dim[3]=
%(img2d)
s->dimensions[1];
img2d_dim[2]=
%(img2d)
s->dimensions[0];
}else if(
%(img2d)
s->nd==3){
img2d_dim[3]=
%(img2d)
s->dimensions[2];
img2d_dim[2]=
%(img2d)
s->dimensions[1];
img2d_dim[0]=
%(img2d)
s->dimensions[0];
}else if(
%(img2d)
s->nd==4){
img2d_dim[3]=
%(img2d)
s->dimensions[3];
img2d_dim[2]=
%(img2d)
s->dimensions[2];
img2d_dim[1]=
%(img2d)
s->dimensions[1];
img2d_dim[0]=
%(img2d)
s->dimensions[0];
}else {
PyErr_SetString(PyExc_ValueError, "img don't have a good shape");
%(fail)
s;
}
if(
%(filtersflipped)
s->nd==3){
kerns_dim[3]=
%(filtersflipped)
s->dimensions[2];
kerns_dim[2]=
%(filtersflipped)
s->dimensions[1];
kerns_dim[0]=
%(filtersflipped)
s->dimensions[0];
}else if(
%(filtersflipped)
s->nd==4){
kerns_dim[3]=
%(filtersflipped)
s->dimensions[3];
kerns_dim[2]=
%(filtersflipped)
s->dimensions[2];
kerns_dim[1]=
%(filtersflipped)
s->dimensions[1];
kerns_dim[0]=
%(filtersflipped)
s->dimensions[0];
}else{
std:stringstream temp;
temp << "nddim="<<
%(filtersflipped)
s->nd;
std::string param = temp.str();
PyErr_SetString(PyExc_ValueError,
("kernel don't have a good shape. " + param).c_str());
%(fail)
s;
}
img2d = PyArray_Newshape(
%(img2d)
s,&img2d_shape, PyArray_CORDER);
img2d_arr = (PyArrayObject*)img2d;
if ((img2d_arr->strides[3] != sizeof(
%(type)
s))
|| (img2d_arr->strides[2] != img2d_arr->dimensions[3]*sizeof(
%(type)
s))){
contig = (PyObject*)(PyArray_GETCONTIGUOUS((PyArrayObject*)img2d));
Py_DECREF(img2d);
img2d = contig;
if (!PyArray_ISCONTIGUOUS(img2d)){
PyErr_SetString(PyExc_ValueError, "img2d isn't contiguous");
%(fail)
s;
}
}
img2d_arr = (PyArrayObject*)img2d;
filtersflipped = PyArray_Newshape(
%(filtersflipped)
s,&kerns_shape, PyArray_CORDER);
filtersflipped_arr = (PyArrayObject*)filtersflipped;
if ((filtersflipped_arr->strides[3] != sizeof(
%(type)
s))
|| (filtersflipped_arr->strides[2] != filtersflipped_arr->dimensions[3]*sizeof(
%(type)
s))){
contig = (PyObject*)(PyArray_GETCONTIGUOUS((PyArrayObject*)filtersflipped));
Py_DECREF(filtersflipped);
filtersflipped = contig;
if (!PyArray_ISCONTIGUOUS(filtersflipped)){
PyErr_SetString(PyExc_ValueError, "filtersflipped isn't contiguous");
%(fail)
s;
}
}
filtersflipped_arr = (PyArrayObject*)filtersflipped;
if(mode != VALID && mode != FULL){
PyErr_SetString(PyExc_ValueError, "invalid mode, only full and valid are supported");
%(fail)
s;
}
typenum = PyArray_ObjectType((PyObject*)
%(img2d)
s, 0);
typenum_f = PyArray_ObjectType((PyObject*)
%(filtersflipped)
s, 0);
if (typenum < 0) {PyErr_SetString(PyExc_ValueError, "Invalid type");
%(fail)
s;}
if (typenum != typenum_f) {PyErr_SetString(PyExc_ValueError, "Input types must match");
%(fail)
s;}
if (!img2d)
%(fail)
s;
if (!filtersflipped)
%(fail)
s;
if ((!
%(z)
s)
|| *PyArray_DIMS(
%(z)
s)!=4
||(
%(z)
s->dimensions[0] !=
%(self_bsize)
s)
||(
%(z)
s->dimensions[1] !=
%(self_nkern)
s)
||(
%(z)
s->dimensions[2] != dim_zz[0])
|| (
%(z)
s->dimensions[3] != dim_zz[1])
)
{
if (
%(z)
s) Py_DECREF(
%(z)
s);
npy_intp dims[4] = {0,0,0,0};
if(!dims)
%(fail)
s;
dims[0]=
%(self_bsize)
s;
dims[1]=
%(self_nkern)
s;
dims[2]=dim_zz[0];
dims[3]=dim_zz[1];
%(z)
s = (PyArrayObject*) PyArray_ZEROS(4, dims, typenum,0);
}else{
//PyArray_FILLWBYTE((PyObject*)
%(z)
s,0);
}
int Os[2];
Os[0]=
%(self_outshp0)
s;
Os[1]=
%(self_outshp1)
s;
//I keep the formula to calculte Os in case we need it in the futur.
//if (mode == FULL) {Os[0] = (int)ceil((dim_im[0]+dim_ker[0]-1)/float(
%(self_dx)
s)); Os[1] = ceil((dim_im[1]+dim_ker[1]-1)/float(
%(self_dy)
s));}
//else {Os[0] = (int)ceil((dim_im[0]-dim_ker[0]+1)/float(
%(self_dx)
s)); Os[1] = (int)ceil((dim_im[1]-dim_ker[1]+1)/float(
%(self_dy)
s));}
for(int b=0;b<
%(self_bsize)
s;b++){
for(int n_kern=0;n_kern<
%(self_nkern)
s;n_kern++){
//assertions
if (
%(z)
s->strides[0] !=
%(z)
s->dimensions[1] *
%(z)
s->dimensions[2] *
%(z)
s->dimensions[3] * sizeof(
%(type)
s))
%(fail)
s;
if (
%(z)
s->strides[1] !=
%(z)
s->dimensions[2] *
%(z)
s->dimensions[3] * sizeof(
%(type)
s))
%(fail)
s;
if (
%(z)
s->strides[2] !=
%(z)
s->dimensions[3] * sizeof(
%(type)
s))
%(fail)
s;
if (
%(z)
s->strides[3] != sizeof(
%(type)
s))
%(fail)
s;
%(type)
s * __restrict__ out=(
%(type)
s *)(PyArray_GETPTR2(
%(z)
s,b,n_kern));
for (int i = 0; i < dim_zz[0]*dim_zz[1]; ++i) out[i] = 0;
for(int stack_size=0;stack_size<
%(self_imshp0)
s;stack_size++){
const
%(type)
s * __restrict__ in=(
%(type)
s *)(PyArray_GETPTR2(img2d,b,stack_size));
const
%(type)
s * __restrict__ hvals=(
%(type)
s *)(PyArray_GETPTR2(filtersflipped,n_kern,stack_size));
int new_m;
for (int iter_m=0; iter_m < Os[0]; iter_m++) {
// Reposition index into input image based on requested output size
int pos_m = iter_m*
%(self_dx)
s;//The position of the patch in the image
if (mode == FULL) new_m = pos_m ;
else new_m = (pos_m+dim_ker[0]-1);
for (int iter_n=0; iter_n < Os[1]; iter_n++) { // loop over columns
int pos_n=iter_n*
%(self_dy)
s;
%(type)
s sum=0;
%(type)
s sum2=0;
%(type)
s sum3=0;
%(type)
s sum4=0;
int nb_sum=0;
// Sum over kernel, if index into image is out of bounds
// fill with the value
for (int j=0; j < dim_ker[0]; j++) {
int ind0 = (new_m-j);
if(mode==FULL){
const
%(type)
s * idx_hvals=&hvals[j*dim_ker[1]];
if(ind0 < 0 || ind0 >= dim_im[0]){
if(fill_value!=0)
for (int k=0; k < dim_ker[1]; k++) {
sum+= idx_hvals[k] * fill_value;
}
}else{
//do the part where kernel is to the right of the img
//TODO: implement unroll patch for fill_value!=0
int k=0,max_k=max((int)(pos_n-dim_im[1])+1,0);
if(fill_value!=0){
for(k=0;k<max_k;k++){
sum+= idx_hvals[k]*fill_value;
}
}else {k=max_k;}
//do the part where the kernel is on the img
max_k=min(pos_n+1,(int)dim_ker[1]);
const
%(type)
s * idx_in=&in[ind0*dim_im[1]];
if(iter_n + 4*
%(self_dy)
s < Os[1]
&& iter_n>dim_ker[1]-1+3
&& iter_n<dim_im[1]-dim_ker[1]+1-3){
nb_sum=4;
//cout<<4<<endl;
for (int ind1=pos_n-k; k<max_k; k++,ind1--) {
sum+=idx_hvals[k]*idx_in[ind1];
sum2+=idx_hvals[k]*idx_in[ind1+
%(self_dy)
s];
sum3+=idx_hvals[k]*idx_in[ind1+2*
%(self_dy)
s];
sum4+=idx_hvals[k]*idx_in[ind1+3*
%(self_dy)
s];
}
}else if(iter_n + 2*
%(self_dy)
s < Os[1]
&& iter_n>dim_ker[1]-1
&& iter_n<dim_im[1]-dim_ker[1]+1){
//cout<<2<<endl;
nb_sum=2;
// if(iter_n==dim_ker[1]-1){//k-1<min(pos_n+
%(self_dy)
s,(int)dim_ker[1])){
// sum2+=idx_hvals[k-1]*idx_in[pos_n-k-
%(self_dy)
s];
// }
for (int ind1=pos_n-k; k<max_k; k++,ind1--) {
sum+=idx_hvals[k]*idx_in[ind1];
sum2+=idx_hvals[k]*idx_in[ind1+
%(self_dy)
s];
}
// sum2+=idx_hvals[k]*idx_in[pos_n-k+
%(self_dy)
s];
// sum+=idx_hvals[k]*idx_in[pos_n-k];
// k++;
}else{
//cout<<1<<endl;
nb_sum=1;
/*
%(type)
s sum_=0;
if((k-max_k) & 0x1 != 0){
sum+= idx_hvals[k] * idx_in[pos_n-k];
}
for (int ind1=pos_n-k; k<max_k; k+=2,ind1-=2) {
sum+= idx_hvals[k] * idx_in[ind1];
sum_+= idx_hvals[k+1] * idx_in[ind1-1];
}
sum+=sum_;
*/
for (int ind1=pos_n-k; k<max_k; k++,ind1--) {
sum+=idx_hvals[k]*idx_in[ind1];
}
}
//do the part to the left of the img
if(fill_value!=0)
for(;k<dim_ker[1];k++) sum+= idx_hvals[k]*fill_value;
}
}else{//valid mode
const
%(type)
s* idx_in=&in[ind0*dim_im[1]];
const
%(type)
s* idx_hvals=&hvals[j*dim_ker[1]];
if(iter_n + 4*
%(self_dy)
s < Os[1]){
nb_sum=4;
for (int k=dim_ker[1]-1,im_idx=pos_n; k >=0; k--,im_idx++) {
sum+=idx_hvals[k]*idx_in[im_idx];
sum2+=idx_hvals[k]*idx_in[im_idx+
%(self_dy)
s];
sum3+=idx_hvals[k]*idx_in[im_idx+2*
%(self_dy)
s];
sum4+=idx_hvals[k]*idx_in[im_idx+3*
%(self_dy)
s];
}
}else if(iter_n + 2*
%(self_dy)
s < Os[1]){
nb_sum=2;
for (int k=dim_ker[1]-1,im_idx=pos_n; k >=0; k--,im_idx++) {
sum+=idx_hvals[k]*idx_in[im_idx];
sum2+=idx_hvals[k]*idx_in[im_idx+
%(self_dy)
s];
}
}else{
nb_sum=1;
for (int k=dim_ker[1]-1,im_idx=pos_n; k >=0; k--,im_idx++) {
sum+=idx_hvals[k]*idx_in[im_idx];
}
}
}//else valid mode
}//for j
switch(nb_sum){
case 4: out[iter_m*dim_zz[1]+iter_n+3]
%(affectation)
s sum4;
case 3: out[iter_m*dim_zz[1]+iter_n+2]
%(affectation)
s sum3;
case 2: out[iter_m*dim_zz[1]+iter_n+1]
%(affectation)
s sum2;
case 1: out[iter_m*dim_zz[1]+iter_n]
%(affectation)
s sum;
}
iter_n+=nb_sum-1;
/*
out[iter_m*dim_zz[1]+iter_n]
%(affectation)
s sum;
if(nb_sum>=2){
iter_n++;
out[iter_m*dim_zz[1]+iter_n]
%(affectation)
s sum2;
}
if(nb_sum>=3){
iter_n++;
out[iter_m*dim_zz[1]+iter_n]
%(affectation)
s sum3;
}
if(nb_sum>=4){
iter_n++;
out[iter_m*dim_zz[1]+iter_n]
%(affectation)
s sum4;
}
*/
}//for iter_n
}//for iter_m
}//for stack_size
if (0 && (mode==FULL)){
for (int i = 0; i < dim_zz[0]*dim_zz[1]; ++i)
std::cout << " " << out[i];
std::cout << "
\\
n";
}
}//for n_kern
}//for b
Py_XDECREF(img2d);
Py_XDECREF(filtersflipped);
"""
theano/sandbox/scan.py
浏览文件 @
24ef1606
...
...
@@ -62,17 +62,6 @@ def scan(fn, sequences, initial_states, non_sequences, inplace_map={},
# compute number of sequences and number of seqs
n_seqs
=
len
(
seqs
)
# see if there are outputs that do not feed anything back to the function
# applied recursively
#outs_tapkeys = outputs_taps.keys()
#outs_tapkeys.sort()
#for k in outs_tapkeys:
# if outputs_taps[k] == []:
# # add empty lists where you have outputs that do not have past
# # values
# init_outs = init_outs[:k] + [[]] + init_outs[k:]
n_outs
=
len
(
init_outs
)
...
...
theano/sandbox/test_conv.py
浏览文件 @
24ef1606
...
...
@@ -41,7 +41,7 @@ def flip(kern, kshp):
global_rng
=
N
.
random
.
RandomState
(
3423489
)
dmatrix4
=
T
.
TensorType
(
'float64'
,
(
False
,
False
,
False
,
False
))
def
exec_multilayer_conv_nnet
(
conv_mode
,
ss
,
bsize
,
imshp
,
kshps
,
nkerns
,
unroll_batch
=
0
,
unroll_kern
=
0
,
img
=
T
.
dmatrix
(),
validate
=
True
,
conv_op_py
=
False
,
do_convolve2
=
False
,
do_print
=
True
,
repeat
=
1
):
def
exec_multilayer_conv_nnet
(
conv_mode
,
ss
,
bsize
,
imshp
,
kshps
,
nkerns
,
unroll_batch
=
0
,
unroll_kern
=
0
,
img
=
T
.
dmatrix
(),
validate
=
True
,
conv_op_py
=
False
,
do_convolve2
=
False
,
do_print
=
True
,
repeat
=
1
,
unroll_patch
=
0
):
# build actual input images
imgval
=
global_rng
.
rand
(
bsize
,
imshp
[
0
],
imshp
[
1
],
imshp
[
2
])
...
...
@@ -121,7 +121,7 @@ def exec_multilayer_conv_nnet(conv_mode, ss, bsize, imshp, kshps, nkerns, unroll
hidval1
=
outval
.
copy
()
# ConvOp
conv_op
=
ConvOp
(
imshp
,
kshp
,
nkern
,
bsize
,
ss
[
0
],
ss
[
1
],
conv_mode
,
unroll_batch
=
unroll_batch
,
unroll_kern
=
unroll_kern
)(
inputs4
,
kerns4
)
conv_op
=
ConvOp
(
imshp
,
kshp
,
nkern
,
bsize
,
ss
[
0
],
ss
[
1
],
conv_mode
,
unroll_batch
=
unroll_batch
,
unroll_kern
=
unroll_kern
,
unroll_patch
=
unroll_patch
)(
inputs4
,
kerns4
)
l1shp
=
N
.
hstack
((
nkern
,
getFilterOutShp
(
imshp
,
kshp
,
ss
,
conv_mode
)))
propup2
=
function
([
inputs4
,
kerns4
],
conv_op
)
...
...
@@ -328,7 +328,7 @@ class TestConvOp(unittest.TestCase):
ssizess
=
[[(
1
,
1
),(
1
,
2
)],[(
1
,
1
),(
2
,
2
)]]
convmodes
=
[
'valid'
,
'full'
]
do_convolve2
=
True
unroll
=
[(
0
,
0
),(
1
,
1
),(
2
,
2
),(
3
,
2
)]
#(batch,kern
)
unroll
=
[(
0
,
0
,
False
),(
0
,
0
,
True
),(
1
,
1
,
False
),(
2
,
2
,
False
),(
3
,
2
,
False
)]
#(batch,kern,patch
)
do_speed_test
=
False
# TODO: this version show a bug that was fixed
...
...
@@ -338,6 +338,11 @@ class TestConvOp(unittest.TestCase):
# nkerns = [2,2] # per output pixel
# ssizes = [(1,1),(2,2)]#2,2)]
# bsizes = [1,1] # batch size
# imshp_starts = [(1,10,10),(1,5,6)]
# kshpss = ([[2,3],[3,2]],[[2,2],[2,2]])
# nkernss = [[1,1],[1,1]] # per output pixel
N
.
set_printoptions
(
threshold
=
N
.
nan
)
# symbolic stuff
...
...
@@ -356,8 +361,8 @@ class TestConvOp(unittest.TestCase):
unroll_batch
=
[
1
,
2
,
4
,
5
,
10
,
20
]
unroll_kern
=
[
1
,
2
,
4
,
5
,
10
,
20
]
unroll_batch
=
[
1
,
2
,
5
]
unroll_kern
=
[
1
,
2
,
5
]
unroll_batch
=
[
1
,
4
,
5
]
unroll_kern
=
[
1
,
4
,
5
]
bsize
=
20
# batch size
imshp_start
=
(
1
,
48
,
48
)
#un square shape to test more corner case.
...
...
@@ -374,46 +379,86 @@ class TestConvOp(unittest.TestCase):
timing
=
N
.
zeros
((
len
(
unroll_batch
),
len
(
unroll_kern
),
3
))
t_b_k
=
[]
#calculate the timing with unrolling
t_
=
[[
7.60572791
,
3.95069814
,
3.74271464
],
[
4.05631089
,
2.90384555
,
2.93613672
],
[
3.90551591
,
2.92595196
,
3.00102282
]]
best
=
[]
worst
=
[]
best
=
[
0.52690219879150391
,
2.4266397953033447
]
worst
=
[
0.92042708396911621
,
6.8822150230407715
]
t_
=
[]
for
unroll_b
,
n_b
in
zip
(
unroll_batch
,
range
(
len
(
unroll_batch
))):
for
unroll_k
,
n_k
in
zip
(
unroll_kern
,
range
(
len
(
unroll_kern
))):
t_b_k
.
append
(
str
(
unroll_b
)
+
"/"
+
str
(
unroll_k
))
tctot
,
tpytot
,
ntot
=
[],[],[]
for
conv_mode
,
n_mode
in
zip
(
convmodes
,
range
(
len
(
convmodes
))):
for
ss
,
n_ss
in
zip
(
ssizes
,
range
(
len
(
ssizes
))):
tctot_
,
tpytot_
,
ntot_
=
exec_multilayer_conv_nnet
(
conv_mode
,
ss
,
bsize
,
imshp_start
,
kshps
,
nkerns
,
unroll_batch
=
unroll_b
,
unroll_kern
=
unroll_k
,
validate
=
validate
)
tctot
+=
[
tctot_
]
tpytot
+=
[
tpytot_
]
ntot
+=
[
ntot_
]
timing
[
n_b
,
n_k
]
=
[
sum
(
tctot
),
sum
(
tpytot
),
sum
(
ntot
)]
if
not
t_
:
tctot
,
tpytot
,
ntot
=
[],[],[]
for
conv_mode
,
n_mode
in
zip
(
convmodes
,
range
(
len
(
convmodes
))):
for
ss
,
n_ss
in
zip
(
ssizes
,
range
(
len
(
ssizes
))):
tctot_
,
tpytot_
,
ntot_
=
exec_multilayer_conv_nnet
(
conv_mode
,
ss
,
bsize
,
imshp_start
,
kshps
,
nkerns
,
unroll_batch
=
unroll_b
,
unroll_kern
=
unroll_k
,
validate
=
validate
)
tctot
+=
[
tctot_
]
tpytot
+=
[
tpytot_
]
ntot
+=
[
ntot_
]
if
unroll_b
==
4
and
unroll_k
==
4
:
print
"unroll 4/4"
,
tctot
best
=
tctot
if
unroll_b
==
1
and
unroll_k
==
1
:
print
"unroll 1/1"
,
tctot
worst
=
tctot
timing
[
n_b
,
n_k
]
=
[
sum
(
tctot
),
sum
(
tpytot
),
sum
(
ntot
)]
if
not
t_
:
t
=
timing
[:,:,
0
]
#We select only the c timing.
else
:
t
=
t_
t
=
N
.
asarray
(
t
)
#calculate the old timing
tctot
,
tpytot
,
ntot
=
0
,
0
,
0
for
conv_mode
,
n_mode
in
zip
(
convmodes
,
range
(
len
(
convmodes
))):
for
ss
,
n_ss
in
zip
(
ssizes
,
range
(
len
(
ssizes
))):
tctot_
,
tpytot_
,
ntot_
=
exec_multilayer_conv_nnet
(
conv_mode
,
ss
,
bsize
,
imshp_start
,
kshps
,
nkerns
,
unroll_batch
=
0
,
unroll_kern
=
0
,
validate
=
validate
)
tctot
+=
tctot_
tpytot
+=
tpytot_
ntot
+=
ntot_
print
"old code timing
%.3
fs"
%
tctot
# print timing
t
=
timing
[:,:,
0
]
#We select only the c timing.
tctot_
=
[
0.52555489540100098
,
6.6634182929992676
]
# tctot_=[]
tctot
,
tpytot
,
ntot
=
[],[],[]
if
not
tctot_
:
for
conv_mode
,
n_mode
in
zip
(
convmodes
,
range
(
len
(
convmodes
))):
for
ss
,
n_ss
in
zip
(
ssizes
,
range
(
len
(
ssizes
))):
tctot_
,
tpytot_
,
ntot_
=
exec_multilayer_conv_nnet
(
conv_mode
,
ss
,
bsize
,
imshp_start
,
kshps
,
nkerns
,
unroll_batch
=
0
,
unroll_kern
=
0
,
validate
=
validate
)
tctot
+=
[
tctot_
]
tpytot
+=
[
tpytot_
]
ntot
+=
[
ntot_
]
else
:
tctot
=
N
.
asarray
(
tctot_
)
print
"old code timing
%.3
fs"
%
sum
(
tctot
),
tctot
best
=
N
.
asarray
(
best
)
worst
=
N
.
asarray
(
worst
)
print
"timing for unrolled version"
print
t_b_k
print
t
print
"max
%.3
fs"
%
t
.
max
(),
"max param(batch unloop size/kernel unloop size)"
,
t_b_k
[
t
.
argmax
()]
print
"min
%.3
fs"
%
t
.
min
(),
"min param(batch unloop size/kernel unloop size)"
,
t_b_k
[
t
.
argmin
()]
print
"speedup vs (1/1)
%.3
fx, vs old
%.3
fx"
%
(
t
.
max
()
/
t
.
min
(),
tctot
/
t
.
min
())
print
"speedup vs (1/1)
%.3
fx, vs old
%.3
fx"
%
(
t
.
max
()
/
t
.
min
(),
sum
(
tctot
)
/
t
.
min
())
print
worst
/
best
,
tctot
/
best
tctot_patch
=
[]
for
conv_mode
,
n_mode
in
zip
(
convmodes
,
range
(
len
(
convmodes
))):
for
ss
,
n_ss
in
zip
(
ssizes
,
range
(
len
(
ssizes
))):
tctot_
,
tpytot_
,
ntot_
=
exec_multilayer_conv_nnet
(
conv_mode
,
ss
,
bsize
,
imshp_start
,
kshps
,
nkerns
,
unroll_batch
=
0
,
unroll_kern
=
0
,
validate
=
validate
,
unroll_patch
=
2
)
tctot_patch
+=
[
tctot_
]
t_patch
=
sum
(
tctot_patch
)
print
"unroll_patch time"
,
tctot_patch
print
"speedup vs (1/1)
%.3
fx, vs old
%.3
fx"
%
(
t
.
max
()
/
t_patch
,
sum
(
tctot
)
/
t_patch
)
print
best
/
tctot_patch
,
worst
/
tctot_patch
print
best
print
worst
print
tctot
print
tctot_patch
return
for
i
in
range
(
len
(
kshpss
)):
for
conv_mode
,
n_mode
in
zip
(
convmodes
,
range
(
len
(
convmodes
))):
for
ss
,
n_ss
in
zip
(
ssizess
[
i
],
range
(
len
(
ssizess
[
i
]))):
for
un_b
,
un_k
in
unroll
:
for
un_b
,
un_k
,
un_p
in
unroll
:
tctot_
,
tpytot_
,
ntot_
=
exec_multilayer_conv_nnet
(
conv_mode
,
ss
,
bsizes
[
i
],
imshp_starts
[
i
],
kshpss
[
i
],
nkernss
[
i
],
img
=
img
,
unroll_batch
=
un_b
,
unroll_kern
=
un_k
,
unroll_patch
=
un_p
,
validate
=
True
)
tctot
+=
[
tctot_
]
tpytot
+=
[
tpytot_
]
...
...
@@ -428,6 +473,11 @@ class TestConvOp(unittest.TestCase):
d
=
N
.
asarray
(
ntot
)
/
tpytot
print
'speed up py theano(ConvOp) vs convolve2d:
%.3
fx'
%
d
.
mean
(),
d
def
init_data
(
self
,
shape
):
return
N
.
ones
(
shape
)
return
N
.
random
.
random
(
shape
)
def
test_ConvOpGrad
(
self
):
"""
test the gradient in float and double
...
...
@@ -442,9 +492,9 @@ class TestConvOp(unittest.TestCase):
kshps
=
[(
2
,
3
)]
imshps
=
[(
2
,
3
,
4
)]
modes
=
[
'valid'
,
'full'
]
unroll
=
[(
0
,
0
),(
1
,
1
),(
2
,
3
)]
unroll
=
[(
0
,
0
,
True
),(
1
,
1
,
False
),(
2
,
3
,
False
),(
1
,
1
,
False
),(
0
,
0
,
False
)]
#(batch,kern,patch)
ssizes
=
[(
1
,
1
),(
2
,
2
)]
for
typ
in
types
:
imgs
=
T
.
TensorType
(
typ
,
(
False
,
False
,
False
,
False
),
'imgs'
)
kerns
=
T
.
TensorType
(
typ
,
(
False
,
False
,
False
,
False
),
'kerns'
)
...
...
@@ -457,12 +507,12 @@ class TestConvOp(unittest.TestCase):
imgvals
=
N
.
array
(
N
.
random
.
random
(
N
.
hstack
((
bsize
,
imshp
))),
dtype
=
imgs
.
dtype
)
for
kshp
in
kshps
:
t
=
numpy
.
array
([
imshp
[
1
]
-
kshp
[
0
],
imshp
[
2
]
-
kshp
[
1
]])
kernvals
=
N
.
array
(
N
.
random
.
rand
(
nkern
,
visdim
,
kshp
[
0
],
kshp
[
1
]
),
dtype
=
kerns
.
dtype
)
kernvals
=
N
.
array
(
self
.
init_data
(
(
nkern
,
visdim
,
kshp
[
0
],
kshp
[
1
])
),
dtype
=
kerns
.
dtype
)
# 'full' mode should support kernels bigger than the input
if
mode
==
'valid'
and
(
t
<
0
)
.
any
():
continue
for
un_b
,
un_k
in
unroll
:
for
un_b
,
un_k
,
un_p
in
unroll
:
for
ss
in
ssizes
:
print
'test_ConvOpGrad'
print
'mode type:'
,
mode
,
typ
...
...
@@ -476,14 +526,14 @@ class TestConvOp(unittest.TestCase):
def
test_i
(
imgs
):
convop
=
ConvOp
(
imshp
,
kshp
,
nkern
,
bsize
,
ss
[
0
],
ss
[
1
],
output_mode
=
mode
,
unroll_batch
=
un_b
,
unroll_kern
=
un_k
)
output_mode
=
mode
,
unroll_batch
=
un_b
,
unroll_kern
=
un_k
,
unroll_patch
=
un_p
)
return
convop
(
imgs
,
kernvals
)
def
test_k
(
kerns
):
convop
=
ConvOp
(
imshp
,
kshp
,
nkern
,
bsize
,
ss
[
0
],
ss
[
1
],
output_mode
=
mode
,
unroll_batch
=
un_b
,
unroll_kern
=
un_k
)
output_mode
=
mode
,
unroll_batch
=
un_b
,
unroll_kern
=
un_k
,
unroll_patch
=
un_p
)
return
convop
(
imgvals
,
kerns
)
print
mode
,
imshp
,
kshp
,
un_b
,
un_k
,
ss
#TODO the tolerance needed to pass is very high for float32(0.17). Is this acceptable? Expected?
tol
=
None
if
typ
==
"float32"
:
...
...
theano/sandbox/test_scan.py
浏览文件 @
24ef1606
from
scan
import
Scan
import
unittest
import
theano
import
theano.sandbox.scan
import
random
import
numpy.random
...
...
@@ -74,6 +75,14 @@ def verify_grad(op, pt, n_tests=2, rng=None, eps = None, tol = None,
def
compareArrays
(
a
,
b
):
if
type
(
a
)
in
(
list
,
tuple
):
a
=
numpy
.
array
(
a
)
if
type
(
b
)
in
(
list
,
tuple
):
b
=
numpy
.
array
(
b
)
return
numpy
.
all
(
abs
(
a
-
b
)
<
1e-5
)
...
...
@@ -85,7 +94,7 @@ class T_Scan(unittest.TestCase):
# generator network, only one output , type scalar ; no sequence or
# non sequence arguments
def
test_1
():
def
test_1
(
self
):
def
f_pow2
(
x_tm1
):
return
(
2
*
x_tm1
,
{})
...
...
@@ -94,11 +103,12 @@ class T_Scan(unittest.TestCase):
Y
=
theano
.
sandbox
.
scan
.
scan
(
f_pow2
,
[],
s
,
[],
n_steps
=
n_steps
)
f1
=
theano
.
function
([
s
,
n_steps
],
Y
)
assert
(
numpy
.
any
(
f1
([
1
],
3
)
==
[
2
,
4
,
8
])
)
assert
(
compareArrays
(
f1
([
1
],
3
),
[
2
,
4
,
8
]))
# simple rnn, one input, one state, weights for each; input/state are
# vectors, weights are scalars
def
test_2
():
def
test_2
(
self
):
def
f_rnn
(
u_t
,
x_tm1
,
W_in
,
W
):
return
(
u_t
*
W_in
+
x_tm1
*
W
,
{})
...
...
@@ -109,14 +119,15 @@ class T_Scan(unittest.TestCase):
Y
=
theano
.
sandbox
.
scan
.
scan
(
f_rnn
,
u
,
x0
,[
W_in
,
W
])
f2
=
theano
.
function
([
u
,
x0
,
W_in
,
W
],
Y
)
assert
(
numpy
.
any
(
f2
([
1
,
2
,
3
,
4
],[
1
],
.
1
,
1
)
==
\
numpy
.
array
([
1.1
,
1.3
,
1.6
,
2.
])))
f2
=
theano
.
function
([
u
,
x0
,
W_in
,
W
],
Y
)
v_u
=
numpy
.
array
([
1.
,
2.
,
3.
,
4.
])
v_x0
=
numpy
.
array
([
1
])
v_out
=
numpy
.
array
([
1.1
,
1.3
,
1.6
,
2.
])
assert
(
compareArrays
(
f2
(
v_u
,
v_x0
,
.
1
,
1
),
v_out
)
)
# simple rnn, one input, one state, weights for each; input/state are
# vectors, weights are scalars; using shared variables
def
test_3
():
def
test_3
(
self
):
u
=
theano
.
tensor
.
dvector
()
x0
=
theano
.
tensor
.
dvector
()
...
...
@@ -128,14 +139,16 @@ class T_Scan(unittest.TestCase):
Y
=
theano
.
sandbox
.
scan
.
scan
(
f_rnn_shared
,
u
,
x0
,[])
f3
=
theano
.
function
([
u
,
x0
],
Y
)
assert
(
numpy
.
any
(
f3
([
1
,
2
,
3
,
4
],[
1
])
==
numpy
.
array
([
1.1
,
1.3
,
1.6
,
2.
])))
f3
=
theano
.
function
([
u
,
x0
],
Y
)
v_u
=
numpy
.
array
([
1.
,
2.
,
3.
,
4.
])
v_x0
=
numpy
.
array
([
1.
])
v_out
=
numpy
.
array
([
1.1
,
1.3
,
1.6
,
2.
])
assert
(
compareArrays
(
f3
(
v_u
,
v_x0
),
v_out
))
# some rnn with multiple outputs and multiple inputs; other dimension
# instead of scalars/vectors
def
test_4
():
def
test_4
(
self
):
W_in2
=
theano
.
shared
(
numpy
.
array
([
1.
,
2.
]),
name
=
'win2'
)
W
=
theano
.
shared
(
numpy
.
array
([[
2.
,
1.
],[
1.
,
1.
]]),
name
=
'w'
)
...
...
@@ -152,20 +165,22 @@ class T_Scan(unittest.TestCase):
Y
=
theano
.
sandbox
.
scan
.
scan
(
f_rnn_cmpl
,[
u1
,
u2
],[
x0
,
y0
],
W_in1
)
f4
=
theano
.
function
([
u1
,
u2
,
x0
,
y0
,
W_in1
],
Y
)
(
x
,
y
)
=
f4
(
numpy
.
array
([[
1
,
2
],[
1
,
2
],[
1
,
2
]]),
\
numpy
.
array
([
1
,
2
,
3
]),
\
numpy
.
array
([[
0
,
0
]]),
\
numpy
.
array
([
1
]),
\
numpy
.
array
([[
1
,
1
],[
1
,
1
]]))
assert
(
numpy
.
all
(
x
==
numpy
.
array
([[
4.
,
5.
],[
18.
,
16.
],[
58.
,
43.
]])))
assert
(
numpy
.
all
(
y
==
numpy
.
array
([
0.
,
7.
,
25.
])))
f4
=
theano
.
function
([
u1
,
u2
,
x0
,
y0
,
W_in1
],
Y
)
v_u1
=
numpy
.
array
([[
1.
,
2.
],[
1.
,
2.
],[
1.
,
2.
]])
v_u2
=
numpy
.
array
([
1.
,
2.
,
3.
])
v_x0
=
numpy
.
array
([[
0.
,
0.
]])
v_y0
=
numpy
.
array
([
1
])
v_Win1
=
numpy
.
array
([[
1.
,
1.
],[
1.
,
1.
]])
v_x
=
numpy
.
array
([[
4.
,
5.
],[
18.
,
16.
],[
58.
,
43.
]])
v_y
=
numpy
.
array
([
0.
,
7.
,
25.
])
(
x
,
y
)
=
f4
(
v_u1
,
v_u2
,
v_x0
,
v_y0
,
v_Win1
)
assert
(
compareArrays
(
x
,
v_x
))
assert
(
compareArrays
(
y
,
v_y
))
# basic ESN using updates
def
test_5
():
def
test_5
(
self
):
W_in
=
theano
.
shared
(
numpy
.
array
([
1.
,
1.
]),
name
=
'win'
)
W
=
theano
.
shared
(
numpy
.
array
([[
.
1
,
0.
],[
.
0
,
.
1
]]),
name
=
'w'
)
W_out
=
theano
.
shared
(
numpy
.
array
([
.
5
,
1.
]),
name
=
'wout'
)
...
...
@@ -180,12 +195,15 @@ class T_Scan(unittest.TestCase):
Y
=
theano
.
sandbox
.
scan
.
scan
(
f_ESN
,
u
,
y0
,[],
outputs_taps
=
{
0
:[]})
f5
=
theano
.
function
([
u
,
y0
],
Y
)
assert
(
f5
(
numpy
.
array
([
1
,
2
,
3
]),
numpy
.
array
([
0
]))
==
\
numpy
.
array
([
0.
,
1.4
,
3.15
]))
f5
=
theano
.
function
([
u
,
y0
],
Y
)
v_u
=
numpy
.
array
([
1.
,
2.
,
3.
])
v_y0
=
numpy
.
array
([
0.
])
v_out
=
numpy
.
array
([
0.
,
1.5
,
3.15
])
out
=
f5
(
v_u
,
v_y0
)
assert
(
compareArrays
(
v_out
,
out
))
# basic ESN using updates ; moving backwards
def
test_6
():
def
test_6
(
self
):
W_in
=
theano
.
shared
(
numpy
.
array
([
1.
,
1.
]),
name
=
'win'
)
W
=
theano
.
shared
(
numpy
.
array
([[
.
1
,
0.
],[
.
0
,
.
1
]]),
name
=
'w'
)
W_out
=
theano
.
shared
(
numpy
.
array
([
.
5
,
1.
]),
name
=
'wout'
)
...
...
@@ -201,9 +219,55 @@ class T_Scan(unittest.TestCase):
Y
=
theano
.
sandbox
.
scan
.
scan
(
f_ESN
,
u
,
y0
,[],
outputs_taps
=
{
0
:[]},
\
go_backwards
=
True
)
f6
=
theano
.
function
([
u
,
y0
],
Y
)
assert
(
f6
(
numpy
.
array
([
1
,
2
,
3
]),
numpy
.
array
([
0
]))
==
\
numpy
.
array
([
0.
,
4.5
,
3.45
]))
f6
=
theano
.
function
([
u
,
y0
],
Y
)
v_u
=
numpy
.
array
([
1.
,
2.
,
3.
])
v_y0
=
numpy
.
array
([
0
])
v_out
=
numpy
.
array
([
0.
,
4.5
,
3.45
])
out
=
f6
(
v_u
,
v_y0
)
assert
(
compareArrays
(
out
,
v_out
))
# simple rnn, one input, one state, weights for each; input/state are
# vectors, weights are scalars; using shared variables and past
# taps (sequences and outputs)
def
test_7
(
self
):
u
=
theano
.
tensor
.
dvector
()
x0
=
theano
.
tensor
.
dvector
()
W_in
=
theano
.
shared
(
.
1
,
name
=
'w_in'
)
W
=
theano
.
shared
(
1.
,
name
=
'w'
)
def
f_rnn_shared
(
u_tm2
,
x_tm1
,
x_tm2
):
return
(
u_tm2
*
W_in
+
x_tm1
*
W
+
x_tm2
,
{})
Y
=
theano
.
sandbox
.
scan
.
scan
(
f_rnn_shared
,
u
,
x0
,
[],
\
sequences_taps
=
{
0
:[
-
2
]},
outputs_taps
=
{
0
:[
-
1
,
-
2
]})
f7
=
theano
.
function
([
u
,
x0
],
Y
)
#print f7([1,2,3,4],[1,2])
# simple rnn, one input, one state, weights for each; input/state are
# vectors, weights are scalars; using shared variables and past
# taps (sequences and outputs) and future taps for sequences
def
test_8
(
self
):
u
=
theano
.
tensor
.
dvector
()
x0
=
theano
.
tensor
.
dvector
()
W_in
=
theano
.
shared
(
.
1
,
name
=
'w_in'
)
W
=
theano
.
shared
(
1.
,
name
=
'w'
)
def
f_rnn_shared
(
u_tm2
,
u_tp2
,
x_tm1
,
x_tm2
):
return
((
u_tm2
+
u_tp2
)
*
W_in
+
x_tm1
*
W
+
x_tm2
,
{})
Y
=
theano
.
sandbox
.
scan
.
scan
(
f_rnn_shared
,
u
,
x0
,
[],
\
sequences_taps
=
{
0
:[
-
2
,
2
]},
outputs_taps
=
{
0
:[
-
1
,
-
2
]})
f8
=
theano
.
function
([
u
,
x0
],
Y
)
#print f8([1,2,3,4,5,6],[1,2])
'''
...
...
@@ -214,7 +278,8 @@ class T_Scan(unittest.TestCase):
- test gradient (go_bacwards)
- test gradient (multiple outputs / some uncomputable )
- test gradient (truncate_gradient)
- test gradient (force_gradient)
- test gradient (force_gradient)
- test_gradient (taps past/future)
- test inplace map
'''
...
...
theano/tensor/nnet.py
浏览文件 @
24ef1606
...
...
@@ -1020,13 +1020,18 @@ def local_advanced_indexing_crossentropy_onehot_grad(node):
# / softmax(x)
# which arises from the gradient of log(softmax(x))[arange(y.shape[0]), y]
#
# TODO: explain variants of case 1.
# TODO: explain other variants of case 2.
# In some cases, in case 2., insted of "-1. like (AdvancedSubtensor...)",
# we can have "-1. like ([-1] * AdvancedSubtensor...)". This case will be
# recognized too, but other variants, even with the same shape, might not
# (yet).
# The base cases are realized when the gradient of the
# cost wrt the output is equal to 1. When this gradient
# has another (scalar) value, it typically appears in the
# second argument of AdvancedIncSubtensor. In that case, we
# try to extract it, and feed it as the output gradient of
# crossentropy_softmax_1hot_with_bias_dx.
#
# N.B. Regarding clients -- This substitution is important for numerical stability, so we
# perform the substitution even when intermediate values have multiple clients.
...
...
@@ -1052,43 +1057,60 @@ def local_advanced_indexing_crossentropy_onehot_grad(node):
else
:
return
# Check that incr has the form -1./sm[arange(len(y)), y]
# In the base case (output gradient = 1), incr is -1./sm[arange(len(y)), y]
# Here, we are looking for the AdvancedSubtensor term (sm[arange(len(y)), y]),
# the remaining of the expression will be used to compute outgrad_factor
# outgrad_factor will be constructed in 3 steps as follow:
# outgrad_factor = +/- 1 (initial sign)
# outgrad_factor *= numerator
# outgrad_factor /= denominator
adv_subtensor
=
None
outgrad_factor
=
1.
# If there's a 'minus' sign before the whole expression, put it in
# outgrad_factor and iterate
if
incr
.
owner
and
incr
.
owner
.
op
==
tensor
.
neg
:
outgrad_factor
=
-
1.
incr
=
incr
.
owner
.
inputs
[
0
]
if
incr
.
owner
and
incr
.
owner
.
op
==
tensor
.
true_div
:
num
,
denom
=
incr
.
owner
.
inputs
if
not
(
hasattr
(
num
,
'data'
)
and
numpy
.
all
(
num
.
data
==
-
1
)):
# set outgrad_factor according to the numerator,
# it may be divided later
if
hasattr
(
num
,
'data'
)
and
numpy
.
all
(
num
.
data
==
-
1
):
# Base case, num is -1
outgrad_factor
*=
1.
elif
numpy
.
all
(
num
.
broadcastable
):
# Otherwise, it should be a scalar
outgrad_factor
*=
-
num
else
:
return
#else: OK
if
not
denom
.
owner
:
return
adv_subtensor
=
None
if
isinstance
(
denom
.
owner
.
op
,
tensor
.
AdvancedSubtensor
):
# Base case
adv_subtensor
=
denom
mult_factor
=
1
outgrad_factor
/=
1.
elif
denom
.
owner
.
op
==
tensor
.
mul
:
# Try to find the AdvancedSubtensor node mentionned above
# For now, we support only the case where the other inputs
# of the "mul" node are of integer type, so we are sure it
# does not affect the gradient computation.
# Try to find the AdvancedSubtensor node mentionned above,
# and a scalar that is equal to the output gradient
for
i
,
input
in
enumerate
(
denom
.
owner
.
inputs
):
if
input
.
owner
and
isinstance
(
input
.
owner
.
op
,
tensor
.
AdvancedSubtensor
):
adv_subtensor
=
input
other_inputs
=
[
in_
for
(
j
,
in_
)
in
enumerate
(
denom
.
owner
.
inputs
)
if
j
!=
i
]
if
len
(
other_inputs
)
==
1
:
mult_factor
=
other_inputs
[
0
]
rest
=
other_inputs
[
0
]
else
:
mult_factor
=
tensor
.
mul
(
*
[
other_inputs
])
rest
=
tensor
.
mul
(
*
[
other_inputs
])
# Check that
mult_factor is of integer type
if
mult_factor
.
dtype
.
startswith
(
'int'
)
\
or
mult_factor
.
dtype
.
startswith
(
'uint'
):
#OK
# Check that
rest is a scalar
if
numpy
.
all
(
rest
.
broadcastable
):
adv_subtensor
=
input
outgrad_factor
/=
rest
break
else
:
# That subtensor was not right
adv_subtensor
=
None
else
:
return
...
...
@@ -1101,6 +1123,8 @@ def local_advanced_indexing_crossentropy_onehot_grad(node):
if
not
(
maybe_sm
is
sm
and
maybe_rows
is
rows
and
maybe_labels
is
labels
):
return
#else: OK
else
:
return
else
:
return
...
...
@@ -1147,7 +1171,7 @@ def local_advanced_indexing_crossentropy_onehot_grad(node):
if
incr
.
owner
and
incr
.
owner
.
op
==
tensor
.
fill
:
model
,
value
=
incr
.
owner
.
inputs
adv_subtensor
=
None
mult_factor
=
1
outgrad_factor
=
None
if
model
.
owner
and
isinstance
(
model
.
owner
.
op
,
tensor
.
AdvancedSubtensor
):
adv_subtensor
=
model
else
:
...
...
@@ -1169,17 +1193,16 @@ def local_advanced_indexing_crossentropy_onehot_grad(node):
if
not
(
maybe_log_sm
is
log_sm
and
maybe_rows
is
rows
and
maybe_labels
is
labels
):
return
#else: OK
else
:
return
# In the base case, value is the constant '-1'
if
hasattr
(
value
,
'data'
)
and
numpy
.
all
(
value
.
data
==
-
1
):
mult_factor
=
1
# In the case of -1/denom, if denom is of integer type
elif
value
.
owner
and
value
.
owner
.
op
==
tensor
.
true_div
:
val_num
,
val_denom
=
value
.
owner
.
inputs
if
hasattr
(
val_num
,
'data'
)
and
numpy
.
all
(
val_num
.
data
==
-
1
):
if
val_denom
.
dtype
.
startswith
(
'int'
)
\
or
val_denom
.
dtype
.
startswith
(
'uint'
):
mult_factor
=
val_denom
outgrad_factor
=
1.
# Otherwise, it should be a scalar, and the output gradient
# would be -value
elif
numpy
.
all
(
value
.
broadcastable
):
outgrad_factor
=
-
value
else
:
return
...
...
@@ -1204,11 +1227,10 @@ def local_advanced_indexing_crossentropy_onehot_grad(node):
# Dimension check before substitution
if
labels
.
ndim
==
1
and
x_var
.
ndim
==
2
:
if
mult
_factor
is
not
None
:
out_grad
=
tensor
.
fill
(
x_var
[:,
0
],
1.
/
mult
_factor
)
if
outgrad
_factor
is
not
None
:
out_grad
=
tensor
.
fill
(
x_var
[:,
0
],
outgrad
_factor
)
return
[
crossentropy_softmax_1hot_with_bias_dx
(
out_grad
,
sm
,
labels
)]
else
:
print
'mult_factor is None?'
return
else
:
return
...
...
theano/tensor/opt.py
浏览文件 @
24ef1606
...
...
@@ -346,7 +346,7 @@ def local_IncSubtensor_serialize(node):
#
# add(x, incsubtensor(b, c), incsubtensor(b, d))
# -> incsubtensor(incsubtensor(add(x,b), c), d)
# -> incsubtensor(incsubtensor(add(x,b
,b
), c), d)
"""
def
movable
(
i
):
...
...
@@ -354,7 +354,8 @@ def local_IncSubtensor_serialize(node):
return
i
.
owner
\
and
isinstance
(
i
.
owner
.
op
,
T
.
IncSubtensor
)
\
and
i
.
type
==
o_type
\
and
len
(
i
.
clients
)
==
1
and
len
(
i
.
clients
)
==
1
\
and
not
i
.
owner
.
op
.
set_instead_of_inc
if
node
.
op
==
T
.
add
:
o_type
=
node
.
outputs
[
0
]
.
type
...
...
@@ -383,7 +384,8 @@ def local_IncSubtensor_serialize(node):
@gof.local_optimizer
([
None
])
def
local_inplace_setsubtensor
(
node
):
if
isinstance
(
node
.
op
,
T
.
IncSubtensor
)
and
not
node
.
op
.
inplace
:
new_op
=
T
.
IncSubtensor
(
node
.
op
.
idx_list
,
inplace
=
True
)
new_op
=
T
.
IncSubtensor
(
node
.
op
.
idx_list
,
inplace
=
True
,
\
set_instead_of_inc
=
node
.
op
.
set_instead_of_inc
)
new_node
=
new_op
(
*
node
.
inputs
)
return
[
new_node
]
return
False
...
...
@@ -932,8 +934,11 @@ def local_neg_neg(node):
@register_specialize
@gof.local_optimizer
([
T
.
neg
])
def
local_neg_div_neg
(
node
):
"""- (-a / b) -> a / b
Also performs - (c / b) -> ((-c) / b) when c is a scalar constant.
"""
if
node
.
op
==
T
.
neg
:
"""- (-a / b) -> a / b"""
if
node
.
inputs
[
0
]
.
owner
and
node
.
inputs
[
0
]
.
owner
.
op
==
T
.
true_div
:
frac
=
node
.
inputs
[
0
]
num
,
denom
=
frac
.
owner
.
inputs
...
...
@@ -942,6 +947,11 @@ def local_neg_div_neg(node):
# No other clients of the original division
new_num
=
num
.
owner
.
inputs
[
0
]
return
[
T
.
true_div
(
new_num
,
denom
)]
elif
numpy
.
all
(
num
.
broadcastable
)
and
isinstance
(
num
,
gof
.
Constant
):
if
len
(
frac
.
clients
)
==
1
:
new_num
=
-
num
.
data
return
[
T
.
true_div
(
new_num
,
denom
)]
@gof.local_optimizer
([
T
.
mul
])
def
local_mul_zero
(
node
):
...
...
theano/tensor/tests/test_nnet.py
浏览文件 @
24ef1606
...
...
@@ -223,6 +223,204 @@ class T_CrossentropyCategorical1Hot(unittest.TestCase):
assert
not
has_softmax
assert
not
has_softmaxdx
def
test_get_rid_of_advanced_indexing_version_of_xent
(
self
):
verbose
=
0
# TODO: add the optimization in FAST_COMPILE?
# In the mean time, run it as 'FAST_RUN' instead
mode
=
theano
.
compile
.
mode
.
get_default_mode
()
if
mode
==
'FAST_COMPILE'
:
mode
=
'FAST_RUN'
rng
=
numpy
.
random
.
RandomState
(
utt
.
fetch_seed
())
x_val
=
rng
.
randn
(
3
,
5
)
b_val
=
rng
.
randn
(
5
)
y_val
=
numpy
.
asarray
([
2
,
4
,
1
])
x
=
T
.
dmatrix
(
'x'
)
b
=
T
.
dvector
(
'b'
)
y
=
T
.
lvector
(
'y'
)
def
print_graph
(
func
):
for
i
,
node
in
enumerate
(
func
.
maker
.
env
.
toposort
()):
print
i
,
node
# Last node should be the output
print
i
,
pprint
(
node
.
outputs
[
0
])
print
## Basic case
expressions
=
[
T
.
sum
(
-
T
.
log
(
softmax
(
x
)[
T
.
arange
(
y
.
shape
[
0
]),
y
])),
-
T
.
sum
(
T
.
log
(
softmax
(
x
)[
T
.
arange
(
y
.
shape
[
0
]),
y
])),
-
T
.
sum
(
T
.
log
(
softmax
(
x
))[
T
.
arange
(
y
.
shape
[
0
]),
y
]),
T
.
sum
(
-
T
.
log
(
softmax
(
x
))[
T
.
arange
(
y
.
shape
[
0
]),
y
])
]
for
expr
in
expressions
:
# Verify the optimizer worked on the expressions
f
=
theano
.
function
([
x
,
y
],
expr
,
mode
=
mode
)
if
verbose
:
print_graph
(
f
)
assert
len
(
f
.
maker
.
env
.
toposort
())
==
4
f
(
x_val
,
y_val
)
# Also verify the gradient wrt x
g
=
theano
.
function
([
x
,
y
],
T
.
grad
(
expr
,
x
),
mode
=
mode
)
if
verbose
:
print_graph
(
g
)
assert
len
(
g
.
maker
.
env
.
toposort
())
==
4
g
(
x_val
,
y_val
)
## Test that a biased softmax is optimized correctly
bias_expressions
=
[
T
.
sum
(
-
T
.
log
(
softmax
(
x
+
b
)[
T
.
arange
(
y
.
shape
[
0
]),
y
])),
-
T
.
sum
(
T
.
log
(
softmax
(
b
+
x
)[
T
.
arange
(
y
.
shape
[
0
]),
y
])),
-
T
.
sum
(
T
.
log
(
softmax
(
x
+
b
))[
T
.
arange
(
y
.
shape
[
0
]),
y
]),
T
.
sum
(
-
T
.
log
(
softmax
(
b
+
x
))[
T
.
arange
(
y
.
shape
[
0
]),
y
])]
for
expr
in
bias_expressions
:
f
=
theano
.
function
([
x
,
b
,
y
],
expr
,
mode
=
mode
)
if
verbose
:
print_graph
(
f
)
assert
len
(
f
.
maker
.
env
.
toposort
())
==
2
# [big_op, sum]
f
(
x_val
,
b_val
,
y_val
)
g
=
theano
.
function
([
x
,
b
,
y
],
T
.
grad
(
expr
,
x
),
mode
=
mode
)
if
verbose
:
print_graph
(
g
)
assert
len
(
g
.
maker
.
env
.
toposort
())
==
4
g
(
x_val
,
b_val
,
y_val
)
## Test that using "mean" instead of sum works, too
mean_expressions
=
[
T
.
mean
(
-
T
.
log
(
softmax
(
x
)[
T
.
arange
(
y
.
shape
[
0
]),
y
])),
-
T
.
mean
(
T
.
log
(
softmax
(
x
)[
T
.
arange
(
y
.
shape
[
0
]),
y
])),
-
T
.
mean
(
T
.
log
(
softmax
(
x
))[
T
.
arange
(
y
.
shape
[
0
]),
y
]),
T
.
mean
(
-
T
.
log
(
softmax
(
x
))[
T
.
arange
(
y
.
shape
[
0
]),
y
])]
for
expr
in
mean_expressions
:
f
=
theano
.
function
([
x
,
y
],
expr
,
mode
=
mode
)
if
verbose
:
print_graph
(
f
)
assert
len
(
f
.
maker
.
env
.
toposort
())
==
7
f
(
x_val
,
y_val
)
g
=
theano
.
function
([
x
,
y
],
T
.
grad
(
expr
,
x
),
mode
=
mode
)
if
verbose
:
print_graph
(
g
)
assert
len
(
g
.
maker
.
env
.
toposort
())
==
8
g
(
x_val
,
y_val
)
mean_bias_expressions
=
[
T
.
mean
(
-
T
.
log
(
softmax
(
x
+
b
)[
T
.
arange
(
y
.
shape
[
0
]),
y
])),
-
T
.
mean
(
T
.
log
(
softmax
(
b
+
x
)[
T
.
arange
(
y
.
shape
[
0
]),
y
])),
-
T
.
mean
(
T
.
log
(
softmax
(
x
+
b
))[
T
.
arange
(
y
.
shape
[
0
]),
y
]),
T
.
mean
(
-
T
.
log
(
softmax
(
b
+
x
))[
T
.
arange
(
y
.
shape
[
0
]),
y
])]
for
expr
in
mean_bias_expressions
:
f
=
theano
.
function
([
x
,
b
,
y
],
expr
,
mode
=
mode
)
if
verbose
:
print_graph
(
f
)
assert
len
(
f
.
maker
.
env
.
toposort
())
==
5
g
=
theano
.
function
([
x
,
b
,
y
],
T
.
grad
(
expr
,
x
),
mode
=
mode
)
if
verbose
:
print_graph
(
g
)
assert
len
(
g
.
maker
.
env
.
toposort
())
==
8
g
(
x_val
,
b_val
,
y_val
)
def
test_scale_cost
(
self
):
# TODO: add the optimization in FAST_COMPILE?
# In the mean time, run it as 'FAST_RUN' instead
mode
=
theano
.
compile
.
mode
.
get_default_mode
()
if
mode
==
'FAST_COMPILE'
:
mode
=
'FAST_RUN'
rng
=
numpy
.
random
.
RandomState
(
utt
.
fetch_seed
())
x_val
=
rng
.
randn
(
3
,
5
)
b_val
=
rng
.
randn
(
5
)
y_val
=
numpy
.
asarray
([
2
,
4
,
1
])
x
=
T
.
dmatrix
(
'x'
)
b
=
T
.
dvector
(
'b'
)
y
=
T
.
lvector
(
'y'
)
a
=
T
.
dscalar
(
'a'
)
def
print_graph
(
func
):
for
i
,
node
in
enumerate
(
func
.
maker
.
env
.
toposort
()):
print
i
,
node
# Last node should be the output
print
i
,
pprint
(
node
.
outputs
[
0
])
def
validate_fn_graph
(
func
):
# The graph of the function should not have softmax anymore
has_cx1hot
=
False
has_softmax
=
False
for
node
in
func
.
maker
.
env
.
toposort
():
if
node
.
op
==
crossentropy_softmax_argmax_1hot_with_bias
:
has_cx1hot
=
True
if
node
.
op
==
softmax
:
has_softmax
=
True
assert
has_cx1hot
assert
not
has_softmax
def
validate_grad_graph
(
func
):
# The graph of the gradient should not have softmaxgrad anymore
has_cx1hotdx
=
False
has_softmax
=
False
has_softmaxdx
=
False
for
node
in
func
.
maker
.
env
.
toposort
():
if
node
.
op
==
crossentropy_softmax_1hot_with_bias_dx
:
has_cx1hotdx
=
True
if
node
.
op
==
softmax
:
has_softmax
=
True
if
node
.
op
==
softmax_grad
:
has_softmaxdx
=
True
assert
has_cx1hotdx
assert
has_softmax
assert
not
has_softmaxdx
## Cases to test
expressions
=
[
a
*
T
.
sum
(
-
T
.
log
(
softmax
(
x
)[
T
.
arange
(
y
.
shape
[
0
]),
y
])),
-
a
*
T
.
sum
(
T
.
log
(
softmax
(
x
)[
T
.
arange
(
y
.
shape
[
0
]),
y
])),
a
*
(
-
T
.
sum
(
T
.
log
(
softmax
(
x
)[
T
.
arange
(
y
.
shape
[
0
]),
y
]))),
a
*
T
.
sum
(
T
.
log
(
softmax
(
x
)[
T
.
arange
(
y
.
shape
[
0
]),
y
])),
a
*
T
.
sum
(
-
T
.
log
(
softmax
(
x
))[
T
.
arange
(
y
.
shape
[
0
]),
y
]),
-
a
*
T
.
sum
(
T
.
log
(
softmax
(
x
))[
T
.
arange
(
y
.
shape
[
0
]),
y
]),
a
*
(
-
T
.
sum
(
T
.
log
(
softmax
(
x
))[
T
.
arange
(
y
.
shape
[
0
]),
y
])),
a
*
T
.
sum
(
T
.
log
(
softmax
(
x
))[
T
.
arange
(
y
.
shape
[
0
]),
y
]),
a
*
T
.
mean
(
-
T
.
log
(
softmax
(
x
)[
T
.
arange
(
y
.
shape
[
0
]),
y
])),
-
a
*
T
.
mean
(
T
.
log
(
softmax
(
x
)[
T
.
arange
(
y
.
shape
[
0
]),
y
])),
a
*
(
-
T
.
mean
(
T
.
log
(
softmax
(
x
)[
T
.
arange
(
y
.
shape
[
0
]),
y
]))),
a
*
T
.
mean
(
T
.
log
(
softmax
(
x
)[
T
.
arange
(
y
.
shape
[
0
]),
y
])),
a
*
T
.
mean
(
-
T
.
log
(
softmax
(
x
))[
T
.
arange
(
y
.
shape
[
0
]),
y
]),
-
a
*
T
.
mean
(
T
.
log
(
softmax
(
x
))[
T
.
arange
(
y
.
shape
[
0
]),
y
]),
a
*
(
-
T
.
mean
(
T
.
log
(
softmax
(
x
))[
T
.
arange
(
y
.
shape
[
0
]),
y
])),
a
*
T
.
mean
(
T
.
log
(
softmax
(
x
))[
T
.
arange
(
y
.
shape
[
0
]),
y
]),
]
for
expr
in
expressions
:
# Verify the optimizer worked on the expressions
f
=
theano
.
function
([
x
,
y
,
a
],
expr
,
mode
=
mode
)
assert
5
<=
len
(
f
.
maker
.
env
.
toposort
())
<=
10
validate_fn_graph
(
f
)
f
(
x_val
,
y_val
,
0.1
)
# Verify the gradient wrt x
g
=
theano
.
function
([
x
,
y
,
a
],
T
.
grad
(
expr
,
x
),
mode
=
mode
)
assert
5
<=
len
(
g
.
maker
.
env
.
toposort
())
<=
12
validate_grad_graph
(
g
)
g
(
x_val
,
y_val
,
0.1
)
# Verify the gradient when providing output gradient
h
=
theano
.
function
([
x
,
y
,
a
],
T
.
grad
(
expr
,
x
,
g_cost
=
a
*
x
.
sum
()),
mode
=
mode
)
assert
8
<=
len
(
h
.
maker
.
env
.
toposort
())
<=
17
validate_grad_graph
(
h
)
h
(
x_val
,
y_val
,
0.1
)
def
test_argmax_pushdown
():
x
=
tensor
.
dmatrix
()
...
...
@@ -306,101 +504,6 @@ def test_asymptotic_32():
assert
gxval
[
0
,
1
]
==
0.25
def
test_get_rid_of_advanced_indexing_version_of_xent
():
verbose
=
0
if
0
:
mode
=
'DEBUG_MODE'
else
:
mode
=
'FAST_RUN'
rng
=
numpy
.
random
.
RandomState
(
utt
.
fetch_seed
())
x_val
=
rng
.
randn
(
3
,
5
)
b_val
=
rng
.
randn
(
5
)
y_val
=
numpy
.
asarray
([
2
,
4
,
1
])
x
=
T
.
dmatrix
(
'x'
)
b
=
T
.
dvector
(
'b'
)
y
=
T
.
lvector
(
'y'
)
def
print_graph
(
func
):
for
i
,
node
in
enumerate
(
func
.
maker
.
env
.
toposort
()):
print
i
,
node
# Last node should be the output
print
i
,
pprint
(
node
.
outputs
[
0
])
## Basic case
expressions
=
[
T
.
sum
(
-
T
.
log
(
softmax
(
x
)[
T
.
arange
(
y
.
shape
[
0
]),
y
])),
-
T
.
sum
(
T
.
log
(
softmax
(
x
)[
T
.
arange
(
y
.
shape
[
0
]),
y
])),
-
T
.
sum
(
T
.
log
(
softmax
(
x
))[
T
.
arange
(
y
.
shape
[
0
]),
y
]),
T
.
sum
(
-
T
.
log
(
softmax
(
x
))[
T
.
arange
(
y
.
shape
[
0
]),
y
])]
for
expr
in
expressions
:
# Verify the optimizer worked on the expressions
f
=
theano
.
function
([
x
,
y
],
expr
,
mode
=
mode
)
if
verbose
:
print_graph
(
f
)
assert
len
(
f
.
maker
.
env
.
toposort
())
==
4
f
(
x_val
,
y_val
)
# Also verify the gradient wrt x
g
=
theano
.
function
([
x
,
y
],
T
.
grad
(
expr
,
x
),
mode
=
mode
)
if
verbose
:
print_graph
(
g
)
assert
len
(
g
.
maker
.
env
.
toposort
())
==
4
g
(
x_val
,
y_val
)
## Test that a biased softmax is optimized correctly
bias_expressions
=
[
T
.
sum
(
-
T
.
log
(
softmax
(
x
+
b
)[
T
.
arange
(
y
.
shape
[
0
]),
y
])),
-
T
.
sum
(
T
.
log
(
softmax
(
b
+
x
)[
T
.
arange
(
y
.
shape
[
0
]),
y
])),
-
T
.
sum
(
T
.
log
(
softmax
(
x
+
b
))[
T
.
arange
(
y
.
shape
[
0
]),
y
]),
T
.
sum
(
-
T
.
log
(
softmax
(
b
+
x
))[
T
.
arange
(
y
.
shape
[
0
]),
y
])]
for
expr
in
bias_expressions
:
f
=
theano
.
function
([
x
,
b
,
y
],
expr
,
mode
=
mode
)
if
verbose
:
print_graph
(
f
)
assert
len
(
f
.
maker
.
env
.
toposort
())
==
2
# [big_op, sum]
f
(
x_val
,
b_val
,
y_val
)
g
=
theano
.
function
([
x
,
b
,
y
],
T
.
grad
(
expr
,
x
),
mode
=
mode
)
if
verbose
:
print_graph
(
g
)
assert
len
(
g
.
maker
.
env
.
toposort
())
==
4
g
(
x_val
,
b_val
,
y_val
)
## Test that using "mean" instead of sum works, too
mean_expressions
=
[
T
.
mean
(
-
T
.
log
(
softmax
(
x
)[
T
.
arange
(
y
.
shape
[
0
]),
y
])),
-
T
.
mean
(
T
.
log
(
softmax
(
x
)[
T
.
arange
(
y
.
shape
[
0
]),
y
])),
-
T
.
mean
(
T
.
log
(
softmax
(
x
))[
T
.
arange
(
y
.
shape
[
0
]),
y
]),
T
.
mean
(
-
T
.
log
(
softmax
(
x
))[
T
.
arange
(
y
.
shape
[
0
]),
y
])]
for
expr
in
mean_expressions
:
f
=
theano
.
function
([
x
,
y
],
expr
,
mode
=
mode
)
if
verbose
:
print_graph
(
f
)
assert
len
(
f
.
maker
.
env
.
toposort
())
==
7
f
(
x_val
,
y_val
)
g
=
theano
.
function
([
x
,
y
],
T
.
grad
(
expr
,
x
),
mode
=
mode
)
if
verbose
:
print_graph
(
g
)
assert
len
(
g
.
maker
.
env
.
toposort
())
==
8
g
(
x_val
,
y_val
)
mean_bias_expressions
=
[
T
.
mean
(
-
T
.
log
(
softmax
(
x
+
b
)[
T
.
arange
(
y
.
shape
[
0
]),
y
])),
-
T
.
mean
(
T
.
log
(
softmax
(
b
+
x
)[
T
.
arange
(
y
.
shape
[
0
]),
y
])),
-
T
.
mean
(
T
.
log
(
softmax
(
x
+
b
))[
T
.
arange
(
y
.
shape
[
0
]),
y
]),
T
.
mean
(
-
T
.
log
(
softmax
(
b
+
x
))[
T
.
arange
(
y
.
shape
[
0
]),
y
])]
for
expr
in
mean_bias_expressions
:
f
=
theano
.
function
([
x
,
b
,
y
],
expr
,
mode
=
mode
)
if
verbose
:
print_graph
(
f
)
assert
len
(
f
.
maker
.
env
.
toposort
())
==
5
g
=
theano
.
function
([
x
,
b
,
y
],
T
.
grad
(
expr
,
x
),
mode
=
mode
)
if
verbose
:
print_graph
(
g
)
assert
len
(
g
.
maker
.
env
.
toposort
())
==
8
g
(
x_val
,
b_val
,
y_val
)
# hint - call the argmax push-down optimization first too
...
...
编写
预览
Markdown
格式
0%
重试
或
添加新文件
添加附件
取消
您添加了
0
人
到此讨论。请谨慎行事。
请先完成此评论的编辑!
取消
请
注册
或者
登录
后发表评论