Skip to content
项目
群组
代码片段
帮助
当前项目
正在载入...
登录 / 注册
切换导航面板
P
pytensor
项目
项目
详情
活动
周期分析
仓库
仓库
文件
提交
分支
标签
贡献者
图表
比较
统计图
议题
0
议题
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
CI / CD
CI / CD
流水线
作业
日程
统计图
Wiki
Wiki
代码片段
代码片段
成员
成员
折叠边栏
关闭边栏
活动
图像
聊天
创建新问题
作业
提交
问题看板
Open sidebar
testgroup
pytensor
Commits
ece4c2e4
提交
ece4c2e4
authored
11月 29, 2016
作者:
khaotik
提交者:
khaotik
1月 27, 2017
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
better readability / draft for OfG R_op
上级
8d9fa9e5
隐藏空白字符变更
内嵌
并排
正在显示
1 个修改的文件
包含
164 行增加
和
157 行删除
+164
-157
builders.py
theano/compile/builders.py
+164
-157
没有找到文件。
theano/compile/builders.py
浏览文件 @
ece4c2e4
...
@@ -13,11 +13,112 @@ from theano.gof.graph import io_connection_pattern
...
@@ -13,11 +13,112 @@ from theano.gof.graph import io_connection_pattern
class
OpFromGraph
(
gof
.
Op
):
class
OpFromGraph
(
gof
.
Op
):
"""
"""
class for Ops with user-defined inner graph
This creates an `Op` from inputs and outputs lists of variables.
The signature is similar to theano.function() and the resulting
`Op`'s perform will do the same operation as::
orig_function(inputs, outputs, **kwargs)
Currently does not support 'updates' or 'givens' argument.
Parameters
----------
inputs: list of variables
outputs: list of variables
inline: bool, optional
if True, will cause the Op's original graph being used during
compilation, otherwise will use a pre-compiled function inside.
grad_overrides: None | function | list of (None|function), optional
Used to override default gradient routine.
Overriding function(s) must take two list of variable(s) as inputs,
the original inputs and ups gradients
For different `grad_overrides`:
- `None` : will use default gradient routine.
- function : must return list of Variable.
- list : each function must return a single Variable. The order
of the list must corresponds to inputs
TODO:
- examples for a multi-layer mlp. where?
- __hash__, __eq__ otherwise won't merge, try
gof.opt.is_same_graph_with_merge(op1.local_outputs, op2,
local_outputs)
- c_code() to remove the double overhead?
- grad() make it support DisconnectedType and the new interface
- implement R_op()
- check how it works with updates.
- add test with constant as input or inside the inner graph.
- Add support for the GPU? Probably just need an opt to remove transfer
- Add support to pickle this Op.
- Add support/test with random generator
- Recursion detection to prevent Op "forkbomb", either set depth
limit or manually check them.
Notes
-----
- We support shared variables in the inner graph. This is automatic
and invisible to the user. They can be as input to the node or in
the inner graph.
- We support unused inputs. This is needed for the grad.
- `inline=True` will cause better runtime optimization at the cost
of compilation time. Like "inline" keyword in C, this is merely a
suggestion to compiler which is not guaranteed. Currently only
works with "fast_compile" or "fast_run" mode.
Examples
--------
Example 1:
.. code-block:: python
from theano import function, op_from_graph, tensor
x, y, z = tensor.scalars('xyz')
e = x + y * z
op = op_from_graph([x, y, z], [e])
# op behaves like a normal theano op
e2 = op(x, y, z) + op(z, y, x)
fn = function([x, y, z], [e2])
Example 2 with shared variable:
.. code-block:: python
import numpy as np
import theano
from theano import config, function, op_from_graph, tensor
x, y, z = tensor.scalars('xyz')
s = theano.shared(np.random.rand(2, 2).astype(config.floatX))
e = x + y * z + s
op = op_from_graph([x, y, z], [e])
# op behaves like a normal theano op
e2 = op(x, y, z) + op(z, y, x)
fn = function([x, y, z], [e2])
Example 3 override gradient
.. code-block:: python
from thenao import funciton, op_from_graph, tensor, grad
x, y, z = tensor.scalars('xyz')
e = x + y * z
def rescale_dy(inps, grads):
x, y, z = inps
g = grads
return z*2
op = op_from_graph(
[x, y, z], [e], grad_overrides=[None, rescale_dy, None])
e2 = op(x, y, z)
dx, dy, dz = grad(e2, [x, y, z])
fn = function([x, y, z], [dx, dy, dz])
# the graident wrt y is now doubled
fn(2., 3., 4.) # [1., 8., 3.]
"""
"""
# NOTE: if you make a subclass of this, make sure add test for it under:
# NOTE: if you make a subclass of this, make sure add test for it under:
# theano/compile/tests/test_builders.py
# theano/compile/tests/test_builders.py
def
__init__
(
self
,
inputs
,
outputs
,
inline
=
False
,
grad_overrides
=
None
,
**
kwargs
):
def
__init__
(
self
,
inputs
,
outputs
,
inline
=
False
,
grad_overrides
=
None
,
rop_overrides
=
None
,
**
kwargs
):
if
not
isinstance
(
outputs
,
list
):
if
not
isinstance
(
outputs
,
list
):
raise
TypeError
(
'outputs must be list'
,
outputs
)
raise
TypeError
(
'outputs must be list'
,
outputs
)
for
i
in
inputs
+
outputs
:
for
i
in
inputs
+
outputs
:
...
@@ -52,18 +153,11 @@ class OpFromGraph(gof.Op):
...
@@ -52,18 +153,11 @@ class OpFromGraph(gof.Op):
self
.
kwargs
=
kwargs
self
.
kwargs
=
kwargs
self
.
input_types
=
[
inp
.
type
for
inp
in
inputs
]
self
.
input_types
=
[
inp
.
type
for
inp
in
inputs
]
self
.
output_types
=
[
out
.
type
for
out
in
outputs
]
self
.
output_types
=
[
out
.
type
for
out
in
outputs
]
# grad_op: a functor takes form:
self
.
set_grad_overrides
(
grad_overrides
)
#
# def grad_op(inputs:list, ups_grads:list):
# TODO
# return dns_grads:list
if
rop_overrides
is
not
None
:
#
raise
NotImplementedError
(
'Overriding Rop is not implemented yet.'
)
# This is used to cache gradient for subgraph
# for __init__, just set as grad_overrides
#
# grad_op should be build on the 1st call to grad()
# after which grad_op_is_cached should be True
self
.
grad_op
=
grad_overrides
self
.
grad_op_is_cached
=
False
def
__eq__
(
self
,
other
):
def
__eq__
(
self
,
other
):
# TODO: recognize a copy
# TODO: recognize a copy
...
@@ -73,46 +167,67 @@ class OpFromGraph(gof.Op):
...
@@ -73,46 +167,67 @@ class OpFromGraph(gof.Op):
# TODO: use internal variables in hash
# TODO: use internal variables in hash
return
hash
(
type
(
self
))
return
hash
(
type
(
self
))
def
grad
(
self
,
inputs
,
output_grads
):
# TODO impl me
if
self
.
grad_op_is_cached
:
# def R_op(self, inputs, eval_points)
:
return
self
.
grad_op
(
inputs
,
output_grads
)
# pass
if
self
.
grad_op
is
None
:
def
_recompute_grad_op
(
self
):
self
.
grad_op
=
[]
output_grads
=
[
out_t
()
for
out_t
in
self
.
output_types
]
if
self
.
_grad_op
is
None
:
self
.
_grad_op
=
[]
# we need to convert a list
into a single funtor
# we need to convert a list
/function into an OfG instance
if
isinstance
(
self
.
grad_op
,
list
):
if
isinstance
(
self
.
_
grad_op
,
list
):
g
rad_op_l
=
self
.
grad_op
g
overrides_l
=
self
.
_
grad_op
if
len
(
g
rad_op
_l
)
>
len
(
self
.
local_inputs
):
if
len
(
g
overrides
_l
)
>
len
(
self
.
local_inputs
):
raise
ValueError
(
raise
ValueError
(
'Can override
%
d gradients at most, got
%
d'
%
(
'Can override
%
d gradients at most, got
%
d'
%
(
len
(
self
.
local_inputs
),
len
(
g
rad_op
_l
)))
len
(
self
.
local_inputs
),
len
(
g
overrides
_l
)))
if
len
(
g
rad_op
_l
)
<
len
(
self
.
local_inputs
):
if
len
(
g
overrides
_l
)
<
len
(
self
.
local_inputs
):
g
rad_op
_l
+=
[
None
]
*
(
g
overrides
_l
+=
[
None
]
*
(
len
(
self
.
local_inputs
)
-
len
(
g
rad_op
_l
))
len
(
self
.
local_inputs
)
-
len
(
g
overrides
_l
))
wrt
=
[
self
.
local_inputs
[
i
]
for
i
,
go
in
wrt
_l
=
[
lin
for
lin
,
gov
in
enumerate
(
grad_op_l
)
if
not
go
]
izip
(
self
.
local_inputs
,
goverrides_l
)
if
not
gov
]
# compute non-overriding downsteam grad
ient
s from upstreams grads
# compute non-overriding downsteam grads from upstreams grads
# it's normal some input may be disconnected, thus the 'ignore'
# it's normal some input may be disconnected, thus the 'ignore'
ups_grads_d
=
dict
(
izip
(
self
.
local_outputs
,
output_grads
))
gdefaults
=
iter
(
theano
.
gradient
.
grad
(
nat_dns_grads
=
iter
(
theano
.
gradient
.
grad
(
cost
=
None
,
cost
=
None
,
known_grads
=
ups_grads_d
,
known_grads
=
dict
(
izip
(
self
.
local_outputs
,
output_grads
))
,
wrt
=
wrt
,
wrt
=
wrt
_l
,
disconnected_inputs
=
'ignore'
))
disconnected_inputs
=
'ignore'
)
if
wrt_l
else
[]
)
# combine overriding gradients
# combine overriding gradients
dns_grads_l
=
[
all_grads_l
=
[
go
(
self
.
local_inputs
,
output_grads
)
if
go
else
next
(
nat_dns_grads
)
for
go
in
grad_op_l
]
gov
(
self
.
local_inputs
,
output_grads
)
if
gov
grad_ofg
=
type
(
self
)(
else
next
(
gdefaults
)
for
gov
in
goverrides_l
]
inputs
=
self
.
local_inputs
+
output_grads
,
else
:
outputs
=
dns_grads_l
,
all_grads_l
=
self
.
_grad_op
(
self
.
local_inputs
,
output_grads
)
inline
=
self
.
is_inline
,
on_unused_input
=
'ignore'
)
self
.
_grad_op
=
type
(
self
)(
inputs
=
self
.
local_inputs
+
output_grads
,
def
grad_op
(
inps
,
grds
):
outputs
=
all_grads_l
,
return
grad_ofg
(
*
(
list
(
inps
)
+
list
(
grds
)))
inline
=
self
.
is_inline
,
on_unused_input
=
'ignore'
)
self
.
grad_op
=
grad_op
self
.
_grad_op_is_cached
=
True
self
.
grad_op_is_cached
=
True
return
self
.
grad_op
(
inputs
,
output_grads
)
def
get_grad_op
(
self
):
"""
getter method for self._grad_op
"""
if
not
self
.
_grad_op_is_cached
:
self
.
_recompute_grad_op
()
return
self
.
_grad_op
def
set_grad_overrides
(
self
,
grad_overrides
):
"""
Set gradient overrides, see help(theano.OpFromGraph) for syntax
This will completed remove any previously set gradient overrides
"""
self
.
_grad_op
=
grad_overrides
self
.
_grad_op_is_cached
=
False
def
grad
(
self
,
inputs
,
output_grads
):
if
not
self
.
_grad_op_is_cached
:
self
.
_recompute_grad_op
()
return
self
.
_grad_op
(
*
(
list
(
inputs
)
+
list
(
output_grads
)))
def
make_node
(
self
,
*
inputs
):
def
make_node
(
self
,
*
inputs
):
for
input
,
type
in
zip
(
inputs
,
self
.
input_types
):
for
input
,
type
in
zip
(
inputs
,
self
.
input_types
):
...
@@ -164,6 +279,7 @@ class OpFromGraph(gof.Op):
...
@@ -164,6 +279,7 @@ class OpFromGraph(gof.Op):
self
.
fn
=
orig_function
(
self
.
local_inputs
,
self
.
fn
=
orig_function
(
self
.
local_inputs
,
self
.
local_outputs
,
self
.
local_outputs
,
**
self
.
kwargs
)
**
self
.
kwargs
)
self
.
fn
.
trust_input
=
True
def
perform
(
self
,
node
,
inputs
,
outputs
):
def
perform
(
self
,
node
,
inputs
,
outputs
):
variables
=
self
.
fn
(
*
inputs
)
variables
=
self
.
fn
(
*
inputs
)
...
@@ -178,7 +294,7 @@ class OpFromGraph(gof.Op):
...
@@ -178,7 +294,7 @@ class OpFromGraph(gof.Op):
def
inline_ofg_expansion
(
node
):
def
inline_ofg_expansion
(
node
):
"""
"""
This optimization expands internal graph of OpFromGraph.
This optimization expands internal graph of OpFromGraph.
Only performed if node.op.is_inline == True
Doing so can improve optimization at the cost of compilation speed.
Doing so can improve optimization at the cost of compilation speed.
"""
"""
op
=
node
.
op
op
=
node
.
op
...
@@ -201,112 +317,3 @@ optdb.register(
...
@@ -201,112 +317,3 @@ optdb.register(
ops_with_inner_function
[
OpFromGraph
]
=
'fn'
ops_with_inner_function
[
OpFromGraph
]
=
'fn'
# API for OpFromGraph
def
op_from_graph
(
inputs
,
outputs
,
inline
=
False
,
grad_overrides
=
None
,
**
kwargs
):
"""
This creates an `Op` from inputs and outputs lists of variables.
The signature is similar to theano.function() and the resulting
`Op`'s perform will do the same operation as::
orig_function(inputs, outputs, **kwargs)
Currently does not support 'updates' or 'givens' argument.
Parameters
----------
inputs: list of variables
outputs: list of variables
inline: bool, optional
if True, will cause the Op's original graph being used during
compilation, otherwise will use a pre-compiled function inside.
grad_overrides: None | function | list of (None|function), optional
Used to override default gradient routine.
Overriding function(s) must take two list of variable as inputs,
the original inputs and ups gradients
For different `grad_overrides`:
- `None` : will use default gradient routine.
- function : must return list of Variable.
- list : each function must return a single Variable. The order
of the list must corresponds to inputs
TODO:
- examples for a multi-layer mlp. where?
- __hash__, __eq__ otherwise won't merge, try
gof.opt.is_same_graph_with_merge(op1.local_outputs, op2,
local_outputs)
- c_code() to remove the double overhead?
- grad() make it support DisconnectedType and the new interface
- check how it works with updates.
- add test with constant as input or inside the inner graph.
- Add support for the GPU? Probably just need an opt to remove transfer
- Add support to pickle this Op.
- Add support/test with random generator
- Recursion detection to prevent Op "forkbomb", either set depth
limit or manually check them.
Notes
-----
- We support shared variables in the inner graph. This is automatic
and invisible to the user. They can be as input to the node or in
the inner graph.
- We support unused inputs. This is needed for the grad.
- `inline=True` will cause better runtime optimization at the cost
of compilation time. Like "inline" keyword in C, this is merely a
suggestion to compiler which is not guaranteed. Currently only
works with "fast_compile" or "fast_run" mode.
Examples
--------
Example 1:
.. code-block:: python
from theano import function, op_from_graph, tensor
x, y, z = tensor.scalars('xyz')
e = x + y * z
op = op_from_graph([x, y, z], [e])
# op behaves like a normal theano op
e2 = op(x, y, z) + op(z, y, x)
fn = function([x, y, z], [e2])
Example 2 with shared variable:
.. code-block:: python
import numpy as np
import theano
from theano import config, function, op_from_graph, tensor
x, y, z = tensor.scalars('xyz')
s = theano.shared(np.random.rand(2, 2).astype(config.floatX))
e = x + y * z + s
op = op_from_graph([x, y, z], [e])
# op behaves like a normal theano op
e2 = op(x, y, z) + op(z, y, x)
fn = function([x, y, z], [e2])
Example 3 override gradient
.. code-block:: python
from thenao import funciton, op_from_graph, tensor, grad
x, y, z = tensor.scalars('xyz')
e = x + y * z
def rescale_dy(inps, grads):
x, y, z = inps
g = grads
return z*2
op = op_from_graph(
[x, y, z], [e], grad_overrides=[None, rescale_dy, None])
e2 = op(x, y, z)
dx, dy, dz = grad(e2, [x, y, z])
fn = function([x, y, z], [dx, dy, dz])
# the graident wrt y is now doubled
fn(2., 3., 4.) # [1., 8., 3.]
"""
return
OpFromGraph
(
inputs
,
outputs
,
inline
=
inline
,
grad_overrides
=
grad_overrides
,
**
kwargs
)
编写
预览
Markdown
格式
0%
重试
或
添加新文件
添加附件
取消
您添加了
0
人
到此讨论。请谨慎行事。
请先完成此评论的编辑!
取消
请
注册
或者
登录
后发表评论