提交 06632882 authored 作者: Brandon T. Willard's avatar Brandon T. Willard 提交者: Thomas Wiecki

Move math-related optimizations to theano.tensor.math_opt

上级 980451b5
===================================================================
:mod:`tensor.math_opt` -- Tensor Optimizations for Math Operations
===================================================================
.. module:: tensor.math_opt
:platform: Unix, Windows
:synopsis: Tensor Optimizations for Math Operations
.. moduleauthor:: LISA, PyMC Developers
.. automodule:: theano.tensor.math_opt
:members:
...@@ -5,12 +5,12 @@ Optimizations ...@@ -5,12 +5,12 @@ Optimizations
============== ==============
Theano applies many kinds of graph optimizations, with different objectives: Theano applies many kinds of graph optimizations, with different objectives:
* simplifying and standardizing the form of the expression graph (e.g. :term:`merge`, :term:`add canonicalization` ), * simplifying and standardizing the form of the expression graph (e.g. :term:`merge`, :term:`add canonicalization` ),
* reducing the maximum memory footprint (e.g. :term:`inplace_elemwise`), * reducing the maximum memory footprint (e.g. :term:`inplace_elemwise`),
* increasing execution speed (e.g. :term:`constant folding`). * increasing execution speed (e.g. :term:`constant folding`).
The optimizations are listed in roughly chronological order. The table below The optimizations are listed in roughly chronological order. The table below
gives a quick summary of the optimizations included in the default modes. gives a quick summary of the optimizations included in the default modes.
The descriptions are brief and point to further reading. The descriptions are brief and point to further reading.
If you would like to add an additional optimization, refer to If you would like to add an additional optimization, refer to
...@@ -33,17 +33,17 @@ For an even faster run-time, we could disable assertions (which could be time co ...@@ -33,17 +33,17 @@ For an even faster run-time, we could disable assertions (which could be time co
python -c "import theano; theano.compile.optdb.query(theano.compile.predefined_optimizers['<OPT_ID>']).print_summary()" python -c "import theano; theano.compile.optdb.query(theano.compile.predefined_optimizers['<OPT_ID>']).print_summary()"
where <OPT_ID> can be one of o1 (:ref:`† <o1=>`), o2, o3, o4 (:ref:`* <o4=>`), where <OPT_ID> can be one of o1 (:ref:`† <o1=>`), o2, o3, o4 (:ref:`* <o4=>`),
Stabilization or unsafe. Stabilization or unsafe.
========================================================= ============== === === ================= ============= ====== ========================================================= ============== === === ================= ============= ======
Optimization o4 o3 o2 o1 Stabilization unsafe Optimization o4 o3 o2 o1 Stabilization unsafe
:ref:`* <o4=>` :ref:`† <o1=>` :ref:`* <o4=>` :ref:`† <o1=>`
========================================================= ============== === === ================= ============= ====== ========================================================= ============== === === ================= ============= ======
:term:`merge` x x x x x :term:`merge` x x x x x
:term:`constant folding<constant folding>` x x x x x :term:`constant folding<constant folding>` x x x x x
:term:`GPU transfer` x x x x x :term:`GPU transfer` x x x x x
:term:`shape promotion<shape promotion>` x x x :term:`shape promotion<shape promotion>` x x x
:term:`fill cut<fill cut>` x x x :term:`fill cut<fill cut>` x x x
:term:`inc_subtensor srlz.<inc_subtensor serialization>` x x x :term:`inc_subtensor srlz.<inc_subtensor serialization>` x x x
...@@ -59,12 +59,12 @@ Optimization o4 o3 o2 ...@@ -59,12 +59,12 @@ Optimization o4 o3 o2
:term:`add specialize <add specialization>` x x x :term:`add specialize <add specialization>` x x x
:term:`mul specialize <mul specialization>` x x x :term:`mul specialize <mul specialization>` x x x
:term:`pow specialize <pow specialization>` x x x :term:`pow specialize <pow specialization>` x x x
:term:`inplace_setsubtensor` x :term:`inplace_setsubtensor` x
:term:`gemm` x x x :term:`gemm` x x x
:term:`inplace_elemwise` x :term:`inplace_elemwise` x
:term:`inplace_random` x :term:`inplace_random` x
:term:`elemwise fusion` x x x x :term:`elemwise fusion` x x x x
:term:`local_log_softmax` x x x x :term:`local_log_softmax` x x x x
:term:`local_remove_all_assert` x :term:`local_remove_all_assert` x
========================================================= ============== === === ================= ============= ====== ========================================================= ============== === === ================= ============= ======
...@@ -104,7 +104,7 @@ Optimization o4 o3 o2 ...@@ -104,7 +104,7 @@ Optimization o4 o3 o2
See :func:`opt.local_shape_lift_*` See :func:`opt.local_shape_lift_*`
fill cut fill cut
`Fill(a,b)` means to make a tensor of the shape of `a` full of the value `b`. `Fill(a,b)` means to make a tensor of the shape of `a` full of the value `b`.
Often when fills are used with elementwise operations (e.g. f) they are Often when fills are used with elementwise operations (e.g. f) they are
un-necessary: un-necessary:
...@@ -113,18 +113,18 @@ Optimization o4 o3 o2 ...@@ -113,18 +113,18 @@ Optimization o4 o3 o2
See :func:`opt.local_fill_sink` See :func:`opt.local_fill_sink`
inc_subtensor serialization inc_subtensor serialization
Incrementing a small subregion of a large tensor can be done quickly Incrementing a small subregion of a large tensor can be done quickly
using an inplace operation, but if two increments are being done on using an inplace operation, but if two increments are being done on
the same large tensor, then only one of them can be done inplace. the same large tensor, then only one of them can be done inplace.
This optimization reorders such graphs so that all increments can be This optimization reorders such graphs so that all increments can be
done inplace. done inplace.
``inc_subtensor(a,b,idx) + inc_subtensor(a,c,idx) -> inc_subtensor(inc_subtensor(a,b,idx),c,idx)`` ``inc_subtensor(a,b,idx) + inc_subtensor(a,c,idx) -> inc_subtensor(inc_subtensor(a,b,idx),c,idx)``
See :func:`local_IncSubtensor_serialize` See :func:`local_IncSubtensor_serialize`
reshape_chain reshape_chain
This optimizes graphs like ``reshape(reshape(x, shape1), shape2)`` -> ``reshape(x, shape2)`` This optimizes graphs like ``reshape(reshape(x, shape1), shape2)`` -> ``reshape(x, shape2)``
See :func:`local_reshape_chain` See :func:`local_reshape_chain`
...@@ -140,22 +140,22 @@ Optimization o4 o3 o2 ...@@ -140,22 +140,22 @@ Optimization o4 o3 o2
form: form:
.. math:: .. math::
(a+b+c+...) - (z + x + y + ....) (a+b+c+...) - (z + x + y + ....)
See :class:`Canonizer`, :attr:`local_add_canonizer` See :class:`AlgebraicCanonizer`, :attr:`local_add_canonizer`
mul canonicalization mul canonicalization
Rearrange expressions of multiplication and division to a canonical Rearrange expressions of multiplication and division to a canonical
form: form:
.. math:: .. math::
\frac{a * b * c * ...}{z * x * y * ....} \frac{a * b * c * ...}{z * x * y * ....}
See :class:`Canonizer`, :attr:`local_mul_canonizer` See :class:`AlgebraicCanonizer`, :attr:`local_mul_canonizer`
dot22 dot22
This simple optimization replaces dot(matrix, matrix) with a special This simple optimization replaces dot(matrix, matrix) with a special
`dot22` op that only works for matrix multiplication. This op is `dot22` op that only works for matrix multiplication. This op is
implemented with a call to GEMM, and sometimes replaced entirely by implemented with a call to GEMM, and sometimes replaced entirely by
...@@ -163,63 +163,63 @@ Optimization o4 o3 o2 ...@@ -163,63 +163,63 @@ Optimization o4 o3 o2
See :func:`local_dot_to_dot22` See :func:`local_dot_to_dot22`
sparse_dot sparse_dot
Theano has a sparse matrix multiplication algorithm that is faster in Theano has a sparse matrix multiplication algorithm that is faster in
many cases than scipy's (for dense matrix output). This optimization many cases than scipy's (for dense matrix output). This optimization
swaps scipy's algorithm for ours. swaps scipy's algorithm for ours.
See :func:`local_structured_dot` See :func:`local_structured_dot`
sum_scalar_mul sum_scalar_mul
This optimizes graphs like ``sum(scalar * tensor)`` -> ``scalar * sum(tensor)`` This optimizes graphs like ``sum(scalar * tensor)`` -> ``scalar * sum(tensor)``
See :func:`local_sum_mul_by_scalar` See :func:`local_sum_mul_by_scalar`
neg_neg neg_neg
Composition of two negatives can be cancelled out. Composition of two negatives can be cancelled out.
See :func:`local_neg_neg` See :func:`local_neg_neg`
neg_div_neg neg_div_neg
Matching negatives in both the numerator and denominator can both be removed. Matching negatives in both the numerator and denominator can both be removed.
See :func:`local_neg_div_neg` See :func:`local_neg_div_neg`
add specialization add specialization
This optimization simplifies expressions involving the addition of This optimization simplifies expressions involving the addition of
zero. zero.
See :func:`local_add_specialize` See :func:`local_add_specialize`
mul specialization mul specialization
Several special cases of mul() exist, and this optimization tries to Several special cases of mul() exist, and this optimization tries to
recognize them. Some examples include: recognize them. Some examples include:
* ``mul(x,x)`` -> ``x**2`` * ``mul(x,x)`` -> ``x**2``
* ``mul(x,0)`` -> ``zeros_like(x)`` * ``mul(x,0)`` -> ``zeros_like(x)``
* ``mul(x, -1)`` -> ``neg(x)`` * ``mul(x, -1)`` -> ``neg(x)``
See :func:`local_mul_specialize` See :func:`local_mul_specialize`
pow specialization pow specialization
Several special cases of pow() exist, and this optimization tries to Several special cases of pow() exist, and this optimization tries to
recognize them. Some examples include: recognize them. Some examples include:
* ``pow(x,2)`` -> ``x**2`` * ``pow(x,2)`` -> ``x**2``
* ``pow(x,0)`` -> ``ones_like(x)`` * ``pow(x,0)`` -> ``ones_like(x)``
* ``pow(x, -0.5)`` -> ``inv(sqrt(x))`` * ``pow(x, -0.5)`` -> ``inv(sqrt(x))``
See :func:`local_pow_specialize` See :func:`local_pow_specialize`
inplace_setsubtensor
inplace_setsubtensor
In order to be a pure Op, setsubtensor must copy its entire input, and In order to be a pure Op, setsubtensor must copy its entire input, and
modify just the subtensor in question (possibly a single element). It modify just the subtensor in question (possibly a single element). It
is much more efficient to modify that element inplace. is much more efficient to modify that element inplace.
See :func:`local_inplace_setsubtensor` See :func:`local_inplace_setsubtensor`
gemm gemm
Numerical libraries such as MKL and ATLAS implement the BLAS-level-3 Numerical libraries such as MKL and ATLAS implement the BLAS-level-3
interface, and provide a function `GEMM` that implements interface, and provide a function `GEMM` that implements
:math:`Z \leftarrow \alpha A \cdot B + \beta Z`, for matrices `A`, `B` :math:`Z \leftarrow \alpha A \cdot B + \beta Z`, for matrices `A`, `B`
and `Z`, and scalars :math:`\alpha, \beta`. and `Z`, and scalars :math:`\alpha, \beta`.
...@@ -237,14 +237,14 @@ Optimization o4 o3 o2 ...@@ -237,14 +237,14 @@ Optimization o4 o3 o2
See :func:`insert_inplace_optimizer` See :func:`insert_inplace_optimizer`
inplace_random inplace_random
Typically when a graph uses random numbers, the RandomState is stored Typically when a graph uses random numbers, the RandomState is stored
in a shared variable, used once per call and, updated after each function in a shared variable, used once per call and, updated after each function
call. In this common case, it makes sense to update the random number generator in-place. call. In this common case, it makes sense to update the random number generator in-place.
See :func:`random_make_inplace` See :func:`random_make_inplace`
elemwise fusion elemwise fusion
This optimization compresses subgraphs of computationally cheap This optimization compresses subgraphs of computationally cheap
elementwise operations into a single Op that does the whole job in a elementwise operations into a single Op that does the whole job in a
single pass over the inputs (like loop fusion). This is a win when single pass over the inputs (like loop fusion). This is a win when
...@@ -260,7 +260,7 @@ Optimization o4 o3 o2 ...@@ -260,7 +260,7 @@ Optimization o4 o3 o2
a graph copying data from GPU to CPU in order to evaluate an a graph copying data from GPU to CPU in order to evaluate an
expression that could have been evaluated on the GPU, we substitute expression that could have been evaluated on the GPU, we substitute
the GPU version of that Op for the CPU version. Likewise if we are the GPU version of that Op for the CPU version. Likewise if we are
copying the output of a Op with a GPU implementation to the GPU, copying the output of a Op with a GPU implementation to the GPU,
then we substitute the GPU version for the CPU version. In this way, if all goes well, then we substitute the GPU version for the CPU version. In this way, if all goes well,
this procedure will result in a graph with the following form: this procedure will result in a graph with the following form:
...@@ -286,6 +286,5 @@ Optimization o4 o3 o2 ...@@ -286,6 +286,5 @@ Optimization o4 o3 o2
setting ``optimizer_including=local_remove_all_assert`` which will setting ``optimizer_including=local_remove_all_assert`` which will
remove all assertions in the graph for checking user inputs are valid. remove all assertions in the graph for checking user inputs are valid.
Use this optimization if you are sure everything is valid in your graph. Use this optimization if you are sure everything is valid in your graph.
See :ref:`unsafe_optimization`
See :ref:`unsafe_optimization`
...@@ -47,11 +47,8 @@ from theano.tensor.basic_opt import ( ...@@ -47,11 +47,8 @@ from theano.tensor.basic_opt import (
MakeVector, MakeVector,
ShapeFeature, ShapeFeature,
assert_op, assert_op,
local_add_specialize,
local_canonicalize_alloc, local_canonicalize_alloc,
local_dimshuffle_lift, local_dimshuffle_lift,
local_greedy_distributor,
local_lift_transpose_through_dot,
local_merge_alloc, local_merge_alloc,
local_reshape_to_dimshuffle, local_reshape_to_dimshuffle,
local_useless_alloc, local_useless_alloc,
...@@ -59,7 +56,6 @@ from theano.tensor.basic_opt import ( ...@@ -59,7 +56,6 @@ from theano.tensor.basic_opt import (
local_useless_elemwise, local_useless_elemwise,
local_useless_reshape, local_useless_reshape,
make_vector, make_vector,
mul_canonizer,
register_specialize, register_specialize,
) )
from theano.tensor.blas import Dot22, Gemv from theano.tensor.blas import Dot22, Gemv
...@@ -109,6 +105,12 @@ from theano.tensor.math import round as tt_round ...@@ -109,6 +105,12 @@ from theano.tensor.math import round as tt_round
from theano.tensor.math import sgn, sin, sinh, sqr, sqrt, sub from theano.tensor.math import sgn, sin, sinh, sqr, sqrt, sub
from theano.tensor.math import sum as tt_sum from theano.tensor.math import sum as tt_sum
from theano.tensor.math import tan, tanh, true_div, xor from theano.tensor.math import tan, tanh, true_div, xor
from theano.tensor.math_opt import (
local_add_specialize,
local_greedy_distributor,
local_lift_transpose_through_dot,
mul_canonizer,
)
from theano.tensor.nnet.sigm import softplus from theano.tensor.nnet.sigm import softplus
from theano.tensor.shape import Reshape, Shape_i, SpecifyShape, reshape, specify_shape from theano.tensor.shape import Reshape, Shape_i, SpecifyShape, reshape, specify_shape
from theano.tensor.subtensor import ( from theano.tensor.subtensor import (
...@@ -465,7 +467,7 @@ class TestCanonize: ...@@ -465,7 +467,7 @@ class TestCanonize:
print(pprint(g.outputs[0])) print(pprint(g.outputs[0]))
def test_elemwise_multiple_inputs_optimisation(self): def test_elemwise_multiple_inputs_optimisation(self):
# verify that the Canonizer merge sequential Elemwise({mul,add}) part 1 # verify that the AlgebraicCanonizer merge sequential Elemwise({mul,add}) part 1
# #
# This part are that case that is done, but don't include case # This part are that case that is done, but don't include case
# that are not implemented but are supposed to be. # that are not implemented but are supposed to be.
...@@ -574,8 +576,8 @@ class TestCanonize: ...@@ -574,8 +576,8 @@ class TestCanonize:
] # [10:11] ] # [10:11]
# print cases # print cases
# We must be sure that the Canonizer is working, but that we don't have other # We must be sure that the AlgebraicCanonizer is working, but that we don't have other
# optimisation that could hide bug in the Canonizer as local_elemwise_fusion # optimisation that could hide bug in the AlgebraicCanonizer as local_elemwise_fusion
mode = get_default_mode() mode = get_default_mode()
opt = Query(["canonicalize"]) opt = Query(["canonicalize"])
opt = opt.excluding("local_elemwise_fusion") opt = opt.excluding("local_elemwise_fusion")
...@@ -595,11 +597,11 @@ class TestCanonize: ...@@ -595,11 +597,11 @@ class TestCanonize:
assert out_dtype == out.dtype assert out_dtype == out.dtype
@pytest.mark.skip( @pytest.mark.skip(
reason="Current implementation of Canonizer does not " reason="Current implementation of AlgebraicCanonizer does not "
"implement all cases. Skip the corresponding test." "implement all cases. Skip the corresponding test."
) )
def test_elemwise_multiple_inputs_optimisation2(self): def test_elemwise_multiple_inputs_optimisation2(self):
# verify that the Canonizer merge sequential Elemwise({mul,add}) part 2. # verify that the AlgebraicCanonizer merge sequential Elemwise({mul,add}) part 2.
# This part are that case that should have been done, but that are not implemented. # This part are that case that should have been done, but that are not implemented.
# Test with and without DimShuffle # Test with and without DimShuffle
...@@ -709,8 +711,8 @@ class TestCanonize: ...@@ -709,8 +711,8 @@ class TestCanonize:
] # [10:11] ] # [10:11]
# print cases # print cases
# We must be sure that the Canonizer is working, but that we don't have other # We must be sure that the AlgebraicCanonizer is working, but that we don't have other
# optimisation that could hide bug in the Canonizer as local_elemwise_fusion # optimisation that could hide bug in the AlgebraicCanonizer as local_elemwise_fusion
mode = get_default_mode() mode = get_default_mode()
mode._optimizer = Query(["canonicalize"]) mode._optimizer = Query(["canonicalize"])
mode._optimizer = mode._optimizer.excluding("local_elemwise_fusion") mode._optimizer = mode._optimizer.excluding("local_elemwise_fusion")
...@@ -728,7 +730,7 @@ class TestCanonize: ...@@ -728,7 +730,7 @@ class TestCanonize:
@pytest.mark.slow @pytest.mark.slow
def test_multiple_case(self): def test_multiple_case(self):
# test those case take from the comment in Canonizer # test those case take from the comment in AlgebraicCanonizer
# x / x -> 1 # x / x -> 1
# (x * y) / x -> y # (x * y) / x -> y
# x / y / x -> 1 / y # x / y / x -> 1 / y
...@@ -756,8 +758,8 @@ class TestCanonize: ...@@ -756,8 +758,8 @@ class TestCanonize:
dwv = _asarray(np.random.rand(*shp), dtype="float64") dwv = _asarray(np.random.rand(*shp), dtype="float64")
dvv = _asarray(np.random.rand(shp[0]), dtype="float64").reshape(1, shp[0]) dvv = _asarray(np.random.rand(shp[0]), dtype="float64").reshape(1, shp[0])
# We must be sure that the Canonizer is working, but that we don't have other # We must be sure that the AlgebraicCanonizer is working, but that we don't have other
# optimisation that could hide bug in the Canonizer as local_elemwise_fusion # optimisation that could hide bug in the AlgebraicCanonizer as local_elemwise_fusion
mode = get_default_mode() mode = get_default_mode()
opt = Query(["canonicalize"]) opt = Query(["canonicalize"])
...@@ -1109,7 +1111,7 @@ class TestCanonize: ...@@ -1109,7 +1111,7 @@ class TestCanonize:
assert f.maker.fgraph.toposort()[0].op == sgn assert f.maker.fgraph.toposort()[0].op == sgn
@pytest.mark.skip( @pytest.mark.skip(
reason="Current implementation of Canonizer does not " reason="Current implementation of AlgebraicCanonizer does not "
"implement all cases. Skip the corresponding test." "implement all cases. Skip the corresponding test."
) )
def test_multiple_case_that_fail(self): def test_multiple_case_that_fail(self):
...@@ -1123,8 +1125,8 @@ class TestCanonize: ...@@ -1123,8 +1125,8 @@ class TestCanonize:
dyv = _asarray(np.random.rand(*shp), dtype="float32") dyv = _asarray(np.random.rand(*shp), dtype="float32")
dzv = _asarray(np.random.rand(*shp), dtype="float32") dzv = _asarray(np.random.rand(*shp), dtype="float32")
# fvv = _asarray(np.random.rand(shp[0]), dtype='float32').reshape(1, shp[0]) # fvv = _asarray(np.random.rand(shp[0]), dtype='float32').reshape(1, shp[0])
# We must be sure that the Canonizer is working, but that we don't have other # We must be sure that the AlgebraicCanonizer is working, but that we don't have other
# optimisation that could hide bug in the Canonizer as local_elemwise_fusion # optimisation that could hide bug in the AlgebraicCanonizer as local_elemwise_fusion
mode = get_default_mode() mode = get_default_mode()
opt = Query(["canonicalize"]) opt = Query(["canonicalize"])
......
...@@ -86,7 +86,7 @@ from theano.scan.utils import ( ...@@ -86,7 +86,7 @@ from theano.scan.utils import (
scan_args, scan_args,
scan_can_remove_outs, scan_can_remove_outs,
) )
from theano.tensor import basic_opt from theano.tensor import basic_opt, math_opt
from theano.tensor.basic import Alloc, AllocEmpty, get_scalar_constant_value from theano.tensor.basic import Alloc, AllocEmpty, get_scalar_constant_value
from theano.tensor.elemwise import DimShuffle, Elemwise from theano.tensor.elemwise import DimShuffle, Elemwise
from theano.tensor.exceptions import NotScalarConstantError from theano.tensor.exceptions import NotScalarConstantError
...@@ -118,8 +118,8 @@ __copyright__ = "(c) 2010, Universite de Montreal" ...@@ -118,8 +118,8 @@ __copyright__ = "(c) 2010, Universite de Montreal"
_logger = logging.getLogger("theano.scan.opt") _logger = logging.getLogger("theano.scan.opt")
list_opt_slice = [ list_opt_slice = [
basic_opt.local_abs_merge, math_opt.local_abs_merge,
basic_opt.local_mul_switch_sink, math_opt.local_mul_switch_sink,
basic_opt.local_upcast_elemwise_constant_inputs, basic_opt.local_upcast_elemwise_constant_inputs,
basic_opt.local_useless_switch, basic_opt.local_useless_switch,
basic_opt.constant_folding, basic_opt.constant_folding,
......
This source diff could not be displayed because it is too large. You can view the blob instead.
This source diff could not be displayed because it is too large. You can view the blob instead.
...@@ -31,7 +31,7 @@ from theano.scalar import UnaryScalarOp ...@@ -31,7 +31,7 @@ from theano.scalar import UnaryScalarOp
# Work-around for Python 3.6 issue that prevents `import theano.tensor as tt` # Work-around for Python 3.6 issue that prevents `import theano.tensor as tt`
from theano.tensor import basic as tt from theano.tensor import basic as tt
from theano.tensor import basic_opt, extra_ops from theano.tensor import extra_ops, math_opt
from theano.tensor.basic import ARange, as_tensor_variable from theano.tensor.basic import ARange, as_tensor_variable
from theano.tensor.basic_opt import ( from theano.tensor.basic_opt import (
register_canonicalize, register_canonicalize,
...@@ -985,7 +985,7 @@ def softmax_simplifier(numerators, denominators): ...@@ -985,7 +985,7 @@ def softmax_simplifier(numerators, denominators):
return numerators, denominators return numerators, denominators
basic_opt.local_mul_canonizer.add_simplifier(softmax_simplifier, "softmax_simplifier") math_opt.local_mul_canonizer.add_simplifier(softmax_simplifier, "softmax_simplifier")
class CrossentropySoftmaxArgmax1HotWithBias(COp): class CrossentropySoftmaxArgmax1HotWithBias(COp):
......
...@@ -1085,7 +1085,7 @@ def local_1msigmoid(fgraph, node): ...@@ -1085,7 +1085,7 @@ def local_1msigmoid(fgraph, node):
register_local_1msigmoid = False register_local_1msigmoid = False
# This is False because the Stabilize pattern above # This is False because the Stabilize pattern above
# is looking for 1-sigm. Also Canonizer turns neg into *(-1) and so # is looking for 1-sigm. Also AlgebraicCanonizer turns neg into *(-1) and so
# this optimization might set off an unwanted chain of things. # this optimization might set off an unwanted chain of things.
# OTH - this transformation can be seen as pushing normal arithmetic either below or above the # OTH - this transformation can be seen as pushing normal arithmetic either below or above the
# sigmoidal nonlinearity... so if the canonicalized form had anything to say about that then it # sigmoidal nonlinearity... so if the canonicalized form had anything to say about that then it
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论