提交 68811fad authored 作者: James Bergstra's avatar James Bergstra

merge

......@@ -87,11 +87,6 @@ Glossary of terminology
Part of a function :term:`Mode` -- an object responsible for 'running'
the compiled function. Among other things, the linker determines whether computations are carried out with C or Python code.
Merge
A simple optimization in which redundant :term:`Apply` nodes are
combined. For example, in ``function([x,y], [(x+y)*2, (x+y)*3])`` the merge
optimization will ensure that ``x`` and ``y`` are only added once.
Mode
An object providing an :term:`optimizer` and a :term:`linker` that is
passed to :term:`theano.function`. It parametrizes how an expression
......
......@@ -33,7 +33,8 @@ Roughly in order of what you'll want to check out:
* :ref:`introduction` -- What is Theano?
* :ref:`tutorial` -- Learn the basics.
* :ref:`libdoc` -- All Theano's functionality, module by module.
* :ref:`libdoc` -- Theano's functionality, module by module.
* :ref:`optimizations` -- Guide to Theano's graph optimizations.
* :ref:`extending` -- Learn to add a Type, Op, or graph optimization.
* :ref:`internal` -- How to maintaining Theano, LISA-specific tips, and more...
* `API <api/>`_ -- The automatically-generated API
......@@ -60,6 +61,7 @@ Community
install
tutorial/index
library/index
optimizations
extending/index
glossary
links
......
......@@ -35,7 +35,7 @@ limited to:
* using inplace operations wherever it does not interfere with aliasing
* loop fusion for elementwise sub-expressions
* improvements to numerical stability (e.g. :math:`\log(1+\exp(x))` and :math:`\log(\sum_i \exp(x[i]))`)
* for a complete list, see :ref:`_optimizations`
* for a complete list, see :ref:`optimizations`
Theano was written at the LISA_ lab to support rapid development of
efficient machine learning algorithms. Theano is
......
......@@ -5,7 +5,8 @@
Library Documentation
=====================
This documentation covers Theano module-wise.
This documentation covers Theano module-wise. This is suited to finding the
Types and Ops that you can use to build and compile expression graphs.
.. toctree::
:maxdepth: 1
......
......@@ -18,6 +18,7 @@ sanity, they are grouped into the following sections:
:maxdepth: 1
basic
raw_random
shared_randomstreams
nnet
signal
......
.. _libdoc_tensor_raw_random:
=============================================
:mod:`raw_random` -- Low-level random numbers
=============================================
.. module:: raw_random
:platform: Unix, Windows
:synopsis: symbolic random variables
.. moduleauthor:: LISA
Raw random provides the random-number drawing functionality, that underlies
the friendlier :class:`RandomStreams` interface.
Reference
=========
.. class:: RandomStateType(gof.Type)
A `Type` for variables that will take ``numpy.random.RandomState`` values.
.. function:: random_state_type(name=None)
Return a new Variable whose ``.type`` is ``random_state_variable``.
.. class:: RandomFunction(gof.Op)
Op that draws random numbers from a numpy.RandomState object. This Op is
parametrized to draw numbers from many possible distributions.
.. function:: uniform(random_state, size=(), low=0.0, high=1.0)
Sample from a uniform distribution between low and high.
If the size argument is ambiguous on the number of
dimensions, the first argument may be a plain integer
to supplement the missing information.
:returns: :class:`RandomVariable`, NewRandomState
.. function:: binomial(random_state, size=(), n=1, p=0.5)
Sample n times with probability of success prob for each trial,
return the number of successes.
If the size argument is ambiguous on the number of
dimensions, the first argument may be a plain integer
to supplement the missing information.
:returns: :class:`RandomVariable`, NewRandomState
.. function:: normal(random_state, size=(), avg=0.0, std=1.0)
Sample from a normal distribution centered on avg with
the specified standard deviation (std)
If the size argument is ambiguous on the number of
dimensions, the first argument may be a plain integer
to supplement the missing information.
:returns: :class:`RandomVariable`, NewRandomState
.. function:: random_integers(random_state, size=(), low=0, high=1)
Sample a random integer between low and high, both inclusive.
If the size argument is ambiguous on the number of
dimensions, the first argument may be a plain integer
to supplement the missing information.
:returns: :class:`RandomVariable`, NewRandomState
.. function:: permutation(random_state, size=(), n=1)
Returns permutations of the integers between 0 and n-1, as many times
as required by size. For instance, if size=(p,q), p*q permutations
will be generated, and the output shape will be (p,q,n), because each
permutation is of size n.
If the size argument is ambiguous on the number of dimensions, the first
argument may be a plain integer i, which should correspond to len(size).
Note that the output will then be of dimension i+1.
:returns: :class:`RandomVariable`, NewRandomState
.. function:: multinomial(random_state, size=(), p_vals=[0.5, 0.5])
Sample from a multinomial distribution defined by probabilities pvals,
as many times as required by size. For instance, if size=(p,q), p*q
samples will be drawn, and the output shape will be (p,q,len(pvals)).
If the size argument is ambiguous on the number of dimensions, the first
argument may be a plain integer i, which should correspond to len(size).
Note that the output will then be of dimension i+1.
:returns: :class:`RandomVariable`, NewRandomState
.. class:: RandomStreamsBase(object)
.. method:: binomial(self, size=(), n=1, prob=0.5, ndim=None):
Sample n times with probability of success prob for each trial, return the number of
successes.
If the size argument is ambiguous on the number of dimensions, the first argument may be a
plain integer to supplement the missing information.
.. method:: uniform(self, size=(), low=0.0, high=1.0, ndim=None):
Sample a tensor of given size whose element from a uniform distribution between low and high.
If the size argument is ambiguous on the number of
dimensions, the first argument may be a plain integer
to supplement the missing information.
.. method:: normal(self, size=(), avg=0.0, std=1.0, ndim=None):
Usage: normal(random_state, size,
Sample from a normal distribution centered on avg with
the specified standard deviation (std)
If the size argument is ambiguous on the number of
dimensions, the first argument may be a plain integer
to supplement the missing information.
.. method:: random_integers(self, size=(), low=0, high=1, ndim=None):
Usage: random_integers(random_state, size, low=0, high=1)
Sample a random integer between low and high, both inclusive.
If the size argument is ambiguous on the number of
dimensions, the first argument may be a plain integer
to supplement the missing information.
.. method:: permutation(self, size=(), n=1, ndim=None):
Returns permutations of the integers between 0 and n-1, as many times
as required by size. For instance, if size=(p,q), p*q permutations
will be generated, and the output shape will be (p,q,n), because each
permutation is of size n.
Theano tries to infer the number of dimensions from the length of the size argument, but you
may always specify it with the `ndim` parameter.
.. note::
Note that the output will then be of dimension ndim+1.
.. method:: multinomial(self, size=(), n=1, pvals=[0.5, 0.5], ndim=None):
Sample n times from a multinomial distribution defined by probabilities pvals,
as many times as required by size. For instance, if size=(p,q), p*q
samples will be drawn, and the output shape will be (p,q,len(pvals)).
Theano tries to infer the number of dimensions from the length of the size argument, but you
may always specify it with the `ndim` parameter.
.. note::
Note that the output will then be of dimension ndim+1.
.. method:: shuffle_row_elements(self, input):
Return a variable with every row (rightmost index) shuffled.
This uses permutation random variable internally, available via the ``.permutation``
attribute of the return value.
......@@ -101,10 +101,11 @@ For example:
Reference
=========
.. class:: RandomStreams(object)
.. class:: RandomStreams(raw_random.RandomStreamsBase)
This is a symbolic stand-in for ``numpy.random.RandomState``. It has
methods such as `uniform` and `normal` that return symbolic random variables.
This is a symbolic stand-in for ``numpy.random.RandomState``.
Random variables of various distributions are instantiated by calls to
parent class :class:`raw_random.RandomStreamsBase`.
.. method:: updates()
......@@ -118,34 +119,22 @@ Reference
`meta_seed` will be used to seed a temporary random number generator,
that will in turn generate seeds for each of the random variables that
has been created by this object.
has been created by this object (via `gen`).
:returns: None
.. method:: binomial(self, size, n=1, p=0.5)
.. method:: gen(op, *args, **kwargs)
Symbolic stand-in for numpy.random.RandomState.binomial
Return the random variable from `op(*args, **kwargs)`, but
also install special attributes (``.rng`` and ``update``, see
:class:`RandomVariable` ) into it.
:returns: :class:`RandomVariable` of float64 that will have `shape==size` at run-time.
This function also adds the returned variable to an internal list so
that it can be seeded later by a call to `seed`.
.. method:: uniform(self, size, low=0.0, high=1.0)
Symbolic stand-in for numpy.random.RandomState.uniform
:returns: :class:`RandomVariable` of float64 that will have `shape==size` at run-time.
.. method:: normal(self, size, loc=0.0, std=1.0)
Symbolic stand-in for numpy.random.RandomState.normal
:returns: :class:`RandomVariable` of float64 that will have `shape==size` at run-time.
.. method:: random_integers(self, size, low=0, high=1)
Symbolic stand-in for numpy.random.RandomState.random_integers
:returns: :class:`RandomVariable` of int64 that will have `shape==size` at run-time.
.. method:: uniform, normal, binomial, multinomial, random_integers, ...
See :class:`raw_random.RandomStreamsBase`.
.. class:: RandomVariable(object)
......@@ -163,114 +152,3 @@ Reference
Including this pair in the``updates`` list to function will cause the
function to update the random number generator feeding this variable.
.. _libdoc_tensor_raw_random:
=============================================
:mod:`raw_random` -- Low-level random numbers
=============================================
.. module:: raw_random
:platform: Unix, Windows
:synopsis: symbolic random variables
.. moduleauthor:: LISA
Raw random provides the random-number drawing functionality, that underlies
the :class:`RandomStreams` interface.
Reference
=========
.. class:: RandomStateType(gof.Type)
A `Type` for variables that will take ``numpy.random.RandomState`` values.
.. class:: RandomFunction(gof.Op)
Op that draws random numbers from a numpy.RandomState object. This Op is
parametrized to draw numbers from many distributions.
.. function:: random_function(fn, dtype, *rfargs, **rfkwargs)
Returns a wrapper around RandomFunction which automatically infers the number
of dimensions of the output from the given shape. If the shape cannot be inferred,
the user can give an integer as first argument, which will be interpreted as the
number of dimensions.
If the distribution is not scalar (e.g., a multinomial), the output will have
more dimensions than what the shape argument suggests. The "ndim_added" keyword
arguments allows to specify how many dimensions to add (for a multinomial, 1).
The number of dimensions for the following shape arguments can be inferred:
* shape(x)
* make_lvector(x, y, z, ...)
* ndarrays, constants
.. function:: uniform(random_state, size, low=0.0, high=1.0)
Sample from a uniform distribution between low and high.
If the size argument is ambiguous on the number of
dimensions, the first argument may be a plain integer
to supplement the missing information.
:returns: :class:`RandomVariable`, NewRandomState
.. function:: binomial(random_state, size, n=1, p=0.5)
Sample n times with probability of success prob for each trial,
return the number of successes.
If the size argument is ambiguous on the number of
dimensions, the first argument may be a plain integer
to supplement the missing information.
:returns: :class:`RandomVariable`, NewRandomState
.. function:: normal(random_state, size, avg=0.0, std=1.0)
Sample from a normal distribution centered on avg with
the specified standard deviation (std)
If the size argument is ambiguous on the number of
dimensions, the first argument may be a plain integer
to supplement the missing information.
:returns: :class:`RandomVariable`, NewRandomState
.. function:: random_integers(random_state, size, low=0, high=1)
Sample a random integer between low and high, both inclusive.
If the size argument is ambiguous on the number of
dimensions, the first argument may be a plain integer
to supplement the missing information.
:returns: :class:`RandomVariable`, NewRandomState
.. function:: permutation(random_state, size, n=1)
Returns permutations of the integers between 0 and n-1, as many times
as required by size. For instance, if size=(p,q), p*q permutations
will be generated, and the output shape will be (p,q,n), because each
permutation is of size n.
If the size argument is ambiguous on the number of dimensions, the first
argument may be a plain integer i, which should correspond to len(size).
Note that the output will then be of dimension i+1.
:returns: :class:`RandomVariable`, NewRandomState
.. function:: multinomial(random_state, size, p_vals=[0.5, 0.5])
Sample from a multinomial distribution defined by probabilities pvals,
as many times as required by size. For instance, if size=(p,q), p*q
samples will be drawn, and the output shape will be (p,q,len(pvals)).
If the size argument is ambiguous on the number of dimensions, the first
argument may be a plain integer i, which should correspond to len(size).
Note that the output will then be of dimension i+1.
:returns: :class:`RandomVariable`, NewRandomState
.. _optimizations:
==============
Optimizations
==============
Theano applies many kinds of graph optimizations, with different objectives:
* simplifying and standardizing the form of the expression graph
(e.g. :term:`merge`, :term:`add canonicalization<add canonicalization>`),
* reducing the maximum memory footprint (e.g. :term:`inplace_elemwise`),
* increasing execution speed (e.g. :term:`constant folding`).
The optimizations are listed in roughly chronological order. The table below
gives a quick summary of the optimizations included in the default modes.
The descriptions are brief and point to further reading.
If you would like to add an additional optimization, refer to
:ref:`optimization` in the guide to extending Theano.
.. #COMMENT
Since the print_summary method has been added to several OpDBs and
optimizers, it is possible to compute an accurate and up-to-date
optimization list by typing
python -c 'import theano; theano.compile.FAST_RUN.optimizer.print_summary()'
python -c 'import theano; theano.compile.FAST_COMPILE.optimizer.print_summary()'
etc.
========================================================= ========= ============
Optimization FAST_RUN FAST_COMPILE
========================================================= ========= ============
:term:`merge` x x
:term:`constant folding<constant folding>` x
:term:`shape promotion<shape promotion>` x
:term:`fill promotion <fill promotion>` x
:term:`fill cut<fill cut>` x
:term:`inc_subtensor srlz.<inc_subtensor serialization>` x
:term:`reshape_chain` x
:term:`const. elimination<constant elimination>` x
:term:`add canonical. <add canonicalization>` x
:term:`mul canonical. <mul canonicalization>` x
:term:`dot22` x
:term:`sparse_dot` x
:term:`sum_scalar_mul` x
:term:`neg_neg` x
:term:`neg_div_neg` x
:term:`add specialize <add specialization>` x
:term:`mul specialize <mul specialization>` x
:term:`pow specialize <pow specialization>` x
:term:`inplace_setsubtensor` x
:term:`gemm` x
:term:`inplace_elemwise` x
:term:`inplace_random` x
:term:`elemwise fusion`
:term:`GPU transfer`
========================================================= ========= ============
.. glossary::
merge
A simple optimization in which redundant :term:`Apply` nodes are
combined. For example, in ``function([x,y], [(x+y)*2, (x+y)*3])`` the merge
optimization will ensure that ``x`` and ``y`` are only added once.
This optimization is very useful because it frees users to write
highly redundant mathematical code. Theano will make sure to compute
just what is necessary.
See :class:`MergeOptimizer`.
constant folding
When all the inputs to an expression are constant, then the expression
can be pre-computed at compile-time.
See ***TODO***
shape promotion
See ***TODO***
fill promotion
See ***TODO***
fill cut
See ***TODO***
inc_subtensor serialization
***TODO***
reshape_chain
This optimizes graphs like ``reshape(reshape(x, shape1), shape2)`` -> ``reshape(x, shape2)``
See also ***TODO***
constant elimination
***TODO***
add canonicalization
***TODO***
mul canonicalization
***TODO***
dot22
This simple optimization replaces dot(matrix, matrix) with a special
`dot22` op that only works for matrix multiplication. This op is
implemented with a call to GEMM, and sometimes replaced entirely by
the :term:`gemm` optimization.
See also, ***TODO***.
sparse_dot
***TODO***
sum_scalar_mul
This optimizes graphs like ``sum(scalar * tensor)`` -> ``scalar * sum(tensor)``
See ***TODO***
neg_neg
Composition of two negatives can be cancelled out.
See ***TODO***
neg_div_neg
Matching negatives in both the numerator and denominator can both be removed.
See ***TODO***
add specialization
This optimization simplifies expressions involving the addition of
zero.
See ***TODO***
mul specialization
Several special cases of mul() exist, and this optimization tries to
recognize them. Some examples include:
* ``mul(x,x)`` -> ``x**2``
* ``mul(x,0)`` -> ``zeros_like(x)``
* ``mul(x, -1)`` -> ``neg(x)``
See ***TODO***
pow specialization
Several special cases of pow() exist, and this optimization tries to
recognize them. Some examples include:
* ``pow(x,2)`` -> ``x**2``
* ``pow(x,0)`` -> ``ones_like(x)``
* ``pow(x, -0.5)`` -> ``inv(sqrt(x))``
See also ***TODO***
inplace_setsubtensor
In order to be a pure Op, setsubtensor must copy its entire input, and
modify just the subtensor in question (possibly a single element). It
is much more efficient to modify that element inplace.
See ***TODO***
gemm
Numerical libraries such as MKL and ATLAS implement the BLAS-level-3
interface, and provide a function `GEMM` that implements
:math:`Z \leftarrow \alpha A \cdot B + \beta Z`, for matrices `A`, `B`
and `Z`, and scalars :math:`\alpha, \beta`.
This optimization tries to rearrange a variety of linear algebra
expressions into one or more instances of this motif, and replace them
each with a single `Gemm` Op.
See ***TODO***
inplace_elemwise
When one of the inputs to an elementwise expression has the same type
and shape as the output, and is no longer needed for computation after
the elemwise expression is evaluated, then we can reuse the storage of
the input to store the output.
See ***TODO***
inplace_random
Typically when a graph uses random numbers, the RandomState is stored
in a shared variable, used once per call and, updated after each function
call. In this common case, it makes sense to update the random number generator in-place.
See ***TODO***
elemwise fusion
See ***TODO***
GPU transfer
The current strategy for choosing which expressions to evaluate on the
CPU and which to evaluate on the GPU is a greedy one. There are a
number of Ops ***TODO*** with GPU implementations and whenever we find
a graph copying data from GPU to CPU in order to evaluate an
expression that could have been evaluated on the GPU, we substitute
the GPU version of that Op for the CPU version. Likewise if we are
copying the output of a Op with a GPU implementation to the GPU,
then we substitute the GPU version for the CPU version. In this way, if all goes well,
this procedure will result in a graph with the following form:
1. copy non-shared inputs to GPU
2. carry out most/all computations on the GPU
3. copy output back to CPU
When using a GPU, :func:`shared()` will default to GPU storage for
'float32' ndarray arguments, and these shared variables act as seeds
for the greedy algorithm.
See ***TODO***
......@@ -10,4 +10,5 @@ Proposals for new/revised features
pfunc
noupdates
opt_patterns2
======================
Optimization Patterns
======================
.. note:
Proposed 2010 01 20
Motivation
==========
Theano optimizations are organized at high level,
but canonicalization and specialization (C&S) are a mess. It is difficult to know how a graph will
be optimized, or to know in which order optimizations will be performed.
C&S is also slow because of the guess-and-check nature of node optimization within equilibrium
optimizers (VERIFY THIS BY PROFILING).
C&S functions are also very difficult and tedious to write because of
symmetries in the graph, and because of the lack of standard Op names
(e.g. ``T.add``, ``T.and_``, and ``T._shape``). Gemm and the advanced_indexing -> xent
optimization are particularly tricky examples.
Defining a sort of regexp-like approach for describing graph substitutions would ideally be
less error-prone, less tedious, more efficient to evaluate, easier to document, and all-round
better.
Proposal
========
In a nutshell: revisit the PatternSub and make it more powerful.
Olivier B. (original author or PatternSub) mentioned that one of the problems was the annoyance
of working through DimShuffle
Olivier B. also suggests writing scalar-related patterns in terms of scalars, and then inferring Tensor-related patterns.
......@@ -73,6 +73,8 @@ class Optimizer(object):
"""
pass
def print_summary(self, stream=sys.stdout, level=0):
print >> stream, "%s%s id=%i" %(' '*level, self.__class__.__name__, id(self))
class FromFunctionOptimizer(Optimizer):
"""WRITEME"""
......@@ -81,6 +83,11 @@ class FromFunctionOptimizer(Optimizer):
def add_requirements(self, env):
env.extend(toolbox.ReplaceValidate())
def print_summary(self, stream=sys.stdout, level=0):
print >> stream, "%s%s id=%i" %(' '*level,
str(self.apply),
id(self))
def optimizer(f):
"""decorator for FromFunctionOptimizer"""
return FromFunctionOptimizer(f)
......@@ -137,6 +144,12 @@ class SeqOptimizer(Optimizer, list):
def __repr__(self):
return list.__repr__(self)
def print_summary(self, stream=sys.stdout, level=0):
print >> stream, "%s%s (%i)" %(' '*level, self.__class__.__name__, id(self))
for opt in self:
opt.print_summary(stream, level=level+2)
class _metadict:
......@@ -354,6 +367,8 @@ class LocalOptimizer(object):
This is the place to do it."""
env.extend(toolbox.ReplaceValidate())
def print_summary(self, stream=sys.stdout, level=0):
print >> stream, "%s%s id=%i" %(' '*level, self.__class__.__name__, id(self))
class FromFunctionLocalOptimizer(LocalOptimizer):
"""WRITEME"""
......@@ -364,6 +379,10 @@ class FromFunctionLocalOptimizer(LocalOptimizer):
return self._tracks
def __str__(self):
return getattr(self, 'name', '<FromFunctionLocalOptimizer instance>')
def print_summary(self, stream=sys.stdout, level=0):
print >> stream, "%s%s id=%i" %(' '*level,
str(self.transform),
id(self))
def local_optimizer(*tracks):
def decorator(f):
......@@ -388,6 +407,11 @@ class LocalOptGroup(LocalOptimizer):
if repl:
return repl
def print_summary(self, stream=sys.stdout, level=0):
print >> stream, "%s%s id=%i" %(' '*level, self.__class__.__name__, id(self))
for lopt in self.opts:
lopt.print_summary(stream, level=level+2)
class _LocalOpKeyOptGroup(LocalOptGroup):
"""WRITEME"""
......@@ -466,6 +490,12 @@ class OpRemove(LocalOptimizer):
def __str__(self):
return "%s(x) -> x" % (self.op)
def print_summary(self, stream=sys.stdout, level=0):
print >> stream, "%s%s(%s) id=%i" %(' '*level,
self.__class__.__name__,
str(self.op),
id(self))
class PatternSub(LocalOptimizer):
"""WRITEME
......@@ -618,6 +648,12 @@ class PatternSub(LocalOptimizer):
def __repr__(self):
return str(self)
def print_summary(self, stream=sys.stdout, level=0):
print >> stream, "%s%s(%s, %s) id=%i" %(' '*level,
self.__class__.__name__,
str(self.in_pattern),
str(self.out_pattern),
id(self))
##################
......@@ -772,6 +808,11 @@ class NavigatorOptimizer(Optimizer):
if self.local_opt:
self.local_opt.add_requirements(env)
def print_summary(self, stream=sys.stdout, level=0):
print >> stream, "%s%s (%i)" %(' '*level, self.__class__.__name__, id(self))
self.local_opt.print_summary(stream, level=level+2)
class TopoOptimizer(NavigatorOptimizer):
"""WRITEME"""
......@@ -807,6 +848,7 @@ class TopoOptimizer(NavigatorOptimizer):
self.detach_updater(env, u)
class OpKeyOptimizer(NavigatorOptimizer):
"""WRITEME"""
......@@ -919,6 +961,10 @@ class EquilibriumOptimizer(NavigatorOptimizer):
if max_use_abort:
print >> sys.stderr, "WARNING: EquilibriumOptimizer max'ed out"
def print_summary(self, stream=sys.stdout, level=0):
print >> stream, "%s%s id=%i" %(' '*level, self.__class__.__name__, id(self))
for lopt in self.local_optimizers:
lopt.print_summary(stream, level=level+2)
#################
......
......@@ -95,6 +95,11 @@ class DB(object):
for variable in variables:
return variable
def print_summary(self, stream=sys.stdout):
print >> stream, "%s (id %i)"%(self.__class__.__name__, id(self))
print >> stream, " names", self._names
print >> stream, " db", self.__db__
class Query(object):
......
差异被折叠。
......@@ -1437,6 +1437,10 @@ def log2(a):
def log10(a):
"""base 10 logarithm of a"""
@_scal_elemwise
def log1p(a):
"""log(1+a)"""
@_scal_elemwise
def sgn(a):
"""sign of a"""
......@@ -3466,7 +3470,10 @@ class numeric_grad:
raise ValueError('argument element %i has wrong shape %s' %(i,str((a.shape,
b.shape))))
errs.append(numpy.max(numeric_grad.abs_rel_err(a,b)))
return numpy.max(errs), numpy.argmax(errs)
if numpy.all(numpy.isfinite(errs)):
return numpy.max(errs), numpy.argmax(errs)
else:
return float('inf'), 0
def verify_grad(op, pt, n_tests=2, rng=None, eps=None, tol=None, mode=None, cast_to_output_type=False):
......
......@@ -100,6 +100,10 @@ def inv_inplace(a):
def log_inplace(a):
"""base e logarithm of a (inplace on a)"""
@_scal_inplace
def log1p_inplace(a):
"""log(1+a)"""
@_scal_inplace
def log2_inplace(a):
"""base 2 logarithm of a (inplace on a)"""
......
......@@ -43,7 +43,11 @@ class ScalarSigmoid(scalar.UnaryScalarOp):
else:
raise NotImplementedError('only floatingpoint is implemented')
def c_code_cache_version(self):
return (2,)
v = super(ScalarSigmoid, self).c_code_cache_version()
if v:
return (2,) + v
else:
return v
scalar_sigmoid = ScalarSigmoid(scalar.upgrade_to_float, name='scalar_sigmoid')
sigmoid = elemwise.Elemwise(scalar_sigmoid, name='sigmoid')
......@@ -74,7 +78,11 @@ class ScalarSoftplus(scalar.UnaryScalarOp):
else:
raise NotImplementedError('only floatingpoint is implemented')
def c_code_cache_version(self):
return (2,)
v = super(ScalarSoftplus, self).c_code_cache_version()
if v:
return (2,) + v
else:
return v
scalar_softplus = ScalarSoftplus(scalar.upgrade_to_float, name='scalar_softplus')
softplus = elemwise.Elemwise(scalar_softplus, name='softplus')
......
......@@ -44,23 +44,32 @@ def _fill_chain(new_out, orig_inputs):
new_out = T.fill(i, new_out)
return [new_out]
def get_constant_value(v):
def get_constant_value(v, fill=False):
"""return the constant value underlying variable `v`
If v is the output of dimshuffles, this function digs through them.
If v is the output of dimshuffles, fills, this function digs through them.
If `v` is not some view of constant data, then raise a TypeError.
if fill is True, then it returns (v, [...]) where the second term is a list of variables
that were used in the fill expressions
:note: There may be another function similar to this one in the code, but I'm not sure where it
is.
"""
if not isinstance(v, gof.Variable):
return v # why would this happen?
if isinstance(v, gof.Constant):
if fill:
return v.data, []
return v.data
if v.owner and isinstance(v.owner.op, T.DimShuffle):
return get_constant_value(v.owner.inputs[0])
return get_constant_value(v.owner.inputs[0], fill=fill)
if fill:
if v.owner and v.owner.op == T.fill:
shape, val = v.owner.inputs
# fill(a,b) fills the shape of 'a' filled with 'b'
rval, rshapes = get_constant_value(val, fill=fill)
return rval, rshapes + [shape]
raise TypeError(v)
@gof.optimizer
......@@ -1122,6 +1131,30 @@ register_specialize(local_add_specialize)
mul_canonizer = in2out(gof.LocalOptGroup(local_mul_canonizer, local_fill_cut, local_fill_sink))
@register_specialize
@gof.local_optimizer([T.log])
def local_log1p(node):
# log(1+exp(x)) -> log1p(x)
if node.op == T.log:
log_arg, = node.inputs
if log_arg.owner and log_arg.owner.op == T.add:
add_inputs = log_arg.owner.inputs
consts = [0]
fills = []
nonconsts = []
for add_in in add_inputs:
try:
v, f = get_constant_value(add_in, fill=True)
consts.append(v)
fills.extend(f)
except:
nonconsts.append(add_in)
if nonconsts:
if numpy.allclose(numpy.sum(consts), 1):
if len(nonconsts)==1:
return _fill_chain(T.log1p(nonconsts[0]), fills)
else:
return _fill_chain(T.log1p(T.add(*nonconsts)), fills)
def add_calculate(num, denum, aslist = False, out_type=None):
......
......@@ -6,7 +6,7 @@ import numpy
from theano.compile import module, In, Component
from theano.gof import Container
from theano.tensor import raw_random, permute_row_elements
from theano.tensor import raw_random
class RandomStreamsInstance(object):
"""RandomStreamsInstance"""
......@@ -86,7 +86,7 @@ class RandomStreamsInstance(object):
return
raise KeyError(item)
class RandomStreams(Component):
class RandomStreams(Component, raw_random.RandomStreamsBase):
"""Module component with similar interface to numpy.random (numpy.random.RandomState)"""
random_state_variables = []
......@@ -147,52 +147,3 @@ class RandomStreams(Component):
self.random_state_variables.append((random_state_variable, new_r))
return out
def binomial(self, *args, **kwargs):
"""Return a symbolic binomial sample
This is a shortcut for a call to `self.gen`
"""
return self.gen(raw_random.binomial, *args, **kwargs)
def uniform(self, *args, **kwargs):
"""Return a symbolic uniform sample
This is a shortcut for a call to `self.gen`
"""
return self.gen(raw_random.uniform, *args, **kwargs)
def normal(self, *args, **kwargs):
"""Return a symbolic normal sample
This is a shortcut for a call to `self.gen`
"""
return self.gen(raw_random.normal, *args, **kwargs)
def random_integers(self, *args, **kwargs):
"""Return a symbolic random integer sample
This is a shortcut for a call to `self.gen`
"""
return self.gen(raw_random.random_integers, *args, **kwargs)
def permutation(self, *args, **kwargs):
"""Return a symbolic permutation of integers
This is a shortcut for a call to `self.gen`
"""
return self.gen(raw_random.permutation, *args, **kwargs)
def multinomial(self, *args, **kwargs):
"""Return a symbolic multinomial sample
This is a shortcut for a call to `self.gen`
"""
return self.gen(raw_random.multinomial, *args, **kwargs)
def shuffle_row_elements(self, input):
"""Return a variable with every row (rightmost index) shuffled"""
perm = self.permutation(input.ndim-1, input.shape[:-1], input.shape[-1])
shuffled = permute_row_elements(input, perm)
return shuffled
......@@ -22,7 +22,7 @@ def randomstate_constructor(value, name=None, strict=False):
name=name,
strict=strict)
class RandomStreams(object):
class RandomStreams(raw_random.RandomStreamsBase):
"""Module component with similar interface to numpy.random (numpy.random.RandomState)"""
state_updates = []
......@@ -100,7 +100,6 @@ class RandomStreams(object):
"""
item.value = val
def gen(self, op, *args, **kwargs):
"""Create a new random stream in this container.
......@@ -123,39 +122,3 @@ class RandomStreams(object):
self.state_updates.append(out.update)
return out
def binomial(self, *args, **kwargs):
"""Return a symbolic binomial sample
*args and **kwargs will be passed to numpy.random.RandomState.binomial
This is a shortcut for a call to `self.gen`
"""
return self.gen(raw_random.binomial, *args, **kwargs)
def uniform(self, *args, **kwargs):
"""Return a symbolic uniform sample
*args and **kwargs will be passed to numpy.random.RandomState.uniform
This is a shortcut for a call to `self.gen`
"""
return self.gen(raw_random.uniform, *args, **kwargs)
def normal(self, *args, **kwargs):
"""Return a symbolic normal sample
*args and **kwargs will be passed to numpy.random.RandomState.normal
This is a shortcut for a call to `self.gen`
"""
return self.gen(raw_random.normal, *args, **kwargs)
def random_integers(self, *args, **kwargs):
"""Return a symbolic random integer sample
*args and **kwargs will be passed to numpy.random.RandomState.random_integers
This is a shortcut for a call to `self.gen`
"""
return self.gen(raw_random.random_integers, *args, **kwargs)
......@@ -444,6 +444,17 @@ Log10InplaceTester = makeBroadcastTester(op = inplace.log10_inplace,
grad = _grad_broadcast_unary_positive,
inplace = True)
Log1pTester = makeBroadcastTester(op = log1p,
expected = numpy.log1p,
good = _good_broadcast_unary_positive,
grad = _grad_broadcast_unary_positive)
Log1pInplaceTester = makeBroadcastTester(op = inplace.log1p_inplace,
expected = numpy.log1p,
good = _good_broadcast_unary_positive,
grad = _grad_broadcast_unary_positive,
inplace = True)
SqrtTester = makeBroadcastTester(op = sqrt,
expected = numpy.sqrt,
good = _good_broadcast_unary_positive,
......@@ -1088,9 +1099,7 @@ class test_bitwise(unittest.TestCase):
self.failUnless(numpy.all(v == (~l)), (l, r, v))
class T_add(unittest.TestCase):
def setUp(self):
utt.seed_rng()
......@@ -1117,8 +1126,11 @@ class T_add(unittest.TestCase):
def test_grad_col(self):
utt.verify_grad(add, [numpy.random.rand(3, 5), numpy.random.rand(3, 1)])
class T_exp(unittest.TestCase):
class T_ceil(unittest.TestCase):
def test_complex(self):
self.assertRaises(TypeError, ceil, zvector())
class T_exp(unittest.TestCase):
def test_grad_0(self):
utt.verify_grad(exp, [
numpy.asarray([[ 1.5089518 , 1.48439076, -4.7820262 ],
......@@ -1128,6 +1140,19 @@ class T_exp(unittest.TestCase):
numpy.asarray([[ 1.5089518 , 1.48439076, -4.7820262 ],
[ 2.04832468, 0.50791564, -1.58892269]])])
def test_int(self):
x = ivector()
f = function([x], exp(x))
exp_3 = f([3])
assert exp_3.dtype == 'float64'
def test_complex(self):
x = zvector()
assert exp(x).dtype == 'complex128'
f = function([x], exp(x))
exp_3 = f([3+2j])
assert numpy.allclose(exp_3, numpy.exp(3+2j))
class T_divimpl(unittest.TestCase):
def test_impls(self):
i = iscalar()
......
......@@ -109,12 +109,18 @@ class T_RandomStreams(unittest.TestCase):
out = m.random.uniform((2,2))
m.fn = Method([], out)
made = m.make()
made.random.initialize(seed=789)
#as a distraction, install various seeds
made.random.initialize(seed=789)
made.random.seed(888)
rng = numpy.random.RandomState(823874)
made.random[out.rng] = numpy.random.RandomState(823874)
# then replace the rng of the stream we care about via setitem
realseed = 823874
rng = numpy.random.RandomState(realseed)
made.random[out.rng] = numpy.random.RandomState(realseed)
print made.fn()
print rng.uniform(size=(2,2))
fn_val0 = made.fn()
fn_val1 = made.fn()
......@@ -153,7 +159,7 @@ class T_RandomStreams(unittest.TestCase):
# ndim specified, consistent with shape, OK
m2 = Module()
m2.random = RandomStreams(234)
m2.fn = Method([], m2.random.uniform(2, (2,2)))
m2.fn = Method([], m2.random.uniform((2,2), ndim=2))
made2 = m2.make()
made2.random.initialize()
......@@ -164,7 +170,7 @@ class T_RandomStreams(unittest.TestCase):
# ndim specified, inconsistent with shape, should raise ValueError
m3 = Module()
m3.random = RandomStreams(234)
m3.fn = Method([], m3.random.uniform(1, (2,2)))
m3.fn = Method([], m3.random.uniform((2,2), ndim=1))
made3 = m3.make()
made3.random.initialize()
self.assertRaises(ValueError, made3.fn)
......
......@@ -5,6 +5,7 @@ import numpy as N
from theano.tests import unittest_tools
from theano.tensor.raw_random import *
from theano.tensor import raw_random
from theano import tensor
......@@ -12,7 +13,7 @@ from theano import compile, gof
class T_random_function(unittest.TestCase):
def test_basic_usage(self):
rf = RandomFunction(numpy.random.RandomState.uniform, tensor.dvector, -2.0, 2.0)
rf = RandomFunction(numpy.random.RandomState.uniform, tensor.dvector)
assert not rf.inplace
assert getattr(rf, 'destroy_map', {}) == {}
......@@ -32,23 +33,21 @@ class T_random_function(unittest.TestCase):
assert numpy.all(f_0 == f_1)
def test_inplace_norun(self):
rf = RandomFunction(numpy.random.RandomState.uniform, tensor.dvector, -2.0, 2.0,
inplace=True)
rf = RandomFunction(numpy.random.RandomState.uniform, tensor.dvector, inplace=True)
assert rf.inplace
assert getattr(rf, 'destroy_map', {}) != {}
def test_args(self):
"""Test that arguments to RandomFunction are honored"""
rf2 = RandomFunction(numpy.random.RandomState.uniform, tensor.dvector, -2.0, 2.0)
rf4 = RandomFunction(numpy.random.RandomState.uniform, tensor.dvector, -4.0, 4.0,
inplace=True)
rf2 = RandomFunction(numpy.random.RandomState.uniform, tensor.dvector)
rf4 = RandomFunction(numpy.random.RandomState.uniform, tensor.dvector, inplace=True)
rng_R = random_state_type()
# use make_node to override some of the self.args
post_r2, out2 = rf2(rng_R, (4,))
post_r2_4, out2_4 = rf2(rng_R, (4,), -4.0)
post_r2, out2 = rf2(rng_R, (4,), -2, 2)
post_r2_4, out2_4 = rf2(rng_R, (4,), -4.0, 2)
post_r2_4_4, out2_4_4 = rf2(rng_R, (4,), -4.0, 4.0)
post_r4, out4 = rf4(rng_R, (4,))
post_r4, out4 = rf4(rng_R, (4,), -4, 4)
f = compile.function(
[compile.In(rng_R, value=numpy.random.RandomState(55), update=post_r4, mutable=True)],
......@@ -65,7 +64,7 @@ class T_random_function(unittest.TestCase):
def test_inplace_optimization(self):
"""Test that FAST_RUN includes the random_make_inplace optimization"""
#inplace = False
rf2 = RandomFunction(numpy.random.RandomState.uniform, tensor.dvector, -2.0, 2.0)
rf2 = RandomFunction(numpy.random.RandomState.uniform, tensor.dvector)
rng_R = random_state_type()
# use make_node to override some of the self.args
......@@ -92,19 +91,18 @@ class T_random_function(unittest.TestCase):
def test_random_function_ndim(self):
"""Test that random_function helper function accepts ndim as first argument"""
rf2 = random_function(numpy.random.RandomState.uniform, 'float64', -2.0, 2.0)
rng_R = random_state_type()
# ndim is an optional argument indicating the length of the 'shape'
# ndim not specified, OK
post_out4, out4 = rf2(rng_R, (4,))
post_out4, out4 = uniform(rng_R, (4,))
# ndim specified, consistent with shape, OK
post_out1_4, out1_4 = rf2(rng_R, 1, (4,))
post_out2_4_4, out2_4_4= rf2(rng_R, 2, (4, 4))
post_out1_4, out1_4 = uniform(rng_R, (4,), ndim=1)
post_out2_4_4, out2_4_4= uniform(rng_R, (4, 4), ndim=2)
# ndim specified, but not compatible with shape
post_out2_4, out2_4 = rf2(rng_R, 2, (4,))
post_out2_4, out2_4 = uniform(rng_R, (4,), ndim=2)
f_ok = compile.function(
[compile.In(rng_R, value=numpy.random.RandomState(55), update=post_out2_4_4, mutable=True)],
......@@ -132,18 +130,31 @@ class T_random_function(unittest.TestCase):
# Specifying a different ndim_added will change the Op's output ndim,
# so numpy.uniform will produce a result of incorrect shape,
# and a ValueError should be raised.
uni_1 = random_function(numpy.random.RandomState.uniform, 'float64', -2.0, 2.0, ndim_added=1)
uni_0 = random_function(numpy.random.RandomState.uniform, 'float64', -2.0, 2.0, ndim_added=0)
uni_m1 = random_function(numpy.random.RandomState.uniform, 'float64', -2.0, 2.0, ndim_added=-1)
def ndim_added_deco(ndim_added):
def randomfunction(random_state, size=(), low=0.0, high=0.0, ndim=None):
ndim, size = raw_random._infer_ndim(ndim, size)
op = RandomFunction('uniform',
tensor.TensorType(dtype = 'float64', broadcastable =
(False,)*(ndim+ndim_added)),
ndim_added=ndim_added)
return op(random_state, size, low, high)
return randomfunction
uni_1 = ndim_added_deco(1)
uni_0 = ndim_added_deco(0)
uni_m1 = ndim_added_deco(-1)
#uni_1 = random_function(numpy.random.RandomState.uniform, 'float64', -2.0, 2.0, ndim_added=1)
#uni_0 = random_function(numpy.random.RandomState.uniform, 'float64', -2.0, 2.0, ndim_added=0)
#uni_m1 = random_function(numpy.random.RandomState.uniform, 'float64', -2.0, 2.0, ndim_added=-1)
rng_R = random_state_type()
p_uni11, uni11 = uni_1(rng_R, 1, (4,))
p_uni12, uni12 = uni_1(rng_R, 2, (3,4))
p_uni01, uni01 = uni_0(rng_R, 1, (4,))
p_uni02, uni02 = uni_0(rng_R, 2, (3,4))
p_unim11, unim11 = uni_m1(rng_R, 1, (4,))
p_unim12, unim12 = uni_m1(rng_R, 2, (3,4))
p_uni11, uni11 = uni_1(rng_R, size=(4,))
p_uni12, uni12 = uni_1(rng_R, size=(3,4))
p_uni01, uni01 = uni_0(rng_R, size=(4,))
p_uni02, uni02 = uni_0(rng_R, size=(3,4))
p_unim11, unim11 = uni_m1(rng_R, size=(4,))
p_unim12, unim12 = uni_m1(rng_R, size=(3,4))
self.assertEqual(uni11.ndim, 2)
self.assertEqual(uni12.ndim, 3)
......@@ -320,7 +331,8 @@ class T_random_function(unittest.TestCase):
def test_permutation(self):
"""Test that raw_random.permutation generates the same results as numpy."""
rng_R = random_state_type()
post_r, out = permutation(rng_R, (9,), 6)
post_r, out = permutation(rng_R, size=(9,), n=6)
print 'OUT NDIM', out.ndim
f = compile.function(
[compile.In(rng_R, value=numpy.random.RandomState(55), update=post_r, mutable=True)],
[out], accept_inplace=True)
......@@ -365,6 +377,24 @@ class T_random_function(unittest.TestCase):
self.assertTrue(val0.shape == (7,3,5))
self.assertTrue(val1.shape == (7,3,5))
def test_symbolic_shape(self):
rng_R = random_state_type()
shape = tensor.lvector()
post_r, out = uniform(rng_R, shape, ndim=2)
f = compile.function([rng_R, shape], out)
rng_state0 = numpy.random.RandomState(55)
assert f(rng_state0, [2,3]).shape == (2,3)
assert f(rng_state0, [4,8]).shape == (4,8)
self.assertRaises(ValueError, f, rng_state0, [4])
self.assertRaises(ValueError, f, rng_state0, [4,3,4,5])
if __name__ == '__main__':
from theano.tests import main
main("test_raw_random")
......@@ -11,8 +11,9 @@ from theano import function
from theano import tensor
from theano import compile, gof
from theano.tests import unittest_tools
class T_RandomStreams(unittest.TestCase):
class T_SharedRandomStreams(unittest.TestCase):
def test_tutorial(self):
srng = RandomStreams(seed=234)
......@@ -109,6 +110,96 @@ class T_RandomStreams(unittest.TestCase):
assert numpy.all(fn_val0 == numpy_val0)
assert numpy.all(fn_val1 == numpy_val1)
def test_permutation(self):
"""Test that RandomStreams.uniform generates the same results as numpy"""
# Check over two calls to see if the random state is correctly updated.
random = RandomStreams(234)
fn = function([], random.permutation((20,), 10), updates=random.updates())
fn_val0 = fn()
fn_val1 = fn()
rng_seed = numpy.random.RandomState(234).randint(2**30)
rng = numpy.random.RandomState(int(rng_seed)) #int() is for 32bit
# rng.permutation outputs one vector at a time, so we iterate.
numpy_val0 = numpy.asarray([rng.permutation(10) for i in range(20)])
numpy_val1 = numpy.asarray([rng.permutation(10) for i in range(20)])
assert numpy.all(fn_val0 == numpy_val0)
assert numpy.all(fn_val1 == numpy_val1)
def test_multinomial(self):
"""Test that RandomStreams.multinomial generates the same results as numpy"""
# Check over two calls to see if the random state is correctly updated.
random = RandomStreams(234)
fn = function([], random.multinomial((4,4), 1, [0.1]*10), updates=random.updates())
fn_val0 = fn()
fn_val1 = fn()
rng_seed = numpy.random.RandomState(234).randint(2**30)
rng = numpy.random.RandomState(int(rng_seed)) #int() is for 32bit
numpy_val0 = rng.multinomial(1, [0.1]*10, size=(4,4))
numpy_val1 = rng.multinomial(1, [0.1]*10, size=(4,4))
assert numpy.all(fn_val0 == numpy_val0)
assert numpy.all(fn_val1 == numpy_val1)
def test_shuffle_row_elements(self):
"""Test that RandomStreams.shuffle_row_elements generates the right results"""
# Check over two calls to see if the random state is correctly updated.
# On matrices, for each row, the elements of that row should be shuffled.
# Note that this differs from numpy.random.shuffle, where all the elements
# of the matrix are shuffled.
random = RandomStreams(234)
m_input = tensor.dmatrix()
f = function([m_input], random.shuffle_row_elements(m_input), updates=random.updates())
val_rng = numpy.random.RandomState(unittest_tools.fetch_seed())
in_mval = val_rng.uniform(-2, 2, size=(20,5))
fn_mval0 = f(in_mval)
fn_mval1 = f(in_mval)
print in_mval[0]
print fn_mval0[0]
print fn_mval1[0]
assert not numpy.all(in_mval == fn_mval0)
assert not numpy.all(in_mval == fn_mval1)
assert not numpy.all(fn_mval0 == fn_mval1)
rng_seed = numpy.random.RandomState(234).randint(2**30)
rng = numpy.random.RandomState(int(rng_seed))
numpy_mval0 = in_mval.copy()
numpy_mval1 = in_mval.copy()
for row in numpy_mval0:
rng.shuffle(row)
for row in numpy_mval1:
rng.shuffle(row)
assert numpy.all(numpy_mval0 == fn_mval0)
assert numpy.all(numpy_mval1 == fn_mval1)
# On vectors, the behaviour is the same as numpy.random.shuffle,
# except that it does not work in place, but returns a shuffled vector.
random1 = RandomStreams(234)
v_input = tensor.dvector()
f1 = function([v_input], random1.shuffle_row_elements(v_input))
in_vval = val_rng.uniform(-3, 3, size=(12,))
fn_vval = f1(in_vval)
numpy_vval = in_vval.copy()
vrng = numpy.random.RandomState(int(rng_seed))
vrng.shuffle(numpy_vval)
print in_vval
print fn_vval
print numpy_vval
assert numpy.all(numpy_vval == fn_vval)
# Trying to shuffle a vector with function that should shuffle
# matrices, or vice versa, raises a TypeError
self.assertRaises(TypeError, f1, in_mval)
self.assertRaises(TypeError, f, in_vval)
if __name__ == '__main__':
from theano.tests import main
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论