提交 958efe87 authored 作者: Chinnadhurai Sankar's avatar Chinnadhurai Sankar

merged file

......@@ -122,8 +122,6 @@ An op has to implement some methods defined in the the interface of
the method :func:`make_node` or :attr:`itypes`, :attr:`otypes` and one of the
implementation methods, either :func:`perform`, :meth:`Op.c_code`
or :func:`make_thunk`.
method :func:`make_node` and one of the implementation methods, either
:func:`perform`, :meth:`Op.c_code` or :func:`make_thunk`.
:func:`make_node` method creates an Apply node representing the application
of the op on the inputs provided. This method is reponsible for three things:
......
......@@ -117,21 +117,24 @@ A contributor made rpm package for Mandriva_ 2010.2 of Theano 0.3.1.
.. _linux_basic:
Docker images
~~~~~~~~~~~~~
Docker
~~~~~~
Builds of Theano are available as `Docker <https://www.docker.com/whatisdocker>`_ images:
`Theano Docker (CPU) <https://hub.docker.com/r/kaixhin/theano/>`_ or `Theano Docker (CUDA) <https://hub.docker.com/r/kaixhin/cuda-theano/>`_.
These are updated on a weekly basis with bleeding-edge builds of Theano. Examples of running bash in a Docker container
are as follows:
Builds of Theano are available as `Docker <https://www.docker.com>`_
images: `Theano Docker (CPU) <https://hub.docker.com/r/kaixhin/theano/>`_ or
`Theano Docker (CUDA) <https://hub.docker.com/r/kaixhin/cuda-theano/>`_. These
are updated on a weekly basis with bleeding-edge builds of Theano.
Examples of running bash in a Docker container are as follows:
.. code-block:: bash
sudo docker run -it kaixhin/theano
sudo docker run -it --device /dev/nvidiactl --device /dev/nvidia-uvm --device /dev/nvidia0 kaixhin/cuda-theano:7.0
sudo nvidia-docker run -it kaixhin/cuda-theano:7.0
For a guide to Docker, see the `official docs <https://docs.docker.com/userguide/>`_. For more details on how to use the
Theano Docker images, including requirements for CUDA support, consult the `source project <https://github.com/Kaixhin/dockerfiles>`_.
For a guide to Docker, see the `official docs <https://docs.docker.com>`_.
CUDA support requires `NVIDIA Docker <https://github.com/NVIDIA/nvidia-docker>`_.
For more details on how to use the Theano Docker images,
consult the `source project <https://github.com/Kaixhin/dockerfiles>`_.
Basic user install instructions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
......
......@@ -87,7 +87,9 @@ Environment Variables
with later (right-most) files taking priority over earlier files in the
case that multiple files specify values for a common configuration option.
For example, to override system-wide settings with personal ones,
set ``THEANORC=/etc/theanorc:~/.theanorc``.
set ``THEANORC=/etc/theanorc:~/.theanorc``. To load configuration files in
the current working directory, append ``.theanorc`` to the list of configuration
files, e.g. ``THEANORC=~/.theanorc:.theanorc``.
Config Attributes
=====================
......
......@@ -40,7 +40,9 @@ If adjusting hyperparameters doesn't work for you, you can still get help from
Theano's NanGuardMode. Change the mode of your theano function to NanGuardMode
and run them again. The NanGuardMode will monitor all input/output variables in
each node, and raises an error if NaNs are detected. For how to use the
``NanGuardMode``, please refer to :ref:`nanguardmode`.
``NanGuardMode``, please refer to :ref:`nanguardmode`. Using ``optimizer_including=alloc_empty_to_zeros``
with ``NanGuardMode`` could be helpful to detect NaN, for more information please refer
to :ref:`AllocEmpty`.
DebugMode can also help. Run your code in DebugMode with flag
``mode=DebugMode,DebugMode.check_py=False``. This will give you clue about which
......@@ -76,3 +78,13 @@ Cuda Specific Option
The Theano flag ``nvcc.fastmath=True`` can genarate NaN. Don't set
this flag while debugging NaN.
.. _AllocEmpty:
NaN Introduced by AllocEmpty
-----------------------------------------------
AllocEmpty is used by many operation such as scan to allocate some memory without properly clearing it. The reason for that is that the allocated memory will subsequently be overwritten. However, this can sometimes introduce NaN depending on the operation and what was previously stored in the memory it is working on. For instance, trying to zero out memory using a multipication before applying an operation could cause NaN if NaN is already present in the memory, since `0 * NaN => NaN`.
Using ``optimizer_including=alloc_empty_to_zeros`` replaces `AllocEmpty` by `Alloc{0}`, which is helpful to diagnose where NaNs come from. Please note that when running in `NanGuardMode`, this optimizer is not included by default. Therefore, it might be helpful to use them both together.
......@@ -25,7 +25,7 @@ from theano.gof import (graph, utils, link, ops_with_inner_function)
from theano.gof.link import raise_with_op
from theano.compile.function_module import (
FunctionMaker, Function, infer_reuse_pattern,
SymbolicInputKit, SymbolicOutput, Supervisor, std_fgraph)
SymbolicOutput, Supervisor, std_fgraph)
from theano.compile.mode import Mode, register_mode
from theano.compile.ops import OutputGuard
......@@ -2517,27 +2517,9 @@ class _Maker(FunctionMaker): # inheritance buys a few helper functions
# default.storage to input_storage.
if indices is not None:
raise TypeError("Cannot take a Container instance as "
"default for a SymbolicInputKit.")
"default for a SymbolicInput.")
input_storage.append(default.storage)
default = None
elif isinstance(input, SymbolicInputKit):
# If the input is a SymbolicInputKit, it represents more than
# one storage unit. The indices and subinputs lists represent
# which of the kit's inputs are active in this graph, so we
# make as many storage units as needed
if isinstance(default, (list, tuple)) \
and all(isinstance(x, gof.Container) for x in default):
if len(default) == len(indices):
input_storage += [x.storage for x in default]
elif len(default) > len(indices):
input_storage += [default[i].storage for i in indices]
else:
raise ValueError(
'Not enough storage for SymbolicInputKit',
input, indices, default)
default = _NODEFAULT
else:
input_storage += [[None] for i in indices]
else:
# Normal case: one new, independent storage unit
input_storage.append([None])
......@@ -2550,16 +2532,7 @@ class _Maker(FunctionMaker): # inheritance buys a few helper functions
# storage after each function call
# - value is the value that will be put in the storage initially
# Even though a SymbolicInputKit represents more than one input,
# we still only have one entry for the defaults list.
if isinstance(input, SymbolicInputKit):
if default is _NODEFAULT:
_defaults.append((False, False, None))
elif default is None:
_defaults.append((True, True, None))
else:
_defaults.append((False, False, default))
elif input.update is not None:
if input.update is not None:
# If the input has an update, then (logically) it is
# not required since it is just a parameter and of
# course we don't want to refeed the default back into
......
......@@ -16,12 +16,11 @@ import numpy
import theano
from theano import config, gof
from functools import partial
from theano.compat import izip
from theano.gof import graph
import theano.compile.mode
from theano.compile.io import (
In, SymbolicInput, SymbolicInputKit, SymbolicOutput)
In, SymbolicInput, SymbolicOutput)
from theano.compile.ops import deep_copy_op, view_op
from theano.gof.graph import is_same_graph
from theano.gof.op import ops_with_inner_function
......@@ -286,7 +285,7 @@ class Function(object):
indices = None
"""
List of (SymbolicInput|SymbolicInputKit, indices, [SymbolicInput,...]),
List of (SymbolicInput, indices, [SymbolicInput,...]),
one tuple for each input.
The first tuple element is the SymbolicInput object for the corresponding
......@@ -396,7 +395,6 @@ class Function(object):
# self.input_storage inplace.
for i, ((input, indices, sinputs), (required, refeed, value)) in \
enumerate(zip(self.indices, defaults)):
# this is true iff input is not a SymbolicInputKit
if indices is None:
# containers is being used as a stack. Here we pop off
# the next one.
......@@ -432,41 +430,6 @@ class Function(object):
named_inputs.append(input.name)
inv_finder[c] = input
containers[:1] = []
else:
# TODO The following code may need to do something to handle
# implicit inputs.
# The input is a SymbolicInputKit, so we take as many
# containers as the Kit provides inputs
cs = containers[:len(indices)]
# distribute does the initialization of the containers
input.distribute(value, indices, cs)
f = partial(distribute, indices, cs)
# Like before, we set a finder entry for the kit. Note that
# we are not mapping to a container but to a function which
# can reinitialize all the containers
finder[i] = f
finder[input] = f
if input.name not in finder:
finder[input.name] = f
else:
finder[input.name] = DUPLICATE
# For each input in the kit and its corresponding
# container, we put an entry in finder. This allows
# the user to micro-manage elements of the kit if need
# be. All containers inherit the required field and
# have their own "provided" counter
for c, sin in zip(cs, sinputs):
finder[sin.variable] = c
finder[sin.name] = c
if sin.name not in finder:
finder[sin.name] = c
else:
finder[sin.name] = DUPLICATE
inv_finder[c] = input
c.required = required
c.provided = 0
containers[:len(indices)] = []
self.finder = finder
self.inv_finder = inv_finder
......@@ -1033,16 +996,8 @@ def _pickle_Function(f):
for (input, indices, inputs), (required, refeed, default) in \
zip(f.indices, f.defaults):
if isinstance(input, SymbolicInputKit):
li = len(indices)
if not default:
input_storage.append(ins[:li])
else:
input_storage.append(default)
ins[:li] = []
else:
input_storage.append(ins[0])
del ins[0]
input_storage.append(ins[0])
del ins[0]
inputs_data = [x.data for x in f.input_storage]
......@@ -1210,7 +1165,7 @@ class FunctionMaker(object):
@staticmethod
def wrap_in(input):
if isinstance(input, (SymbolicInput, SymbolicInputKit)):
if isinstance(input, (SymbolicInput)):
return input
elif isinstance(input, gof.Variable):
# r -> SymbolicInput(variable=r)
......@@ -1234,9 +1189,10 @@ class FunctionMaker(object):
# instances in inputs. For SymbolicInput, this returns None
# as the list of indices and a list with just the
# SymbolicInput.
if isinstance(sinput, SymbolicInputKit):
return sinput.complete(rinputs)
elif isinstance(sinput, SymbolicInput):
# if isinstance(sinput, SymbolicInputKit):
# return sinput.complete(rinputs)
# elif isinstance(sinput, SymbolicInput):
if isinstance(sinput, SymbolicInput):
return [None, [sinput]]
@staticmethod
......@@ -1858,7 +1814,7 @@ def convert_function_input(input):
`In`(r, name=name, value=val, update=up, autoname=True)
"""
if isinstance(input, (SymbolicInput, SymbolicInputKit)):
if isinstance(input, SymbolicInput):
return input
elif isinstance(input, gof.Constant):
raise TypeError('A Constant instance is not a legal function input',
......@@ -1887,7 +1843,7 @@ def convert_function_input(input):
else:
raise TypeError("Invalid input syntax: %s (check "
"documentation or use an In instance)" % orig)
elif isinstance(input[0], (SymbolicInput, SymbolicInputKit)):
elif isinstance(input[0], SymbolicInput):
if len(input) == 1:
return input[0]
elif len(input) == 2:
......
......@@ -97,69 +97,6 @@ class SymbolicInput(object):
return str(self)
# TODO: FB: I think this isn't used, confirm this and remove.
class SymbolicInputKit(object):
"""
Represents a group ("kit") of SymbolicInputs. If fed into function or
FunctionMaker, only the inputs which are needed to compile the function
properly will be taken.
A SymbolicInputKit provides the distribute function in order to set or
initialize several inputs from a single value. Specialized Kits should
override it.
"""
def __init__(self, name):
if not isinstance(name, string_types):
raise TypeError('name must be a string (got: %s)' % name)
self.name = name
self.sinputs = []
self.variables = []
def add_input(self, sinput):
"""
Add a SymbolicInput to this SymbolicInputKit.
It will be given the next available index.
"""
self.sinputs.append(sinput)
self.variables.append(sinput.variable)
def distribute(self, value, indices, containers):
"""
Given a list of indices corresponding to SymbolicInputs in this kit
as well as a corresponding list of containers, initialize all the
containers using the provided value.
"""
raise NotImplementedError
def complete(self, inputs):
"""
Given inputs (a list of Variable instances), checks through all the
SymbolicInputs in the kit and return a sorted list of indices and a list
of their corresponding SymbolicInputs such that each of them represents
some variable in the inputs list.
Not all the provided inputs will have a corresponding SymbolicInput in
the kit.
"""
ret = []
for input in inputs:
try:
i = self.variables.index(input)
ret.append((i, self.sinputs[i]))
except ValueError:
pass
ret.sort()
if not ret:
return [[], []]
return list(zip(*ret))
class In(SymbolicInput):
"""
Represents a symbolic input for use with function or FunctionMaker.
......
......@@ -88,6 +88,7 @@ def _atexit_print_fn():
merge = cum.optimizer_profile[0].merge_profile(
cum.optimizer_profile[1],
ps.optimizer_profile[1])
assert len(merge) == len(cum.optimizer_profile[1])
cum.optimizer_profile = (cum.optimizer_profile[0], merge)
except Exception as e:
print("Got an exception while merging profile")
......
......@@ -5,6 +5,7 @@ Generate and compile C modules for Python.
from __future__ import absolute_import, print_function, division
import atexit
import textwrap
import six.moves.cPickle as pickle
import logging
import os
......@@ -24,6 +25,7 @@ import numpy.distutils # TODO: TensorType should handle this
import theano
from theano.compat import PY3, decode, decode_iter
from six import b, BytesIO, StringIO, string_types, iteritems
from six.moves import xrange
from theano.gof.utils import flatten
from theano.configparser import config
from theano.gof.utils import hash_from_code
......@@ -125,9 +127,7 @@ class ExtFunction(object):
class DynamicModule(object):
"""
WRITEME
"""
def __init__(self, name=None):
assert name is None, (
"The 'name' parameter of DynamicModule"
......@@ -1807,6 +1807,34 @@ class Compiler(object):
output=output, compiler=compiler)
def try_march_flag(flags):
"""
Try to compile and run a simple C snippet using current flags.
Return: compilation success (True/False), execution success (True/False)
"""
test_code = textwrap.dedent("""\
#include <cmath>
using namespace std;
int main(int argc, char** argv)
{
float Nx = -1.3787706641;
float Sx = 25.0;
double r = Nx + sqrt(Sx);
if (abs(r - 3.621229) > 0.01)
{
return -1;
}
return 0;
}
""")
cflags = flags + ['-L' + d for d in theano.gof.cmodule.std_lib_dirs()]
compilation_result, execution_result = GCC_compiler.try_compile_tmp(
test_code, tmp_prefix='try_march_',
flags=cflags, try_run=True)
return compilation_result, execution_result
class GCC_compiler(Compiler):
# The equivalent flags of --march=native used by g++.
march_flags = None
......@@ -2027,6 +2055,54 @@ class GCC_compiler(Compiler):
_logger.info("g++ -march=native equivalent flags: %s",
GCC_compiler.march_flags)
# Find working march flag:
# -- if current GCC_compiler.march_flags works, we're done.
# -- else replace -march and -mtune with ['core-i7-avx', 'core-i7', 'core2']
# and retry with all other flags and arguments intact.
# -- else remove all other flags and only try with -march = default + flags_to_try.
# -- if none of that worked, set GCC_compiler.march_flags = [] (for x86).
default_compilation_result, default_execution_result = try_march_flag(GCC_compiler.march_flags)
if not default_compilation_result or not default_execution_result:
march_success = False
march_ind = None
mtune_ind = None
default_detected_flag = []
march_flags_to_try = ['corei7-avx', 'corei7', 'core2']
for m_ in xrange(len(GCC_compiler.march_flags)):
march_flag = GCC_compiler.march_flags[m_]
if 'march' in march_flag:
march_ind = m_
default_detected_flag = [march_flag]
elif 'mtune' in march_flag:
mtune_ind = m_
for march_flag in march_flags_to_try:
if march_ind is not None:
GCC_compiler.march_flags[march_ind] = '-march=' + march_flag
if mtune_ind is not None:
GCC_compiler.march_flags[mtune_ind] = '-mtune=' + march_flag
compilation_result, execution_result = try_march_flag(GCC_compiler.march_flags)
if compilation_result and execution_result:
march_success = True
break
if not march_success:
# perhaps one of the other flags was problematic; try default flag in isolation again:
march_flags_to_try = default_detected_flag + march_flags_to_try
for march_flag in march_flags_to_try:
compilation_result, execution_result = try_march_flag(['-march=' + march_flag])
if compilation_result and execution_result:
march_success = True
GCC_compiler.march_flags = ['-march=' + march_flag]
break
if not march_success:
GCC_compiler.march_flags = []
# Add the detected -march=native equivalent flags
if march_flags and GCC_compiler.march_flags:
cxxflags.extend(GCC_compiler.march_flags)
......@@ -2099,7 +2175,6 @@ class GCC_compiler(Compiler):
include_dirs=None, lib_dirs=None, libs=None,
preargs=None, py_module=True, hide_symbols=True):
"""
Parameters
----------
module_name : str
......
......@@ -315,17 +315,17 @@ class SeqOptimizer(Optimizer, list):
" time - (name, class, index, nodes before, nodes after) - validate time",
file=stream)
ll = []
for opt in opts:
for (opt, nb_n) in zip(opts, nb_nodes):
if hasattr(opt, "__name__"):
name = opt.__name__
else:
name = opt.name
idx = opts.index(opt)
ll.append((name, opt.__class__.__name__,
idx))
lll = sorted(zip(prof, ll, nb_nodes), key=lambda a: a[0])
idx) + nb_n)
lll = sorted(zip(prof, ll), key=lambda a: a[0])
for (t, opt, nb_n) in lll[::-1]:
for (t, opt) in lll[::-1]:
i = opt[2]
if sub_validate_time:
val_time = sub_validate_time[i + 1] - sub_validate_time[i]
......@@ -345,8 +345,8 @@ class SeqOptimizer(Optimizer, list):
Merge 2 profiles returned by this cass apply() fct.
"""
new_t = []
new_l = []
new_t = [] # the time for the optimization
new_l = [] # the optimization
new_sub_profile = []
# merge common(same object) opt
for l in set(prof1[0]).intersection(set(prof2[0])):
......@@ -399,6 +399,12 @@ class SeqOptimizer(Optimizer, list):
new_sub_profile.append(p[6][idx])
new_opt = SeqOptimizer(*new_l)
new_nb_nodes = []
for p1, p2 in zip(prof1[8], prof2[8]):
new_nb_nodes.append((p1[0] + p2[0], p1[1] + p2[1]))
new_nb_nodes.extend(prof1[8][len(new_nb_nodes):])
new_nb_nodes.extend(prof2[8][len(new_nb_nodes):])
new_callbacks_times = merge_dict(prof1[9], prof2[9])
# We need to assert based on the name as we merge also based on
# the name.
......@@ -410,6 +416,7 @@ class SeqOptimizer(Optimizer, list):
return (new_opt, new_t, prof1[2] + prof2[2],
prof1[3] + prof2[3],
-1, -1, new_sub_profile, [],
new_nb_nodes,
new_callbacks_times)
......@@ -2313,10 +2320,18 @@ class EquilibriumOptimizer(NavigatorOptimizer):
"%f with the theano flag 'optdb.max_use_ratio'." %
config.optdb.max_use_ratio)
fgraph.remove_feature(change_tracker)
assert len(loop_process_count) == len(loop_timing)
assert len(loop_process_count) == len(global_opt_timing)
assert len(loop_process_count) == len(nb_nodes)
assert len(loop_process_count) == len(io_toposort_timing)
assert len(loop_process_count) == len(global_sub_profs)
assert len(loop_process_count) == len(final_sub_profs)
assert len(loop_process_count) == len(cleanup_sub_profs)
return (self, loop_timing, loop_process_count,
(start_nb_nodes, end_nb_nodes, max_nb_nodes),
global_opt_timing, nb_nodes, time_opts, io_toposort_timing,
node_created, global_sub_profs, final_sub_profs, cleanup_sub_profs)
node_created, global_sub_profs, final_sub_profs,
cleanup_sub_profs)
def print_summary(self, stream=sys.stdout, level=0, depth=-1):
name = getattr(self, 'name', None)
......
......@@ -75,10 +75,11 @@ class GpuElemwise(HideC, Elemwise):
pass
try:
support_code = self.scalar_op.c_support_code()
if (support_code.strip() != "#define THEANO_MACRO_MOD(x,y) (x % y)" and
support_code.strip() != ""):
if "struct" in support_code:
# The macro is fine, the C++ struct is not.
raise SupportCodeError(support_code)
raise SupportCodeError(
"struct aren't supported in GpuElemwise support_code" +
support_code)
except MethodNotDefined:
pass
......
......@@ -673,6 +673,15 @@ def local_gpua_advanced_incsubtensor(node, context_name):
set_instead_of_inc=set_instead_of_inc)
@register_inplace()
@local_optimizer([GpuAdvancedIncSubtensor1, GpuAdvancedIncSubtensor1_dev20])
def local_advincsub1_gpua_inplace(node):
if isinstance(node.op, (GpuAdvancedIncSubtensor1,
GpuAdvancedIncSubtensor1_dev20)):
if not node.op.inplace:
return [node.op.clone_inplace()(*node.inputs)]
@register_opt('fast_compile')
@op_lifter([tensor.CAReduce, tensor.Sum, tensor.elemwise.Prod])
def local_gpua_careduce(node, context_name):
......@@ -881,6 +890,10 @@ def local_gpua_softmaxwithbias(node, context_name):
@register_opt('fast_compile')
@op_lifter([theano.tensor.opt.Assert])
def local_assert(node, context_name):
# Check if input nodes are already on the GPU
if isinstance(node.inputs[0].type, GpuArrayType):
return
return [host_from_gpu(node.op(as_gpuarray_variable(node.inputs[0],
context_name),
*node.inputs[1:]))]
......@@ -946,7 +959,7 @@ def local_lift_abstractconv2d(node, context_name):
return [node.op(*inps)]
# Register this here so that it goes after the abstract lifting
register_opt()(conv_groupopt)
register_opt('fast_compile')(conv_groupopt)
@register_opt("low_memory")
......
......@@ -608,11 +608,6 @@ class GpuAdvancedIncSubtensor1(Op):
}
step[0] = 0;
num_indices = PyArray_SIZE(%(ind)s);
if ((num_indices - 1) > LONG_MAX) {
PyErr_Format(PyExc_AssertionError,
"num_indices %%lld exceeds LONG_MAX + 1", (long long)num_indices);
%(fail)s
}
if (!%(inplace)s) {
%(out)s = theano_try_copy(%(out)s, %(x)s);
if (%(out)s == NULL)
......@@ -622,42 +617,49 @@ class GpuAdvancedIncSubtensor1(Op):
%(out)s = %(x)s;
Py_INCREF(%(out)s);
}
broadcast_y = PyGpuArray_DIM(%(y)s, 0) == 1;
for (j = 0; j < num_indices; j++) {
start[0] = *(dtype_%(ind)s *)PyArray_GETPTR1(%(ind)s, j);
if (start[0] < 0)
start[0] += PyGpuArray_DIM(%(out)s, 0);
if (start[0] < 0 || start[0] >= PyGpuArray_DIM(%(out)s, 0)) {
PyErr_SetString(PyExc_IndexError, "index out of bounds");
%(fail)s;
if (num_indices != 0) {
if ((num_indices - 1) > LONG_MAX) {
PyErr_Format(PyExc_AssertionError,
"num_indices %%lld exceeds LONG_MAX + 1", (long long)num_indices);
%(fail)s
}
row_x = pygpu_index(%(out)s, start, (ssize_t *)PyGpuArray_DIMS(%(out)s), step);
if (row_x == NULL)
%(fail)s;
if (broadcast_y)
start[0] = 0;
else
start[0] = j;
broadcast_y = PyGpuArray_DIM(%(y)s, 0) == 1;
for (j = 0; j < num_indices; j++) {
start[0] = *(dtype_%(ind)s *)PyArray_GETPTR1(%(ind)s, j);
if (start[0] < 0)
start[0] += PyGpuArray_DIM(%(out)s, 0);
if (start[0] < 0 || start[0] >= PyGpuArray_DIM(%(out)s, 0)) {
PyErr_SetString(PyExc_IndexError, "index out of bounds");
%(fail)s;
}
row_x = pygpu_index(%(out)s, start, (ssize_t *)PyGpuArray_DIMS(%(out)s), step);
if (row_x == NULL)
%(fail)s;
if (broadcast_y)
start[0] = 0;
else
start[0] = j;
row_y = pygpu_index(%(y)s, start, (ssize_t *)PyGpuArray_DIMS(%(y)s), step);
if (row_y == NULL) {
Py_DECREF(row_x);
%(fail)s;
}
row_y = pygpu_index(%(y)s, start, (ssize_t *)PyGpuArray_DIMS(%(y)s), step);
if (row_y == NULL) {
if (%(set_instead_of_inc)s) {
ret = GpuArray_setarray(&row_x->ga, &row_y->ga);
} else {
void *args[2];
args[0] = (void *)&row_x->ga;
args[1] = (void *)&row_y->ga;
ret = GpuElemwise_call(iadd, args, GE_BROADCAST);
}
Py_DECREF(row_x);
%(fail)s;
Py_DECREF(row_y);
if (ret != GA_NO_ERROR)
PyErr_SetString(PyExc_RuntimeError, "Failed to set/inc elements");
}
if (%(set_instead_of_inc)s) {
ret = GpuArray_setarray(&row_x->ga, &row_y->ga);
} else {
void *args[2];
args[0] = (void *)&row_x->ga;
args[1] = (void *)&row_y->ga;
ret = GpuElemwise_call(iadd, args, GE_BROADCAST);
}
Py_DECREF(row_x);
Py_DECREF(row_y);
if (ret != GA_NO_ERROR)
PyErr_SetString(PyExc_RuntimeError, "Failed to set/inc elements");
}
""" % dict(x=inputs[0], y=inputs[1], ind=inputs[2], out=outputs[0],
fail=sub['fail'], inplace=int(self.inplace),
......@@ -665,7 +667,7 @@ class GpuAdvancedIncSubtensor1(Op):
set_instead_of_inc=int(self.set_instead_of_inc))
def c_code_cache_version(self):
return (0,)
return (1,)
class GpuAdvancedIncSubtensor1_dev20(GpuKernelBase, HideC,
......
......@@ -590,6 +590,8 @@ def get_scalar_constant_value(orig_v, elemwise=True,
----------
elemwise : bool
If False, we won't try to go into elemwise. So this call is faster.
But we still investigate in Second Elemwise (as this is a substitute
for Alloc)
only_process_constants : bool
If True, we only attempt to obtain the value of `orig_v` if it's
directly constant and don't try to dig through dimshuffles, fills,
......@@ -650,14 +652,18 @@ def get_scalar_constant_value(orig_v, elemwise=True,
ret = [[None]]
v.owner.op.perform(v.owner, const, ret)
return ret[0][0].copy()
elif elemwise and isinstance(v.owner.op, Elemwise):
# In fast_compile, we don't enable local_fill_to_alloc, so
# we need to investigate Second as Alloc. So elemwise
# don't disable the check for Second.
elif isinstance(v.owner.op, Elemwise):
if isinstance(v.owner.op.scalar_op, scal.Second):
# We don't need both input to be constant for second
shp, val = v.owner.inputs
v = val
continue
elif isinstance(v.owner.op.scalar_op,
get_scalar_constant_value_elemwises):
elif elemwise and isinstance(
v.owner.op.scalar_op,
get_scalar_constant_value_elemwises):
const = [get_scalar_constant_value(i)
for i in v.owner.inputs]
ret = [[None]]
......
File mode changed from 100644 to 100755
......@@ -905,9 +905,11 @@ def softmax_simplifier(numerators, denominators):
matching_denom = denominator
break
if matching_denom:
softmax = softmax_op(x)
copy_stack_trace(numerator, softmax)
numerators.remove(numerator)
denominators.remove(matching_denom)
numerators.append(softmax_op(x))
numerators.append(softmax)
return numerators, denominators
opt.local_mul_canonizer.add_simplifier(softmax_simplifier, 'softmax_simplifier')
......
......@@ -612,6 +612,7 @@ def local_exp_over_1_plus_exp(node):
else:
# case: 1/(1+exp(x))
sigmoids.append(sigmoid(-t))
copy_stack_trace(node.outputs[0], sigmoids[-1])
if not sigmoids: # we didn't find any. abort
return
......@@ -625,12 +626,17 @@ def local_exp_over_1_plus_exp(node):
if num_neg ^ denom_neg:
new_num = -new_num
copy_stack_trace(num, new_num)
if len(denom_rest) == 0:
return [new_num]
elif len(denom_rest) == 1:
return [new_num / denom_rest[0]]
out = new_num / denom_rest[0]
else:
return [new_num / tensor.mul(*denom_rest)]
out = new_num / tensor.mul(*denom_rest)
copy_stack_trace(node.outputs[0], out)
return [out]
def parse_mul_tree(root):
......@@ -923,6 +929,7 @@ def local_sigm_times_exp(node):
exp(x) * sigm(-x) -> sigm(x)
exp(-x) * sigm(x) -> sigm(-x)
todo: add stack traces to the intermediate variables
"""
# Bail early if it is not a multiplication.
if node.op != tensor.mul:
......
......@@ -7,6 +7,7 @@ from nose.tools import assert_raises
import theano
from theano import tensor
from theano.gof.opt import check_stack_trace
from theano.tests import unittest_tools as utt
from theano.tensor.nnet import corr, abstract_conv as conv
from theano.tensor.nnet.abstract_conv import get_conv_output_shape
......@@ -98,7 +99,7 @@ class BaseTestConv2d(unittest.TestCase):
def run_fwd(self, inputs_shape, filters_shape, ref=conv_corr,
subsample=(1, 1), verify_grad=True, mode=None,
border_mode='valid', filter_flip=True, provide_shape=False,
target_op=None):
target_op=None, check_trace=False):
inputs_val = numpy.random.random(inputs_shape).astype('float32')
filters_val = numpy.random.random(filters_shape).astype('float32')
......@@ -133,8 +134,9 @@ class BaseTestConv2d(unittest.TestCase):
if target_op is not None:
assert any([isinstance(n.op, target_op) for n
in f.maker.fgraph.toposort()])
if check_trace:
self.assertTrue(check_stack_trace(f, ops_to_check=target_op))
self.assertTrue(hasattr(f.maker.fgraph.outputs[0].tag, 'trace'))
res_ref = numpy.array(f_ref())
res = numpy.array(f())
utt.assert_allclose(res_ref, res)
......@@ -148,7 +150,7 @@ class BaseTestConv2d(unittest.TestCase):
def run_gradweight(self, inputs_shape, filters_shape, output_shape,
ref=conv_corr_gw, subsample=(1, 1), filter_flip=True,
verify_grad=True, mode=None, border_mode='valid',
provide_shape=False, target_op=None):
provide_shape=False, target_op=None, check_trace=False):
inputs_val = numpy.random.random(inputs_shape).astype('float32')
output_val = numpy.random.random(output_shape).astype('float32')
......@@ -177,12 +179,13 @@ class BaseTestConv2d(unittest.TestCase):
subsample=subsample,
conv_mode=conv_mode)
f = theano.function([], c, mode=mode)
self.assertTrue(hasattr(f.maker.fgraph.outputs[0].tag, 'trace'))
f_ref = theano.function([], c_ref, mode='FAST_RUN')
if target_op is not None:
assert any([isinstance(n.op, target_op) for n
in f.maker.fgraph.toposort()])
if check_trace:
self.assertTrue(check_stack_trace(f, ops_to_check=target_op))
res_ref = numpy.array(f_ref())
res = numpy.array(f())
......@@ -201,7 +204,7 @@ class BaseTestConv2d(unittest.TestCase):
def run_gradinput(self, inputs_shape, filters_shape, output_shape,
ref=conv_corr_gi, subsample=(1, 1), filter_flip=True,
verify_grad=True, mode=None, border_mode='valid',
provide_shape=False, target_op=None):
provide_shape=False, target_op=None, check_trace=False):
output_val = numpy.random.random(output_shape).astype('float32')
filters_val = numpy.random.random(filters_shape).astype('float32')
......@@ -227,12 +230,13 @@ class BaseTestConv2d(unittest.TestCase):
border_mode=border_mode, subsample=subsample,
conv_mode=conv_mode)
f = theano.function([], c, mode=mode)
self.assertTrue(hasattr(f.maker.fgraph.outputs[0].tag, 'trace'))
f_ref = theano.function([], c_ref, mode='FAST_RUN')
if target_op is not None:
assert any([isinstance(n.op, target_op) for n
in f.maker.fgraph.toposort()])
if check_trace:
self.assertTrue(check_stack_trace(f, ops_to_check=target_op))
res_ref = numpy.array(f_ref())
res = numpy.array(f())
......@@ -291,15 +295,18 @@ class TestCorrConv2d(BaseTestConv2d):
raise SkipTest("Need blas to test conv2d")
self.run_fwd(inputs_shape=i, filters_shape=f, subsample=s,
verify_grad=True, provide_shape=provide_shape,
border_mode=b, filter_flip=flip, target_op=CorrMM)
border_mode=b, filter_flip=flip, target_op=CorrMM,
check_trace=True)
self.run_gradweight(inputs_shape=i, filters_shape=f,
output_shape=o, subsample=s, verify_grad=True,
provide_shape=provide_shape, border_mode=b,
filter_flip=flip, target_op=CorrMM_gradWeights)
filter_flip=flip, target_op=CorrMM_gradWeights,
check_trace=True)
self.run_gradinput(inputs_shape=i, filters_shape=f,
output_shape=o, subsample=s, verify_grad=True,
provide_shape=provide_shape, border_mode=b,
filter_flip=flip, target_op=CorrMM_gradInputs)
filter_flip=flip, target_op=CorrMM_gradInputs,
check_trace=True)
class TestCpuConv2d(BaseTestConv2d):
......@@ -343,7 +350,8 @@ class TestCpuConv2d(BaseTestConv2d):
self.run_fwd(inputs_shape=i, filters_shape=f, subsample=s,
verify_grad=(gradweight_OK and gradinput_OK),
mode=mode, provide_shape=provide_shape,
border_mode=b, filter_flip=flip, target_op=ConvOp)
border_mode=b, filter_flip=flip, target_op=ConvOp,
check_trace=True)
else:
self.assertRaises(AssertionError,
self.run_fwd,
......@@ -354,7 +362,8 @@ class TestCpuConv2d(BaseTestConv2d):
mode=mode,
provide_shape=provide_shape,
border_mode=b,
filter_flip=flip)
filter_flip=flip,
check_trace=True)
if gradweight_OK:
if not theano.config.blas.ldflags:
......@@ -364,7 +373,8 @@ class TestCpuConv2d(BaseTestConv2d):
verify_grad=False, mode=mode,
provide_shape=provide_shape, border_mode=b,
filter_flip=flip,
target_op=(ConvOp, ConvGrad3D))
target_op=(ConvOp, ConvGrad3D),
check_trace=True)
else:
self.assertRaises(AssertionError,
self.run_gradweight,
......@@ -376,7 +386,8 @@ class TestCpuConv2d(BaseTestConv2d):
mode=mode,
provide_shape=provide_shape,
border_mode=b,
filter_flip=flip)
filter_flip=flip,
check_trace=True)
if gradinput_OK:
if not theano.config.blas.ldflags:
......@@ -386,7 +397,8 @@ class TestCpuConv2d(BaseTestConv2d):
verify_grad=False, mode=mode,
provide_shape=provide_shape, border_mode=b,
filter_flip=flip,
target_op=(ConvOp, ConvTransp3D))
target_op=(ConvOp, ConvTransp3D),
check_trace=True)
else:
self.assertRaises(AssertionError,
self.run_gradinput,
......@@ -398,7 +410,8 @@ class TestCpuConv2d(BaseTestConv2d):
mode=mode,
provide_shape=provide_shape,
border_mode=b,
filter_flip=flip)
filter_flip=flip,
check_trace=True)
def test_constant_shapes():
......
......@@ -10,6 +10,7 @@ except ImportError:
from six.moves import xrange
import theano
from theano.gof.opt import check_stack_trace
from theano.tensor.nnet.conv3d2d import *
import theano.tests.unittest_tools as utt
......@@ -73,10 +74,11 @@ def pyconv3d(signals, filters):
r_i += o_i[Tf2:o_i_sh0-Tf2, Hf2:-Hf2, Wf2:-Wf2]
return rval
def check_diagonal_subtensor_view_traces(fn):
for apply_node in fn.maker.fgraph.apply_nodes:
if isinstance(apply_node.op, (DiagonalSubtensor, IncDiagonalSubtensor)):
assert hasattr(apply_node.outputs[0].tag, 'trace')
assert check_stack_trace(
fn, ops_to_check=(DiagonalSubtensor, IncDiagonalSubtensor))
def test_conv3d(mode=mode_without_gpu, shared=theano.tensor._shared):
if ndimage is None:
......@@ -150,7 +152,6 @@ def test_conv3d(mode=mode_without_gpu, shared=theano.tensor._shared):
newconv3d = theano.function([], [],
updates={s_output: out},
mode=mode)
check_diagonal_subtensor_view_traces(newconv3d)
t0 = time.time()
newconv3d()
......@@ -162,7 +163,6 @@ def test_conv3d(mode=mode_without_gpu, shared=theano.tensor._shared):
(s_signals, gsignals)],
mode=mode,
name='grad')
check_diagonal_subtensor_view_traces(gnewconv3d)
t0 = time.time()
gnewconv3d()
......
......@@ -10,6 +10,7 @@ from theano import config
from theano import tensor as T
from theano import tensor
from theano import gof
from theano.gof.opt import check_stack_trace
from theano.tests import unittest_tools as utt
from theano import printing
from theano.tensor.nnet import (categorical_crossentropy,
......@@ -150,8 +151,7 @@ class T_SoftmaxWithBias(utt.InferShapeTester):
b = theano.shared(numpy.float32(numpy.random.randn()))
sm = T.nnet.softmax(a + b)
f = theano.function([], sm)
self.assertTrue(hasattr(f.maker.fgraph.outputs[0].tag, 'trace'))
print('f.maker.fgraph.outputs[0]: {0}'.format(f.maker.fgraph.outputs[0], ))
assert check_stack_trace(f, ops_to_check='last')
def test_infer_shape(self):
admat = matrix()
......@@ -256,9 +256,10 @@ class T_LogSoftmax(utt.InferShapeTester):
sm = tensor.nnet.softmax(x)
logsm = tensor.log(sm)
f = theano.function([x], logsm)
self.assertTrue(hasattr(f.maker.fgraph.outputs[0].tag, 'trace'))
assert isinstance(f.maker.fgraph.outputs[0].owner.op,
theano.tensor.nnet.nnet.LogSoftmax)
assert check_stack_trace(
f, ops_to_check=theano.tensor.nnet.nnet.LogSoftmax)
def test_local_softmax_grad_optimization_and_big_input(self):
"""Test the Logsoftmax's grad substitution.
......@@ -272,7 +273,8 @@ class T_LogSoftmax(utt.InferShapeTester):
m.check_isfinite = False
# some inputs that are large to make the gradient explode in the non
# optimized case
a = numpy.exp(10 * numpy.random.rand(5, 10).astype(theano.config.floatX))
a = numpy.exp(
10 * numpy.random.rand(5, 10).astype(theano.config.floatX))
def myfunc(x):
sm = tensor.nnet.softmax(x)
......@@ -280,8 +282,9 @@ class T_LogSoftmax(utt.InferShapeTester):
return logsm
# We set step to 0.1 because for big values we need a big epsilon
utt.verify_grad(myfunc, [a], eps=0.1, mode=m)
f = theano.function([], myfunc(a))
self.assertTrue(hasattr(f.maker.fgraph.outputs[0].tag, 'trace'))
sa = theano.shared(a)
f = theano.function([], myfunc(sa))
self.assertTrue(check_stack_trace(f, ops_to_check='all'))
class T_SoftmaxGrad(utt.InferShapeTester):
......@@ -659,7 +662,9 @@ class T_CrossentropyCategorical1Hot(utt.InferShapeTester):
fgraph = gof.FunctionGraph(
[x, one_of_n],
[g_x])
self.assertTrue(hasattr(fgraph.outputs[0].tag, 'trace'))
assert check_stack_trace(
fgraph, ops_to_check=[crossentropy_softmax_1hot_with_bias_dx,
softmax_op])
# print 'BEFORE'
# for node in fgraph.toposort():
......@@ -755,7 +760,9 @@ class T_CrossentropyCategorical1Hot(utt.InferShapeTester):
for expr in expressions:
# Verify the optimizer worked on the expressions
f = theano.function([x, y], expr, mode=mode)
self.assertTrue(hasattr(f.maker.fgraph.outputs[0].tag, 'trace'))
# todo: only the first output of the op has a stack trace
# assert check_stack_trace(
# f, ops_to_check=crossentropy_softmax_argmax_1hot_with_bias)
if verbose:
theano.printing.debugprint(f)
try:
......@@ -771,7 +778,9 @@ class T_CrossentropyCategorical1Hot(utt.InferShapeTester):
# Also verify the gradient wrt x
g = theano.function([x, y], T.grad(expr, x), mode=mode)
self.assertTrue(hasattr(g.maker.fgraph.outputs[0].tag, 'trace'))
assert check_stack_trace(
g, ops_to_check=[crossentropy_softmax_1hot_with_bias_dx,
softmax_op])
if verbose:
theano.printing.debugprint(g)
try:
......@@ -794,7 +803,9 @@ class T_CrossentropyCategorical1Hot(utt.InferShapeTester):
for expr in bias_expressions:
f = theano.function([x, b, y], expr, mode=mode)
self.assertTrue(hasattr(f.maker.fgraph.outputs[0].tag, 'trace'))
# todo: only the first output of the op has a stack trace
# assert check_stack_trace(
# f, ops_to_check=crossentropy_softmax_argmax_1hot_with_bias)
if verbose:
theano.printing.debugprint(f)
try:
......@@ -806,7 +817,9 @@ class T_CrossentropyCategorical1Hot(utt.InferShapeTester):
theano.printing.debugprint(f)
raise
g = theano.function([x, b, y], T.grad(expr, x), mode=mode)
self.assertTrue(hasattr(g.maker.fgraph.outputs[0].tag, 'trace'))
assert check_stack_trace(
g, ops_to_check=[crossentropy_softmax_1hot_with_bias_dx,
softmax_with_bias])
if verbose:
theano.printing.debugprint(g)
try:
......@@ -829,7 +842,9 @@ class T_CrossentropyCategorical1Hot(utt.InferShapeTester):
for expr in mean_expressions:
f = theano.function([x, y], expr, mode=mode)
self.assertTrue(hasattr(f.maker.fgraph.outputs[0].tag, 'trace'))
# todo: only the first output of the op has a stack trace
# assert check_stack_trace(
# f, ops_to_check=[crossentropy_softmax_argmax_1hot_with_bias])
if verbose:
theano.printing.debugprint(f)
try:
......@@ -844,7 +859,9 @@ class T_CrossentropyCategorical1Hot(utt.InferShapeTester):
raise
g = theano.function([x, y], T.grad(expr, x), mode=mode)
self.assertTrue(hasattr(g.maker.fgraph.outputs[0].tag, 'trace'))
assert check_stack_trace(
g, ops_to_check=[crossentropy_softmax_1hot_with_bias_dx,
softmax_op])
if verbose:
theano.printing.debugprint(g)
try:
......@@ -868,7 +885,9 @@ class T_CrossentropyCategorical1Hot(utt.InferShapeTester):
for expr in mean_bias_expressions:
f = theano.function([x, b, y], expr, mode=mode)
self.assertTrue(hasattr(f.maker.fgraph.outputs[0].tag, 'trace'))
# todo: only the first output of the op has a stack trace
# assert check_stack_trace(
# f, ops_to_check=crossentropy_softmax_argmax_1hot_with_bias)
if verbose:
theano.printing.debugprint(f)
try:
......@@ -881,7 +900,9 @@ class T_CrossentropyCategorical1Hot(utt.InferShapeTester):
theano.printing.debugprint(f)
raise
g = theano.function([x, b, y], T.grad(expr, x), mode=mode)
self.assertTrue(hasattr(g.maker.fgraph.outputs[0].tag, 'trace'))
assert check_stack_trace(
g, ops_to_check=[crossentropy_softmax_1hot_with_bias_dx,
softmax_with_bias])
if verbose:
theano.printing.debugprint(g)
try:
......@@ -1287,6 +1308,8 @@ def test_argmax_pushdown():
assert len(fgraph.toposort()) == 2 # an output_guard is second
assert fgraph.toposort()[0].op == tensor.basic._max_and_argmax
assert str(fgraph.toposort()[1].op) == 'OutputGuard'
assert check_stack_trace(
fgraph, ops_to_check=tensor.basic._max_and_argmax)
x = tensor.matrix()
# test that the max_and_argmax is not pushed down if the max is used
out = tensor.max_and_argmax(
......@@ -1295,7 +1318,6 @@ def test_argmax_pushdown():
fgraph = gof.FunctionGraph(
[x],
[out])
assert hasattr(fgraph.outputs[0].tag, 'trace')
backup = config.warn.argmax_pushdown_bug
config.warn.argmax_pushdown_bug = False
......@@ -1324,8 +1346,6 @@ def test_argmax_pushdown_bias():
fgraph = gof.FunctionGraph(
[x, b],
[out])
f = theano.function([x, b], out)
assert hasattr(f.maker.fgraph.outputs[0].tag, 'trace')
theano.compile.mode.optdb.query(
theano.compile.mode.OPT_FAST_RUN).optimize(fgraph)
......@@ -1333,11 +1353,12 @@ def test_argmax_pushdown_bias():
# print 'AFTER'
# for node in fgraph.toposort():
# print node.op
types_to_check = (tensor.DimShuffle, tensor.Elemwise, tensor.MaxAndArgmax)
assert len(fgraph.toposort()) == 4
assert isinstance(fgraph.toposort()[0].op, tensor.DimShuffle)
assert isinstance(fgraph.toposort()[1].op, tensor.Elemwise)
assert isinstance(fgraph.toposort()[2].op, tensor.MaxAndArgmax)
for i, type in enumerate(types_to_check):
assert isinstance(fgraph.toposort()[i].op, type)
assert str(fgraph.toposort()[3].op) == 'OutputGuard'
assert check_stack_trace(fgraph, ops_to_check=types_to_check)
x = tensor.matrix()
b = tensor.vector()
......@@ -1345,8 +1366,6 @@ def test_argmax_pushdown_bias():
fgraph = gof.FunctionGraph(
[x, b],
[out])
f = theano.function([x, b], out)
assert hasattr(f.maker.fgraph.outputs[0].tag, 'trace')
backup = config.warn.argmax_pushdown_bug
config.warn.argmax_pushdown_bug = False
......@@ -1364,6 +1383,8 @@ def test_argmax_pushdown_bias():
assert isinstance(fgraph.toposort()[1].op, tensor.CAReduce)
assert isinstance(fgraph.toposort()[1].op.scalar_op, theano.scalar.Maximum)
assert str(fgraph.toposort()[2].op) == 'OutputGuard'
assert check_stack_trace(
fgraph, ops_to_check=(SoftmaxWithBias, tensor.CAReduce))
def test_asymptotic_32():
......@@ -1437,7 +1458,7 @@ class Test_softmax_opt:
# test that function contains softmax and no div.
f = theano.function([c], p_y, mode=self.mode)
assert hasattr(f.maker.fgraph.outputs[0].tag, 'trace')
assert check_stack_trace(f, ops_to_check=softmax_op)
f_ops = [n.op for n in f.maker.fgraph.toposort()]
# print '--- f ='
......@@ -1454,7 +1475,7 @@ class Test_softmax_opt:
# test that function contains softmax and no div.
f = theano.function([c], p_y, mode=self.mode)
assert hasattr(f.maker.fgraph.outputs[0].tag, 'trace')
assert check_stack_trace(f, ops_to_check=softmax_op)
f_ops = [n.op for n in f.maker.fgraph.toposort()]
# print '--- f ='
......@@ -1474,7 +1495,6 @@ class Test_softmax_opt:
config.warn.sum_div_dimshuffle_bug = False
try:
g = theano.function([c, w], T.grad((p_y * w).sum(), c))
hasattr(g.maker.fgraph.outputs[0].tag, 'trace')
finally:
config.warn.sum_div_dimshuffle_bug = backup
g_ops = [n.op for n in g.maker.fgraph.toposort()]
......@@ -1502,7 +1522,6 @@ class Test_softmax_opt:
config.warn.sum_div_dimshuffle_bug = False
try:
g = theano.function([c], T.grad(p_y.sum(), c))
hasattr(g.maker.fgraph.outputs[0].tag, 'trace')
finally:
config.warn.sum_div_dimshuffle_bug = backup
# printing.debugprint(g)
......@@ -1515,7 +1534,6 @@ class Test_softmax_opt:
# test that function contains softmax and no div.
f = theano.function([c], p_y)
hasattr(f.maker.fgraph.outputs[0].tag, 'trace')
# printing.debugprint(f)
# test that function contains softmax and no div.
......@@ -1523,7 +1541,6 @@ class Test_softmax_opt:
config.warn.sum_div_dimshuffle_bug = False
try:
g = theano.function([c], T.grad(p_y.sum(), c))
hasattr(g.maker.fgraph.outputs[0].tag, 'trace')
finally:
config.warn.sum_div_dimshuffle_bug = backup
# printing.debugprint(g)
......@@ -1563,7 +1580,7 @@ def test_stabilize_log_softmax():
z = theano.tensor.log(y)
f = theano.function([x], z, mode=mode)
assert hasattr(f.maker.fgraph.outputs[0].tag, 'trace')
assert check_stack_trace(f, ops_to_check='all')
# check that the softmax has been optimized out
for node in f.maker.fgraph.toposort():
......
from __future__ import absolute_import, print_function, division
import theano
from theano import tensor
from theano.tensor.nnet.blocksparse import sparse_block_dot
from theano.gof.opt import check_stack_trace
from theano.tensor.nnet.blocksparse import (
sparse_block_dot, sparse_block_gemv_inplace, sparse_block_outer_inplace,
sparse_block_gemv, sparse_block_outer)
def test_blocksparse_inplace_gemv_opt():
......@@ -14,12 +17,13 @@ def test_blocksparse_inplace_gemv_opt():
o = sparse_block_dot(W, h, iIdx, b, oIdx)
f = theano.function([W, h, iIdx, b, oIdx], o)
assert hasattr(f.maker.fgraph.outputs[0].tag, 'trace')
if theano.config.mode == "FAST_COMPILE":
assert not f.maker.fgraph.toposort()[-1].op.inplace
assert check_stack_trace(f, ops_to_check=[sparse_block_gemv])
else:
assert f.maker.fgraph.toposort()[-1].op.inplace
assert check_stack_trace(f, ops_to_check=[sparse_block_gemv_inplace])
def test_blocksparse_inplace_outer_opt():
......@@ -31,13 +35,12 @@ def test_blocksparse_inplace_outer_opt():
o = sparse_block_dot(W, h, iIdx, b, oIdx)
theano.printing.debugprint(tensor.grad(o.sum(), wrt=W))
f = theano.function([W, h, iIdx, b, oIdx],
[o, tensor.grad(o.sum(), wrt=W)])
assert hasattr(f.maker.fgraph.outputs[0].tag, 'trace')
if theano.config.mode == "FAST_COMPILE":
assert not f.maker.fgraph.toposort()[-1].op.inplace
assert check_stack_trace(f, ops_to_check=sparse_block_outer)
else:
assert f.maker.fgraph.toposort()[-1].op.inplace
assert check_stack_trace(f, ops_to_check=sparse_block_outer_inplace)
......@@ -8,6 +8,7 @@ import theano.tensor.inplace
from theano.tensor import basic as tensor
from theano import tensor as T
from theano import config
from theano.gof.opt import check_stack_trace
from theano.tests import unittest_tools as utt
from theano.tensor.nnet import (sigmoid, sigmoid_inplace,
softplus, ultra_fast_sigmoid, hard_sigmoid)
......@@ -126,40 +127,37 @@ class T_sigmoid_opts(unittest.TestCase):
# tests inv_1_plus_exp
f = theano.function([x], T.fill(x, 1.0) / (1 + T.exp(-x)), mode=m)
assert hasattr(f.maker.fgraph.outputs[0].tag, 'trace')
# todo: solve issue #4589 first
# assert check_stack_trace(f, ops_to_check=sigmoid)
assert [node.op for node in f.maker.fgraph.toposort()] == [sigmoid]
f(data)
f = theano.function([x], T.fill(x, 1.0) / (2 + T.exp(-x)), mode=m)
assert hasattr(f.maker.fgraph.outputs[0].tag, 'trace')
assert [node.op for node in f.maker.fgraph.toposort()] != [sigmoid]
f(data)
f = theano.function([x], T.fill(x, 1.0) / (1 - T.exp(-x)), mode=m)
assert hasattr(f.maker.fgraph.outputs[0].tag, 'trace')
assert [node.op for node in f.maker.fgraph.toposort()] != [sigmoid]
f(data)
f = theano.function([x], T.fill(x, 1.1) / (1 + T.exp(-x)), mode=m)
assert hasattr(f.maker.fgraph.outputs[0].tag, 'trace')
assert [node.op for node in f.maker.fgraph.toposort()] != [sigmoid]
f(data)
# tests inv_1_plus_exp with neg
f = theano.function([x], T.fill(x, -1.0) / (1 + T.exp(-x)), mode=m)
assert hasattr(f.maker.fgraph.outputs[0].tag, 'trace')
# todo: solve issue #4589 first
# assert check_stack_trace(
# f, ops_to_check=[sigmoid, theano.tensor.inplace.neg_inplace])
assert [node.op for node in f.maker.fgraph.toposort()] == [sigmoid,
theano.tensor.inplace.neg_inplace]
f(data)
f = theano.function([x], T.fill(x, -1.0) / (1 - T.exp(-x)), mode=m)
assert hasattr(f.maker.fgraph.outputs[0].tag, 'trace')
assert [node.op for node in f.maker.fgraph.toposort()] != [sigmoid,
theano.tensor.inplace.neg_inplace]
f(data)
f = theano.function([x], T.fill(x, -1.0) / (2 + T.exp(-x)), mode=m)
assert hasattr(f.maker.fgraph.outputs[0].tag, 'trace')
assert [node.op for node in f.maker.fgraph.toposort()] != [sigmoid,
theano.tensor.inplace.neg_inplace]
f(data)
f = theano.function([x], T.fill(x, -1.1) / (1 + T.exp(-x)), mode=m)
assert hasattr(f.maker.fgraph.outputs[0].tag, 'trace')
assert [node.op for node in f.maker.fgraph.toposort()] != [sigmoid,
theano.tensor.inplace.neg_inplace]
f(data)
......@@ -170,37 +168,33 @@ class T_sigmoid_opts(unittest.TestCase):
# = - (sigm(x) * sigm(x))
f = theano.function([x], (T.fill(x, -1.0) * T.exp(x)) /
((1 + T.exp(x)) * (1 + T.exp(-x))), mode=m)
assert hasattr(f.maker.fgraph.outputs[0].tag, 'trace')
# todo: solve issue #4589 first
# assert check_stack_trace(f, ops_to_check=[sigmoid, T.mul])
assert [node.op for node in f.maker.fgraph.toposort()] == [sigmoid,
T.mul]
f(data)
f = theano.function([x], (T.fill(x, -1.1) * T.exp(x)) /
((1 + T.exp(x)) * (1 + T.exp(-x))), mode=m)
assert hasattr(f.maker.fgraph.outputs[0].tag, 'trace')
assert [node.op for node in f.maker.fgraph.toposort()] != [sigmoid,
T.mul, theano.tensor.inplace.neg_inplace]
f(data)
f = theano.function([x], (T.fill(x, -1.0) * T.exp(x)) /
((2 + T.exp(x)) * (1 + T.exp(-x))), mode=m)
assert hasattr(f.maker.fgraph.outputs[0].tag, 'trace')
assert [node.op for node in f.maker.fgraph.toposort()] != [sigmoid,
T.mul, theano.tensor.inplace.neg_inplace]
f(data)
f = theano.function([x], (T.fill(x, -1.0) * T.exp(x)) /
((1 + T.exp(x)) * (2 + T.exp(-x))), mode=m)
assert hasattr(f.maker.fgraph.outputs[0].tag, 'trace')
assert [node.op for node in f.maker.fgraph.toposort()] != [sigmoid,
T.mul, theano.tensor.inplace.neg_inplace]
f(data)
f = theano.function([x], (T.fill(x, -1.0) * T.exp(x)) /
((1 + T.exp(x)) * (1 + T.exp(x))), mode=m)
assert hasattr(f.maker.fgraph.outputs[0].tag, 'trace')
assert [node.op for node in f.maker.fgraph.toposort()] != [sigmoid,
T.mul, theano.tensor.inplace.neg_inplace]
f(data)
f = theano.function([x], (T.fill(x, -1.0) * T.exp(x)) /
((1 + T.exp(x)) * (2 + T.exp(-x))), mode=m)
assert hasattr(f.maker.fgraph.outputs[0].tag, 'trace')
assert [node.op for node in f.maker.fgraph.toposort()] != [sigmoid,
T.mul, theano.tensor.inplace.neg_inplace]
f(data)
......@@ -218,13 +212,13 @@ class T_sigmoid_opts(unittest.TestCase):
# tests exp_over_1_plus_exp
f = theano.function([x], 1 - T.exp(x) / (1 + T.exp(x)), mode=m)
assert hasattr(f.maker.fgraph.outputs[0].tag, 'trace')
assert check_stack_trace(f, ops_to_check=[tensor.neg, sigmoid_inplace])
assert [node.op for node in f.maker.fgraph.toposort()] == [
tensor.neg, sigmoid_inplace]
# tests inv_1_plus_exp
f = theano.function([x], 1 - T.fill(x, 1.0) / (1 + T.exp(-x)), mode=m)
assert hasattr(f.maker.fgraph.outputs[0].tag, 'trace')
assert check_stack_trace(f, ops_to_check=[tensor.neg, sigmoid_inplace])
assert [node.op for node in f.maker.fgraph.toposort()] == [tensor.neg,
sigmoid_inplace]
......@@ -241,25 +235,26 @@ class T_sigmoid_opts(unittest.TestCase):
x, y = tensor.vectors('x', 'y')
f = theano.function([x], sigmoid(-x) * tensor.exp(x), mode=m)
assert hasattr(f.maker.fgraph.outputs[0].tag, 'trace')
match(f, [sigmoid])
assert check_stack_trace(f, ops_to_check=sigmoid)
f = theano.function([x], sigmoid(x) * tensor.exp(-x), mode=m)
assert hasattr(f.maker.fgraph.outputs[0].tag, 'trace')
match(f, [tensor.neg, sigmoid])
assert check_stack_trace(f, ops_to_check=sigmoid)
f = theano.function([x], -(-(-(sigmoid(x)))) * tensor.exp(-x), mode=m)
assert hasattr(f.maker.fgraph.outputs[0].tag, 'trace')
match(f, [tensor.neg, sigmoid, tensor.neg])
# assert check_stack_trace(f, ops_to_check=sigmoid)
f = theano.function(
[x, y],
(sigmoid(x) * sigmoid(-y) * -tensor.exp(-x) *
tensor.exp(x * y) * tensor.exp(y)),
mode=m)
assert hasattr(f.maker.fgraph.outputs[0].tag, 'trace')
match(f, [sigmoid, tensor.mul, tensor.neg, tensor.exp, sigmoid,
tensor.mul])
# assert check_stack_trace(f, ops_to_check=[sigmoid, tensor.mul,
# tensor.exp])
def test_perform_sigm_times_exp(self):
"""
......@@ -318,7 +313,6 @@ class T_sigmoid_opts(unittest.TestCase):
mode = self.get_mode()
if not isinstance(mode, theano.compile.DebugMode):
f = theano.function([x, lr], ux, mode=mode)
assert hasattr(f.maker.fgraph.outputs[0].tag, 'trace')
ux_v = f([[50]], 0.1)
assert not numpy.isnan(ux_v)
......@@ -328,14 +322,14 @@ class T_sigmoid_opts(unittest.TestCase):
mode = self.get_mode('local_ultra_fast_sigmoid')
f = theano.function([x], s, mode=mode)
assert hasattr(f.maker.fgraph.outputs[0].tag, 'trace')
assert check_stack_trace(f, ops_to_check=sigmoid)
topo = f.maker.fgraph.toposort()
assert len(topo) == 1
assert topo[0].op == sigmoid
mode = self.get_mode().including('local_ultra_fast_sigmoid')
f = theano.function([x], s, mode=mode)
assert hasattr(f.maker.fgraph.outputs[0].tag, 'trace')
assert check_stack_trace(f, ops_to_check=ultra_fast_sigmoid)
topo = f.maker.fgraph.toposort()
assert topo[0].op == ultra_fast_sigmoid
assert len(topo) == 1
......@@ -347,18 +341,21 @@ class T_sigmoid_opts(unittest.TestCase):
mode = self.get_mode('local_hard_sigmoid')
f = theano.function([x], s, mode=mode)
assert hasattr(f.maker.fgraph.outputs[0].tag, 'trace')
assert check_stack_trace(f, ops_to_check=sigmoid)
topo = f.maker.fgraph.toposort()
assert topo[0].op == sigmoid
assert len(topo) == 1
mode = self.get_mode().including('local_hard_sigmoid')
f = theano.function([x], s, mode=mode)
assert hasattr(f.maker.fgraph.outputs[0].tag, 'trace')
topo = f.maker.fgraph.toposort()
assert not any([n.op == sigmoid for n in topo])
ux_v = f([[-50, -10, -4, -1, 0, 1, 4, 10, 50]])
mode2 = mode.excluding('fusion').excluding('inplace')
f2 = theano.function([x], s, mode=mode2)
self.assertTrue(check_stack_trace(f2, ops_to_check=theano.tensor.clip))
class T_softplus_opts(unittest.TestCase):
def setUp(self):
......@@ -376,7 +373,11 @@ class T_softplus_opts(unittest.TestCase):
out = T.log(sigmoid(x))
f = theano.function([x], out, mode=self.m)
assert hasattr(f.maker.fgraph.outputs[0].tag, 'trace')
# Fix ticket #4581 first
# assert check_stack_trace(
# f, ops_to_check=(theano.scalar.Neg,
# theano.tensor.nnet.sigm.ScalarSoftplus))
topo = f.maker.fgraph.toposort()
assert len(topo) == 3
assert isinstance(topo[0].op.scalar_op, theano.scalar.Neg)
......@@ -395,12 +396,14 @@ class T_softplus_opts(unittest.TestCase):
assert isinstance(topo[0].op.scalar_op,
theano.tensor.nnet.sigm.ScalarSoftplus)
assert isinstance(topo[1].op.scalar_op, theano.scalar.Neg)
# assert check_stack_trace(f, ops_to_check='all')
f(numpy.random.rand(54, 11).astype(config.floatX))
# Same test with a flatten
out = T.log(1 - T.flatten(sigmoid(x)))
f = theano.function([x], out, mode=self.m)
assert hasattr(f.maker.fgraph.outputs[0].tag, 'trace')
# assert check_stack_trace(f, ops_to_check='all')
topo = f.maker.fgraph.toposort()
assert len(topo) == 3
assert tensor.is_flat(topo[0].outputs[0])
......@@ -429,7 +432,9 @@ class T_softplus_opts(unittest.TestCase):
out = T.log(1 + T.exp(x))
f = theano.function([x], out, mode=self.m)
assert hasattr(f.maker.fgraph.outputs[0].tag, 'trace')
# Fix ticket #4581 first
# assert check_stack_trace(f, ops_to_check='all')
topo = f.maker.fgraph.toposort()
assert len(topo) == 1
assert isinstance(topo[0].op.scalar_op,
......
......@@ -1031,6 +1031,8 @@ class ShapeFeature(object):
# don't make the optimizer merge a zillion ones together
# by always returning the same object to represent 1
return self.lscalar_one
if type(s_i) is float and int(s_i) == s_i:
s_i = int(s_i)
if (type(s_i) in integer_types or
isinstance(s_i, numpy.integer) or
(isinstance(s_i, numpy.ndarray) and s_i.ndim == 0)):
......@@ -3753,8 +3755,20 @@ def local_useless_switch(node):
if out.type.broadcastable != node.outputs[0].type.broadcastable:
# We need to copy data to the new dimensions during execution
out = T.alloc(out, *[node.outputs[0].shape[i] for i
in xrange(out.ndim)])
# We should not depend on node.outputs as this would
# make the new node depend on the old one that will
# get optimized again. So this create a cycle.
shps = []
for idx, (b1, b2), in enumerate(zip(out.type.broadcastable,
node.outputs[0].type.broadcastable)):
if b1 == b2:
shps.append(out.shape[idx])
elif not node.inputs[1].type.broadcastable[idx]:
shps.append(node.inputs[1].shape[idx])
else:
shps.append(node.inputs[2].shape[idx])
out = T.alloc(out, *shps)
else:
out = out
......
......@@ -14,7 +14,7 @@ from six.moves import xrange
import six.moves.builtins as builtins
import theano
from theano import gof, Op, tensor, Variable, Apply
from theano import gof, OpenMPOp, tensor, Variable, Apply
def max_pool_2d_same_size(input, patch_size):
......@@ -114,7 +114,7 @@ def pool_2d(input, ds, ignore_border=None, st=None, padding=(0, 0),
return tensor.reshape(output, outshp, ndim=input.ndim)
class Pool(Op):
class Pool(OpenMPOp):
"""
For N-dimensional tensors, consider that the last two dimensions span
images. This Op downsamples these images by taking the max, sum or average
......@@ -236,7 +236,8 @@ class Pool(Op):
return rval
def __init__(self, ds, ignore_border=False, st=None, padding=(0, 0),
mode='max'):
mode='max', openmp=None):
super(Pool, self).__init__(openmp=openmp)
self.ds = tuple(ds)
if not all([isinstance(d, integer_types) for d in ds]):
raise ValueError(
......@@ -350,7 +351,9 @@ class Pool(Op):
x, gz)]
def c_headers(self):
return ['<algorithm>']
headers = ['<algorithm>']
headers += super(Pool, self).c_headers()
return headers
def c_code(self, node, name, inp, out, sub):
if self.mode not in ('max', 'sum', 'average_exc_pad', 'average_inc_pad'):
......@@ -362,6 +365,10 @@ class Pool(Op):
ds0, ds1 = self.ds
st0, st1 = self.st
pd0, pd1 = self.padding
if self.openmp:
omp_parallel = '#pragma omp parallel for private(r_st, r_end, c_st, c_end, collector) schedule(static)'
else:
omp_parallel = ''
ccode = """
int typenum = PyArray_ObjectType((PyObject*)%(x)s, 0);
int z_r, z_c; // shape of the output
......@@ -443,13 +450,15 @@ class Pool(Op):
%(z)s = (PyArrayObject*) PyArray_ZEROS(4, dims, typenum,0);
}
// used for indexing a pool region inside the input
int r_st, r_end, c_st, c_end;
dtype_%(x)s collector; // temp var for the value in a region
if (z_r && z_c)
{
for(int b=0; b<PyArray_DIMS(%(x)s)[0]; b++){
for(int k=0; k<PyArray_DIMS(%(x)s)[1]; k++){
for(int i=0; i< z_r; i++){
int r_st, r_end, c_st, c_end;
%(omp_parallel)s
for(int t = 0; t < PyArray_DIMS(%(x)s)[0] * PyArray_DIMS(%(x)s)[1]; t++){
int b = t %% PyArray_DIMS(%(x)s)[0];
int k = t / PyArray_DIMS(%(x)s)[0];
for(int i=0; i < z_r; i++){
r_st = i * %(st0)s;
r_end = r_st + %(ds0)s;
// skip the padding
......@@ -526,15 +535,14 @@ class Pool(Op):
}
}
}
}
"""
return ccode % locals()
def c_code_cache_version(self):
return (0, 6, 8, 4)
return (0, 6, 8, 4, self.openmp)
class PoolGrad(Op):
class PoolGrad(OpenMPOp):
__props__ = ('ds', 'ignore_border', 'st', 'padding', 'mode')
@staticmethod
......@@ -617,7 +625,7 @@ class PoolGrad(Op):
rval = list(imgshape[:-2]) + [nr, nc]
return rval
def __init__(self, ds, ignore_border, st=None, padding=(0, 0), mode='max'):
def __init__(self, ds, ignore_border, st=None, padding=(0, 0), mode='max', openmp=None):
self.ds = tuple(ds)
self.ignore_border = ignore_border
if st is None:
......@@ -629,14 +637,15 @@ class PoolGrad(Op):
"Pool mode parameter only support 'max', 'sum',"
" 'average_inc_pad' and 'average_exc_pad'. Got %s" % mode)
self.mode = mode
super(PoolGrad, self).__init__(openmp=openmp)
def infer_shape(self, node, in_shapes):
return [in_shapes[0]]
class MaxPoolGrad(PoolGrad):
def __init__(self, ds, ignore_border, st=None, padding=(0, 0)):
PoolGrad.__init__(self, ds, ignore_border, st, padding, mode='max')
def __init__(self, ds, ignore_border, st=None, padding=(0, 0), openmp=None):
PoolGrad.__init__(self, ds, ignore_border, st, padding, 'max', openmp)
def make_node(self, x, maxout, gz):
# make_node should only be called by the grad function of
......@@ -708,6 +717,10 @@ class MaxPoolGrad(PoolGrad):
ds0, ds1 = self.ds
st0, st1 = self.st
pd0, pd1 = self.padding
if self.openmp:
omp_parallel = '#pragma omp parallel for private(r_st, r_end, c_st, c_end, maximum) schedule(static)'
else:
omp_parallel = ''
return """
// sanity checks
int x_typenum = PyArray_ObjectType((PyObject*)%(x)s, 0);
......@@ -757,13 +770,15 @@ class MaxPoolGrad(PoolGrad):
else {
PyArray_FILLWBYTE(%(gx)s, 0);
}
int r_st, r_end, c_st, c_end; // used to index into the input img x
dtype_%(z)s maximum; // temp var for maximum value in a region
if (z_r && z_c)
{
for(int b=0; b<PyArray_DIMS(%(x)s)[0]; b++){
for(int k=0; k<PyArray_DIMS(%(x)s)[1]; k++){
for(int i=0; i< z_r; i++){
int r_st, r_end, c_st, c_end;
%(omp_parallel)s
for(int t = 0; t < PyArray_DIMS(%(x)s)[0] * PyArray_DIMS(%(x)s)[1]; t++){
int b = t %% PyArray_DIMS(%(x)s)[0];
int k = t / PyArray_DIMS(%(x)s)[0];
for(int i=0; i < z_r; i++){
r_st = i * %(st0)s;
r_end = r_st + %(ds0)s;
// skip the padding
......@@ -803,11 +818,10 @@ class MaxPoolGrad(PoolGrad):
}
}
}
}
""" % locals()
def c_code_cache_version(self):
return (0, 7)
return (0, 7, self.openmp)
class AveragePoolGrad(PoolGrad):
......@@ -895,10 +909,10 @@ class AveragePoolGrad(PoolGrad):
st=self.st, padding=self.padding, mode=self.mode)(ggx)]
class DownsampleFactorMaxGradGrad(Op):
class DownsampleFactorMaxGradGrad(OpenMPOp):
__props__ = ('ds', 'ignore_border', 'st', 'padding', 'mode')
def __init__(self, ds, ignore_border, st=None, padding=(0, 0), mode='max'):
def __init__(self, ds, ignore_border, st=None, padding=(0, 0), mode='max', openmp=None):
self.ds = tuple(ds)
if not all([isinstance(d, integer_types) for d in ds]):
raise ValueError(
......@@ -917,6 +931,7 @@ class DownsampleFactorMaxGradGrad(Op):
raise NotImplementedError(
'padding_h and padding_w must be smaller than strides')
self.mode = mode
super(DownsampleFactorMaxGradGrad, self).__init__(openmp=openmp)
assert self.mode == 'max'
def make_node(self, x, maxout, gz):
......@@ -990,6 +1005,10 @@ class DownsampleFactorMaxGradGrad(Op):
ds0, ds1 = self.ds
st0, st1 = self.st
pd0, pd1 = self.padding
if self.openmp:
omp_parallel = '#pragma omp parallel for private(r_st, r_end, c_st, c_end, maximum) schedule(static)'
else:
omp_parallel = ''
return """
int z_typenum = PyArray_ObjectType((PyObject*)%(maxout)s, 0);
int z_r, z_c;
......@@ -1017,10 +1036,12 @@ class DownsampleFactorMaxGradGrad(Op):
PyArray_FILLWBYTE(%(z)s, 0);
}
dtype_%(maxout)s maximum; // temp var for maximum value in a region
int r_st, r_end, c_st, c_end; // used to index into the input img x
for(int b=0; b<PyArray_DIMS(%(x)s)[0]; b++){
for(int k=0; k<PyArray_DIMS(%(x)s)[1]; k++){
for(int i=0; i< z_r; i++){
int r_st, r_end, c_st, c_end;
%(omp_parallel)s
for(int t = 0; t < PyArray_DIMS(%(x)s)[0] * PyArray_DIMS(%(x)s)[1]; t++){
int b = t %% PyArray_DIMS(%(x)s)[0];
int k = t / PyArray_DIMS(%(x)s)[0];
for(int i=0; i < z_r; i++){
r_st = i * %(st0)s;
r_end = r_st + %(ds0)s;
// skip the padding
......@@ -1058,8 +1079,7 @@ class DownsampleFactorMaxGradGrad(Op):
}
}
}
}
""" % locals()
def c_code_cache_version(self):
return (0, 1)
return (0, 1, self.openmp)
......@@ -84,14 +84,13 @@ class test_sort(unittest.TestCase):
data = np.random.rand(2, 3, 4).astype(theano.config.floatX)
utt.verify_grad(lambda x: sort(x, None), [data])
def test_grad_negative_axis(self):
# test 2D
def test_grad_negative_axis_2d(self):
data = np.random.rand(2, 3).astype(theano.config.floatX)
utt.verify_grad(lambda x: sort(x, -1), [data])
data = np.random.rand(2, 3).astype(theano.config.floatX)
utt.verify_grad(lambda x: sort(x, -2), [data])
# test 3D
def test_grad_negative_axis_3d(self):
data = np.random.rand(2, 3, 4).astype(theano.config.floatX)
utt.verify_grad(lambda x: sort(x, -1), [data])
data = np.random.rand(2, 3, 4).astype(theano.config.floatX)
......@@ -99,7 +98,7 @@ class test_sort(unittest.TestCase):
data = np.random.rand(2, 3, 4).astype(theano.config.floatX)
utt.verify_grad(lambda x: sort(x, -3), [data])
# test 4D
def test_grad_negative_axis_4d(self):
data = np.random.rand(2, 3, 4, 2).astype(theano.config.floatX)
utt.verify_grad(lambda x: sort(x, -1), [data])
data = np.random.rand(2, 3, 4, 2).astype(theano.config.floatX)
......@@ -109,14 +108,13 @@ class test_sort(unittest.TestCase):
data = np.random.rand(2, 3, 4, 2).astype(theano.config.floatX)
utt.verify_grad(lambda x: sort(x, -4), [data])
def test_grad_nonnegative_axis(self):
# test 2D
def test_grad_nonnegative_axis_2d(self):
data = np.random.rand(2, 3).astype(theano.config.floatX)
utt.verify_grad(lambda x: sort(x, 0), [data])
data = np.random.rand(2, 3).astype(theano.config.floatX)
utt.verify_grad(lambda x: sort(x, 1), [data])
# test 3D
def test_grad_nonnegative_axis_3d(self):
data = np.random.rand(2, 3, 4).astype(theano.config.floatX)
utt.verify_grad(lambda x: sort(x, 0), [data])
data = np.random.rand(2, 3, 4).astype(theano.config.floatX)
......@@ -124,7 +122,7 @@ class test_sort(unittest.TestCase):
data = np.random.rand(2, 3, 4).astype(theano.config.floatX)
utt.verify_grad(lambda x: sort(x, 2), [data])
# test 4D
def test_grad_nonnegative_axis_4d(self):
data = np.random.rand(2, 3, 4, 2).astype(theano.config.floatX)
utt.verify_grad(lambda x: sort(x, 0), [data])
data = np.random.rand(2, 3, 4, 2).astype(theano.config.floatX)
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论