提交 8761d3f7 authored 作者: Razvan Pascanu's avatar Razvan Pascanu

Merge pull request #532 from delallea/minor

Minor stuff
...@@ -33,9 +33,9 @@ New Features ...@@ -33,9 +33,9 @@ New Features
Sparse Sandbox Addition (Not reviewed/documented/tested, but used by some people) Sparse Sandbox Addition (Not reviewed/documented/tested, but used by some people)
* They are all in the theano.sparse.sandbox.sp2 module * They are all in the theano.sparse.sandbox.sp2 module
* Op class: Cast, Poisson, Multinomial, EliminateZeros, Sum, Binomial * Op class: Cast, Poisson, Multinomial, EliminateZeros, Sum, Binomial
* Op class: SamplingDot, SamplingDotCsr(inserted automatically) * Op class: SamplingDot, SamplingDotCsr (inserted automatically)
* Op function: structured_sigmoid, structured_exp, structured_pow, structured_minimum, * Op function: structured_sigmoid, structured_exp, structured_pow, structured_minimum
* Op class: StructuredAddSV, StrucutedAddSVCSR(inserted automatically) * Op class: StructuredAddSV, StrucutedAddSVCSR (inserted automatically)
* opt: local_sampling_dot_csr, local_structured_add_s_v * opt: local_sampling_dot_csr, local_structured_add_s_v
Internal changes Internal changes
...@@ -43,9 +43,10 @@ Internal changes ...@@ -43,9 +43,10 @@ Internal changes
in theano.function, instead of TypeError and ValueError. (Pascal L.) in theano.function, instead of TypeError and ValueError. (Pascal L.)
Crash Fix Crash Fix
* Don't try to use blas library when told to don't use them(Frederic B.) * Do not try to use the BLAS library when blas.ldflags is manually set to an
* When importing theano on a computer without without gpu with the Theano empty string (Frederic B.)
flags device or init_gpu_device to gpu* (Frederic B., Reported Luo Heng) * When importing theano on a computer without GPU with the Theano
flags 'device' or 'init_gpu_device' set to gpu* (Frederic B., reported by Luo Heng)
============= =============
......
...@@ -317,13 +317,13 @@ bindings to work only on Python files. ...@@ -317,13 +317,13 @@ bindings to work only on Python files.
Emacs Emacs
~~~~~ ~~~~~
There is an **excellent** system to configure emacs for python: There is an **excellent** system to configure emacs for Python:
`emacs-for-python `emacs-for-python
<https://github.com/gabrielelanaro/emacs-for-python>`_. It gathers many <https://github.com/gabrielelanaro/emacs-for-python>`_. It gathers many
emacs config into one, and modifies them to behave together nicely. You emacs config into one, and modifies them to behave together nicely. You
can use it to check for pep8 compliance and for python syntax errors. can use it to check for pep8 compliance and for Python syntax errors.
To install it on linux, you can do like this: To install it on Linux, you can do like this:
.. code-block:: bash .. code-block:: bash
......
...@@ -46,7 +46,7 @@ def _atexit_print_fn(): ...@@ -46,7 +46,7 @@ def _atexit_print_fn():
if len(_atexit_print_list) > 1: if len(_atexit_print_list) > 1:
# Make a global profile # Make a global profile
cum = copy.copy(_atexit_print_list[0]) cum = copy.copy(_atexit_print_list[0])
cum.message = "Sum of all printed profile at exit" cum.message = "Sum of all printed profiles at exit"
for ps in _atexit_print_list[1:]: for ps in _atexit_print_list[1:]:
# for ps in [ps for ps in _atexit_print_list[1:] # for ps in [ps for ps in _atexit_print_list[1:]
# if not isinstance(ps, ScanProfileStats)]: # if not isinstance(ps, ScanProfileStats)]:
...@@ -202,7 +202,7 @@ class ProfileStats(object): ...@@ -202,7 +202,7 @@ class ProfileStats(object):
if op_flops: if op_flops:
flops_msg = ' <MFlops/s>' flops_msg = ' <MFlops/s>'
print ('\nHACK WARNING: we print the flops for some OP, but the' print ('\nHACK WARNING: we print the flops for some OP, but the'
' logic don\' always work. You need to know the internal' ' logic does not always work. You need to know the internal'
' of Theano to make it work correctly.' ' of Theano to make it work correctly.'
' Otherwise don\'t use!') ' Otherwise don\'t use!')
print ('\nOp-wise summary:' print ('\nOp-wise summary:'
...@@ -679,17 +679,33 @@ if 0: # old code still to be ported from ProfileMode ...@@ -679,17 +679,33 @@ if 0: # old code still to be ported from ProfileMode
def print_summary(self, def print_summary(self,
n_apply_to_print=config.ProfileMode.n_apply_to_print, n_apply_to_print=config.ProfileMode.n_apply_to_print,
n_ops_to_print=config.ProfileMode.n_ops_to_print): n_ops_to_print=config.ProfileMode.n_ops_to_print):
""" Print 3 summary that show where the time is spend. The first show an Apply-wise summary, the second show an Op-wise summary, the third show an type-Op-wise summary. """
Print 3 summaries that show where the time is spent. The first shows an
Apply-wise summary, the second shows an Op-wise summary, the third
shows an type-Op-wise summary.
The Apply-wise summary print the timing information for the worst
offending Apply nodes. This corresponds to individual Op applications
within your graph which take the longest to execute (so if you use dot
twice, you will see two entries there).
The Apply-wise summary print the timing information for the worst offending Apply nodes. This corresponds to individual Op applications within your graph which take the longest to execute (so if you use dot twice, you will see two entries there). The Op-wise summary print the execution time of all Apply nodes
The Op-wise summary print the execution time of all Apply nodes executing the same Op are grouped together and the total execution time per Op is shown (so if you use dot twice, you will see only one entry there corresponding to the sum of the time spent in each of them). If two Op have different hash value, they will be separate. executing the same Op are grouped together and the total execution time
The type-Op-wise summary group the result by type of op. So event if two Op have different hash value, they will be merged. per Op is shown (so if you use dot twice, you will see only one entry
there corresponding to the sum of the time spent in each of them). If
two Op have different hash value, they will be separate.
Their is an hack with the Op-wise summary. Go see it if you want to know more. The type-Op-wise summary group the result by type of op. So event if
two Op have different hash value, they will be merged.
:param n_apply_to_print: the number of apply to print. Default 15, or n_ops_to_print flag. There is a hack with the Op-wise summary. Go see it if you want to know
more.
:param n_ops_to_print: the number of ops to print. Default 20, or n_apply_to_print flag. :param n_apply_to_print: the number of apply to print. Default 15, or
n_ops_to_print flag.
:param n_ops_to_print: the number of ops to print. Default 20, or
n_apply_to_print flag.
""" """
fct_call_time = self.mode.fct_call_time fct_call_time = self.mode.fct_call_time
fct_call = self.mode.fct_call fct_call = self.mode.fct_call
...@@ -709,42 +725,48 @@ if 0: # old code still to be ported from ProfileMode ...@@ -709,42 +725,48 @@ if 0: # old code still to be ported from ProfileMode
n_apply_to_print, n_apply_to_print,
n_ops_to_print) n_ops_to_print)
def print_diff_summary(self, other, n_apply_to_print=15,
n_ops_to_print=20):
"""
As print_summary, but print the difference on two different profile
mode.
def print_diff_summary(self, other, n_apply_to_print=15, n_ops_to_print=20): TODO: Also we don't print the Apply-wise summary as it doesn't work for
""" As print_summary, but print the difference on two different profile mode. now.
TODO: Also we don't print the Apply-wise summary as it don't work for now.
TODO: make comparaison with gpu code. TODO: make comparaison with gpu code.
:param other: the other instance of ProfileMode that we want to be compared to. :param other: the other instance of ProfileMode that we want to be
compared to.
:param n_apply_to_print: the number of apply to print. Default 15. :param n_apply_to_print: the number of apply to print. Default 15.
:param n_ops_to_print: the number of ops to print. Default 20. :param n_ops_to_print: the number of ops to print. Default 20.
""" """
def diff_dict(a_time,b_time_): def diff_dict(a_time, b_time_):
r = {} r = {}
b_time = copy.copy(b_time_) b_time = copy.copy(b_time_)
for a,ta in a_time.items(): for a, ta in a_time.items():
r.setdefault(a,0) r.setdefault(a, 0)
tb = b_time.pop(a,0) tb = b_time.pop(a, 0)
r[a]+=ta-tb r[a] += ta - tb
#they are missing in a #they are missing in a
for a,t in b_time.items(): for a, t in b_time.items():
r.setdefault(a,0) r.setdefault(a, 0)
r[a]+=t r[a] += t
return r return r
compile_time = self.compile_time-other.compile_time compile_time = self.compile_time - other.compile_time
fct_call_time = diff_dict(self.fct_call_time,other.fct_call_time) fct_call_time = diff_dict(self.fct_call_time, other.fct_call_time)
fct_call = diff_dict(self.fct_call,other.fct_call) fct_call = diff_dict(self.fct_call, other.fct_call)
apply_time = diff_dict(self.apply_time, other.apply_time) apply_time = diff_dict(self.apply_time, other.apply_time)
op_cimpl = self.op_cimpl and other.op_cimpl op_cimpl = self.op_cimpl and other.op_cimpl
message = self.message message = self.message
outputs_size = diff_dict(self.outputs_size,other.outputs_size) outputs_size = diff_dict(self.outputs_size, other.outputs_size)
self.print_summary_("print_diff_summary", compile_time, fct_call_time, fct_call, self.print_summary_(
"print_diff_summary", compile_time, fct_call_time, fct_call,
apply_time, op_cimpl, message, outputs_size, apply_time, op_cimpl, message, outputs_size,
n_apply_to_print=n_apply_to_print, n_apply_to_print=n_apply_to_print,
n_ops_to_print=n_ops_to_print, print_apply=False) n_ops_to_print=n_ops_to_print, print_apply=False)
......
...@@ -1262,13 +1262,13 @@ class CLinker(link.Linker): ...@@ -1262,13 +1262,13 @@ class CLinker(link.Linker):
struct_name = self.struct_name struct_name = self.struct_name
print >> code, "static PyObject * instantiate(PyObject * self, PyObject *argtuple) {" print >> code, "static PyObject * instantiate(PyObject * self, PyObject *argtuple) {"
print >> code, ' assert(PyTuple_Check(argtuple));' print >> code, ' assert(PyTuple_Check(argtuple));'
print >> code, ' if (%(n_args)i != PyTuple_Size(argtuple)){ ' %locals() print >> code, ' if (%(n_args)i != PyTuple_Size(argtuple)){ ' % locals()
print >> code, ' PyErr_Format(PyExc_TypeError, "Wrong number of arguments, expected %(n_args)i, got %%i", (int)PyTuple_Size(argtuple));' %locals() print >> code, ' PyErr_Format(PyExc_TypeError, "Wrong number of arguments, expected %(n_args)i, got %%i", (int)PyTuple_Size(argtuple));' % locals()
print >> code, ' return NULL;' print >> code, ' return NULL;'
print >> code, ' }' print >> code, ' }'
print >> code, ' %(struct_name)s* struct_ptr = new %(struct_name)s();' %locals() print >> code, ' %(struct_name)s* struct_ptr = new %(struct_name)s();' % locals()
print >> code, ' struct_ptr->init(', ','.join('PyTuple_GET_ITEM(argtuple, %i)'%n for n in xrange(n_args)), ');' print >> code, ' struct_ptr->init(', ','.join('PyTuple_GET_ITEM(argtuple, %i)' % n for n in xrange(n_args)), ');'
print >> code, ' PyObject* thunk = PyCObject_FromVoidPtrAndDesc((void*)(&%(struct_name)s_executor), struct_ptr, %(struct_name)s_destructor);' %locals() print >> code, ' PyObject* thunk = PyCObject_FromVoidPtrAndDesc((void*)(&%(struct_name)s_executor), struct_ptr, %(struct_name)s_destructor);' % locals()
print >> code, " return thunk; }" print >> code, " return thunk; }"
return code.getvalue() return code.getvalue()
......
...@@ -797,10 +797,10 @@ class TensorType(Type): ...@@ -797,10 +797,10 @@ class TensorType(Type):
@staticmethod @staticmethod
def values_eq_approx(a, b, allow_remove_inf = False, allow_remove_nan = False): def values_eq_approx(a, b, allow_remove_inf = False, allow_remove_nan = False):
""" """
:param allow_remove_inf: If True, when their is an inf in a, :param allow_remove_inf: If True, when there is an inf in a,
we allow any value in b in that position. we allow any value in b in that position.
Event -inf Event -inf
:param allow_remove_nan: If True, when their is a nan in a, :param allow_remove_nan: If True, when there is a nan in a,
we allow any value in b in that position. we allow any value in b in that position.
Event +-inf Event +-inf
""" """
...@@ -3985,7 +3985,6 @@ class IncSubtensor(Op): ...@@ -3985,7 +3985,6 @@ class IncSubtensor(Op):
else: else:
return () return ()
def infer_shape(self, node, shapes): def infer_shape(self, node, shapes):
return [shapes[0]] return [shapes[0]]
...@@ -4004,13 +4003,13 @@ class IncSubtensor(Op): ...@@ -4004,13 +4003,13 @@ class IncSubtensor(Op):
if self.set_instead_of_inc: if self.set_instead_of_inc:
gx = set_subtensor( gx = set_subtensor(
Subtensor(idx_list=self.idx_list)(g_output,*idx_list), Subtensor(idx_list=self.idx_list)(g_output, *idx_list),
zeros_like(y)) zeros_like(y))
else: else:
gx = g_output gx = g_output
gy = Subtensor(idx_list = self.idx_list)(g_output, *idx_list) gy = Subtensor(idx_list=self.idx_list)(g_output, *idx_list)
return [gx, gy] + [None]*len(idx_list) return [gx, gy] + [None] * len(idx_list)
def split(x, splits_size, n_splits, axis=0): def split(x, splits_size, n_splits, axis=0):
...@@ -4149,7 +4148,6 @@ class Rebroadcast(Op): ...@@ -4149,7 +4148,6 @@ class Rebroadcast(Op):
items.sort() # no ambiguity because each item key is unique items.sort() # no ambiguity because each item key is unique
return hash(type(self)) ^ hash(tuple(items)) return hash(type(self)) ^ hash(tuple(items))
def __str__(self): def __str__(self):
if len(self.axis) == 0: if len(self.axis) == 0:
broadcast_pattern = [] broadcast_pattern = []
...@@ -4215,6 +4213,7 @@ def addbroadcast(x, *axes): ...@@ -4215,6 +4213,7 @@ def addbroadcast(x, *axes):
rval = Rebroadcast(*[(axis, True) for axis in axes])(x) rval = Rebroadcast(*[(axis, True) for axis in axes])(x)
return theano.tensor.opt.apply_rebroadcast_opt(rval) return theano.tensor.opt.apply_rebroadcast_opt(rval)
def unbroadcast(x, *axes): def unbroadcast(x, *axes):
""" """
Make the input impossible to broadcast in the specified axes. Make the input impossible to broadcast in the specified axes.
...@@ -4230,9 +4229,11 @@ def patternbroadcast(x, broadcastable): ...@@ -4230,9 +4229,11 @@ def patternbroadcast(x, broadcastable):
""" """
Make the input adopt a specific broadcasting pattern. Make the input adopt a specific broadcasting pattern.
We apply the opt here not to pollute the graph especially during the gpu optimization We apply the opt here not to pollute the graph especially during the gpu
optimization.
""" """
rval = Rebroadcast(*[(i,broadcastable[i]) for i in xrange(len(broadcastable))])(x) rval = Rebroadcast(*[(i, broadcastable[i])
for i in xrange(len(broadcastable))])(x)
return theano.tensor.opt.apply_rebroadcast_opt(rval) return theano.tensor.opt.apply_rebroadcast_opt(rval)
...@@ -4554,16 +4555,17 @@ def stack(*tensors): ...@@ -4554,16 +4555,17 @@ def stack(*tensors):
# This should be an optimization! # This should be an optimization!
# Doing it here make the graph less canonicalized # Doing it here make the graph less canonicalized
# (more type need to be understood by all optimization) # (more type need to be understood by all optimization)
# And DebugMode can't detect error in this code as it is not in an optimization. # And DebugMode can't detect error in this code as it is not in an
# optimization.
# See ticket #660 # See ticket #660
if numpy.all([ if numpy.all([
# in case their is direct int in tensors. # in case there is direct int in tensors.
isinstance(t, (numpy.number, float, int, python_complex)) or isinstance(t, (numpy.number, float, int, python_complex)) or
(isinstance(t, Variable) and (isinstance(t, Variable) and
isinstance(t.type, TensorType) and isinstance(t.type, TensorType) and
t.ndim==0) t.ndim == 0)
for t in tensors]): for t in tensors]):
#in case their is direct int # in case there is direct int
tensors = map(as_tensor_variable, tensors) tensors = map(as_tensor_variable, tensors)
dtype = scal.upcast(*[i.dtype for i in tensors]) dtype = scal.upcast(*[i.dtype for i in tensors])
return theano.tensor.opt.MakeVector(dtype)(*tensors) return theano.tensor.opt.MakeVector(dtype)(*tensors)
...@@ -4650,7 +4652,9 @@ def vertical_stack(*args): ...@@ -4650,7 +4652,9 @@ def vertical_stack(*args):
return concatenate(args, axis=0) return concatenate(args, axis=0)
if 0: #vertical and horizontal stacking are deprecated. Better to use stack() and join(). # Vertical and horizontal stacking are deprecated. Better to use stack() and
# join().
if 0:
class VerticalStack(Op): class VerticalStack(Op):
""" """
Vertically stack two L{TensorType}s. Vertically stack two L{TensorType}s.
...@@ -5168,17 +5172,17 @@ class PermuteRowElements(Op): ...@@ -5168,17 +5172,17 @@ class PermuteRowElements(Op):
if xs0 == ys0: if xs0 == ys0:
for i in xrange(xs0): for i in xrange(xs0):
self._rec_perform(node, x[i], y[i], inverse, out[i], self._rec_perform(node, x[i], y[i], inverse, out[i],
curdim+1) curdim + 1)
elif ys0 == 1 and node.inputs[1].type.broadcastable[curdim]: elif ys0 == 1 and node.inputs[1].type.broadcastable[curdim]:
# Broadcast y # Broadcast y
for i in xrange(xs0): for i in xrange(xs0):
self._rec_perform(node, x[i], y[0], inverse, out[i], self._rec_perform(node, x[i], y[0], inverse, out[i],
curdim+1) curdim + 1)
elif xs0 == 1 and node.inputs[0].type.broadcastable[curdim]: elif xs0 == 1 and node.inputs[0].type.broadcastable[curdim]:
# Broadcast x # Broadcast x
for i in xrange(ys0): for i in xrange(ys0):
self._rec_perform(node, x[0], y[i], inverse, out[i], self._rec_perform(node, x[0], y[i], inverse, out[i],
curdim+1) curdim + 1)
else: else:
raise ValueError('Dimension mismatch: %s, %s' % (xs0, ys0)) raise ValueError('Dimension mismatch: %s, %s' % (xs0, ys0))
......
差异被折叠。
...@@ -583,8 +583,8 @@ class T_examples(unittest.TestCase): ...@@ -583,8 +583,8 @@ class T_examples(unittest.TestCase):
def test_examples_8(self): def test_examples_8(self):
from theano import shared from theano import shared
# Force the dtype to int64 to work correctly on 32 bits computer. # Force the dtype to int64 to work correctly on 32 bit computer.
# Otherwise, it create by default a int32 on 32 bits computer. # Otherwise, it create by default a int32 on 32 bit computer.
state = shared(numpy.int64(0)) state = shared(numpy.int64(0))
inc = T.iscalar('inc') inc = T.iscalar('inc')
accumulator = function([inc], state, updates=[(state, state+inc)]) accumulator = function([inc], state, updates=[(state, state+inc)])
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论