Documentation : error messages

aad941fc · sebastien-j · 6a6e9914 · aad941fc
--- a/doc/tutorial/debug_faq.txt
+++ b/doc/tutorial/debug_faq.txt
@@ -17,6 +17,94 @@ Isolating the Problem/Testing Theano Compiler
 You can run your Theano function in a :ref:`DebugMode<using_debugmode>`.
 This tests the Theano optimizations and helps to find where NaN, inf and other problems come from.
+Interpreting Error Messages
+---------------------------
+Even in its default configuration, Theano tries to display useful error
+messages. Consider the following faulty code.
+.. code-block:: python
+    import numpy as np
+    import theano
+    import theano.tensor as T
+    x = T.vector()
+    y = T.vector()
+    z = x + x
+    z = z + y
+    f = theano.function([x, y], z)
+    f(np.ones((2,)), np.ones((3,)))
+Running the code above we see:
+.. code-block:: bash
+    ---------------------------------------------------------------------------
+    ValueError                                Traceback (most recent call last)
+    <ipython-input-1-902f5bb7425a> in <module>()
+          8 z = z + y
+          9 f = theano.function([x, y], z)
+    ---> 10 f(np.ones((2,)), np.ones((3,)))
+    /data/lisa/exp/jeasebas/Theano/theano/compile/function_module.pyc in __call__(se
+    lf, *args, **kwargs)
+        603                     gof.link.raise_with_op(
+        604                         self.fn.nodes[self.fn.position_of_error],
+    --> 605                         self.fn.thunks[self.fn.position_of_error])
+        606                 else:
+        607                     # For the c linker We don't have access from
+    /data/lisa/exp/jeasebas/Theano/theano/compile/function_module.pyc in __call__(self, *args, **kwargs)
+        593         t0_fn = time.time()
+        594         try:
+    --> 595             outputs = self.fn()
+        596         except Exception:
+        597             if hasattr(self.fn, 'position_of_error'):
+    ValueError: Input dimension mis-match. (input[0].shape[0] = 3, input[1].shape[0] = 2)
+    Apply node that caused the error: Elemwise{add,no_inplace}(<TensorType(float64, vector)>, <TensorType(float64, vector)>, <TensorType(float64, vector)>)
+    Inputs types: [TensorType(float64, vector), TensorType(float64, vector), TensorType(float64, vector)]
+    Inputs shapes: [(3,), (2,), (2,)]
+    Inputs strides: [(8,), (8,), (8,)]
+    Inputs scalar values: ['not scalar', 'not scalar', 'not scalar']
+    HINT: Re-running with most Theano optimization disabled could give you a back-traces when this node was created. This can be done with by setting the Theano flags optimizer=fast_compile
+    HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint of this apply node.
+Arguably the most useful information is approximately half-way through
+the error message, where the kind of error is displayed along with its
+cause (`ValueError: Input dimension mis-match. (input[0].shape[0] = 3,
+input[1].shape[0] = 2`).
+Below it, some other information is given, such as the apply node that
+caused the error, as well as the input types, shapes, strides and
+scalar values.
+The two hints can also be helpful when debugging. Using the theano flag
+``optimizer=fast_compile`` can in some cases tell you the faulty line,
+while ``exception_verbosity=high`` will display a debugprint of the
+apply node. Using these hints, the end of the error message becomes :
+.. code-block:: bash
+    Backtrace when the node is created:
+      File "/opt/lisa/os/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 2690, in run_ast_nodes
+        if self.run_code(code):
+      File "/opt/lisa/os/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 2746, in run_code
+        exec code_obj in self.user_global_ns, self.user_ns
+      File "<ipython-input-1-902f5bb7425a>", line 8, in <module>
+        z = z + y
+    Debugprint of the apply node:
+    Elemwise{add,no_inplace} [@A] <TensorType(float64, vector)> ''
+     |Elemwise{add,no_inplace} [@B] <TensorType(float64, vector)> ''
+     | |<TensorType(float64, vector)> [@C] <TensorType(float64, vector)>
+     | |<TensorType(float64, vector)> [@C] <TensorType(float64, vector)>
+     |<TensorType(float64, vector)> [@D] <TensorType(float64, vector)>
+We can here see that the error can be traced back to the line ``z = z + y``.
+Nevertheless, it may happen that setting ``optimizer=fast_compile`` does not
+give you a backtrace of the error. In this case, you can use test values.
 Using Test Values
 -----------------
@@ -26,7 +114,8 @@ on-the-fly, before a ``theano.function`` is ever compiled. Since optimizations
 haven't been applied at this stage, it is easier for the user to locate the
 source of some bug. This functionality is enabled through the config flag
 ``theano.config.compute_test_value``. Its use is best shown through the
-following example.
+following example (both ``optimizer=fast_compile`` and ``exception_verbosity=high``
+are already used here).
 .. code-block:: python
@@ -61,35 +150,47 @@ Running the above code generates the following error message:
 .. code-block:: bash
-    Definition in:
+    ---------------------------------------------------------------------------
-      File "/u/desjagui/workspace/PYTHON/theano/gof/opt.py", line 1102, in apply
+    ValueError                                Traceback (most recent call last)
-        lopt_change = self.process_node(fgraph, node, lopt)
+    <ipython-input-1-f144ea5aa857> in <module>()
-      File "/u/desjagui/workspace/PYTHON/theano/gof/opt.py", line 882, in process_node
+         27 # compile and call the actual function
-        replacements = lopt.transform(node)
+         28 f = theano.function([x], h2)
-      File "/u/desjagui/workspace/PYTHON/Theano/theano/tensor/blas.py", line 1030, in local_dot_to_dot22
+    ---> 29 f(numpy.random.rand(5, 10))
-        return [_dot22(*node.inputs)]
-      File "/u/desjagui/workspace/PYTHON/Theano/theano/gof/op.py", line 324, in __call__
+    /data/lisa/exp/jeasebas/Theano/theano/compile/function_module.pyc in __call__(self, *args, **kwargs)
-        self.add_tag_trace(node)
+        603                     gof.link.raise_with_op(
-    For the full definition stack trace set the Theano flags traceback.limit to -1
+        604                         self.fn.nodes[self.fn.position_of_error],
+    --> 605                         self.fn.thunks[self.fn.position_of_error])
-    Traceback (most recent call last):
+        606                 else:
-      File "test.py", line 29, in <module>
+        607                     # For the c linker We don't have access from
-        f(numpy.random.rand(5,10))
-      File "/u/desjagui/workspace/PYTHON/theano/compile/function_module.py", line 596, in __call__
+    /data/lisa/exp/jeasebas/Theano/theano/compile/function_module.pyc in __call__(self, *args, **kwargs)
-        self.fn()
+        593         t0_fn = time.time()
-      File "/u/desjagui/workspace/PYTHON/theano/gof/link.py", line 288, in streamline_default_f
+        594         try:
-        raise_with_op(node)
+    --> 595             outputs = self.fn()
-      File "/u/desjagui/workspace/PYTHON/theano/gof/link.py", line 284, in streamline_default_f
+        596         except Exception:
-        thunk()
+        597             if hasattr(self.fn, 'position_of_error'):
-      File "/u/desjagui/workspace/PYTHON/Theano/theano/gof/cc.py", line 1111, in execute
-        raise exc_type, exc_value, exc_trace
+    ValueError: Shape mismatch: x has 10 cols (and 5 rows) but y has 20 rows (and 10 cols)
-    ValueError: ('Shape mismatch: x has 10 cols but y has 20 rows',
+    Apply node that caused the error: Dot22(x, DimShuffle{1,0}.0)
-    _dot22(x, <TensorType(float64, matrix)>), [_dot22.0],
+    Inputs types: [TensorType(float64, matrix), TensorType(float64, matrix)]
-    _dot22(x, InplaceDimShuffle{1,0}.0), 'Sequence id of Apply node=4')
+    Inputs shapes: [(5, 10), (20, 10)]
+    Inputs strides: [(80, 8), (8, 160)]
-Needless to say, the above is not very informative and does not provide much in
+    Inputs scalar values: ['not scalar', 'not scalar']
-the way of guidance. However, by instrumenting the code ever so slightly, we
-can get Theano to reveal the exact source of the error.
+    Debugprint of the apply node:
+    Dot22 [@A] <TensorType(float64, matrix)> ''
+     |x [@B] <TensorType(float64, matrix)>
+     |DimShuffle{1,0} [@C] <TensorType(float64, matrix)> ''
+       |Flatten{2} [@D] <TensorType(float64, matrix)> ''
+         |DimShuffle{2,0,1} [@E] <TensorType(float64, 3D)> ''
+           |W1 [@F] <TensorType(float64, 3D)>
+    HINT: Re-running with most Theano optimization disabled could give you a back-traces when this node was created. This can be done with by setting the Theano flags optimizer=fast_compile
+If the above is not informative enough, by instrumenting the code ever
+so slightly, we can get Theano to reveal the exact source of the error.
 .. code-block:: python
@@ -108,18 +209,46 @@ value. This allows Theano to evaluate symbolic expressions on-the-fly (by
 calling the ``perform`` method of each op), as they are being defined. Sources
 of error can thus be identified with much more precision and much earlier in
 the compilation pipeline. For example, running the above code yields the
-following error message, which properly identifies *line 23* as the culprit.
+following error message, which properly identifies *line 24* as the culprit.
 .. code-block:: bash
-    Traceback (most recent call last):
+    ---------------------------------------------------------------------------
-      File "test2.py", line 23, in <module>
+    ValueError                                Traceback (most recent call last)
-        h1 = T.dot(x,func_of_W1)
+    <ipython-input-1-320832559ee8> in <module>()
-      File "/u/desjagui/workspace/PYTHON/Theano/theano/gof/op.py", line 360, in __call__
+         22
-        node.op.perform(node, input_vals, output_storage)
+         23 # source of error: dot product of 5x10 with 20x10
-      File "/u/desjagui/workspace/PYTHON/Theano/theano/tensor/basic.py", line 4458, in perform
+    ---> 24 h1 = T.dot(x, func_of_W1)
-        z[0] = numpy.asarray(numpy.dot(x, y))
+         25
-    ValueError: ('matrices are not aligned', (5, 10), (20, 10))
+         26 # do more stuff
+    /data/lisa/exp/jeasebas/Theano/theano/tensor/basic.pyc in dot(a, b)
+       4732         return tensordot(a, b, [[a.ndim - 1], [numpy.maximum(0, b.ndim - 2)]])
+       4733     else:
+    -> 4734         return _dot(a, b)
+       4735
+       4736
+    /data/lisa/exp/jeasebas/Theano/theano/gof/op.pyc in __call__(self, *inputs, **kwargs)
+        543                 thunk.outputs = [storage_map[v] for v in node.outputs]
+        544
+    --> 545                 required = thunk()
+        546                 assert not required  # We provided all inputs
+        547
+    /data/lisa/exp/jeasebas/Theano/theano/gof/op.pyc in rval(p, i, o, n)
+        750
+        751         def rval(p=p, i=node_input_storage, o=node_output_storage, n=node):
+    --> 752             r = p(n, [x[0] for x in i], o)
+        753             for o in node.outputs:
+        754                 compute_map[o][0] = True
+    /data/lisa/exp/jeasebas/Theano/theano/tensor/basic.pyc in perform(self, node, inp, out)
+       4552         # gives a numpy float object but we need to return a 0d
+       4553         # ndarray
+    -> 4554         z[0] = numpy.asarray(numpy.dot(x, y))
+       4555
+       4556     def grad(self, inp, grads):
 The ``compute_test_value`` mechanism works as follows: