提交 90399d7e authored 作者: Jakub Sygnowski's avatar Jakub Sygnowski

(nit) tutorial typos

上级 08963819
...@@ -19,7 +19,7 @@ in most programming languages. ...@@ -19,7 +19,7 @@ in most programming languages.
Theano represents symbolic mathematical computations as graphs. These Theano represents symbolic mathematical computations as graphs. These
graphs are composed of interconnected :ref:`apply`, :ref:`variable` and graphs are composed of interconnected :ref:`apply`, :ref:`variable` and
:ref:`op` nodes. *apply* node represents the application of an *op* to some :ref:`op` nodes. *Apply* node represents the application of an *op* to some
*variables*. It is important to draw the difference between the *variables*. It is important to draw the difference between the
definition of a computation represented by an *op* and its application definition of a computation represented by an *op* and its application
to some actual data which is represented by the *apply* node. to some actual data which is represented by the *apply* node.
...@@ -394,7 +394,7 @@ it is assumed that the *gradient* is not defined. ...@@ -394,7 +394,7 @@ it is assumed that the *gradient* is not defined.
Using the Using the
`chain rule <http://en.wikipedia.org/wiki/Chain_rule>`_ `chain rule <http://en.wikipedia.org/wiki/Chain_rule>`_
these gradients can be composed in order to obtain the expression of the these gradients can be composed in order to obtain the expression of the
*gradient* of the graph's output with respect to the graph's inputs . *gradient* of the graph's output with respect to the graph's inputs.
A following section of this tutorial will examine the topic of :ref:`differentiation<tutcomputinggrads>` A following section of this tutorial will examine the topic of :ref:`differentiation<tutcomputinggrads>`
in greater detail. in greater detail.
......
...@@ -552,7 +552,7 @@ If you are reading this, there is high chance that you emailed our ...@@ -552,7 +552,7 @@ If you are reading this, there is high chance that you emailed our
mailing list and we asked you to read this section. This section mailing list and we asked you to read this section. This section
explain how to dump all the parameter passed to explain how to dump all the parameter passed to
``theano.function()``. This is useful to help us reproduce a problem ``theano.function()``. This is useful to help us reproduce a problem
during compilation and it don't request you to make a self contained during compilation and it doesn't request you to make a self contained
example. example.
For this to work, we need to be able to import the code for all Op in For this to work, we need to be able to import the code for all Op in
......
...@@ -26,7 +26,7 @@ start to get resonable output values. ...@@ -26,7 +26,7 @@ start to get resonable output values.
Other hyperparameters may also play a role. For example, are your training Other hyperparameters may also play a role. For example, are your training
algorithms involve regularization terms? If so, are their corresponding algorithms involve regularization terms? If so, are their corresponding
penalties set reasonably? Search a wider hyperparameter space with a few (one or penalties set reasonably? Search a wider hyperparameter space with a few (one or
two) training eopchs each to see if the NaNs could disappear. two) training epochs each to see if the NaNs could disappear.
Some models can be very sensitive to the initialization of weight vectors. If Some models can be very sensitive to the initialization of weight vectors. If
those weights are not initialized in a proper range, then it is not surprising those weights are not initialized in a proper range, then it is not surprising
...@@ -37,15 +37,15 @@ Run in NanGuardMode, DebugMode, or MonitorMode ...@@ -37,15 +37,15 @@ Run in NanGuardMode, DebugMode, or MonitorMode
----------------------------------------------- -----------------------------------------------
If adjusting hyperparameters doesn't work for you, you can still get help from If adjusting hyperparameters doesn't work for you, you can still get help from
Theano's NanGuardMode. change the mode of your theano function to NanGuardMode, Theano's NanGuardMode. Change the mode of your theano function to NanGuardMode
and run them again. The NanGuardMode will monitor all input/output variables in and run them again. The NanGuardMode will monitor all input/output variables in
each node, and raises an error if NaNs are detected. For how to use the each node, and raises an error if NaNs are detected. For how to use the
``NanGuardMode``, please refer to :ref:`nanguardmode`. ``NanGuardMode``, please refer to :ref:`nanguardmode`.
DebugMode can also help. Run your code in DebugMode with flag mode=DebugMode, DebugMode can also help. Run your code in DebugMode with flag
DebugMode.check_py=False. This will give you clue about which op is causing this ``mode=DebugMode,DebugMode.check_py=False``. This will give you clue about which
problem, and then you can inspect into that op in more detail. For a detailed op is causing this problem, and then you can inspect that op in more detail. For
of using DebugMode, please refere to :ref:`debugmode`. details of using ``DebugMode``, please refer to :ref:`debugmode`.
Theano's MonitorMode provides another helping hand. It can be used to step Theano's MonitorMode provides another helping hand. It can be used to step
through the execution of a function. You can inspect the inputs and outputs of through the execution of a function. You can inspect the inputs and outputs of
...@@ -57,9 +57,9 @@ Numerical Stability ...@@ -57,9 +57,9 @@ Numerical Stability
------------------- -------------------
After you have located the op which causes the problem, it may turn out that the After you have located the op which causes the problem, it may turn out that the
NaNs yielded by that op are related to numerical issues. For example, :math: NaNs yielded by that op are related to numerical issues. For example,
`1 / log(p(x) + 1)` may result in NaNs for those nodes who have learned to yield :math:`1 / log(p(x) + 1)` may result in NaNs for those nodes who have learned to
a low probability p(x) for some input x. yield a low probability p(x) for some input x.
Algorithm Related Algorithm Related
...@@ -74,5 +74,5 @@ and find out if everything is derived correctly. ...@@ -74,5 +74,5 @@ and find out if everything is derived correctly.
Cuda Specific Option Cuda Specific Option
-------------------- --------------------
The Theano flags ``nvcc.fastmath=True``, can genarate NaN. Don't set The Theano flag ``nvcc.fastmath=True`` can genarate NaN. Don't set
this flag while debugging nan. this flag while debugging NaN.
...@@ -25,7 +25,7 @@ functions using either of the following two options: ...@@ -25,7 +25,7 @@ functions using either of the following two options:
to :attr:`config.profile`. to :attr:`config.profile`.
- You can also use the Theano flags :attr:`profiling.n_apply`, - You can also use the Theano flags :attr:`profiling.n_apply`,
:attr:`profiling.n_ops` and :attr:`profiling.min_memory_size` :attr:`profiling.n_ops` and :attr:`profiling.min_memory_size`
to modify the quantify of information printed. to modify the quantity of information printed.
2. Pass the argument :attr:`profile=True` to the function :func:`theano.function <function.function>`. And then call :attr:`f.profile.print_summary()` for a single function. 2. Pass the argument :attr:`profile=True` to the function :func:`theano.function <function.function>`. And then call :attr:`f.profile.print_summary()` for a single function.
- Use this option when you want to profile not all the - Use this option when you want to profile not all the
...@@ -44,19 +44,19 @@ functions using either of the following two options: ...@@ -44,19 +44,19 @@ functions using either of the following two options:
The profiler will output one profile per Theano function and profile The profiler will output one profile per Theano function and profile
that is the sum of the printed profile. Each profile contains 4 that is the sum of the printed profiles. Each profile contains 4
sections: global info, class info, Ops info and Apply node info. sections: global info, class info, Ops info and Apply node info.
In the global section, the "Message" is the name of the Theano In the global section, the "Message" is the name of the Theano
function. theano.function() has an optional parameter ``name`` that function. theano.function() has an optional parameter ``name`` that
defaults to None. Change it to something else to help you profile many defaults to None. Change it to something else to help you profile many
Theano functions. In that section, we also see the number of time the Theano functions. In that section, we also see the number of times the
function was called (1) and the total time spent in all those function was called (1) and the total time spent in all those
calls. The time spent in Function.fn.__call__ and in thunks is useful calls. The time spent in Function.fn.__call__ and in thunks is useful
to help understand Theano overhead. to understand Theano overhead.
Also, we see the time spent in the two parts of the compilation Also, we see the time spent in the two parts of the compilation
process: optimization(modify the graph to make it more stable/faster) process: optimization (modify the graph to make it more stable/faster)
and the linking (compile c code and make the Python callable returned and the linking (compile c code and make the Python callable returned
by function). by function).
...@@ -93,8 +93,8 @@ optimizations enabled, there would be only one op left in the graph. ...@@ -93,8 +93,8 @@ optimizations enabled, there would be only one op left in the graph.
memory allocated by Theano. The second is the peak GPU memory memory allocated by Theano. The second is the peak GPU memory
that was allocated by Theano. that was allocated by Theano.
Do not always enable this, as this slowdown memory allocation and Do not always enable this, as this slows down memory allocation and
free. As this slowdown the computation, this will affect speed free. As this slows down the computation, this will affect speed
profiling. So don't use both at the same time. profiling. So don't use both at the same time.
to run the example: to run the example:
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论