提交 87867d4c authored 作者: hantek's avatar hantek

adjust grammar and rewrite the last part

上级 8ff173ff
...@@ -16,4 +16,4 @@ Guide ...@@ -16,4 +16,4 @@ Guide
The NanGuardMode aims to prevent the model from outputing NaNs or Infs. It has The NanGuardMode aims to prevent the model from outputing NaNs or Infs. It has
a number of self-checks, which can help to find out which apply node is a number of self-checks, which can help to find out which apply node is
generating those incorrect output. generating those incorrect outputs.
...@@ -8,24 +8,24 @@ Dealing with NaNs ...@@ -8,24 +8,24 @@ Dealing with NaNs
Having a model yielding NaNs or Infs is quite common if some of the tiny Having a model yielding NaNs or Infs is quite common if some of the tiny
components in your model are not set properly. NaNs are hard to deal with components in your model are not set properly. NaNs are hard to deal with
because sometimes it is caused by a bug or error in the code, sometimes it's because sometimes it is caused by a bug or error in the code, sometimes it's
because of the numerical stability of your actual computing systerm, and even, because of the numerical stability of your computational environment (library
sometimes it relates to your algorithm. Here we try to outline common, basic versions, etc.), and even, sometimes it relates to your algorithm. Here we try
issues which cause the model to yield NaNs, as well as provide nails and to outline common issues which cause the model to yield NaNs, as well as
hammers to diagnose it. provide nails and hammers to diagnose it.
Check Superparameters and Weight Initialization Check Superparameters and Weight Initialization
----------------------------------------------- -----------------------------------------------
Most frequently, the cause would be that some of the superparameters, especially Most frequently, the cause would be that some of the hyperparameters, especially
learning rates, are set incorrectly. A high learning rate can blow up your whole learning rates, are set incorrectly. A high learning rate can blow up your whole
model into NaN outputs even within one epoch of training. So the first and model into NaN outputs even within one epoch of training. So the first and
easiest way is try to lower it. Keep halving your learning rate until you start easiest solution is try to lower it. Keep halving your learning rate until you
to get resonable output values. start to get resonable output values.
Other superparameters may also play a role. For example, are your training Other hyperparameters may also play a role. For example, are your training
algorithms involve regularization terms? If so, are their corresponding algorithms involve regularization terms? If so, are their corresponding
penalties set resonably? Search a wider superparameter space with a few (one or penalties set reasonably? Search a wider hyperparameter space with a few (one or
two) training eopchs each to see if the NaNs could disappear. two) training eopchs each to see if the NaNs could disappear.
Some models can be very sensitive to the initialization of weight vectors. If Some models can be very sensitive to the initialization of weight vectors. If
...@@ -36,10 +36,11 @@ that the model ends up with yielding NaNs. ...@@ -36,10 +36,11 @@ that the model ends up with yielding NaNs.
Run in DebugMode Run in DebugMode
----------------- -----------------
If adjusting superparameters doesn't work for you, you can still get help from If adjusting hyperparameters doesn't work for you, you can still get help from
Theano's DebugMode. Run your code in DebugMode with flag Theano's DebugMode. Run your code in DebugMode with flag mode=DebugMode,
DebugMode.check_py=False. This will give you clue about which op is causing this DebugMode.check_py=False. This will give you clue about which op is causing this
problem, and then you can inspect into that op in more detail. problem, and then you can inspect into that op in more detail. For a detailed
of using DebugMode, please refere to :ref:`debugmode`.
Theano's MonitorMode can also help. It can be used to step through the execution Theano's MonitorMode can also help. It can be used to step through the execution
of a function. You can inspect the inputs and outputs of each node being of a function. You can inspect the inputs and outputs of each node being
...@@ -51,15 +52,15 @@ Numerical Stability ...@@ -51,15 +52,15 @@ Numerical Stability
------------------- -------------------
After you have located the op which causes the problem, it may turn out that the After you have located the op which causes the problem, it may turn out that the
NaNs yielded by that op are related to numerical issues. For example, NaNs yielded by that op are related to numerical issues. For example, :math:
1 / log(p(x) + 1) may result in NaNs for those nodes who have learned to yield `1 / log(p(x) + 1)` may result in NaNs for those nodes who have learned to yield
a low probability p(x) for some input x. a low probability p(x) for some input x.
Algorithm Related Algorithm Related
----------------- -----------------
The hardest thing is that, after tracing back through all the former processes, In the most difficult situations, you may go through the above steps and find
it turns out that nothing goes wrong. If unfortunately you reaches here, there nothing wrong. If the above methods fail to uncover the cause, there is a good
is high chance that something is wrong in your algorithm. Go back to the chance that something is wrong with your algorithm. Go back to the mathematics
mathematics and find out if everything is derived correctly. and find out if everything is derived correctly.
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论