Remove None from return values of grad.
Also change checks to verify the output dtype of the Op itself, not of
the inputs or gradient, because it can depend on different things.
The idea is that if the Op's output is continuous, then the gradient
should be propagated to the inputs, regardless of whether they are
continuous or discrete. However, if the output is discrete, then the
gradient wrt the inputs will be a continuous zero.
正在显示
请
注册
或者
登录
后发表评论