提交 680275ad authored 作者: Frederic Bastien's avatar Frederic Bastien

scan doc and small stuff

上级 aaab6ea5
...@@ -855,21 +855,27 @@ print f([0, 1, 2]) ...@@ -855,21 +855,27 @@ print f([0, 1, 2])
\end{frame} \end{frame}
\subsection{Scan}
\begin{frame} \begin{frame}
\frametitle{Scan} \frametitle{Scan}
Allow looping (for, map, while) \begin{itemize}
Allow recursion (reduce) \item Allow looping (for, map, while)
Allow recursion with dependency on many of the previous time step \item Allow recursion (reduce)
Optimize some cases live moving computation outside of scan. \item Allow recursion with dependency on many of the previous time step
\item Optimize some cases like moving computation outside of scan.
When not to use scan: \item The Scan grad is done via Backpropagation\_through\_time(BPTT)
\end{itemize}
\end{frame}
If only needed for ``vectorization'' or ``broadcasting''. tensor and \begin{frame}{When not to use scan}
numpy.ndarray support them natively. This will be much better. \begin{itemize}
\item If only needed for ``vectorization'' or ``broadcasting''. tensor
and numpy.ndarray support them natively. This will be much better.
You do a fixed number of iteration that is very small (2,3). You are probably better to just unroll the graph to do it. \item You do a fixed number of iteration that is very small (2,3). You
are probably better to just unroll the graph to do it.
The Scan grad is Backpropagation\_through\_time(BPTT) \end{itemize}
\end{frame} \end{frame}
...@@ -1069,6 +1075,17 @@ The result is then sliced to obtain the pre-nonlinearity activations for i, f, $ ...@@ -1069,6 +1075,17 @@ The result is then sliced to obtain the pre-nonlinearity activations for i, f, $
\end{itemize} \end{itemize}
\end{frame} \end{frame}
\begin{frame}{LSTM Tips For Training}
\begin{itemize}
\item Do use use SGD, but use something like adagrad or rmsprop.
\item Initialize any recurrent weights as orthogonal matrices (orth\_weights). This helps optimization.
\item Take out any operation that does not have to be inside "scan".
Theano do many cases, but not all.
\item Rescale (clip) the L2 norm of the gradient, if necessary.
\item You can use weight noise or dropout at the output of the recurrent layer for regularization.
\end{itemize}
\end{frame}
\begin{frame} \begin{frame}
\frametitle{} \frametitle{}
\begin{itemize} \begin{itemize}
...@@ -1076,19 +1093,10 @@ The result is then sliced to obtain the pre-nonlinearity activations for i, f, $ ...@@ -1076,19 +1093,10 @@ The result is then sliced to obtain the pre-nonlinearity activations for i, f, $
\end{itemize} \end{itemize}
\end{frame} \end{frame}
\begin{frame}{Conclusion}
Theano/Pylearn2/libgpuarry provide an environment for machine learning that is:
\begin{bf}Fast to develop\end{bf}\newline
\begin{bf}Fast to run\end{bf}\newline
\end{frame}
\section{Exercices} \section{Exercices}
\begin{frame}{Exercices} \begin{frame}{Exercices}
\begin{itemize} \begin{itemize}
\item Theano exercice: Work through the ``01\_buildbing\_expressions'', \item Theano exercice: Work through the ``0[1-4]*'' directory exercices:
``02\_compiling\_and\_running'',
``03\_modifying'' and/or ``04\_debugging''
directory now.
Available at ``git~clone~https://github.com/abergeron/ccw\_tutorial\_theano.git''. Available at ``git~clone~https://github.com/abergeron/ccw\_tutorial\_theano.git''.
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论