提交 2f046666 authored 作者: Dustin Webb's avatar Dustin Webb

Moved general FAQ to top-level index page.

Added entries to FAQ related to speeding up Theano function compilation and out-of-memory errors.
上级 c8e7eea6
......@@ -39,6 +39,15 @@ CPUs. In fact, Theano asks g++ what are the equivalent flags it uses, and re-use
them directly.
Faster Theano Function Compilation
----------------------------------
Theano function compilation can be time consuming. It can be sped up by setting the flag ``mode=FAST_COMPILE`` which disables a number of time consuming optimizations and disables the generation of any c/cuda code. This is useful for quickly testing a simple idea.
If c/cuda code is necessary, as when using a GPU, the flag ``optimizer=fast_compile`` can be used instead. It instructs Theano to skip time consuming optimizations but still generate c/cuda code. To get the most out of this flag requires using a development version of Theano instead of the latest release (0.6).
Similarly using the flag ``optimizer_excluding=inplace`` will speed up compilation by preventing optimizations that replace operations with a version that reuses memory where it will not negatively impact the integrity of the operation. Such operations can be time consuming. However using this flag will result in greater memory usage because space must be allocated for the results which would be unnecessary otherwise. In short, using this flag will speed up compilation but it will also use more memory.
Faster Theano function
----------------------
......@@ -62,12 +71,42 @@ can disable it by setting ``f.trust_input`` to True.
Make sure the types of arguments you provide match those defined when
the function was compiled.
For example, replace the following
.. code-block:: python
x = theano.tensor.scalar('x')
f = function([x], x + 1.)
f([x], 10.)
with
.. code-block:: python
x = theano.tensor.scalar('x')
f = function([x], x + 1.)
f.trust_input = True
f([x], numpy.array([10.], dtype=theano.config.floatX))
Also, for small Theano functions, you can remove more Python overhead by
making a Theano function that does not take any input. You can use shared
variables to achieve this. Then you can call it like this: ``f.fn()`` or
``f.fn(n_calls=N)`` to speed it up. In the last case, only the last
function output (out of N calls) is returned.
Out of memory... but not really
-------------------------------
Occassionally Theano may fail to allocate memory when there appears to be more than enough reporting:
Error allocating X bytes of device memory (out of memory). Driver report Y bytes free and Z total.
where X is far less than Y and Z (i.e. X << Y < Z).
This occurs when an operation requires allocation of a large continguous block of memory but no blocks of sufficient size are available.
A known example is related to writing data to shared variables. When updating a shared variable Theano will allocate new space if the size of the data does not match the size of the space already assigned to the variable. The error above will result if there is not a contiguous block of memory of sufficient capacity to hold the data even if the overall free memory is large enough.
Related Projects
----------------
......
......@@ -110,6 +110,7 @@ Roughly in order of what you'll want to check out:
* :ref:`introduction` -- What is Theano?
* :ref:`tutorial` -- Learn the basics.
* :ref:`libdoc` -- Theano's functionality, module by module.
* :ref:`faq` -- A set of commonly asked questions.
* :ref:`optimizations` -- Guide to Theano's graph optimizations.
* :ref:`extending` -- Learn to add a Type, Op, or graph optimization.
* :ref:`dev_start_guide` -- How to contribute code to Theano.
......
......@@ -43,6 +43,5 @@ you out.
debug_faq
profiling
extending_theano
faq
python-memory-management
multi_cores
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论