提交 e41ba9bd authored 作者: lamblin's avatar lamblin

Merge pull request #1200 from delallea/minor

Minor fixes
...@@ -105,7 +105,7 @@ Brian Vandenberg emailed `installation instructions on Gentoo ...@@ -105,7 +105,7 @@ Brian Vandenberg emailed `installation instructions on Gentoo
<http://groups.google.com/d/msg/theano-dev/-8WCMn2FMR0/bJPasoZXaqoJ>`_, <http://groups.google.com/d/msg/theano-dev/-8WCMn2FMR0/bJPasoZXaqoJ>`_,
focusing on how to install the appropriate dependencies. focusing on how to install the appropriate dependencies.
Nicolas Pinto provide `ebuild scripts <https://github.com/npinto/sekyfsr-gentoo-overlay/tree/master/sci-libs/Theano>`_. Nicolas Pinto provides `ebuild scripts <https://github.com/npinto/sekyfsr-gentoo-overlay/tree/master/sci-libs/Theano>`_.
Alternative installation on Mandriva 2010.2 Alternative installation on Mandriva 2010.2
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
...@@ -657,9 +657,9 @@ Theano dependencies is easy, but be aware that it will take a long time ...@@ -657,9 +657,9 @@ Theano dependencies is easy, but be aware that it will take a long time
Homebrew Homebrew
~~~~~~~~ ~~~~~~~~
There is some :ref:`instruction There are some :ref:`instructions
<https://github.com/samueljohn/homebrew-python>` on how to install <https://github.com/samueljohn/homebrew-python>` by Samuel John on how to install
Theano dependencies with Homebrew instead of MacPort by Samuel John. Theano dependencies with Homebrew instead of MacPort.
.. _gpu_macos: .. _gpu_macos:
......
...@@ -284,13 +284,13 @@ Tips for Improving Performance on GPU ...@@ -284,13 +284,13 @@ Tips for Improving Performance on GPU
Check the line similar to *Spent Xs(X%) in cpu op, Xs(X%) in gpu op and Xs(X%) in transfer op*. Check the line similar to *Spent Xs(X%) in cpu op, Xs(X%) in gpu op and Xs(X%) in transfer op*.
This can tell you if not enough of your graph is on the GPU or if there This can tell you if not enough of your graph is on the GPU or if there
is too much memory transfer. is too much memory transfer.
* Use nvcc options. nvcc support those options to speed up some * Use nvcc options. nvcc supports those options to speed up some
computations: `-ftz=true` to `flush denormals values to computations: `-ftz=true` to `flush denormals values to
zeros. <https://developer.nvidia.com/content/cuda-pro-tip-flush-denormals-confidence>`_, zeros. <https://developer.nvidia.com/content/cuda-pro-tip-flush-denormals-confidence>`_,
`--prec-div=false` and `--prec-sqrt=false` option to speed up `--prec-div=false` and `--prec-sqrt=false` options to speed up
division and square root operation by being less precise. You can division and square root operation by being less precise. You can
enable all of them with with the `nvcc.flags=--use_fast_math` Theano enable all of them with the `nvcc.flags=--use_fast_math` Theano
flags or you can enable them individually as in this example flag or you can enable them individually as in this example:
`nvcc.flags=-ftz=true --prec-div=false`. `nvcc.flags=-ftz=true --prec-div=false`.
.. _gpu_async: .. _gpu_async:
......
...@@ -892,8 +892,8 @@ class ModuleCache(object): ...@@ -892,8 +892,8 @@ class ModuleCache(object):
key_data = None key_data = None
# We have never seen this key before. # We have never seen this key before.
# We acquire the lock later only if we where able to # We acquire the lock later only if we were able to
# generate c code Otherwise, we would take the lock for op # generate C code. Otherwise, we would take the lock for ops
# that have only a perform(). # that have only a perform().
lock_taken = False lock_taken = False
# This try/finally block ensures that the lock is released once we # This try/finally block ensures that the lock is released once we
...@@ -920,7 +920,7 @@ class ModuleCache(object): ...@@ -920,7 +920,7 @@ class ModuleCache(object):
src_code = compile_steps.next() src_code = compile_steps.next()
module_hash = get_module_hash(src_code, key) module_hash = get_module_hash(src_code, key)
# The op have c_code, so take the lock. # The op has c_code, so take the lock.
compilelock.get_lock() compilelock.get_lock()
lock_taken = True lock_taken = True
assert os.path.exists(location), ( assert os.path.exists(location), (
......
...@@ -42,7 +42,7 @@ compiledir_format_dict = {"platform": platform.platform(), ...@@ -42,7 +42,7 @@ compiledir_format_dict = {"platform": platform.platform(),
"numpy_version": numpy.__version__, "numpy_version": numpy.__version__,
"gxx_version": gcc_version_str.replace(" ", "_"), "gxx_version": gcc_version_str.replace(" ", "_"),
} }
compiledir_format_keys = ", ".join(compiledir_format_dict.keys()) compiledir_format_keys = ", ".join(sorted(compiledir_format_dict.keys()))
default_compiledir_format =\ default_compiledir_format =\
"compiledir_%(platform)s-%(processor)s-%(python_version)s" "compiledir_%(platform)s-%(processor)s-%(python_version)s"
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论