remove old version.

ea36c662 · Frederic · 85ebc3aa · 85ebc3aa · 85ebc3aa · 85ebc3aa
--- a/doc/omlw2014/gpundarray.txt
+++ b/doc/omlw2014/gpundarray.txt
-
-.. _omlw2014_libgpuarray:
-
-***********
-libgpuarray
-***********
-
-Why a common GPU ndarray?
-------------------------
-
- Currently there are at least 4 different GPU array data structures in use by Python packages
-
-  - CudaNdarray (Theano), GPUArray (PyCUDA), CUDAMatrix (cudamat), GPUArray (PyOpenCL), ...
-  - There are even more if we include other languages
-
- All of them are a subset of the functionality of ``numpy.ndarray`` on the GPU
- Lots of duplicated effort
-
-  - GPU code is harder/slower to do {\bf correctly} and {\bf fast} than on the CPU/Python
-
- Lack of a common array API makes it harder to port/reuse code
- Also harder to find/distribute code
- Divides development work
-
-
-Design Goals
------------
-
- Make it VERY similar to ``numpy.ndarray``
- Be compatible with both CUDA and OpenCL
- Have the base object accessible from C to allow collaboration with more projects, across high-level languages
-
-  - We want people from C, C++, lua, Ruby, R, ... all use the same base GPU N-dimensional array
-
-
-Final Note
----------
-
- Usable directly, but not all implementation available.
- Is the next GPU array container for Theano and is working (not all implementation available now)
- Mailing list: http://lists.tiker.net/listinfo/gpundarray
--- a/doc/omlw2014/index.txt
+++ b/doc/omlw2014/index.txt
-
-.. _omlw2014_index:
-
-======================================================
-Theano, Pylearn2, libgpuarray Presentation @ OMLW 2014
-======================================================
-
-August 22, 2014, New York University, US.
-
-By Frédéric Bastien and Bart van Merriënboer. University of Montréal, Canada.
-
-
-Theano, Pylearn2 and libgpuarray software stack for machine learning.
-
-
-It complements the Python numeric/scientific software stack (e.g. NumPy, SciPy,
-scikits, matplotlib, PIL.)
-
-Theano
-======
-
-Theano is a software for evaluating and manipulating complicated array
-expressions.
-
-What does it do?
-
- * aggressive expression optimizations,
-
- * automatic GPU use,
-
- * automatic symbolic differentiation, Jacobian, Hession computation
-   and R/L op (for hessian free).
-
-Design and feature set has been driven by machine learning research
-at the University of
-Montreal (groups of Yoshua Bengio, Pascal Vincent, Aaron Courville and Roland Memisevic)
-The result is a very good library for doing research in deep
-learning and neural network training, and a flexible framework for
-many other models and algorithms in machine learning more generally.
-
-It has proven to be useful for implementing:
-
- - linear and nonlinear neural network classifiers
-
-   - including Maxout, Dropout
-
- - convolutional models
-
- - Energy models: RBM, DBN, GRBM, ssRBM, AIS
-
- - Auto-encoders: DAE, CAE
-
- - GP regression
-
- - sparse coding
-
- - recurrent neural networks, echo state, (HMM?) TODO
-
- - online and batch learning and optimization
-
- - Even SVM!
-
-As people's needs change this list will grow, but Theano is built
-around vector, matrix, and tensor expressions. It also support sparse matrix.
-
-
-Pylearn2
-========
-
-Pylearn2 is undergoing rapid development. Don’t expect a clean
-road without bumps! It is made for machine learning
-practitioner/researcher first.
-
-Pylearn2 is a machine learning library. Most of its functionality is
-built on top of Theano. This means you can write Pylearn2 plugins (new
-models, algorithms, etc) using mathematical expressions, and Theano
-will optimize and stabilize those expressions for you, and compile
-them to a backend of your choice (CPU or GPU).
-
-
-Pylearn2 Vision
---------------
-
-TODO: SHould we split this in 2 part, what is done, what is the vision not done yet?
-
-* Researchers **add features as they need them**. We avoid getting bogged down by
-  too much top-down planning in advance.
-* A machine learning toolbox for **easy scientific experimentation**.
-* All models/algorithms published by the LISA lab should have reference
-  implementations in Pylearn2. TODO REMOVE???
-* Pylearn2 **may wrap other libraries** such as scikits.learn when this is practical
-* Pylearn2 **differs from scikits.learn** in that Pylearn2 aims to provide great
-  flexibility and make it possible for a researcher to do almost anything,
-  while **scikits.learn aims to work as a "black box"**.
-* **Dataset interface** for vector, images, video, ... TODO (DO WE HAVE VIDEO?)
-* Small framework for all what is needed for one normal MLP/RBM/SDA/Convolution
-  experiments. (TODO: I think I would remove this)
-* **Easy reuse of sub-component** of Pylearn2.
-  * Using one sub-component of the library does not force you to use / learn to
-    use all of the other sub-components if you choose not to. TODO remove?
-* Support cross-platform serialization of learned models.(TODO, I think this isn't done)
-* Remain approachable enough to be used in the classroom
-
-
-libgpuarray
-===========
-
-Make a common GPU ndarray(vector, matrix or n dimensions) that can be
-reused by all projects. It support CUDA and OpenCL.
-
-Motivation
----------
-
-* Currently there are at least 6 different gpu arrays in python
-  *  CudaNdarray(Theano), GPUArray(pycuda), CUDAMatrix(cudamat), GPUArray(pyopencl), Clyther, Copperhead, ...
-  *  There are even more if we include other languages.
-
-* They are incompatible
-  * None have the same properties and interface.
-
-* All of them are a subset of numpy.ndarray on the gpu!
-
-
-Design Goals
------------
-
-* Have the base object in C to allow collaboration with more projects.
-  * We want people from C, C++, ruby, R, ... all use the same base GPU ndarray.
-* Be compatible with CUDA and OpenCL.
-* Not too simple, (don't support just matrix).
-* But still easy to develop new code that support only a few memory layout.
-  * This ease the development of new code.
-
-
-
-Contents
-========
-
-.. toctree::
-
-    introduction
-    theano
-    pylearn2
-    gpundarray
-    sharing
--- a/doc/omlw2014/introduction.txt
+++ b/doc/omlw2014/introduction.txt
-
-.. _omlw2014_Introduction:
-
-
-************
-Introduction
-************
-
-Python in one slide
-------------------
-
-* General-purpose high-level **OO interpreted language**
- 
-* Emphasizes **code readability**
- 
-* Comprehensive standard library
- 
-* Dynamic type and memory management
-
-* Built-in types: int, float, str, list, dict, tuple, object
-
-* Slow execution
-
-* Popular in **web-dev** and **scientific communities**
-
-
-NumPy in one slide
------------------
-
-* Python floats are full-fledged objects on the heap
-
- * Not suitable for high-performance computing!
-
-* NumPy provides a N-dimensional numeric array in Python
-
- * Perfect for high-performance computing.
- * Slice are return view (no copy)
-
-* NumPy provides
-
- * elementwise computations
-
- * linear algebra, Fourier transforms
-
- * pseudorandom numbers from many distributions
-
-* SciPy provides lots more, including
-
- * more linear algebra
-
- * solvers and optimization algorithms
-
- * matlab-compatible I/O
-
- * I/O and signal processing for images and audio
-
-.. code-block:: python
-
-    ##############################
-    # Properties of NumPy arrays
-    # that you really need to know
-    ##############################
-
-    import numpy as np          # import can rename
-    a = np.random.rand(3, 4, 5) # random generators
-    a32 = a.astype('float32')   # arrays are strongly typed
-
-    a.ndim                      # int: 3
-    a.shape                     # tuple: (3, 4, 5)
-    a.size                      # int: 60
-    a.dtype                     # np.dtype object: 'float64'
-    a32.dtype                   # np.dtype object: 'float32'
-
-    assert a[1, 1, 1] != 10     # a[1, 1, 1] is a view
-    a[1, 1, 1] = 10             # So affectation to it change the
-    assert a[1, 1, 1] == 10     # original array
-
-
-Arrays can be combined with numeric operators, standard mathematical
-functions. NumPy has great `documentation <http://docs.scipy.org/doc/numpy/reference/>`_.
-
-What's missing?
---------------
-
-* Non-lazy evaluation (required by Python) hurts performance
-
-* NumPy is bound to the CPU
-
-* NumPy lacks symbolic or automatic differentiation
-
-Quick look at a small examples:
-
-.. code-block:: python
-
-    #########################
-    # Theano for Training a
-    # Neural Network on MNIST
-    #########################
-
-    import numpy as np
-
-    import theano
-    import theano.tensor as tensor
-
-    x = np.load('data_x.npy')
-    y = np.load('data_y.npy')
-
-    # symbol declarations
-    sx = tensor.matrix()
-    sy = tensor.matrix()
-    w = theano.shared(np.random.normal(avg=0, std=.1,
-                                       size=(784, 500)))
-    b = theano.shared(np.zeros(500))
-    v = theano.shared(np.zeros((500, 10)))
-    c = theano.shared(np.zeros(10))
-
-    # symbolic expression-building
-    hid = tensor.tanh(tensor.dot(sx, w) + b)
-    out = tensor.tanh(tensor.dot(hid, v) + c)
-    err = 0.5 * tensor.sum(out - sy) ** 2
-    gw, gb, gv, gc = tensor.grad(err, [w, b, v, c])
-
-    # compile a fast training function
-    train = theano.function([sx, sy], err,
-        updates={
-            w: w - lr * gw,
-            b: b - lr * gb,
-            v: v - lr * gv,
-            c: c - lr * gc})
-
-    # now do the computations
-    batchsize = 100
-    for i in xrange(1000):
-        x_i = x[i * batchsize: (i + 1) * batchsize]
-        y_i = y[i * batchsize: (i + 1) * batchsize]
-        err_i = train(x_i, y_i)
-
-    
-Theano in one slide
-------------------
-
-* High-level domain-specific language tailored to numeric computation
-
-* Compiles most common expressions to C for CPU and GPU.
-
-* Limited expressivity means lots of opportunities for expression-level optimizations
-
- * No function call -> global optimization
-
- * Strongly typed -> compiles to machine instructions
-
- * Array oriented -> easy parallelism
-
- * Support for looping and branching in expressions
-
-* Expression substitution optimizations automatically draw
-  on many backend technologies for best performance.
-
- * BLAS, SciPy, Cython, CUDA
-
- * Slower fallbacks always available
-
-* Automatic differentiation and R op
-
-* Sparse matrices
-
-
-Project status
--------------
-
-* Mature: theano has been developed and used since January 2008 (6.5 yrs old)
-
-* Driven over 100 research papers
-
-* Good user documentation
-
-* Active mailing list with participants from outside our lab
-
-* Core technology for a few Silicon-Valley startup
-
-* Many contributors (some from outside our lab)
-
-* Used to teach many university classes
-
-* Used for research at Google and Yahoo. (TODO, should we remove? I think so)
-
-
-Pylearn2 in one slide
---------------------
-
-TODO
-
-Other global information
------------------------
-
-Theano have small basic operation, not layers as base operation:
-
-* Easy reuse
-* Don't need to reimplement the grad for each variation of layers
-
-This could cause slowness (more small operation), but the optimizer fix that.
-
-Pylearn2 wrap the small operations into layers like other
-projects:
-
-* There is no overhead to this extra layer, due to the
-  compilation of the function by Theano.
-
-
-Why scripting for GPUs?
-----------------------
-
-They *Complement each other*:
-
-* GPUs are everything that scripting/high level languages are not
-
- * Highly parallel
-
- * Very architecture-sensitive
-
- * Built for maximum FP/memory throughput
-
- * So hard to program that meta-programming is easier.
-
-* CPU: largely restricted to control
-
- * Optimized for sequential code and low latency (rather than high throughput)
-
- * Tasks (1000/sec)
-
- * Scripting fast enough
-
-Best of both: scripted CPU invokes JIT-compiled kernels on GPU.
--- a/doc/omlw2014/pylearn2.txt
+++ b/doc/omlw2014/pylearn2.txt
-
-.. _omlw2014_pylearn2:
-
-
-********
-Pylearn2
-********
-
-
-Pointers
--------
-
-TODO:
-* http://deeplearning.net/software/pylearn2/
-* User mailing list: http://groups.google.com/group/pylearn-users
-* Dev mailing list: http://groups.google.com/group/pylearn-dev
-* Installation: http://deeplearning.net/software/pylearn2/index.html#download-and-installation
-
-Description
-----------
-
-TODO:
-* ...
- 
-Simple example
--------------
-
-(logistic regression?) TODO
-
-Real example
------------
-
-(maxout?)TODO
-
-Known limitations
-----------------
-
-TODO
-* It is getting stabilized, but still heavily modified.
--- a/doc/omlw2014/sharing.txt
+++ b/doc/omlw2014/sharing.txt
-.. _omlw2014_sharing:
-
-************
-Sharing code
-************
-
-* License (BSD 3 clauses suggested, don't forget to add the license info in the code)
-* Common base object? libgpuarray.
-* If not, important implementation that use raw ptr/shape? Doc that interface.
-* Important, *acknowledgement section on web site*(citation like) AND *in paper* about the software we reuse! (and use too)
-
-*************
-Theano future
-*************
--- a/doc/omlw2014/theano.txt
+++ b/doc/omlw2014/theano.txt