tweaking cifarCS2011 rst pages

3e2cc421 · James Bergstra · 1192e645 · 3e2cc421 · 3e2cc421 · 3e2cc421
--- a/doc/cifarSC2011/boot_camp_overview.txt
+++ b/doc/cifarSC2011/boot_camp_overview.txt
@@ -13,13 +13,13 @@ on the afternoons of Aug 2, 3, 5, and 6 (but not Aug 4th).
 Day 1
 -----
- * Show of hands - what is your background?
+* Show of hands - what is your background?
- * Python & Numpy in a nutshell
+* Python & Numpy in a nutshell
- * Theano basics
+* Theano basics
- * Quick tour through Deep Learning Tutorials (think about projects)
+* Quick tour through Deep Learning Tutorials (think about projects)
 .. :
    day 1:
@@ -38,25 +38,25 @@ Day 1
 Day 2
 -----
-  * Loop/Condition in Theano (10-20m)
+* Loop/Condition in Theano (10-20m)
-  * Propose/discuss projects
+* Propose/discuss projects
-  * Form groups and start projects!
+* Form groups and start projects!
 Day 3
 -----
- * Advanced Theano (30 minutes)
+* Advanced Theano (30 minutes)
 * Debugging, profiling, compilation pipeline
- * Projects / General hacking / code-sprinting.
+* Projects / General hacking / code-sprinting.
 Day 4
 -----
- * *You choose* (we can split the group)
+* *You choose* (we can split the group)
 * Extending Theano
@@ -64,9 +64,5 @@ Day 4
  * How to use pycuda code in Theano
-   * Projects / General hacking / code-sprinting.
+* Projects / General hacking / code-sprinting.
-Note - the schedule here is a guideline.
-We can adapt it in reponse to developments in the hands-on work.
-The point is for you to learn something about the practice of machine
-learning.
--- a/doc/cifarSC2011/index.txt
+++ b/doc/cifarSC2011/index.txt
@@ -21,9 +21,10 @@ What does it do?
 It complements the Python numeric/scientific software stack (e.g. numpy, scipy,
 scikits, matplotlib, PIL.)
-Design and feature set has been driven by research in the machine learning group at the University of
+Design and feature set has been driven by machine learning research
-Montreal (Yoshua Bengio, Pascal Vincent, Douglas Eck).
+at the University of
-Result: a very good library for doing research in deep
+Montreal (groups of Yoshua Bengio, Pascal Vincent, Douglas Eck).
+The result is a very good library for doing research in deep
 learning and neural network training, and a flexible framework for
 many other models and algorithms in machine learning more generally.
@@ -53,7 +54,7 @@ calculations on other data structures.
 Contents
 --------
-The structured part of the course will be a walk-through of the following
+The structured part of these lab sessions will be a walk-through of the following
 material. Interleaved with this structured part will be blocks of time for
 individual or group work.  The idea is that you can try out Theano and get help
 from gurus on hand if you get stuck.

--- a/doc/cifarSC2011/introduction.txt
+++ b/doc/cifarSC2011/introduction.txt
@@ -49,8 +49,6 @@ Background Questionaire
 Python in one slide
 -------------------
-Features:
 * General-purpose high-level OO interpreted language
 * Emphasizes code readability
@@ -59,22 +57,59 @@ Features:
 * Dynamic type and memory management
-* builtin types: int, float, str, list, dict, tuple, object
+* Built-in types: int, float, str, list, dict, tuple, object
+* Slow execution
+* Popular in web-dev and scientific communities
-Syntax sample:
 .. code-block:: python
-    a = {'a': 5, 'b': None}   # dictionary of two elements
+    #######################
-    b = [1,2,3]               # list of three int literals
+    # PYTHON SYNTAX EXAMPLE
+    #######################
+    a = 1                     # no type declaration required!
+    b = (1,2,3)               # tuple of three int literals
+    c = [1,2,3]               # list of three int literals
+    d = {'a': 5, b: None}     # dictionary of two elements
+                              # N.B. string literal, None
+    print d['a']              # square brackets index
+    # -> 5
+    print d[(1,2,3)]          # new tuple == b, retrieves None
+    # -> None
+    print d[6]
+    # raises KeyError Exception
+    x, y, z = 10, 100, 100    # multiple assignment from tuple
+    x, y, z = b               # unpacking a sequence
+    b_squared = [b_i**2 for b_i in b]  # list comprehension
    def foo(b, c=3):          # function w default param c
        return a + b + c      # note scoping, indentation
-    b_squared = [b_i**2 for b_i in b]  # list comprehension
+    foo(5)                    # calling a function
+    # -> 1 + 5 + 3 == 9       # N.B. scoping
+    foo(b=6, c=2)             # calling with named args
+    # -> 1 + 6 + 2 == 9
    print b[1:3]              # slicing syntax
+    class Foo(object):        # Defining a class
+        a = 1
+        def hello(self):
+            return self.a
+    class Bar(Foo):           # Defining a subclass
+        def __init__(self):
+            self.a = 6
+    f = Foo()                 # Creating a class instance
+    b = Bar()                 # Creating an instance of Bar
+    f.hello(); b.hello()      # Calling methods of objects
 Numpy in one slide
 ------------------
@@ -104,13 +139,16 @@ Numpy in one slide
 * I/O and signal processing for images and audio
-Here are the properties of numpy arrays that you really need to know.
 .. code-block:: python
-    import numpy as np
+    ##############################
-    a = np.random.rand(3,4,5)
+    # Properties of Numpy arrays
-    a32 = a.astype('float32')
+    # that you really need to know
+    ##############################
+    import numpy as np          # import can rename
+    a = np.random.rand(3,4,5)   # random generators
+    a32 = a.astype('float32')   # arrays are strongly typed
    a.ndim                      # int: 3
    a.shape                     # tuple: (3,4,5)
@@ -118,44 +156,52 @@ Here are the properties of numpy arrays that you really need to know.
    a.dtype                     # np.dtype object: 'float64'
    a32.dtype                   # np.dtype object: 'float32'
-These arrays can be combined with numeric operators, standard mathematical
+Arrays can be combined with numeric operators, standard mathematical
-functions. Numpy has XXX great documentation XXX.
+functions. Numpy has great `documentation <http://docs.scipy.org/doc/numpy/reference/>`_.
 Training an MNIST-ready classification neural network in pure numpy might look like this:
 .. code-block:: python
+    #########################
+    # Numpy for Training a
+    # Neural Network on MNIST
+    #########################
    x = np.load('data_x.npy')
    y = np.load('data_y.npy')
-    w = np.random.normal(avg=0, std=.1,
+    w = np.random.normal(
+        avg=0,
+        std=.1,
        size=(784, 500))
-    b = np.zeros(500)
+    b = np.zeros((500,))
    v = np.zeros((500, 10))
-    c = np.zeros(10)
+    c = np.zeros((10,))
+    batchsize = 100
    for i in xrange(1000):
        x_i = x[i*batchsize:(i+1)*batchsize]
        y_i = y[i*batchsize:(i+1)*batchsize]
-        hidin = N.dot(x_i, w) + b
+        hidin = np.dot(x_i, w) + b
-        hidout = N.tanh(hidin)
+        hidout = np.tanh(hidin)
-        outin = N.dot(hidout, v) + c
+        outin = np.dot(hidout, v) + c
-        outout = (N.tanh(outin)+1)/2.0
+        outout = (np.tanh(outin)+1)/2.0
        g_outout = outout - y_i
-        err = 0.5 * N.sum(g_outout**2)
+        err = 0.5 * np.sum(g_outout**2)
        g_outin = g_outout * outout * (1.0 - outout)
-        g_hidout = N.dot(g_outin, v.T)
+        g_hidout = np.dot(g_outin, v.T)
        g_hidin = g_hidout * (1 - hidout**2)
-        b -= lr * N.sum(g_hidin, axis=0)
+        b -= lr * np.sum(g_hidin, axis=0)
-        c -= lr * N.sum(g_outin, axis=0)
+        c -= lr * np.sum(g_outin, axis=0)
-        w -= lr * N.dot(x_i.T, g_hidin)
+        w -= lr * np.dot(x_i.T, g_hidin)
-        v -= lr * N.dot(hidout.T, g_outin)
+        v -= lr * np.dot(hidout.T, g_outin)
 What's missing?
@@ -167,11 +213,16 @@ What's missing?
 * Numpy lacks symbolic or automatic differentiation
-Here's how the algorithm above looks in Theano, and it runs 15 times faster if
+Now let's have a look at the same algorithm in Theano, which runs 15 times faster if
-you have GPU (I'm skipping some dtype-details which we'll come back to):
+you have GPU (I'm skipping some dtype-details which we'll come back to).
 .. code-block:: python
+    #########################
+    # Theano for Training a
+    # Neural Network on MNIST
+    #########################
    import theano as T
    import theano.tensor as TT
@@ -188,12 +239,13 @@ you have GPU (I'm skipping some dtype-details which we'll come back to):
    c = T.shared(np.zeros(10))
    # symbolic expression-building
-    outout = TT.tanh(TT.dot(TT.tanh(TT.dot(sx, w.T) + b), v.T) + c)
+    hid = TT.tanh(TT.dot(sx, w) + b)
-    err = 0.5 * TT.sum(outout - sy)**2
+    out = TT.tanh(TT.dot(hid, v) + c)
+    err = 0.5 * TT.sum(out - sy)**2
    gw, gb, gv, gc = TT.grad(err, [w,b,v,c])
    # compile a fast training function
-    train = function([sx, sy], cost,
+    train = T.function([sx, sy], err,
        updates={
            w:w - lr * gw,
            b:b - lr * gb,
@@ -201,6 +253,7 @@ you have GPU (I'm skipping some dtype-details which we'll come back to):
            c:c - lr * gc})
    # now do the computations
+    batchsize = 100
    for i in xrange(1000):
        x_i = x[i*batchsize:(i+1)*batchsize]
        y_i = y[i*batchsize:(i+1)*batchsize]
@@ -241,12 +294,12 @@ Project status
 * Driven over 40 research papers in the last few years
-* Core technology for a funded Silicon-Valley startup
 * Good user documentation
 * Active mailing list with participants from outside our lab
+* Core technology for a funded Silicon-Valley startup
 * Many contributors (some from outside our lab)
 * Used to teach IFT6266 for two years
@@ -255,11 +308,11 @@ Project status
 * Unofficial RPMs for Mandriva
-* Downloads (on June 8 2011, since last January): Pypi 780, MLOSS: 483, Assembla (`bleeding edge` repository): unknown
+* Downloads (January 2011 -  June 8 2011): Pypi 780, MLOSS: 483, Assembla (`bleeding edge` repository): unknown
-Why scripting for GPUs ?
+Why scripting for GPUs?
------------------------
+-----------------------
 They *Complement each other*: