提交 13237ee6 authored 作者: Frederic's avatar Frederic

more old todo/comment and pep8 doc/tutorial

上级 2ba58c0c
......@@ -178,13 +178,11 @@ with NumPy arrays may be found here: :ref:`tensor creation<libdoc_tensor_creatio
import theano
a = theano.tensor.vector() # declare variable
out = a + a**10 # build symbolic expression
out = a + a ** 10 # build symbolic expression
f = theano.function([a], out) # compile function
print f([0,1,2]) # prints `array([0,2,1026])`
print f([0, 1, 2]) # prints `array([0, 2, 1026])`
Modify and execute this code to compute this expression: a**2 + b**2 + 2*a*b.
Modify and execute this code to compute this expression: a ** 2 + b ** 2 + 2 * a * b.
.. TODO: repair this link
:download:`Solution<adding_solution_1.py>`
......@@ -231,7 +231,7 @@ that control how ``theano.function`` handles its argument[s] and return value[s]
import theano, theano.tensor
x = theano.tensor.matrix()
y = 2*x
y = 2 * x
f = theano.function([theano.In(x, borrow=True)], theano.Out(y, borrow=True))
Borrowing an input means that Theano will treat the argument you provide as if
......
......@@ -24,33 +24,33 @@ IfElse vs Switch
from theano.ifelse import ifelse
import theano, time, numpy
a,b = T.scalars('a','b')
x,y = T.matrices('x','y')
a,b = T.scalars('a', 'b')
x,y = T.matrices('x', 'y')
z_switch = T.switch(T.lt(a,b), T.mean(x), T.mean(y))
z_lazy = ifelse(T.lt(a,b), T.mean(x), T.mean(y))
z_switch = T.switch(T.lt(a, b), T.mean(x), T.mean(y))
z_lazy = ifelse(T.lt(a, b), T.mean(x), T.mean(y))
f_switch = theano.function([a,b,x,y], z_switch,
f_switch = theano.function([a, b, x, y], z_switch,
mode=theano.Mode(linker='vm'))
f_lazyifelse = theano.function([a,b,x,y], z_lazy,
f_lazyifelse = theano.function([a, b, x, y], z_lazy,
mode=theano.Mode(linker='vm'))
val1 = 0.
val2 = 1.
big_mat1 = numpy.ones((10000,1000))
big_mat2 = numpy.ones((10000,1000))
big_mat1 = numpy.ones((10000, 1000))
big_mat2 = numpy.ones((10000, 1000))
n_times = 10
tic = time.clock()
for i in xrange(n_times):
f_switch(val1, val2, big_mat1, big_mat2)
print 'time spent evaluating both values %f sec'%(time.clock()-tic)
print 'time spent evaluating both values %f sec' % (time.clock() - tic)
tic = time.clock()
for i in xrange(n_times):
f_lazyifelse(val1, val2, big_mat1, big_mat2)
print 'time spent evaluating one value %f sec'%(time.clock()-tic)
print 'time spent evaluating one value %f sec' % (time.clock() - tic)
In this example, the ``IfElse`` op spends less time (about half as much) than ``Switch``
since it computes only one variable out of the two.
......
......@@ -35,27 +35,27 @@ following example.
theano.config.compute_test_value = 'off'
# configure shared variables
W1val = numpy.random.rand(2,10,10).astype(theano.config.floatX)
W1val = numpy.random.rand(2, 10, 10).astype(theano.config.floatX)
W1 = theano.shared(W1val, 'W1')
W2val = numpy.random.rand(15,20).astype(theano.config.floatX)
W2val = numpy.random.rand(15, 20).astype(theano.config.floatX)
W2 = theano.shared(W2val, 'W2')
# input which will be of shape (5,10)
x = T.matrix('x')
# transform the shared variable in some way. Theano does not
# know off hand that the matrix func_of_W1 has shape (20,10)
func_of_W1 = W1.dimshuffle(2,0,1).flatten(2).T
# know off hand that the matrix func_of_W1 has shape (20, 10)
func_of_W1 = W1.dimshuffle(2, 0, 1).flatten(2).T
# source of error: dot product of 5x10 with 20x10
h1 = T.dot(x,func_of_W1)
h1 = T.dot(x, func_of_W1)
# do more stuff
h2 = T.dot(h1,W2.T)
h2 = T.dot(h1, W2.T)
# compile and call the actual function
f = theano.function([x], h2)
f(numpy.random.rand(5,10))
f(numpy.random.rand(5, 10))
Running the above code generates the following error message:
......@@ -98,10 +98,10 @@ can get Theano to reveal the exact source of the error.
...
# input which will be of shape (5,10)
# input which will be of shape (5, 10)
x = T.matrix('x')
# provide Theano with a default test-value
x.tag.test_value = numpy.random.rand(5,10)
x.tag.test_value = numpy.random.rand(5, 10)
In the above, we are tagging the symbolic matrix *x* with a special test
value. This allows Theano to evaluate symbolic expressions on-the-fly (by
......@@ -159,10 +159,10 @@ Theano provides a 'Print' op to do this.
f_with_print = theano.function([x], x_printed * 5)
#this runs the graph without any printing
assert numpy.all( f([1,2,3]) == [5, 10, 15])
assert numpy.all( f([1, 2, 3]) == [5, 10, 15])
#this runs the graph with the message, and value printed
assert numpy.all( f_with_print([1,2,3]) == [5, 10, 15])
assert numpy.all( f_with_print([1, 2, 3]) == [5, 10, 15])
Since Theano runs your program in a topological order, you won't have precise
......@@ -242,7 +242,7 @@ along with its position in the graph, the arguments to the functions ``perform``
``c_code`` and the output it computed.
>>> x = T.dscalar('x')
>>> f = function([x], [5*x], mode=PrintEverythingMode())
>>> f = function([x], [5 * x], mode=PrintEverythingMode())
>>> f(3)
>>> # print: 0 Elemwise{mul,no_inplace}(5, x) [array(5, dtype=int8), array(3.0)] [array(15.0)]
>>> # print: [array(15.0)]
......@@ -280,11 +280,11 @@ Consider this example script ("ex.py"):
a = T.dmatrix('a')
b = T.dmatrix('b')
f = theano.function([a,b], [a*b])
f = theano.function([a, b], [a * b])
# matrices chosen so dimensions are unsuitable for multiplication
mat1 = numpy.arange(12).reshape((3,4))
mat2 = numpy.arange(25).reshape((5,5))
mat1 = numpy.arange(12).reshape((3, 4))
mat2 = numpy.arange(25).reshape((5, 5))
f(mat1, mat2)
......
......@@ -412,11 +412,11 @@ The preceding elements are featured in this more realistic example. It will be
print w.get_value(), b.get_value()
# Construct Theano expression graph
p_1 = 1 / (1 + T.exp(-T.dot(x, w)-b)) # Probability that target = 1
p_1 = 1 / (1 + T.exp(-T.dot(x, w) - b)) # Probability that target = 1
prediction = p_1 > 0.5 # The prediction thresholded
xent = -y*T.log(p_1) - (1-y)*T.log(1-p_1) # Cross-entropy loss function
cost = xent.mean() + 0.01*(w**2).sum() # The cost to minimize
gw,gb = T.grad(cost, [w,b]) # Compute the gradient of the cost
xent = -y * T.log(p_1) - (1-y) * T.log(1-p_1) # Cross-entropy loss function
cost = xent.mean() + 0.01 * (w ** 2).sum()# The cost to minimize
gw,gb = T.grad(cost, [w, b]) # Compute the gradient of the cost
# (we shall return to this in a
# following section of this tutorial)
......@@ -424,7 +424,7 @@ The preceding elements are featured in this more realistic example. It will be
train = theano.function(
inputs=[x,y],
outputs=[prediction, xent],
updates={w:w-0.1*gw, b:b-0.1*gb})
updates={w: w - 0.1 * gw, b: b - 0.1 * gb})
predict = theano.function(inputs=[x], outputs=prediction)
# Train
......
......@@ -121,13 +121,13 @@ class TestOp(utt.InferShapeTester):
y = tensor.dmatrix()
# adapt the choice of the next instruction to the op under test
self._compile_and_check([x, y], [self.op_class()(x, y)], # case 1
[numpy.random.rand(5, 6),
numpy.random.rand(5, 6)],
self.op_class)
"""
self._compile_and_check([x, y], self.op_class()(x, y), # case 2
[numpy.random.rand(5, 6),
numpy.random.rand(5, 6)],
......@@ -141,5 +141,5 @@ if __name__ == "__main__":
t.test_perform()
# comment out next instruction in case 2 since autotesting of
# gradient of multiple output functions is not implemented yet
t.test_gradient() # enable in case 1, disable in case 2
t.test_gradient() # enable in case 1, disable in case 2
t.test_infer_shape()
......@@ -100,9 +100,9 @@ You can use a GPU function compiled with PyCUDA in a Theano op:
def make_thunk(self, node, storage_map, _, _2):
mod = SourceModule("""
__global__ void my_fct(float * i0, float * o0, int size) {
int i = blockIdx.x*blockDim.x + threadIdx.x;
int i = blockIdx.x * blockDim.x + threadIdx.x;
if(i<size){
o0[i] = i0[i]*2;
o0[i] = i0[i] * 2;
}
}""")
pycuda_fct = mod.get_function("my_fct")
......@@ -114,7 +114,7 @@ You can use a GPU function compiled with PyCUDA in a Theano op:
z[0] = cuda.CudaNdarray.zeros(inputs[0][0].shape)
grid = (int(numpy.ceil(inputs[0][0].size / 512.)),1)
pycuda_fct(inputs[0][0], z[0], numpy.intc(inputs[0][0].size),
block=(512,1,1), grid=grid)
block=(512, 1, 1), grid=grid)
return thunk
CUDAMat
......
......@@ -25,7 +25,7 @@ Here is the code to compute this gradient:
>>> from theano import pp
>>> x = T.dscalar('x')
>>> y = x**2
>>> y = x ** 2
>>> gy = T.grad(y, x)
>>> pp(gy) # print out the gradient prior to optimization
'((fill((x ** 2), 1.0) * 2) * (x ** (2 - 1)))'
......@@ -118,10 +118,10 @@ do is to loop over the entries in *y* and compute the gradient of
shall return to :ref:`scan<tutloop>` later in this tutorial.
>>> x = T.dvector('x')
>>> y = x**2
>>> J, updates = theano.scan(lambda i, y,x : T.grad(y[i], x), sequences = T.arange(y.shape[0]), non_sequences = [y,x])
>>> f = function([x], J, updates = updates)
>>> f([4,4])
>>> y = x ** 2
>>> J, updates = theano.scan(lambda i, y,x : T.grad(y[i], x), sequences=T.arange(y.shape[0]), non_sequences=[y,x])
>>> f = function([x], J, updates=updates)
>>> f([4, 4])
array([[ 8., 0.],
[ 0., 8.]])
......@@ -156,12 +156,12 @@ scalar.
>>> x = T.dvector('x')
>>> y = x**2
>>> y = x ** 2
>>> cost = y.sum()
>>> gy = T.grad(cost, x)
>>> H, updates = theano.scan(lambda i, gy,x : T.grad(gy[i], x), sequences = T.arange(gy.shape[0]), non_sequences = [gy,x])
>>> f = function([x], H, updates = updates)
>>> f([4,4])
>>> H, updates = theano.scan(lambda i, gy,x : T.grad(gy[i], x), sequences=T.arange(gy.shape[0]), non_sequences=[gy, x])
>>> f = function([x], H, updates=updates)
>>> f([4, 4])
array([[ 2., 0.],
[ 0., 2.]])
......@@ -200,10 +200,10 @@ you need to do something similar to this:
>>> W = T.dmatrix('W')
>>> V = T.dmatrix('V')
>>> x = T.dvector('x')
>>> y = T.dot(x,W)
>>> y = T.dot(x, W)
>>> JV = T.Rop(y, W, V)
>>> f = theano.function([W,V,x], JV)
>>> f([[1,1],[1,1]], [[2,2,],[2,2]], [0,1])
>>> f = theano.function([W, V, x], JV)
>>> f([[1, 1], [1, 1]], [[2, 2], [2, 2]], [0,1])
array([ 2., 2.])
:ref:`List <R_op_list>` of Op that implement Rop.
......@@ -219,10 +219,10 @@ f(x)}{\partial x}`. The *L-operator* is also supported for generic tensors
>>> W = T.dmatrix('W')
>>> v = T.dvector('v')
>>> x = T.dvector('x')
>>> y = T.dot(x,W)
>>> y = T.dot(x, W)
>>> VJ = T.Lop(y, W, v)
>>> f = theano.function([W,v,x], JV)
>>> f([[1,1],[1,1]], [2,2,], [0,1])
>>> f([[1, 1], [1, 1]], [2, 2], [0, 1])
array([[ 0., 0.],
[ 2., 2.]])
......@@ -250,11 +250,11 @@ Hence, we suggest profiling the methods before using either one of the two:
>>> x = T.dvector('x')
>>> v = T.dvector('v')
>>> y = T.sum(x**2)
>>> y = T.sum(x ** 2)
>>> gy = T.grad(y, x)
>>> vH = T.grad( T.sum(gy*v), x)
>>> f = theano.function([x,v], vH)
>>> f([4,4],[2,2])
>>> vH = T.grad(T.sum(gy * v), x)
>>> f = theano.function([x, v], vH)
>>> f([4, 4], [2, 2])
array([ 4., 4.])
......@@ -262,11 +262,11 @@ or, making use of the *R-operator*:
>>> x = T.dvector('x')
>>> v = T.dvector('v')
>>> y = T.sum(x**2)
>>> y = T.sum(x ** 2)
>>> gy = T.grad(y, x)
>>> Hv = T.Rop(gy,x,v)
>>> f = theano.function([x,v], Hv)
>>> f([4,4],[2,2])
>>> Hv = T.Rop(gy, x, v)
>>> f = theano.function([x, v], Hv)
>>> f([4, 4], [2, 2])
array([ 4., 4.])
......
......@@ -43,11 +43,11 @@ The full documentation can be found in the library: :ref:`Scan <lib_scan>`.
outputs_info=T.ones_like(A),
non_sequences=A, n_steps=k)
# Scan has provided us with A**1 through A**k. Keep only the last
# Scan has provided us with A ** 1 through A ** k. Keep only the last
# value. Scan notices this and does not waste memory saving them.
final_result = result[-1]
power = theano.function(inputs=[A,k], outputs=final_result,
power = theano.function(inputs=[A, k], outputs=final_result,
updates=updates)
print power(range(10),2)
......@@ -94,6 +94,4 @@ Run both examples.
Modify and execute the polynomial example to have the reduction done by ``scan``.
.. TODO: repair this link as well as the code in the target file
:download:`Solution<loop_solution_1.py>`
......@@ -23,7 +23,7 @@ result, updates = theano.scan(fn=inner_fct,
outputs_info=T.ones_like(A),
non_sequences=A, n_steps=k)
# Scan has provided us with A**1 through A**k. Keep only the last
# Scan has provided us with A ** 1 through A ** k. Keep only the last
# value. Scan notices this and does not waste memory saving them.
final_result = result[-1]
......@@ -83,10 +83,10 @@ outputs_info = T.as_tensor_variable(numpy.asarray(0, 'float64'))
components, updates = theano.scan(fn=lambda prior_value, coeff, power, free_var:
prior_value + (coeff * (free_var ** power)),
outputs_info=outputs_info,
sequences=[coefficients, full_range],
non_sequences=x)
outputs_info=outputs_info,
sequences=[coefficients, full_range],
non_sequences=x)
polynomial = components[-1]
calculate_polynomial = theano.function(inputs=[coefficients, x],
outputs=polynomial, updates=updates)
......@@ -94,4 +94,3 @@ calculate_polynomial = theano.function(inputs=[coefficients, x],
test_coeff = numpy.asarray([1, 0, 2], dtype=numpy.float32)
print calculate_polynomial(test_coeff, 3)
# 19.0
......@@ -108,12 +108,6 @@ Modify and execute this example to run on CPU (the default) with floatX=float32
time the execution using the command line ``time python file.py``. Save your code
as it will be useful later on.
.. TODO: To be resolved:
.. Solution said:
.. You will need to use: ``theano.config.floatX`` and ``ndarray.astype("str")``
.. Note::
* Apply the Theano flag ``floatX=float32`` through (``theano.config.floatX``) in your code.
......@@ -124,8 +118,6 @@ as it will be useful later on.
* Insert manual cast around the mean operator (this involves division by length, which is an *int64*).
* Notice that a new casting mechanism is being developed.
.. TODO: repair this link
:download:`Solution<modes_solution_1.py>`
-------------------------------------------
......
......@@ -63,4 +63,3 @@ print D[1]
print "prediction on D"
print predict(D[0])
......@@ -37,7 +37,7 @@ This is a 3x2 matrix, i.e. there are 3 rows and 2 columns.
To access the entry in the 3rd row (row #2) and the 1st column (column #0):
>>> numpy.asarray([[1., 2], [3, 4], [5, 6]])[2,0]
>>> numpy.asarray([[1., 2], [3, 4], [5, 6]])[2, 0]
5.0
......
......@@ -42,20 +42,20 @@ The following output depicts the pre- and post- compilation graphs.
# Construct Theano expression graph
p_1 = 1 / (1 + T.exp(-T.dot(x, w)-b)) # Probabily of having a one
p_1 = 1 / (1 + T.exp(-T.dot(x, w) - b)) # Probabily of having a one
prediction = p_1 > 0.5 # The prediction that is done: 0 or 1
xent = -y*T.log(p_1) - (1-y)*T.log(1-p_1) # Cross-entropy
cost = xent.mean() + 0.01*(w**2).sum() # The cost to optimize
gw,gb = T.grad(cost, [w,b])
xent = -y * T.log(p_1) - (1 - y) * T.log(1 - p_1) # Cross-entropy
cost = xent.mean() + 0.01 * (w ** 2).sum() # The cost to optimize
gw,gb = T.grad(cost, [w, b])
# Compile expressions to functions
train = theano.function(
inputs=[x,y],
inputs=[x, y],
outputs=[prediction, xent],
updates={w:w-0.01*gw, b:b-0.01*gb},
name = "train")
updates={w: w - 0.01 * gw, b: b - 0.01 * gb},
name="train")
predict = theano.function(inputs=[x], outputs=prediction,
name = "predict")
name="predict")
if any( [x.op.__class__.__name__=='Gemv' for x in
train.maker.fgraph.toposort()]):
......
......@@ -24,7 +24,7 @@ Currently, information regarding shape is used in two ways in Theano:
import theano
x = theano.tensor.matrix('x')
f = theano.function([x], (x**2).shape)
f = theano.function([x], (x ** 2).shape)
theano.printing.debugprint(f)
#MakeVector [@43860304] '' 2
# |Shape_i{0} [@43424912] '' 1
......@@ -48,9 +48,9 @@ can lead to errors. Consider this example:
import theano
x = theano.tensor.matrix('x')
y = theano.tensor.matrix('y')
z = theano.tensor.join(0,x,y)
xv = numpy.random.rand(5,4)
yv = numpy.random.rand(3,3)
z = theano.tensor.join(0, x, y)
xv = numpy.random.rand(5, 4)
yv = numpy.random.rand(3, 3)
f = theano.function([x,y], z.shape)
theano.printing.debugprint(f)
......@@ -119,7 +119,7 @@ upgrade. Here is the current state of what can be done:
.. code-block:: python
theano.tensor.nnet.conv2d(..., image_shape=(7,3,5,5), filter_shape=(2,3,4,4))
theano.tensor.nnet.conv2d(..., image_shape=(7, 3, 5, 5), filter_shape=(2, 3, 4, 4))
- You can use the ``SpecifyShape`` op to add shape information anywhere in the
graph. This allows to perform some optimizations. In the following example,
......@@ -129,8 +129,8 @@ upgrade. Here is the current state of what can be done:
import theano
x = theano.tensor.matrix()
x_specify_shape = theano.tensor.specify_shape(x, (2,2))
f = theano.function([x], (x_specify_shape**2).shape)
x_specify_shape = theano.tensor.specify_shape(x, (2, 2))
f = theano.function([x], (x_specify_shape ** 2).shape)
theano.printing.debugprint(f)
# [2 2] [@72791376]
......
......@@ -67,7 +67,7 @@ Take for example the following code:
.. code-block:: python
x = T.dmatrix('x')
y = x*2.
y = x * 2.
If you enter ``type(y.owner)`` you get ``<class 'theano.gof.graph.Apply'>``,
which is the apply node that connects the op and the inputs to get this
......@@ -156,9 +156,9 @@ as we apply it. Consider the following example of optimization:
>>> import theano
>>> a = theano.tensor.vector("a") # declare symbolic variable
>>> b = a + a**10 # build symbolic expression
>>> b = a + a ** 10 # build symbolic expression
>>> f = theano.function([a], b) # compile function
>>> print f([0,1,2]) # prints `array([0,2,1026])`
>>> print f([0, 1, 2]) # prints `array([0,2,1026])`
====================================================== =====================================================
......
......@@ -389,8 +389,6 @@ What can be done to further increase the speed of the GPU version? Put your idea
* Insert manual cast around the mean operator (this involves division by length, which is an *int64*).
* Notice that a new casting mechanism is being developed.
.. TODO: repair this link
:download:`Solution<using_gpu_solution_1.py>`
-------------------------------------------
......
......@@ -321,4 +321,4 @@ Include the training steps inside the definition of the Theano function.
Implement this solution and put it to test.
"""
\ No newline at end of file
"""
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论