We will attempt to post in a timely fashion any student questions / homework
clarifications. Please send your questions to cs156ta@work.caltech.edu.
We also have an announcements mailing list cs156@work.caltech.edu.
You can subscribe
to receive emails or read the
messages directly online.
| Problem 1 |
Part (a) asks for the best hypothesis.
The best hypothesis does not depend on the training examples.
It is the hypothesis that can achive the lowest test error
with respect to the input distribution on x.
|
|
Part (b) asks for the expected value of the hypothesis.
The answer should be a complete function on [-π, π] whose value
at a point x is the expected value of g(x) where
g is the hypothesis that the learning model produces based
on the two examples (hence the expectation is with respect to the
two examples).
|
|
For Numerically evaluation the integrals, a hint is that when you are doing integral over a linear function a*x+b (where x is a variable not being integrated over), you end up having a linear function A*x+B (where A is some
integration involving only a and B is some integration involving only b). Evaluating A and B separately
may give matlab or mathematica a better form for solving it (for one
thing, the non-numerical x won't be there).
|
| Problem 2 |
For parts (v) and (vi), you need to numerically find &Delta u and &Delta v which minimizes the second order approximation of &Delta E. Another approach is
use the Newton's method, which would give you a direction along with a prescribed step size. You can use the direction and restrict the step size to 0.1.
Either of the approaches is acceptable for the homework, although we do encourage the students to try both of them.
|
| Code |
Source code is required for problems 1 and 4.
|
| Boundary |
The decision boundary is a curve along which two areas with different
classes (labels) meet. The boundary of RBF
consists of points x such that g(x) is 0. contour(x,y,z,[0,0])
may be convenient for plotting the boundaries.
|
| Leave-one-out |
Take one point at a time from the training set and assume that
it is the "test data." Classify the point using the remaining training
points. Repeat this for all points in the training set and compute
the average error.
|
| Problem 4 |
When comparing results with those from HW#1, use the mean square
error.
|
| Problem 1 |
You need to run the steepest descent/conjugate gradient for 250 time steps.
|
| quadprog |
When using quadprog in matlab, set the upper bounds
to some large number (say 99999) instead of infinity.
|
| Code |
Source code is required for problems 1 and 4.
|
| Error |
For problem 1 (and 4), please threshold your outputs to
+/- 1 as your g(x) (to do classification), and then compute
the mean squared error mean((g(x) - y)^2).
|
| Specialized packages |
The "kmeans" function in MATLAB, or the "svmtrain" function, or routines of similar nature, are regarded as specialized packages and you should not use them for solving the problems. You are allowed to use general-purpose optimization routines like "quadprog", "minimize", etc..
|