Answer: You can make a call just like the ones being used to test your code for Problem 2. For example:
(test-perceptron (learn-perceptron bc-data-l1 bc-data-l2) bc-data-l2) (test-perceptron (learn-perceptron bc-data-l1 bc-data-l2) bc-data-l3)
The first example above tests the perceptron against the testing data. With the synthetic data, you'd have to use this option since I've only provided two training/testing sets. The second tests the perceptron against an independent set of data (one not used at all in learning the perceptron). If you wanted to do this with the synthetic data, you could generate your own data sets using the code provided on the A7 page.
You can additionally show how the perceptron is doing on the training data. (And the perceptron should be doing better on its training data than on other data sets.) For example:
(test-perceptron (learn-perceptron bc-data-l1 bc-data-l2) bc-data-l1)
Answer: When you learn weights for a perceptron, the number of inputs to that perceptron is fixed. You cannot use it with a data set with a different number of inputs. Therefore, if you train a perceptron with two dimensional data sets, you can only test it against two dimensional data sets. Your procedure, however, should be able to handle learning and testing perceptrons that take different numbers of inputs.
Answer: no, they are both scalars. the output of the perceptron procedure is a scalar (1 or -1) and so is the correct output in each training example.
Answer: With the perceptron learning rule, for sufficiently small alpha (and assuming no problems with numerical accuracy), the weights of the perceptron should converge to values that perfectly separate the training data, assuming the training data are linearly separable. It is not possible to guarantee that they will perfectly classify other data drawn from the same distribution. For example, suppose you have a training data set consisting of the points in input space (0,0) and (1,1) with outputs -1 and 1 respectively. There are many different hyperplanes that can separate these two points, not all of which will produce the same output for the input point (0,1) or (1,0). If your perceptron has settled to the minimum of an error function defined over the entire set of training data, then the hyperplane chosen should be "halfway" between these two points.
Answer: the correct output is what you are given with the training example. (How else would you get the correct output?)
Answer: the threshold value is one of the weights, so it ggets updated with all the other weights.
Answer: If you do, then the weights and the input vector are not the same length.
Answer: The training example consists of a list of N inputs and a single output (either 1 or -1). You take the N inputs and prepend a -1 to them to form the "actual" input vector. You can then take your vector of $N+1$ weights and the input vector and give them to the perceptron procedure which will compute the output of a perceptron for that input. Please examine the support code of this procedure.
You then compute the error which is the correct output value minus the value that the perceptron returns. You then update the weights by adding to the weight vector alpha (the learning rate) times the error times the input vector. One epoch consists of doing this for all examples in a training data set.
Answer: Your perceptron-epoch procedure should return a single list of weights which are the current weights after one pass through the training data. If there are N inputs in the training data, both the start-weights and the list of weights you return should have (N+1) elements.
Answer: They are the same sort of thing, but you should use one for testing and the other for training.
Answer: Read the first bullet under "Notes, Conventions, and suggestions" in the assignment handout.
Answer: You can only take a dot product of two vectors. Alpha (the learning rate) is a scalar, and so is the error. So what the perceptron learning rule says is update the weights by those constants times the input vector.
Answer: The perceptron-epoch function should go through all the examples in the training data and apply the perceptron learning rule for each one. Be sure you go through the training examples in order because the web tester assumes this.
Answer: For each example, you use the current weights to calculate the output of the perceptron for the example inputs. the error is then the difference between the correct output value and the actual output value you computed. then you update the weights with the perceptron learning rule and then go on to the next example.
Answer: See the first bullet under the "notes, conventions, and suggestions" section on the assignnment handout. Since the threshold value is incorporated into the weights, this is learned by the perceptron along with the value of the other weights and is therefore part of the weight vector returned by the perceptron-epoch procedure.
No, I don't think I will be providing a specific example. I would rather encourage students to make sure that they understand the algorithm that they are implementing and for them to test it themselves. One way to test your procedure is of course to learn a perceptron with it; its performance should improve! You could also test your procedure using the ideas we covered in class on Thursday --- see whether the value of an error function is actually decreasing as you take these steps in the direction of the negative gradient.