Assignment 7 information
Announcements
- [12/2] If you depend upon the perceptron-epoch procedure
for your learn-perceptron procedure, you must include it in
your file for Problem 2. The pretester does not check for the
perceptron-epoch procedure because you are not required to
use it.
- [11/30] For Problem 3 (where it says that I will provide a data
set specifically for creating a learning curve), you may use either
the B1 and B2 data from the lsd.scm
file, or you can use the bc-data-l1, bc-data-l2,
bc-data-l3 data from the bc-data.scm file. The
synthetic data might converge so quickly that you don't get a very
wide learning curve; the breast cancer data might not converge to
100%, but you'd probably get a good learning curve from it. Either
will be fine.
- [11/30] The web testers are up for problems 1 and 2. Also see
the "Generate your own data" section below...
- [11/30] A student pointed out that I made a mistake in class on
Thursday with the convention using a -1 for the first input which
makes the first weight the threshold value.
- [11/27] I've put together some synthetic data you can use to test
your procedures. See the support code section below and the comments
in the file.
- [11/27] Since I didn't get to perceptrons in class yesterday,
here's some information on learning perceptrons. The text is not so
forthcoming about training perceptrons instead focusing on multilayer
networks of sigmoid units (which we'll get to), although it does
basically cover the material.
- The "basic" algorithm for training a perceptron is, for each
epoch, to take each example in turn, compute the perceptron's output
with the current weights, compute the error from the correct answer,
and then apply the perceptron learning rule to adjust the weights.
- In the perceptron learning rule, the vector "w" is the vector
of weights and the vector "I" is the vector of inputs. (Sometimes, I
may use the variable "x" instead of "I")
- [11/26] Support code is out. I've provided one collection of
data sets; I should have at least one more out shortly.
Support code & data sets
Generate your own synthetic data
Using the routines in the file generate-data.scm you can
produce your own synthetic data which is guaranteed to be linearly
separable. Here's how you use it:
(define h (pick-hyperplane 2))
h
;Value: ((-.8925898031763745 -.45086965218958913) .15687006660372727)
; the format of the hyperplane is (normal offset) so that:
;
; output = 1 if the dot product of the "normal" and the "input" is
; greater than the "offset"
; -1 otherwise
(define d (lsdata h 10))
Of 10 examples, 5 are positive examples.
d
;Value: ((-1 (.8667412929548199 .17378869975659605))
(-1 (-.09185614144878473 -.1411371356122395))
(1 (-.14720896904730352 -.44910735200288165))
(1 (-.24302071105109146 .12993139718884072))
(1 (-.7104105489682528 .9972802749803293))
(1 (-.698340137603623 .8554353330464048))
(1 (-.23009130827860358 -8.647936173537651e-3))
(-1 (.5941178449361095 .7622703851227071))
(-1 (.15212484490140898 .4744542207038662))
(-1 (.17650231762847657 .12650541626674294)))
(test-perceptron (cons (second h) (first h)) d)
;Value: 1.
; i.e. it classified 100% of the examples in d correctly