© William W. Armstrong,
1999
Our goal now is to overtrain, that is, to reduce the error on the training set as far as possible. So we click on the Smoothing parameters and ranges button and push the slider to the left. This means that neighboring training points are not to be smoothed over, but if possible are to be fitted exactly. We set the allowable error, also called the output tolerance, to a very small value 0.0001. This causes linear pieces to break in two very frequently. With the help of the Train parameters button, we set the training to 61 epochs, which means 61 passes through the training set. Now we click on the Charts button, and then initiate training using the Train ALN button. We train about five times without creating a new ALN.
Paradoxically, while the training error is dropping, the test error is increasing. This is because the ALN is fitting the noise in the training set nearly perfectly, but that takes it further away from the ideal function. After training is complete we see that the training error is about 0.062 while the test error is about 0.463. About 400 linear pieces form the function, which is close to one linear piece per data sample. A graph of the ALN function shows it varying rapidly up and down within the cloud of training points.
The test error is now divided by 1.291 (this is a correct value for the case of one input variable), giving 0.36. We use this number as our estimate of the output error in the data. Any function we can create by any process whatsoever can be expected to have at least this error with respect to a new test set of data from the same source.
NEXT