© William W. Armstrong,
1999
The chart shows how we have added a random amount of noise to the values in columns A
and B.
We generate three sets of random samples of data in this way, each set containing 500 sample points.
The first set will be used for training.We shall use it first for estimating the noise inherent in our data, and again to obtain a function that captures the information in the data samples but not the noise. This function will approximate the ideal function we started with in this experiment.
The second set will be used along with the training set to estimate the error in the data due to output noise, and again as a measure of error on a set other than the training set. Sometimes this set is called a validation set.
The third set will be used only to test the final result.
NEXT