© William W. Armstrong,
1999
In this presentation, I am going to show how supervised learning can produce a function defined by a set of training samples. We have taken a simple function in the form of a wave of varying amplitude and frequency. We have created 500 sample points at random and shown these in a Microsoft ® Excel spreadsheet. The A column on the left represents the input, shown on the horizontal axis of the chart. The B column on the right is the output, shown on the vertical axis.
This function gives the ideal relationship between inputs and outputs. Usually, this is not known when real-world data is involved. All we normally have is a description of the ideal function through samples of inputs and outputs. Both the inputs and the outputs can contain errors, or noise.
Our first step is to make the problem more realistic by creating some noisy samples. Our goal will then be to recover an approximation of the ideal function by applying machine learning to the samples. By generating the noisy data from an ideal function as an experiment, we can examine the result of training to see how close we came to the ideal function.
In the case of a real-world problem, we have to hold back some of the sample data for the sole purpose of testing the result of training. If we just test the result against the data used in training, the error might be smaller than we could expect on new data.
NEXT