Simple learning algorithm The learning algorithm (tree growth option) starts with one linear function piece. This fits the data like linear regression does.  If the error is too high, the piece splits into two, each new piece fitting only some of the sample points. The two pieces are joined by taking either a minimum of a maximum of their values.   Splitting continues, forming a tree, but a piece is never split if its error is within what can be caused by noise in the data.  Estimation of the noise in the data is a first step in solving the problem (see below).   In subsequent training, a piece is not split if its error could be caused by noise.  This is a new, simple and effective technique for achieving good generalization.
Automatic choice of architecture You no longer have to worry about whether the architecture you choose will allow your network to produce a good result. By (optionally) growing a tree of minimum and maximum operators with linear functions at the leaves, and controlling the splitting of pieces with too high an error, the architecture fits the data by construction. 
Simple form of results The result of ALN training is a function made up of linear pieces connected into a continuous function surface.  One goal is to keep the number of pieces small, so noise is not fitted.  A result made up of a small number of linear pieces is easy to analyze and check compared to more complicated expressions. You don't have to fear unexpected results if your application is safety-critical -- you don't have to evaluate it on an astronomical number of points -- you just check all the linear pieces of the result carefully.
Smooth results The linear pieces have quadratic fillets, that fill in the corners between them smoothly.  You control how much smoothing is used.  The deviation from the continuous piecewise linear function is bounded by a value you set.
Real-time   evaluation speed After training, the input space can be partitioned into boxes within each of which only a few linear pieces are active.  This is called an ALN decision tree. Most linear pieces don't have to be evaluated to compute a specific output.  The input lies in a certain box, and only the linear pieces touching that box need to be considered. Computations which don't have to be done are omitted.
Scalability The method of omitting computations described under "real-time evaluation speed" becomes increasingly efficient as the size of the problem grows larger.  In the limit of very large problems, this is far better than "massive parallelism" using fast hardware.
Non-normalized data The learning algorithms are invariant to translation and scaling, so you don't have to normalize your data.  This may not be the case with neural networks you are currently using, e.g some backpropagation algorithms are not invariant in this way, so you have to normalize the data.
Localized sensitivity analysis A certain input variable may affect the output to a greater or less extent depending on the values of all input components.  You can analyze linear pieces, on which all sensitivities (which are just the weights) are constant.  This helps to interpret your result and gain value from it, say in areas like data mining.
Control of sensitivities If you know the sensitivities (partial derivatives) of your result will always be within certain bounds, you can enforce those bounds and make sure your result has properties you know it should have. This is simply control of the bounds on weights of linear pieces, and is conserved when fillets smooth the piecewise linear solution.
Control over function shape If the ideal functional relationship you seek is convex, you can constrain your result to be convex (up or down).  It is forced to conform to what you know the real world solution has to look like.
Control over increase and decrease If you know your result must increase (or decrease) in a certain variable to be realistic, you can enforce that property during training.  This is very useful in real-world systems satisfying physical or economic laws. Some researchers have called these constraints "hints".
Estimation of noise in data One of the most difficult problems in using neural nets, or doing pattern recognition in general is filtering the noise out of your result.  With ALNs, you estimate that noise first.  After that step, you set an output tolerance so the next training result fits the information in your data, but not the noise.  This represents a significant advance over methods which merely stop training at a certain point.  At that point in other systems, parts of the net may be overtrained, others undertrained.  With ALNs, it is the error of a piece that determines when to stop. This is illustrated in ALNBench in the case where the error is of constant magnitude. If the relative error is constant, the output should be transformed into its logarithm before training.  If the error varies in other ways, the same idea can be applied using the Dendronic Learning Engine. The above process is shown in a series of images (click here to view them).

Back to the products and services page