Dendronic Decisions Limited
Advanced Computing Research -- Expertise in Machine Learning
![]()
Adaptive Logic Networks Technology
![]()
Constant improvement
The development of Adaptive Logic Networks technology1 has been going on since 1967. Originally, Adaptive Logic Networks were feed-forward neural networks processing only logical, or Boolean, signals. The networks learned by changing the logic functions at nodes of a binary tree.
In 1993, linear threshold units, or perceptrons, producing a logical value representing the sign of a linear expression of real-valued inputs, were placed at the inputs to the logical tree. These linear threshold units were made adaptive instead of the logic gates, which became fixed AND and OR gates. The functions represented by this architecture were piecewise linear, with the Adaptive Logic Network computing not the function value, but a logical value indicating whether an input to the net was greater than the function value. The function value itself could be computed by a binary search technique. This form of Adaptive Logic Networks was made available on the Internet as the Atree 3.0 EK educational kit. The algorithms of the commercial version Atree 3.0, were used, among other things, to control a mechanical system in real time2 and had notable success in automatically controlling functional electrical stimulation to provide locomotion to a person with incomplete quadriplegia.3,4
In 1995, logic gates AND and OR were replaced by MINIMUM and MAXIMUM functions. Then nets now computed real valued functions just like backpropagation neural networks. However because these node functions enabled picking out the parts of the network responsible for processing a given input sample, the "credit assignment" problem was solved for ALNs. Learning became much faster and more dependable. A bit later, the MINIMUM and MAXIMUM functions were made smoother by adding quadratic "fillets". This got rid of the sharp edges at the junctions of linear pieces.
In 1996, much improved learning was achieved by growing the tree of nodes according to the complexity of the training data, instead of using a fixed-architecture tree specified in advance. An Adaptive Logic Network, starting with just one linear piece, would add pieces by splitting the piece with the greatest error and adding a new node to the tree. This eliminated the problem of the learning phase reaching a local minimum of error, and allowed functions of unlimited complexity to be learned easily.
The traditional problem of neural network over-training was solved by developing a criterion for splitting pieces during tree growth. If the error of a linear piece on the training set is smaller than its error on a validation set, then that indicates we have to stop splitting or we will be fitting noise.
A program ALNBench embodied this level of ALNs. Constraining the weights (partial derivatives) and functional forms (convex up, convex down) can both greatly enhance generalization. As a special case, we could force functions to be monotonic increasing or decreasing. This a priori knowledge made learning functions easier for the system and reduced the amount of data required.
The latest ALN learning algorithms use the technique called bagging to get better performance on new datasets. A new graphical interface of the program ALNfit Easy made using ALNs very simple for machine learning problems where there is adequate sample data.
One recent developments for ALNs is reinforcement learning for use in innovative control applications. In contrast to supervised learning, the optimal control output associated with a given input is not always known in advance. Reinforcement learning provides a mechanism to determine appropriate control outputs for a sequence of events when the only feedback is a rough indication of success. One technique builds on top of the existing supervised learning capability by successfully implementing a solution to Bellmans equation. This has been demonstrated by a basketball balancing task.
Synthesizing Piece-wise Linear Functions
There are two ways of looking at an ALN:
- as a network that computes a real valued function
of n real inputs.
- as a network that computes a logical value indicating whether the inequality
holds, given n+1 real-valued inputs including y;
The passage from a network producing logical values to one producing real values is achieved by converting AND and OR gates to MINIMUM and MAXIMUM nodes respectively, and converting linear threshold units to linear functions. Linear functions of the form
are combined using MAXIMUM and MINIMUM operations to produce a continuous piece-wise linear function. Placing a threshold on the output of this kind of net is equivalent to the result of a network with simple perceptrons and AND and OR nodes. In the following we will talk about the MINIMUM and MAXIMUM approach, but please remember that thresholding at a fixed level turns this into a perceptron-AND-OR net, in effect.
Taking the MINIMUM of several linear functions yields a convex-up function like the one shown in Figure 1.
Figure 1. MIN (a, b, c) defines a convex-up function.Taking the MAXIMUM of several linear functions yields a convex-down function like that shown in Figure 2.
Figure 2. MAX (a, b, c, d) defines convex-down function.Any arbitrary continuous functions can be approximated using trees of MAXIMUM and MINIMUM operations. An example of a more complex non-linear function appears in Figure 3.
Figure 3. MAX (MIN (a, b, c), MIN (d, e, f, g)) defines a function with two bumps.The current implementation of Adaptive Logic Networks allows the use of quadratic fillets in conjunction with MAXIMUM and MINIMUM operations when combining the functions computed by any two sub-trees. This results in continuously differentiable functions. A fillet is defined so that the deviation it causes from the MINIMUM or MAXIMUM is less than a certain prescribed amount, and the deviations from the basic piecewise linear function are bounded.
Advantages of Using Adaptive Logic Networks
One of the most important attributes of an Adaptive Logic Network is the high speed of evaluation, even if the network uses a very large number of linear pieces. The ALN is analyzed after training to divide the input space into boxes, in each of which only a small number of linear pieces are active. At each stage of the division process a threshold on an input variable is chosen so that about half of the linear pieces in a box are active on one side of the threshold, and about half on the other. This creates two boxes from one. In the smallest boxes, the number of active linear pieces that have to be evaluated can be as low as n+1, independent of the complexity of the function. Given an input vector
for which the function f has to be evaluated, a decision tree quickly finds the box the input vector is in. Then a small, box-specific expression of just a few linear functions and MAXIMUM and MINIMUM operations is computed to get the output. This type of computation is very fast on a standard desktop or personal computer. Typically, no special hardware is required, though a limited amount could be useful if the function is very complex or the number of input variables is high.
We note that an Adaptive Logic Network can take training data from the output of any continuous function that is computed too slowly (e.g., some complicated mathematical formula or simulation of a trained neural net) and turn it into a fast, easy-to-understand approximation.
The piecewise linear form of the ALN result can potentially be checked for conformance to a specification using the boxes into which the space is divided by the decision tree. Each box has a small number of linear functions associated with it. Proofs of properties of the ALN become possible.
The above properties of Adaptive Logic Networks suggest they can act as a replacement technology for other kinds of feed-forward neural networks if computing speed or safety considerations arise.
![]()
References
- W. Armstrong and M. Thomas. Adaptive Logic Networks, Handbook of Neural Computation, Emile Fiesler and Russell Beale, editors, Oxford University Press, 1996, pages C1.8:1 - 14.
- B. R. Thane. Prediction of intramuscular fat in live and slaughtered beef animals through processing ultrasonic images. Thesis, Texas A&M University College Station, Texas 1992.
- A. Kostov, W. Armstrong and M. Thomas. Adaptive Logic Networks in Rehabilitation of Persons with Incomplete Spinal Cord Injury, Handbook of Neural Computation, Emile Fiesler and Russell Beale, editors, Oxford University Press, 1996, pages G5.1:1 - 8.
- Richard B. Stein, Aleksandar Kostov, Dejan Popovic, William W. Armstrong, Monroe Thomas. Functional Electrical Stimulation Aided Locomotion Controlled in Real Time by Artificial Neural Networks, Can. J. Physiol. Pharmacol. 73:A29 1995 (Abstract).
![]()
Send mail to
with questions or
comments about this web site.
Copyright © 2003 Dendronic Decisions Limited.
All rights reserved.
Date Modified: March 1, 2003.
![]()