| |||
Learning Theory ResearchCurrent WorkData PruningThere are two important phenomena about real world learning problems. First, they may contain noisy or mislabeled data, which can mislead the learning algorithm. Second, they may have data that are too complex, and make it hard for the algorithm to extract the essence. In both cases, the performance of the algorithm degrades due to the incorrect or complex data given. In order to obtain a better generalization ability, we want to prune those unfavorable data before launching the learning procedure. This would also help set up analysis tools in areas that we have abundant data, such as computational finance. We have found that several learning algorithms, such as the rho-Learning
scenario, AdaBoost, and Support Vector Machines, can offer some help in
identifying unfavorable data (see our poster). We are trying to justify the data selection
framework in the learning aspect, and to build useful selection tools by
understanding the behavior of different learning algorithms in various
environments.
Distributed Learning in Swarm SystemsDistributed learning is the learning process of multiple autonomous agents in a varying environment, where each agent may have only partial information about the environment and other agents. We model the system and individual agents, then use several techniques such as reinforcement learning to find the optimal strategy for each agent in order to maximize the group performance. Our experiments with the stick-pulling problem showed agents became specialized automatically. (full report) We went on measuring the degree of specialization emergent from learning. Specialization is defined as the part of diversity that is incented by the need of performance improvement. We observed some interesting and sometimes counterintuitive results with the generalized stick-pulling experiments. Learning from hintsHints are prior information that is known about the function to be modeled. As such they can guide the training process, and improve the accuracy of the model. Learning theory relies mainly on the concept of learning from examples. If some prior information is known, for example the function is monotone or scale-invariant, then these can be enforced using the method of learning from hints. Not only will the resulting model satisfy this prior information, but the statistical error due to noise in the data can be reduced. (Monotonicity Hints) Bin Model for generalizationFor a learning problem we would like to know how well our system
will perform on unseen (out-of-sample) data, that is, how well our
hypothesis generalizes. We have developed a model for generalization
which allows us to derive a closed form expression for the expected
generalization error in terms the error on a training set. We have
addressed the problem of overfitting, and have shown that using
a simple exhaustive learning algorithm it does not arise. The results
are independent of the target function, input distribution and learning
model, and have been extended to problems with noisy data sets.
(full report)
Density estimation using neural networksEstimating probability densities is an essential step in many applications such as pattern classification, time series prediction, etc. We are currently working on developing neural network models for density estimation. We show that a network whose output is uniformly distributed implements the density estimator. We have developed an algorithm that achieves this. The developed estimator achieves a convergence rate close to the optimal. We have also considered the problem of random variate generation. Using the fact that it has a dual nature to the density estimation problem, we develop appropriate models. Learning hardwareThe use of hardware systems for learning and pattern recognition can provide great increases in learning and execution speed. This is crucial for real time pattern recognition and complex learning algorithms. We investigate the use of learning and evolution in hardware for automated digital circuit design. Using a field programmable gate array as a hardware learning model, we have successfully learned small arithmetic circuits from a set of examples. We are currently applying the technique to simple pattern recognition problems and data sets with noise. (full report) Fault diagnosisWe have addressed the problem of detecting and diagnosing faults from pumps. Due to the high cost of pump breakdown, early diagnosis is very important. For this particular problem only a small data set is available because of the high cost of collecting data. We developed pattern-recognition-type techniques that tackle specifically our case of small data sets. White blood cell image recognitionWe have developed a neural network-based model for the classification of white blood cells. The developed classifier lead to the development of an automated blood analysis system. The model includes aspects of image processing, neural networks, pattern classification, including a novel method to utilize contextual information in the image. We have developed another system for the problem of urine particle classification, for the purpose of developing an automated urine analysis system. |
|||
|
Updated: 04/24/2004 |