Boosting is a very powerful technique for improving the performance of weak learning algorithms. Since the original boosting algorithm has been proposed, various improvements had been added to it. Among them, the new boosting algorithm, called `AdaBoost', has been proven to be very effective, practical, and relatively easy to implement. The algorithm was successfully used in various real-world classification problems. However, it is hard to find any references to the results of applying the algorithm to regression problems, even though the `boosting regression algorithm' (AdaBoost.R) was developed three years ago. This report describes an implementation of the `AdaBoost.R' algorithm and the application of the algorithm to a regression problem. In this project, a neural network was used for the learning algorithm. At this moment, the results are less compelling, and therefore the problems that lead to these results are also discussed.
In this paper I compare three different algorithms that solve the Permutation Flow-Shop Sequencing Problem: local search, a genetic algorithm called GENITOR, and a genetic local search algorithm (combining GENITOR with local search). In the experiments I designed, the number of evaluations allowed to be performed by each of these three types of algorithms was the same, such that the performances obtained could be compared. It resulted that local search will improve the performance of GENITOR, but the combination of GENITOR and local search does not perform better than simple local search.
In this report, the velocity picking problem is described first. A bit string representation and evaluation function are given. A genetic algorithm versus two other local search algorithms (random local search and steepest descent local search) to search for a good solution for this velocity picking problem are discussed. Solutions of these three algorithms are compared. GA gains the best performance on average. Steepest desent local search gains the second best performance. Random local search has the worst performance.
This report shows that it is possible to train a neural network to recognize trading opportunities and theoretically trade currency futures profitably.
This paper reports on a project to train Artificial Neural Networks (NN) to identify profitable trading positions in the Japanese Yen Futures Contracts. The results substantiate the hypothesis that NN's can recognize trading opportunities and theoretically trade currency futures profitably. Three NN's were used to identify the futures contract position to hold. The first NN identified when to hold a long position, the second when to hold a short position, and the third when to be neutral. The outputs of these NN's were quantified to 1 or 0 and then the result was used to establish the position. In ambiguous situations the position was set to neutral. The target data for training was produced by a program which, using the benefit of looking forward with historical price data, identified profitable trading positions to hold as of the end of each trading day. The input to the NN's was created by another program using only previous or current price data. Five segments of the input and target data were created. Each segment contained the data for 100 sequential trading days. The NN's were trained using a sequence of three of the five segments. The other two segments were used to evaluate the trading results produced by the trained NN's. In the training phase, the indicators were applied as inputs and the profitable positions to hold were used as the training targets. In the evaluation phase, the indicators were applied to the inputs of the NN's and the NN's evaluated. These outputs were converted to the appropriate trading orders, "Buy 1 (or 2) at Close" or "Sell 1 (or 2) at Close". These trading orders were then applied to a trading simulator which returned the trades that would have resulted if the orders had actually been placed. Using the simple delayed difference between the price and the moving average, the NN's were consistently trained to produce a profit during the training data segments after 30 epochs of Resilient Backpropagation. A profit was achieved in 70% of the validation and test data segments though the profit was not as great as it would have been if the profitable long or short position had been held. Replacing three of the inputs with an enhanced set of indicators produced substantially better results.
This paper investigates the application of temporal difference Q-learning to a taxi driver navigation problem. The agent uses exploration to calculate its own action-value function. In exploration, the agent disregards its policy by choosing a random action with some probability. Various values of fixed exploration probability were investigated. A dynamic agent that reduced its exploration probability over time was also modeled. Finally, a static agent's exploration was tested by closing one of the roads after the initial model had been learned.
Many problems in AI can be viewed as problems of constraint satisfaction in which the goal is to discover some problem state that satisfies a given set of constraints. Examples of this sort of problem include cryptarithmetic puzzles and some real world perpetual labelng problems. The Mastermind Game is one such problem that falls under the constraint satisfaction problem category. In this project, I propose an algorithm that solves the Mastermind Game.
The purpose of this project is to control a robot arm in a continuous two dimensional space. The objective is to use a reinforcement learning technique in order for the robot arm to reach a goal. We use a neural network to map a state to an action. In order to make this problem realistic, our model contains many of the same characteristics as real-world control problems involving mechanical systems. The problems dynamics are dependent on velocity and acceleration.
In this report, we first describe, in part 1, the environment in which we are working, then, in part 2, we will discuss the implementation of the neural network and how we are applying reinforcement learning. At last, we will discuss in part 3 the results that we got, the problems that occurred, and what could probably be done to improve these results.
A new field in artificial intelligence is artificial life. It has been used to model living creatures. An attempt is made here to simulate a world filled with living creatures. The entities consist of neural networks evolved using genetic algorithms. The results show promise that a system could be created that models living creatures.
This paper begins with a description of two methods used for handwriting recognition. The first entails transforming input letters into graphs that can be compared against a database of known letters and their graphs. The second method employs the use of convolutional neural networks that work as feature extractors used to recognize handwritten characters. Finally, a decription of an implementation using a large number of neural networks to solve the handwriting recognition problem is given, along with experimental results.
To me, reinforcement learning (RL) is a branch of computer science that borders on the fine line between artificial intelligence and black magic. The traditional approach to solving problems with a computer begins with the programmer designing an algorithm, or a plan of attack. The program then dutifully carries out the code, regardless of how well it works. In this traditional approach, the problem and its solution are very well defined. The method of solving the problem is constant, and the computer is not allowed to improvise new solutions dynamically. In RL, however, the problem and solution are left for the computer to determine on its own. By allowing the computer to define the problem and the solution, it can often make use of information or patterns in the data that a person might not notice. In addition, since the programmer does not need to give an algorithm for finding a solution, this method can also be used to tackle problems whose solutions or algorithms are so complex that a person is unable to solve them easily. In addition, the problem itself can be vague. It is not necessary to give the program any information except for how well it is functioning. A small amount of information like ``Whatever you are doing, it's working'', or ``Don't do that, it's bad'' is plenty.
An excellent example of this is backgammon. The current Grand Master backgammon player is a RL program. \cite{tes-98} The program was never explicitly told what constitutes a good move, or any strategies. However, after being rewarded with wins, or punished with a losses over the span of many years, it has become the best backgammon player in the world. Human player study its techniques, to try to improve their backgammon skills. It is this ability for an RL program to outgrow its programmer, that is mysterious, powerful and frightening. I believe that if a computer is ever to gain self awareness, it will be a result of using a RL technique or other techniques that share this ability to solve problems in ways that were not preprogrammed.
In many ways, RL appears to simulate organic learning. The concepts of positive and negative reinforcement used in RL have been long studied in the fields of biology, psychology, education, and others. These ideas have been used countless times to train rats and other animals to perform simple tasks. The only real difference between the carbon based RL and the silicon based RL, is the method of reinforcement. Animals like food and other physical stimulants, and dislike electric shocks; Computers like positive numbers, and dislike negative ones, or at least we can program them to have these preferences. It is my hypothesis that reinforcement learning can be used to create programs that mimic the training of simple animals to perform simple tasks.
Neural nets may be able to be used to simulate pilots flying flight simulators. If a neural net can be trained to handle in a simulator like a student does, then human students will not be needed to be in control of the simulator for testing. This could speed up the design of the simulator and make it easier to debug. Additionally, it may be possible to use a neural net to learn a certain style of piloting in order that the simulator sessions may be tailored to individual human pilots.