You dont need a weatherman to know which way the wind blows.
6.1 Introduction
All trading systems have some behaviour which we would like to modify. Finding variables to help us influence this behaviour, however, is often very difficult.
Consider the case of a simple moving average crossover system. Such a system will invariably be subject to whipsaw. This is where a signal occurs, but is quickly reversed, maybe even in the next time period. This causes the system to move in and out of trades, tying up capital, and incurring transaction costs. The typical traders solution is to implement some crossover threshold which must be exceeded before the trade is taken, but this delays entry, and in the case of a profitable outcome, costs time and money.
A better solution might be to look to a neural network to help solve this problem. An ANN is a universal approximator; it is capable of determining a non-linear relationship between a set of inputs and a set of outputs.
The main challenge is to find a way to express your problem so that an ANN can help provide the solution. Once this is done, the process of creating the required ANN can begin.
The main functions in building an ANN are:
- Expressing the problem to be solved in such a way that an ANN can be used to build the solution
- Partitioning data into training and testing sets
- Finding variables of influence
- Making ANN architecture choices
- Training
- Testing
Whilst ANNs can help solve a great many complex problems, it should be remembered that there are a number of limitations inherent in the use of neural networks. Chiefly, these concern the fact that the neural net is a black box, and that rule extraction, whilst possible in some limited circumstances, is a particularly difficult and uncertain process. Overall, however, the neural net is not suitable for use as an explanatory tool.
Neural networks also tend to overfit the data if not very carefully controlled during the training process, and can find non-causal patterns in data very easily. There are no rigorous training methodologies that avoid this problem entirely. Determining a good internal structure for the network also tends to be a rather delicate process and although a number of useful guidelines exist, there are again no definite steps to success.
Despite these clear limitations with neural networks, they are still considered the tool of choice for investigating non-linear relationships amongst noisy and complex data sets.
6.2 Expressing your problem
For new ANN developers, this is perhaps the most complex part of the problem. Before we can start thinking about such matters as variables of influence, architecture and so on, we need to think about how we are going to try to solve the problem; there are often many ways to do this.
Returning our attention to the simple moving average system mentioned in the introduction to this chapter, there are several possible solutions, such as:
- Try to predict which trades will be profitable (or unprofitable),
- Try to predict which securities (or markets) are most likely to exhibit this behaviour,
- Try to find the optimal threshold setting.
Each of these different solutions will most likely have different outcome time frames, different variables of influence, and different degrees of success. For this reason, they may also require differing ANN architecture choices. There is also no reason that only one solution should be chosen perhaps a better solution is a combination of several possible ones.
6.3 Partitioning data
Any study involving optimisation or neural networks must at least logically separate data that will be used for training from data that will be used for testing. It is also good practice to physically separate the training data from the testing data.
There is acceptance within the academic community that the relationship between security prices (and returns), and the variables that constitute that price (return), changes over time. In other words, the structural mechanics of the market are changing over time, and their effects on prices are also changing. For this reason, it is necessary to partition data vertically rather than horizontally.
A vertical partition of a dataset will divide the dataset into two partitions; one for training, and one for testing. Typically, the training dataset is larger, and covers a significant date range of the overall data, whilst the testing dataset is smaller, and used to provide out-of-sample confidence. These two partitions are typically known as in-sample (training), and out-of-sample (testing) partitions. Using this approach, every security should have its dataset partitioned into training and testing subsets.
The horizontal approach to partitioning splits entire datasets into either a training or a testing block. For example, horizontally partitioning ten datasets, with 60% in training and 40% as testing, would yield six entire datasets used for training and four entire datasets used for testing. This approach is invalid when it is recognised that the structural mechanics change over time, due to the fact that a neural network may well learn correlations that could not have been known in chronological time, and later, exploit these during the testing phase. This may well lead to higher quality predictions, but is clearly unrealistic.