Although the term “machine learning” was coined at IBM in the late 1950s, and the methods and models that underpin machine learning applications were developed in the following decades, only since the turn of the century has it exerted significant influence outside of academia and research institutions. But once it entered the mainstream, the machine learning boom began in earnest. Over the last decade, machine learning tools have been adopted by developers, data scientists, and businesses in every sector of the economy.
Today, machine learning is ubiquitous; software built on machine learning models forecasts the weather, runs manufacturing plants, makes medical diagnoses, and recommends what you should watch this evening on Netflix. But it has had perhaps the biggest impact on trading. In 2019, worldwide funding for machine learning applications exceeded $28 billion, much of which came from the financial industry like banks, hedge funds, and startups that applied machine learning technology to algorithmic trading.
What is Machine Learning?
Machine learning is a subset of artificial intelligence (AI). While the AI field studies machine intelligence more broadly, machine learning focuses on technologies that allow computers to learn from data and use what they have learned to make predictions and decisions.
Supervised machine learning trains regression and decision tree algorithms, among others, on training sets with data from an area of interest. The algorithm learns by iteratively processing labeled training data, which contains variables known as predictors. The resulting model is tested against target variables in validation data sets to verify its accuracy. The goal is to create a model that accurately predicts the target variables when the model is exposed to relevant predictors.
In contrast, unsupervised learning algorithms, including neural networks and clustering algorithms, work with unlabeled data sets. They are generally used to discover hidden patterns in large volumes of data, including anomalies and associations.
One consequence of machine learning is that users don’t know precisely how the model works, just that it does. It isn’t a rulebook that can be read and understood, but a complex and multilayered set of weights and biases that produce more or less accurate results.
Machine Learning and Algorithmic Trading
Spotting patterns is the key to successful trading. Historically, traders observed market data patterns and used them to make predictions to maximize the return on their trading activities. These strategies can be expressed as a set of rules that trigger buys and sells when certain conditions are met.
Traders often search for patterns in the movement of technical trading indicators: mathematical calculations based on information about prices, volatility, and so on. For example, a moving average crossover is a straightforward trading strategy based on the understanding that a moving average indicator’s behavior relative to other moving averages can help identify trends.
While it’s possible to watch the market and make trades based on strategies of this type, humans are slow and inconsistent. Machines are faster and more accurate, and it’s often advantageous to encode strategies in an “if this happens, do that” algorithm for a high-frequency trading platform that can handle thousands of transactions a second.
Algorithmic trading is an improvement on manual trading, and the bulk of trades happening today are algorithmic. However, it still relies on a human being to identify relevant patterns and code an algorithm to take advantage of them. Additionally, returns from algorithmic trading have declined in recent years because of intense competition. It offers less of an advantage when everyone is doing it.
Machine learning, in contrast, has several benefits compared to traditional algorithmic trading. Machine learning algorithms can spot patterns in large volumes of data. They are used to find associations in historical data that can then be applied to algorithmic trading strategies. Machine learning empowers traders to accelerate and automate one of the most complex, time-consuming, and challenging aspects of algorithmic trading, providing a competitive advantage beyond rules-based trading.
Machine Learning and Historical Data
Data quality is critical to machine learning. Machine learning algorithms must be trained on accurate and comprehensive historical data, or they will never provide reliable predictions. Ideally, the data should be normalized, and it should include a wide range of relevant indicators for the asset you’re interested in.
SpiderRock’s data archives include a comprehensive selection of historical data sets suitable for training machine learning algorithms, including stocks, options and greeks, futures, volatility data, and more.