Machine learning models are essentially a subset of mathematical models in which computer algorithm determines the relationship between the variables as compared to traditional financial models which require human intervention to do the same. However, ML models are not easy to implement in a common corporate finance context where an adequate data set may not be available.
Financial models are commonly employed for business forecasting. But there are also many machine learning models used for predictive modelling. So, a common question I face is whether machine learning models are better than financial models?
In my view, a question like that comes because of a little bit of misunderstanding about financial models. As I would explain later, machine learning models used in finance are just one of the subsets of financial modeling.
But before we understand that, we first need to understand the two broad categories of financial models (i) Structural models and (ii) Statistical models.
If you want to understand how a variable affects another, one way to ascertain it is based on their natural or theorized behavior. For example, if one were to ask us what would happen to total raw material cost if sales were to increase, then we can be fairly confident that it should also increase proportionally. This is because we know that is how raw material cost behaves given that it is a variable cost.
Some relationships may not be obvious but can be theorized. For example, the capital asset pricing model (CAPM) uses theory to explain how the expected return on a stock would change if the risk-free rate or market risk premium changes. Similarly, the Black-Scholes model uses risk-neutral pricing theory to explain how various factors such as volatility, risk-free rate, spot prices, and time to maturity affect the value of an option.
In statistical models, the behavior of a variable is ascertained based on observation of empirical data. For example, if a company has to estimate how many defects are likely to occur on account of making workers extend their shifts, it may not be able to arrive at the number using logic or theory. However, by comparing and studying historical data of defects during normal shifts and defects during extended shifts, they may be able to arrive at a reasonable estimate. This is an example of statistical models.
A complex integrated financial model would most often use both structural and statistical models for estimating relationships between various data points. Continuing with the examples above, a company may use theory to forecast their raw material cost and use statistical models to forecast defects to arrive at their overall cost.
Now let us understand where machine learning (ML) fits in this and how does it compare with other models.
ML models vs Statistical models
In machine learning, a computer algorithm determines the relationship between various data points that have been supplied to it. Thus, it is essentially a statistical model. There are, however, certain differences in the approach to ML models vs traditional statistical models.
In the pre-ML era, statistical models required the users to identify the variables, run regressions or other statistical techniques to arrive at the relationship. Then they had to test the relationship to ensure it can work. On account of various factors, these relationships may break or change over a period. But unless someone actively reexamined the model, users are less likely to realize that the model was broken and may continue to rely on it.
However, in ML, the computer algorithm constantly evaluates the data and adjusts the relationship as new information keeps coming in. This ensures that the models are always up to date.
Another key difference between ML and traditional statistical models is in the validation of models. In traditional models, users look for reasonability of a relationship before accepting it, no matter how strong the statistical relationship is. So, even if the math suggests that a model has a 99.9% chance of being accurate, the users may dismiss it if they cannot reasonably explain how one variable can affect the other.
On the other hand, most often ML models turn out to be what is commonly referred to as “black box” models. Since it is the algorithm that determines the relationship, it may not be easy for one person to explain to another. This is one of the reasons why regulators are still reluctant to accept ML models.
Before we end, let us address a key concern
Can ML models replace traditional financial models?
One of the prerequisites for ML models is that it requires large set of data point with which the machine could be first trained. We need another large set of data point to test whether the relationship that the machine determined would work correctly.
So, in certain application areas where such large data sets exist, ML has already made inroads. For example, ML algorithms can be easily used to look at market prices to make buy or sell technical calls.
However, in other areas where the availability of data is a challenge, it would still be a long while before ML can be used. Especially, in the context of forecasting company financials, there are a lot of challenges even in applying traditional statistical techniques. Frequent business reorganizations, changes in regulations, and even changes in accounting and reporting standards cause challenges in the comparability of numbers over a period of time. And that means it is not possible to obtain a large enough set of comparable data. That is why the typical corporate finance models often rely on human judgment to make assumptions.
If ML has to serve such corporate finance needs, we have to explore alternate ways to forecast that does not rely on a company’s historical data. It is fair enough to say that those days are still reasonably far away.
Thus, while ML and AI will eventually one day replace the need for traditional financial models, we would most likely be using spreadsheet-based financial models for quite a long time to come.