Machine Learning for Financial Forecasting: When It Beats Spreadsheets

Machine learning is not always better than a well-built financial model. The belief that ML automatically produces better forecasts is one of the most expensive misconceptions in finance AI, and it has driven a significant number of investment decisions that did not deliver.

The organisations that have seen genuine, measurable improvement in forecasting accuracy from ML are real. The 20 to 35% improvement in demand forecast accuracy that well-implemented ML delivers in high-volume ecommerce environments is not marketing fiction. Neither is the 15 to 25% improvement in cash flow forecast precision that mature ML forecasting delivers in businesses with sufficient historical data. Those results are achievable in the right conditions.

The question is what those conditions actually are. The same ML investment that transforms forecasting in one business produces confident-sounding wrong answers in another. Understanding the distinction is the difference between an AI forecasting investment that pays back and one that does not.

When machine learning beats traditional forecasting

ML creates value in financial forecasting in specific circumstances. They share a common characteristic: the forecasting problem is too complex for a human-built model to solve well, and sufficient historical data exists for an ML model to learn the patterns.

High-volume data with patterns too complex for manual model building. A business forecasting demand across 5,000 SKUs, with each SKU influenced by dozens of variables including seasonality, promotional uplift, competitive dynamics, and supply constraints, faces a forecasting problem that a spreadsheet model cannot solve well. Not because the modeller is not skilled. Because the dimensionality of the problem exceeds what can be captured in a manually-built model and maintained at the frequency the business needs.

ML handles this dimensionality naturally. It identifies which variables are predictive for which SKUs, learns the interaction effects between those variables, and updates its understanding as new data arrives. Demand forecasting implementations in ecommerce businesses at this scale typically improve forecast accuracy by 20 to 35% compared to statistical methods. The improvement in inventory efficiency and stock availability that follows is usually where the ROI is measured.

Cash flow forecasting with multiple interacting input variables. Cash flow forecasting in a business with complex receivables cycles, variable supplier payment patterns, seasonal working capital requirements, and multiple revenue streams is a problem where ML can improve precision materially. When there are dozens of variables influencing cash timing and their interactions are non-linear, ML identifies patterns that manual model-building misses. Businesses with three or more years of consistent historical cash flow data and rich transaction-level detail have seen 15 to 25% improvements in 13-week cash flow forecast precision from ML implementations.

Non-linear relationships that spreadsheet models flatten. Traditional financial models assume linear or log-linear relationships between variables. Many real business relationships are not linear. The relationship between marketing spend and revenue has threshold effects, diminishing returns, and timing lags that vary by channel and customer segment. A spreadsheet model captures an average relationship. An ML model learns the actual shape of the relationship. In businesses where the non-linearity is significant and the data to learn from is available, this matters.

Seasonal patterns that shift over time. Traditional seasonal adjustment applies a fixed seasonal index derived from historical averages. If your seasonal patterns are shifting because of changing customer behaviour, new channel mix, or macroeconomic conditions, a fixed seasonal index becomes increasingly wrong. ML models that update their seasonal pattern estimates as new data arrives adapt to shifting patterns in ways that static seasonal adjustment does not.

When a spreadsheet is better

There are conditions where a well-built spreadsheet model will outperform ML. This is not a failure of the technology. It is a characteristic of the environment.

Short time series. ML models learn from data. If you have less than two years of consistent historical data, most ML forecasting approaches do not have enough to learn from. They will overfit to the specific patterns in your limited history and perform poorly out of sample. Traditional statistical methods and well-structured financial models tend to be more robust with short time series because they incorporate prior knowledge about the structure of the problem rather than learning everything from data.

Two years is a minimum. Three years is comfortable. For seasonal businesses, four years gives you enough cycles to learn seasonal patterns with confidence. Businesses that have gone through significant structural changes, an acquisition, a major product launch, a channel restructuring, effectively reset their historical time series for forecasting purposes. Post-change data is the relevant data. Pre-change data may mislead.

Businesses where external shocks dominate. ML models learn patterns from history. They cannot learn from events that have not yet happened. Brexit, a pandemic, a major customer failure, a regulatory change that restructures a market: these are external shocks that a model trained on pre-shock data cannot predict or prepare for.

If your business operates in an environment where external events have driven large, unpredictable step-changes in your financials with some regularity, the historical patterns your ML model learns will be systematically disrupted by the next shock. A simpler model, explicitly built to accommodate scenario analysis and management override, often performs better in this environment because it makes the assumptions explicit rather than encoding them as learned historical patterns.

Processes where the forecast needs to be explainable. A board or an investment committee that does not trust black boxes is not a problem with the people. It is a legitimate governance concern. A forecast that informs a capital allocation decision needs to be explainable. A CFO who presents a revenue forecast and cannot explain the reasoning behind any specific number because it emerged from a model they do not fully understand is in a difficult position.

If the forecast serves a decision-making audience that requires transparency into the assumptions, a well-structured financial model where every assumption is explicit and reviewable is often the right tool, regardless of what an ML model might achieve on raw accuracy. The value of the forecast is not only in the number. It is in the discussion and scrutiny the number enables.

Contexts where the discussion is the value. The monthly management accounts are not only a source of numbers. They are a mechanism for forcing the leadership team to confront assumptions, explain variances, and update their understanding of the business. A forecast that arrives as a model output rather than as a product of deliberate management assumptions changes the nature of that conversation. Not always for the worse, but the change is worth being intentional about.

Data requirements for ML forecasting

The data requirements for ML financial forecasting are specific. Not meeting them is the most common reason ML forecasting investments underperform.

Minimum three years of consistent, clean historical data. Consistent means the same definitions applied throughout: the same revenue recognition approach, the same cost centre structure, the same entity consolidation scope. If your business has changed its accounting policies, restructured entities, or changed its ERP during the period, the historical data is inconsistent and the model will learn a mixture of different regimes rather than a coherent underlying pattern.

Clean means the data quality dimensions are met: complete, accurate, consistent, timely, and with traceable lineage. An ML model trained on data with significant missing values, inconsistencies, and inaccuracies learns those imperfections. It does not average around them.

Defined feature variables with reliable sourcing. ML forecasting models use input variables: macroeconomic indicators, weather data, promotional calendars, headcount, capacity utilisation, whatever is predictive in your specific context. Each of those variables needs to be defined clearly, sourced reliably, and available on the same schedule as the forecast update cycle. If you want to update your cash flow forecast weekly, every input variable needs to be available weekly. If a key input is only updated monthly, the forecast can only be as current as its least current input.

A validation methodology to measure forecast accuracy against actuals. This is the most commonly absent element. Many ML forecasting implementations are deployed without a rigorous framework for measuring whether they are actually more accurate than the previous approach. Without this, you cannot tell whether the investment is working. Forecast accuracy should be measured using defined metrics: mean absolute percentage error (MAPE), bias (systematic over- or underforecasting), and accuracy within defined tolerance bands. These should be calculated and reviewed monthly for at least the first year of production.

The cash flow forecasting that works post covers the accuracy measurement framework in more detail for that specific forecasting domain.

Build versus buy

For most finance functions evaluating ML forecasting, the question is not just whether ML is appropriate. It is whether to build a custom model, buy a specialist forecasting AI product, or improve the statistical methods in existing tools.

Off-the-shelf forecasting AI products. A growing category of tools, from established FP&A platforms to specialist ML forecasting vendors, offer pre-built forecasting models that connect to your financial systems. These are appropriate when the forecasting problem is relatively standard and when your data is clean enough for the tool to work. The advantage is implementation speed. The disadvantage is that the model is not fully customisable and the vendor’s data handling practices need to be scrutinised carefully.

Custom ML models. Built by a data science team or a specialist vendor against your specific data and forecasting problem. Appropriate when the forecasting problem has specific characteristics that off-the-shelf tools do not handle well, or when the business has sufficient data science capability to maintain a custom model in production. Custom models take longer to build and require ongoing maintenance that off-the-shelf tools do not. The capability requirement for running a custom ML model in production is not trivial.

Improved statistical methods in existing tools. This is the option most underweighted in AI forecasting conversations. Excel, Anaplan, Adaptive Insights, and equivalent FP&A tools have statistical forecasting capabilities that have improved substantially in the last five years. Exponential smoothing, ARIMA-style decomposition, and ensemble methods are available in these tools with no ML infrastructure required. For mid-market businesses where data volume is moderate, time series are three to five years, and data quality is mixed, improved statistical methods in existing tools will often match or beat a custom ML model. The data quality constraints that limit ML performance do not limit simpler methods in the same way.

Custom ML makes sense at a scale where data volume actually benefits from it and where the infrastructure and capability to maintain it in production exists. Below that threshold, the investment in data quality and process discipline that a well-run statistical forecasting programme requires will deliver comparable accuracy improvements at a fraction of the cost.

The decision framework

Before investing in ML forecasting, answer these questions.

Do you have three or more years of consistent, clean historical data in the forecasting domain? If not, the data preparation is the first investment.

Is the forecasting problem characterised by high volume, multiple interacting variables, or non-linear relationships? If not, a well-built statistical model is likely adequate.

Can you implement a rigorous accuracy measurement framework to validate whether ML outperforms your current approach? If not, you cannot demonstrate return on the investment.

Does your forecast audience require model transparency? If yes, explainability needs to be a vendor selection criterion, not an afterthought.

For the organisations where the answers to these questions are favourable, ML forecasting is a genuine, measurable performance improvement. For the organisations where the answers are not, the right investment is in the foundations: data quality, process discipline, and analytical capability. These investments improve forecasting performance with any method, including the well-built spreadsheet that is still, in many contexts, the right tool for the job.

The broader framework for where AI delivers in finance sits at AI in Finance Strategy. The CFO guide to AI strategy covers how to sequence these investments against your specific finance function priorities.

Maebh Collins is a Fellow Chartered Accountant (FCA, ICAEW) with Big 4 training and twenty years of operational experience as a founder and senior finance leader. She writes about AI in finance transformation from the inside out.

Back to Blog | AI in Finance →