Use anomaly detection in your forecasts for greater accuracy
Supply chain professionals seek to accurately forecast future demand using historical data. There are many equally valid approaches to the task, but they all require good data. Put another way, regardless of forecasting method, bad data will always yield bad results. This post is about refining historical data using a Machine Learning algorithm to detect data anomalies - known as anomaly detection, or sometimes outlier detection.
As we know all too well, data deficiencies come in all shapes and sizes. For example, a given time-series dataset could suffer from technical errors, such as gaps. In addition to technical problems, data can contain anomalies. Think of anomalies as outliers. Depending on how you are viewing time-series data, anomalies may show themselves as simple spikes, upward or downward shifts or changes in level. They can be local or global. Local anomalies are difficult to detect because they fall inside normal fluctuations in demand. In contrast, global anomalies are easier to spot because of their greater magnitude.
The most important consideration for our purposes is this: anomalies do not reflect true historical demand. Some may want to argue with that statement by pointing out that outliers are, in fact, part of the historical record. While that’s true, keep in mind that the goal is forecast accuracy. The presence of unhandled outliers hinders accurate demand forecasting. The problem is especially acute in fast-moving consumer goods (FMCG) companies where heavy, sustained promotional activity can create significant variability in order volume.
What are the options for dealing with and neutralizing anomalies? At a high level, there’s the option to build a multi-variate model, which is a bigger investment in terms of time and effort; anomaly detection and removal could provide a quicker, more cost-effective win:
Build a multi-variate model
- Model your outliers using a Machine Learning algorithm, predict their effect, include the effects in models of future demand
- Effective, but involves considerable effort, constant iteration and refinement
Use anomaly detection
- Remove outliers (still need to manage anomalous events if possible)
- Much faster, less expensive
Halo recently collaborated with a very large food distributor and producer to address the company’s chronic forecast inaccuracies.
The first step was to evaluate open-source Machine Learning algorithms designed to detect and remediate anomalies. It turns out that researchers at Twitter had developed a method for anomaly detection in time-series data. What’s more, Twitter’s model handles both local and global anomalies. Once modified to fit the distributor’s needs, testing began.
How does it work? In short, the method removes the trend and seasonal component from raw historical data, which exposes the outliers in what’s called residual data. The outliers are then removed, and replaced with the sum of the trend and seasonal components, and the cleaned data is used to generate more accurate forecasts.
The customer realized adopting a model such as anomaly detection could reap similar benefits, and the investment was easily justified, especially when compared with building, refining and maintaining a multi-variate model.
According to Gartner, a 1% improvement in forecast accuracy can lead to a:
● 2.4% decrease in order-to-delivery days (cycle time)
● 0.4% increase in perfect order performance (on time, in full)
● 2.7% reduction in finished goods inventory (days)
● 3.2% reduction in transportation costs (percent of sales)
● 3.9% reduction in inventory obsolescence (percent of inventory value)
“Win the Business Case for Investment to Improve Forecast Accuracy.” Gartner, May 2017 (consumer goods – non-food and beverage)