### Improving Forecasting Performance for intermittent data: A New Measurement Approach

With today’s disruptions in global supply chains as a result of the coronavirus pandemic, intermittent sales volumes, shipments and service parts inventory levels are becoming more common across many industries, especially in retail. ** Intermittent demand forecasting** for a product or service with sporadically scattered zero values in the demand data is a challenging problem. As a result, demand forecasting for a modern

**consumer demand-driven supply chain**organization is becoming a more vital discipline for business planners to master.

In this article, I introduce a new way of measuring **forecasting performance** that does not have the shortcomings of the widely used **Mean Absolute Percentage Error** (MAPE) metric in this context. To keep matters simpler, I will introduce the new metric using regular trend/seasonal data. An example with intermittent data will follow this article.

Before dealing with the intermittent demand case, we will examine a forecast profile example without intermittency, something smarter to start with if you are considering to add this to your toolkit and want to test the methodology with your own data. I have added the notation on the spreadsheet, so you can follow this example with your own data.

In the spreadsheet dataset, the **holdout sample** (in *italics*) is shown in the row for year 2016. A fiscal, budget or planning year does not necessarily have to start in January, so lead-time or forecasting cycle totals are shown also, as we will be using them in our calculations.

### Step 1. Preliminaries. What is a Forecast Profile?

We start with a little notation, so that you can test and follow the steps in a spreadsheet environment for yourself. Suppose we need to create forecasts over a time horizon *m* , like a leadtime or annual planning or budget cycle. These multi-step ahead forecasts, created by a model, method or judgment. make up a sequence **FP** = { FP(1), FP(2), . . . , FP(*m*) }. For our example, we will assume that the forecasts are monthly starting at time t = T and ending at time t = T + m. Typically, lead-times could be 2 months to 6 months, the time for an order to reach an inventory warehouse. For operational and budget planning, the time horizon might be 12 months. We call this pattern a *F**orecast **P**rofile*. At the end of the forecast horizon or planning cycle, we will have a corresponding *A**ctual **P**rofile* **AP** ={ AP(1), AP(2), . . . , AP(*m*) } to compare with **FP**. Familiar methods include the calculation of the **Mean Absolute Percentage Error** (MAPE). In the case of intermittent demand, some actuals are zero, which leads to undefined terms in the MAPE.

### Step 2. Coding a Profile Into an *Alphabet Profile*

We are interested in evaluating forecasting performance or “accuracy” by considering the information in the Forecast Profile and see how different that is from the information in the profile of the actuals.

Add alt text

The performance of the process that created the Forecast Profile is of interest because we will come up with a ‘distance’ metric between the Forecast Profile and Actual Profile that has its basis in Information Theory. As you can see in the graphs, the *alphabet *Profile has the same pattern as the *historical* Profile from which it was created, but is necessary in the construction of the ‘distance’ metric.

For a given profile, the alphabet profile is a set of weights whose sum is one. That is, a *Forecast Alphabet Profile* is **FAP** = {f1, f2, . . . fm ], where fi is calculated by simply dividing each *forecast* by the sum of the forecasts over the forecast horizon. This will give us fractions whose sum is one. The *Actual Alphabet Profile* is **AAP** = {a1, a2, . . . am], where ai is obtained by dividing each *actual* by the sum of the actuals over the forecast horizon. When you compare a historical profile with the corresponding alphabet profile, the pattern does not change.

### Step 3. A Forecast Performance Metric for Multi-Step Ahead Forecasts (Forecast Profiles).

The ‘relative entropy’ or ‘divergence; measure are being used as performance measures in various applications in meteorology, neuroscience and machine learning. I use it here as a measure of how one forecast profile is different from another profile. We wil first define the entropy H(.) as a way of measuring the amount of information in the alphabet profiles, as follows:

Thus, – H(**FAP**) can be interpreted as the amount of information associated with the *Forecast Alphabet Profile. *Similarly, – H(**AAP**) is interpreted as the amount of information associated with the *Actual Alphabet Profile*. Both quantities are greater than or equal to zero and have units of measurement called ‘nats’ (for natural logarithm). If you use logarithms to the base 2, the units are in ‘bits’, something that might have a more familiar ring to it.

### A Performance Measure for Forecast Profiles

A performance measure for the forecast alphabet profile (FAP) is given by a divergence measure

This can be interpreted as a measure of *dissimilarity* or ‘distance’ between the actual and forecast alphabet profiles. The measure is non-negative and is equal to zero if and only if ai = fi (i = 1,2, . . m). This happens when the alphabet profiles overlap, or what we would call 100% accuracy. I am unaware of where this measure has been previously proposed in the forecasting literature or actually implemented in demand forecasting applications.

It should be pointed out that this measure is *asymmetrical,* in that D(a, f) is **not** equal to D(f, a). That should be evident from the formula. Also, the alphabet profiles are not discrete probability distributions, so some interpretations normally attributed to this divergence measure may not be applicable. This divergence measure is known as the **Kullback-Leibler divergence**.

### Step 4. Looking at Divergence Measures with Real Data

For the twelve-month **holdout sample**, we have created three forecasts by judgment, method and model:

- For a
*judgment forecast*, we will assume last year actuals as the forecast for the next year. This forecast profile is Year-1 FP. - For a
*naive**method*, we use the**MAVG12 forecast**, which is simply the average of a previous 12-month history. The MAVG12 method has a level profile. - The model forecast is based on the ETS (A,A,M) model which has a local level and multiplicative seasonal forecast profile. The model was selected in an automatic mode since the profile is deterministic. The State Space ETS Exponential Smoothing Models are described in
**Chapter**8 of my**book**.

### Calculating Alphabet Profiles and the Divergence Measure

The determination of the alphabet profiles are shown in the spreadsheet along with the divergence calculations. The divergence measurements are shown in bold. Since matching profiles result in the lower bound of zero, the closer to zero the better. This would suggest that the ETS forecast is the least divergent (of the three profiles or different) from the actuals.

### A Comparison With the Traditional Accuracy Measurement MAPE

If we compare the results with a Mean Absolute Percentage Error (MAPE) calculation for the holdout sample, we see that the ETS model has the best MAPE. This is not to say that it will always be the case, but at least the results are consistent with the divergence measure.

However, there seems to be a more serious issue with the MAPE that I have written about previously in an **article here,** in that the MAPE is not a reliable representative or typical value for the APEs when there are unusual or zero values in the actuals. If you also calculate the **Median APE** (MdAPE), however, you will note that the underlying APEs are skewed with thicker tails that prescribed by a normal distribution. This has not been adequately noted by forecast practitioners that the underlying distribution of the data used in a calculation with the arithmetic mean is critical. Forecasters should always quote the MdAPE with the MAPE to avoid loss of credibility.

There is an outlier-resistant measure you can use to validate the MAPE; it is called **HBB TAPE** (**T**ypical **A**bsolute **P**ercentage **E**rror) for which I have shown an example in my book, an **article here** on LinkedIn, and a blog on my **website**.

Hans Levenbach, PhD is Executive Director, **CPDF Professional Development Training and Certification Programs**. Dr. Hans created and conducts hands-on Professional Development Workshops on Demand Forecasting and Planning for multi-national supply chain companies worldwide. Hans is a Past President, Treasurer and former member of the Board of Directors of the **International Institute of Forecasters**. He is group manager of the LinkedIn groups (1) **Demand Forecaster Training and Certification, Blended Learning, Predictive Visualization**, and (2) **New Product Forecasting and Innovation Planning, Cognitive Modeling, Predictive Visualization**.

I invite you to join these groups and share your thoughts and practical experiences with intermittent data and demand forecasting in the supply chain.