### How You Can Determine the Bias in a Leadtime Demand Forecast

In a global pandemic environment, lead-time demand forecasts become increasingly important in planning production capacities, managing product portfolios, and controlling inventory stock outs in the supply chain. In inventory planning, for example, the **lead time demand** is the total demand between the present and the anticipated time for the delivery after the next one if a reorder is made now to replenish the inventory. This *delay* is called the **lead-time**. Since lead time demand is a future demand (not yet observed), this needs to be forecasted. While models are designed to produce unbiased forecasts, how can we determine whether forecasts are biased or not over the desired lead-time. This means that individual point forecasts as well as the lead-time total can be biased.

In a **previous article**, I introduced a measure of accuracy for lead time demand forecasts that does not become unusable with intermittent demand, like the widely used Mean Absolutes Percentage Error (**MAPE**). The MAPE can be a **seriously flawed accuracy** measure, in general, and not just with zero demand occurrences. In **another article**and **website blog**, I posted a new way you can forecast intermittent demand when the assumption of independence between intervals and nonzero demands is not plausible, making the Croston methods inappropriate. I use information-theoretic concepts, like **KL Divergence**, in this **new approach** to intermittent demand forecasting,

### Creating the Actual Holdout Sample and Forecast Profiles

We start by defining a multi-step ahead or lead-time forecast as a *F**orecast** P**rofile*** (FP)**. A forecast profile can be created by a model, a method, or informed judgment. The forecast profile of an *m*-step-ahead forecast makes up a sequence **FP** = { FP(1), FP(2), . . . , FP(*m*) } of point forecasts over the lead-time horizon. For example, the point forecasts can be hourly, daily, weekly or more aggregated time buckets.

I have been using real-world monthly holdout (training) data starting at time t = T and ending at time t = T + *m, *where* m *= 12. Typically, lead-times could be 2 months to 6 months or more, the time for an order to reach an inventory warehouse from the manufacturing plant. For operational and budget planning, the time horizon might be 12 months to 18 months. This 12-month pattern of point forecasts is called a *F**orecast **P**rofile*.

At the end of the forecast horizon or planning cycle, we can determine a corresponding *A**ctual **P**rofile* **AP** ={ AP(1), AP(2), . . . , AP(*m*) } of actuals to compare with **FP **for an accuracy performance assessment. Familiar methods include the **M**ean **A**bsolute **P**ercentage **E**rror (MAPE). The problem, in case of intermittent demand, is that some AP values can be zero, which leads to undefined terms in the MAPE.

We are approaching forecasting performance by considering the biases in the Forecast Profile using information-theoretic concepts of accuracy.

**Coding a Profile into an ***Alphabet *Profile

*Alphabet*Profile

For a given Profile, the *Alphabet Profile* is a set of positive weights whose sum is one. That is, a *Forecast Alphabet Profile* is FAP = {f(1) f(2), . . . f(*m*)}, where f(i)= FP(i)/Sum FP(i). Likewise, the *Actual Alphabet Profile* is AAP = {a(1), a(2), . . . a(*m*)], where a(i) = AP(i)/Sum AP(i). The alphabet profiles are defined by:

But, what about the *real data* profiles FP and AP shown in the lower frame below ? For the spreadsheet data example, the two forecasts show different FP profiles.

**When you code a forecast profile FP into the corresponding alphabet profile FAP, you can see that the demand pattern does not change for the Year-1 method, and ETS(AAM) model**. (The models were explained in the previous articles)

In practice, the performance of a forecasting process leads us to make use of various metrics that need to be clearly defined first, so that in practice, planners and managers do not talk ‘apples and oranges’ and possibly misinterpret them. The alphabet profile is necessary in the construction of a ‘Statistical Bias’ measure I am proposing for lead-time demand forecasting for both regular and intermittent demand.

The *relative entropy* or *divergence* measure is used as a performance measure in various applications in meteorology, neuroscience and machine learning. I use it here as a measure of how a forecast profile diverges from the actual data profile. An accuracy measure for the forecast alphabet profile (FAP) is given by a Kullback-Leibler divergence measure D(a|f), which can be readily calculated in a spreadsheet:

For a *perfect* forecast, a(i) = f(i), for all i, so that D(a|f) = 0 and (shown below) FAP Miss = 0. Since D(a|f) can be shown to be greater than or equal to zero, this means that for a *perfect* forecast, zero is the best you can achieve. In all other cases, accuracy measure D(a|f) will be greater than zero. Thus, with a *perfect* forecast the coded alphabet profiles **FAP** and **AAP** are identical and overlap with **no **bias.

Now, a **Forecast Alphabet Profile Miss** or Bias is shown in the formula above . In information-theoretic terms the summation terms are known as ‘entropies’ and will be interpreted as information about AAP and FAP, respectively. The Miss or Bias is the difference between the two entropies. The unit of measurement is ‘*nats*’ because I am using natural logarithms in the formula or “logarithms to the base e”. In Information theory and climatology applications, it is more common to use “logarithms to the base 2”, so the units of measurement are then ‘*bits*’. The meaning of bias remains the same. By substituting for a(i)and f(i) in the logarithm terms in the above formula, we can show that

With a *perfect* forecast, the **AP** pattern and **FP** patterns are *identical *but do not overlap because the patterns do not have the same *lead time totals*. The profiles would be parallel but thus show a bias. Thus, the formula below is the desired bias term between ** FAP** Miss and

**Miss, shown in the spreadsheet above in the last column (in**

*FP***bold**). You can verify that when the total lead time forecast = total lead time actuals, then ln (1) = 0 and the Bias = 0. In the holdout sample, ETS model had the largest overall Bias; however, it had fewer and smaller swings of over and under forecasting in the point forecasts, so point forecast bias and lead-time demand total bias are both important.

**But, This is Not All!**

In a follow-up article, I will show that further examination of these information entropies, will lead to a **Skill Score** we can use to measure the performance of the forecaster in terms of how much the forecaster skill benefitted the forecasting process! Is it better or worse to have large over and under forecasting swings than it is to show an overall under forecasting bias of the total lead-time forecast? Hopefully, the **Skill Score** with give additional insight into the lead-time demand forecasting performance issue.

Try it out on your own data and see for yourself what biases you have in your lead-time demand forecasts and give me some of your comments in the meantime. I think it depends on the context and application, so be as specific as you can.

Hans Levenbach, PhD is Executive Director, * CPDF Professional Development Training and Certification Programs*. Dr. Hans is author of a

**new book**(Change&Chance Embraced) on

**Demand Forecasting**in the Supply Chainand created and conducts hands-on

**Professional Development Workshops**on Demand Forecasting and Planning for multi-national supply chain companies worldwide. Hans is a Past President, Treasurer and former member of the Board of Directors of the

__International Institute of Forecasters__*He is Owner/Manager of the LinkedIn groups*

__.__(1)** Demand Forecaster Training and Certification, Blended Learning, Predictive Visualization**, and

(2) __New Product Forecasting and Innovation Planning, Cognitive Modeling, Predictive Visualization__**.**