Overlooking Data Quality in Historical Data Will Undermine Accuracy in Demand Forecasting
The quality of historical data will be enhanced if we make the effort to clearly identify key patterns in time series. It should not be assumed that once a key variable has been identified as a driver of demand, that its usefulness will be good for all times. Data need to be analyzed on an ongoing basis for quality assurance.
Like a detective seeking evidence at a crime scene, an exploratory ANOVA decomposition (e.g., Excel > Data > Data Analysis > Analysis Tools menu > ANOVA: Two Factor Without Replication > OK), while useful in assessing the significance of a seasonal and trend-cycle effects, can point to changes in the nature of demand variability. For example, in the ten-year period 1960 – 1969 (shown in Figure 2.4 (left panel) in my book), the total variability in the US housing starts data is made up of 0.1 % seasonality, 64.5% trend and 35.4% irregular (unknown at this stage of the analysis). On the other hand, for the ten-year period 2004 – 2013 (right panel), the total variability in the data is comprised of 0.3 % seasonality, 96.5% trend and 3.2% irregular (unknown). Because these are seasonally adjusted data, no significant seasonal effect should be present. The non-trending variability in the early period, masked by the deep (recession) dip, suggests a highly irregular contribution to the variability. When using these data as an explanatory factor or driver of demand in regression models, we recognize its value in explaining primarily economic and demographic consumer behavior.
Historical data must be consistent throughout the period of its application. When definitions change, adjustments need to be made in order to retain logical consistency in historical patterns. The monthly demand is for refrigerator sales. It is an example of internal data that would be available to a demand forecaster of a consumer goods manufacturer. The historical data below show a pattern consistent with trend/seasonal consumer demand. If the data pattern shows abrupt level changes or unusual variation, the forecaster should first check into how the data are constructed. In the top panel, the total variability in the refrigerator data is made up of 81% seasonality, 7% trend and 13% Other (Unknown or unidentified). It can be seen that the second seasonal peak appears unusually low (the domain expert’s explanation to me was a lack of inventory during Ramadan holidays). A simple interpolation between the seasonal peaks makes the data more representative. It is advisable when modeling for forecasting to identify and adjust for the unusual value(s) prior to running models as unexamined data will distort the forecast profile as well as the uncertainty range (width of the prediction limits).
With the adjustment, the total variability in the data is made up of 87% seasonality, 4% trend and 8% Other (Unknown). The seasonality is still dominant but the unknown component, comprised of the uncertain variation is significantly reduced – the impact of only one data point! (Think forecast profiles, not model fitting)
When using these data as an explanatory factor or driver of demand in causal models, we recognize its value in explaining primarily seasonality (habit in consumer behavior), including a specification of the nature of the uncertainty.
The result of jumping into the modeling phase prior to thoroughly investigating the quality of the data and checking for anomalies can be misleading. For instance, if we blindly applied a credible trend-seasonal model (e.g. Holt-Winters exponential smoothing or an ARIMA (011)(011)12 ‘airline’ model) with the original data, we would see that the seasonal peak profile becomes level and the uncertainty range is wide (top panel). On the other hand with a single point adjustment based on informed judgment and the same trend-seasonal model, the results become much more credible and practical (bottom panel). Not only is the forecast profile more representative of what may lie in the future, but the prediction limits have become much narrower. The variability in the “Other” category of the exploratory ANOVA decomposition is reduced and our forecasts become more precise.
These issues are explored throughout my business forecasting book Change & Chance Embraced: Achieving Agility with Smarter Demand Forecasting in the Supply Chain, available in paperback or Kindle e-book on Amazon, along with a number of insightful 5-Star reviews below:
My mentor on forecasting, Hans Levenbach, has not only updated his previous excellent publications with new topics, but he has given the whole field a fresh feeling written with style understandable by novices and appreciated by experts. In addition to incorporating the latest advances in technology, Dr. Levenbach has added sections on Agility, Big Data and Data Quality that are so critical in today’s business climate.
The focus is on applications across the supply chain in a novel way that makes reading fun and full of practical examples. The “Takeaways” provide a helpful summary at the end of each chapter. Pictures, Cases and Quotes add charm to the story.
Leon Schwartz, PhD, Principal, Informed Decisions Group, LTD; INFORMS Fellow
BETTER INSIGHTS Gather DATA, make the right sorts of simple calculations and end-up literally SEEING how well you are forecasting changes in both demand and its uncertainty. This new book by Dr. Hans Levenbach makes the advantages of tried-and-true “technical” methods more accessible to everybody involved in the forecasting process. Loved the pragmatic “Takeaways” at the end of each chapter!
Bob Obenchain, PhD, Fellow of the American Statistical Association
Dr. Hans Levenbach is Executive Director, CPDF Training and Certification Programs. He conducts hands-on and onsite Professional Development Workshops on Smarter Forecasting and Business Planning for multi-national supply chain organizations worldwide. He is group manager of the LinkedIn groups (1) Demand Forecaster Training and Certification, Blended Learning, Predictive Visualization, and (2) New Product Forecasting and Innovation Planning, Cognitive Modeling, Predictive Visualization.