Different forecasting algorithms are highlighted and a framework is provided on how best to estimate product demand using a combination of qualitative and quantitative approaches.
BY JITESH SHAH, Integrated Device Technology, San Jose, CA
Nothing in the world of forecasting is more complex than predicting demand for semiconductors, but this is one business where accurate forecasting could be a matter of long-term survival. Not only will the process of forecasting help reduce costs for the company by holding the right amount of inventory in the channels and knowing what parts to build when but implementing a robust and self-adaptive system will also keep customers happy by providing them with products they need when they need. Other benefits include improved vendor engagements and optimal resource (labor and capital) allocation.
Talking about approaches…
There are two general approaches to forecasting a time-based event; qualitative approach and quantitative or a more numbers-based approach. If historical time-series data on the variable of interest is sketchy or if the event being forecasted is related to a new product launch, a more subjective or expert-based predictive approach is necessary, but we all intui- tively know that. New product introductions usually involve active customer and vendor engagements, and that allows us to have better control on what to build, when, and in what quantity. Even with that, the Bass Diffusion Model, a technique geared towards helping to predict sales for a new product category could be employed, but that will not be discussed in this context.
Now if data on past information on the forecasted variable is handy and quantifiable and it’s fair to assume that the pattern of the past will likely continue in the future, then a more quant-based, algorithmic and somewhat automated approach is almost a necessity.
But how would one go about deciding whether to use an automated approach to forecasting or a more expert-based approach? A typical semiconductor company’s products could be segmented into four quadrants (FIGURE 1), and deciding whether to automate the process of forecasting will depend on which quadrant the product fits best.
Time series modeling
Past shipment data over time for a product, or a group of products you are trying to forecast demand for is usually readily available, and that is generally the only data you need to design a system to automate the forecasting process. The goal is to discover a pattern in the historical, time-series data and extrapolate that pattern into the future. An ideal system should be built in such a way that it evolves, or self-adapts, and selects the “right” algorithm from the pre-built toolset if shipment pattern changes. A typical time-series forecasting model would have just two variables; an independent time variable and a dependent variable representing an event we are trying to forecast.
That event Qt (order, shipment, etc.) we are trying to forecast is more or less a function of the product’s life-cycle or trend, seasonality or business cycle and randomness, shown in the “white board” style illustration of FIGURE 2.
Trend and seasonality or business cycle are typically associated with longer-range patterns and hence are best suited to be used to make long-term forecasts. A shorter-term or horizontal pattern of past shipment data is usually random and is used to make shorter-term forecasts.
Forecasting near-term events
Past data exhibiting randomness with horizontal patterns can be reasonably forecasted using either a Naïve method or a simple averaging method. The choice between the two will depend on which one gives lower Mean Absolute Error (MAE) and Mean Absolute % Error (MAPE).
Naïve Method The sample table in FIGURE 3 shows 10 weeks’ worth of sales data. Using the Naïve approach, the forecasted value for the 2nd week is just what was shipped in the 1st week. The forecasted value for the 3rd week is the actual sales value in the 2nd week and so on. The difference between the actual value and the forecasted value represents the forecast error and the absolute value of that is used to calculate the total error. MAE is just the mean of total error. A similar approach is used to calculate MAPE, but now each individual error is divided by the actual sales volume to calculate % error, which are then summed and divided by the number of forecasted values to calculate MAPE.
Averaging Instead of using the last observed event and using that to forecast the next event, a better approach would be to use the mean of all past observations and use that as the next period’s forecast. For example, the forecasted value for the 3rd week is the mean of the 1st and 2nd week’s actual sales value. The forecasted value for the 4th week is the mean of the previous three actual sales values, and so on (FIGURE 4).
MAE and MAPE for the Naïve method are 4.56 and 19% respectively, and the same for the averaging method are 3.01 and 13% respectively. Right there, one can conclude that averaging is better than the simple Naïve approach.
Horizontal Pattern with Level Shift But what happens when there is a sudden shift (anticipated or not) in the sales pattern like the one shown in FIGURE 5?
The simple averaging approach needs to be tweaked to account for that, and that is where a moving average approach is better suited. Instead of averaging across the entire time series, only 2 or 3 or 4 recent time events are used to calculate the forecast value. How many time periods to use will depend on which one gives the smallest MAE and MAPE values and that can and should be parameterized and coded. The tables in FIGURE 6 compare the two approaches, and clearly the moving average approach seems to be a better fit in predicting future events.
Exponential Smoothing But oftentimes, there is a better approach, especially when the past data exhibits severe and random level shifts.
This approach is well suited for such situations because over time, the exponentially weighted moving average of the entire time series tends to deemphasize data that is older but still includes them and, at the same time, weighs recent observations more heavily. That relationship between the actual and forecasted value is shown in FIGURE 7.
Again, the lowest MAE and MAPE will help decide the optimal value for the smoothing constant and, as always, this can easily be coded based on the data you already have, and can be automatically updated as new data trickles in.
But based on the smoothing equation above, one must wonder how the entire time series is factored in when only the most recent actual and forecasted values are used as part of the next period’s forecast. The math in FIGURE 8 explains how.
The forecast for the second period is assumed to be the first observed value. The third period is the true derived forecast and with subsequent substitu- tions, one quickly finds out that the forecast for nth period is a weighted average of all previous observed events. And the weight ascribed to later events compared to the earlier events is shown in the plot in FIGURE 9.
Making longer term forecasts
A semiconductor product’s lifecycle is usually measured in months but surprisingly, there are quite a few products with lifespans measured in years, especially when the end applications exhibit long and growing adoption cycles. These products not only exhibit shorter-term randomness in time-series but show a longer-term seasonal / cyclical nature with growing or declining trend over the years.
The first step in estimating the forecast over the longer term is to smooth out some of that short- term randomness using the approaches discussed before. The unsmoothed and smoothed curves might resemble the plot in FIGURE 10.
Clearly, the data exhibits a long-term trend along with a seasonal or cyclical pattern that repeats every year, and Ordinary Least Square or OLS regression is the ideal approach to forming a function that will help estimate that trend and the parameters involved. But before crunching the numbers, the dataset has to be prepped to include a set of dichotomous variables representing the different intervals in that seasonal behavior. Since in this situation, that seasonality is by quarters representing Q1, Q2, Q3 and Q4, only three of them are included in the model. The fourth one, which is Q=2 in this case, forms the basis upon which to measure the significance of the other three quarters (FIGURE 11).
The functional form of the forecasted value by quarter looks something like what’s shown in FIGURE 12.
The intercept b0 moves up or down based on whether the quarter in question is Q2 or not. If b2, b3 and b4 are positive, Q2 will exhibit the lowest expected sales volume. The other three quarters will show increasing expected sales in line with the increase in the respective estimated parameter values. And this equation can be readily used to reasonably forecast an event a few quarters or a few years down the road.
So there you have it. This shows how easy it is to automate some features of the forecasting process, and the importance of building an intelligent, self- aware and adaptive forecasting system. The results will not only reduce cost but help refocus your supply-chain planning efforts on bigger and better challenges.
JITESH SHAH is a principal engineer with Integrated Device Technology, San Jose, CA
Very nice!
Thanks. Election year forecasting methods would take a much longer tutorial of course but my grandkids have already picked a winner….and a “hope not” spoiler.