Time-Series Analysis

Example Sheet Basics of Forecasting Advanced Forecasting

Time Series refers to __data ordered at evenly spaced intervals of time__. This is univariate data, where only one variable of interest (Price, Sales, Population etc.) is collected at various intervals of time (daily, weekly, monthly etc.).

The underlying theme in Time-series analysis is that by analyzing past historical data, we may find patterns in the data, which can be used to predict the future values. The raw data may exhibit four components as shown below.

**Trend**: Trend or Secular Trend is the broad direction in which the data moves. It may be an increasing, decreasing or level direction, shown for the entire time duration in consideration. Usually it is a longer time period, sustainable for more than a year or so, and results from long term changes in the environment like population, literacy, energy security etc.

**Cyclic**: Cyclic Pattern usually are repeating patterns for more than a year and accompany some inherent business cycles like economic recovery, depression etc.

**Seasonal**: Seasonal is very similar in shape to Cyclic pattern, but sustains for a relatively smaller period (less than a year). Seasonal patterns are present for weeks, months, quarters etc.

**Irregular (Random)**: Random component in the data is the noise, which is unexplainable and result of some chance occurrence of a known or unknown factor.

In real-life, the components may not be as easily recognizable as shown earlier. The raw data for Sales for a Retailer for 3 years is shown on the right hand side picture above. Except Trend (Excel Trendline), other components can not be detected visually.

The prime motivation in Time-series Analysis is to segregate these components as clearly as possible and use them for finding Stable Components, Root-Cause-analysis or Forecasting. The following are some of the most useful applications of Time-series Analysis.

**Demand Forecasting****Commodity Price Analysis****Stock Price Analysis****Econometrics**

The following pictures show some of the popular approaches and applications in Time-Series Analysis.

Inherent in the collection of data taken over time is some form of random variation. It is useful to smooth the time series data to remove the random parts or noise in data. An often-used technique in industry is "**Smoothing**". This technique, when properly applied, reveals more clearly the underlying trend, seasonal and cyclic components.

Smoothing techniques are used when there are no appreciable Trend, Cyclical or Seasonal patterns in the data and the prime objective is to average out the irregular components. Hence Smoothing techniques rely on different kinds of Averaging Methods as given below.

There are two widely used smoothing methods

**Simple, Centered and Weighted Moving Average****Single, Double and Triple Exponential Moving Average**

Why not to use Mean or Average of all past value as a Smoothing technique. This Naive Smoothing method leads to very high error, and due to this various Moving Average techniques are used, where average of only very recent data points are used.

**Simple Moving Average (SMA)**: SMA uses average of past n values (called order or period of SMA) to predict a current value. Order of averaging is choice of user and usually the best n is obtained by trial-and-error, which results in least error.

**Centered Moving Average**: In this case, the average of n data points is assigned to the middle value. This technique is mainly used to find the Seasonal Index of a Time Series data (Deseasonalization).

**Weighted Moving Average**: Weighted Moving Average is similar to SMA, however weights are attached to each previous data points. Depending on whether recent historical points are more or less important, a high or low weights can be attached respectively, with all weights adding to 1. Weighted MA is used when there is a clear indication of Trend component in Time series data.

SMA is not very effective in case of Trend component and is not very useful as a forecasting tool. Their main use is in averaging sudden changes in data and finding the stable pattern in the data.

**Exponential Moving Average (EMA)**: EMA attaches exponentially changing weights to the past data points, unlike no weights (SMA) or linearly decreasing (or increasing) weights (Weighted MA) used earlier. EMA can use 1, 2 or 3 weights and accordingly called Single EMA, Double EMA or Triple EMA respectively. Triple EMA is also known as **Holt-winter’s** technique based on the names of the people who invented the most widely used Triple EMA algorithm.

As with SMA, Single EMA is used when there is no Trend or Seasonal components in the data. It is also not very useful as a forecasting tool.

In case, a Trend Component is evident in the data, Double EMA is used. It can be used as a forecasting tool, as it forecasts future values aligned with the Trend (Increasing or Decreasing).

In case, a Seasonal Component is also present, then Triple EMA is more appropriate (called Holt-Winter's Method).

Please see the calculations with various methods in the Example Sheet. The SMA for t = 2,3,4,5 and 6 have been given. Using a very small order (t= 2 and 3) does not do much of averaging and follows the wiggles in the data. Order of 4 and higher starts smoothing the peak, as we don't see the small wiggles and the peaks have been reduced too. Based on the accuracy (Mean Absolute Deviation), t = 5 appears to be the best model.

Weighted MA and Centered MA calculations are also shown in another worksheet. Please observe the lag between Centered and Weighted MA techniques.

The Single EMA is shown in the next worksheet, with a comparison of value obtained from formula and Excel Outputs. Please change the value of alpha between extreme values of 0.1 to 0.9 to see the impact on the forecast. A common used value is 0.5, however depending on the case (amount of desired smoothing and error), other values can be opted. Similarly 2 more parameters, called Beta and Gamma, are also used depending on whether we use Single, Double or Triple EMA. There are some efficient algorithms to find the best values of these parameters, and used internally in Commercial Software. User may not need to provide explicit values and iterate on them.

**Autoregressive, Integrated, Moving Average (ARIMA) Model**

It is also called Box-Jenkins, after the name of inventors of this method. ARIMA is usually more accurate and general purpose, compared to Smoothing and Decomposition techniques, discussed above. However it needs more data points as compared to Smoothing techniques, and is suitable when data is relatively stable and not very volatile.

The first step in applying ARIMA methodology is to check for **Trend**. If a Trend is evident, then data has to be made "**Stationary**". A simple Run Plot (Line chart) may show Trend, or Smoothing or Decomposition techniques may be used to detect it. In more complicated data, one has to resort to what is known as **Autocorrelation Function Plot** (ACF), which can be generate by Computer Programs. A gradual decay in this plot is also a sign of “Nonstationarity” in the data. In order to make the data stationary, current data point is subtracted from the previous one for the entire series. This may remove the trend, if the data is changing at a constant rate. If the change is not at a constant rate, then the above differencing process may be needed one more time. Accordingly we call it 1st Order or 2nd Order Differencing.

Please check a simple illustration in the Example Sheet in worksheet “**ARIMA**”. Weekly sales have been shown for two Trends – one changing at a constant Rate and other changing at an increasing rate. The Trend with Constant rate can be made Stationary by differencing once, whereas the one with an increasing rate needs differencing twice.

After detecting and taking care of Stationarity, one should detect “**Autocorrelation**” in the data series. As the name suggests, it refers to the correlation between data at any time period and data at periods prior to it. If correlation is high between two consecutive time periods, then it is known as **lag** = 1. If the correlation is high for two time periods spaced apart by n steps, then lag = n. Hence an autocorrelation at lag 2 measures how the data two periods apart are correlated throughout the series.

The autocorrelation for different lags are generated as plots and known as Autocorrelation Function plots. This is a very important visual tool in ARIMA models and used for several things. We just discussed, how ACF can be used to detect Stationarity.

With the presence of Autocorrelation, two models can be used for Time-series data.

**Autoregressive Model**: This model (AR) uses a Regression Equation to model data at time = t, in terms of data at p periods prior to it, where p is called order of AR. Hence p = 1, would mean Simple Regression Model with just one data point immediately before t, 2 would mean a Multiple Regression Model with 2 data points and so on. The model can be shown as below, where Delta is the Error term.

Essentially we presume the data at t to be impacted by p data points prior to it, where p is the order.

**Moving Average Model**: This model (MA) is once again a Regression Model, however we use the error term at any time, rather than the value itself. Hence in the above equation, Xt-1 would be replaced by error term (random part or noise) at that period (t-1).

How do we know whether AR or MA model is appropriate? These are found from diagnostic plots, called ACF and PACF for AR and MA respectively. These plots are generated by Commercial Software and should be used before picking an order for AR or MA.

When AR and MA are used simultaneously, then it is called ARMA models. If differencing is also used, to make the data stationary, then it is called Integrated ARMA or ARIMA model. The biggest challenge in ARIMA model is picking the orders for AR, I (Differencing) and MA, often denoted as p, d and q respectively. Usually an order more than 1 is not needed, and we seldom have to go beyond order of 2.

Autocorrelation Function Plots (ACF) and Partial-autocorrelation Function Plots (PACF) are used to judge the p, d, q. These plots are generated by Computer Programs, and shows series of Bars at different values of lag (on the X-axis). The following guidelines should be used to judge presence of AR, Trend and MA in a time-series data.

To summarize, ARIMA models are very general purpose Time-series techniques, which can handle wide variety of application and is the most reliable model for forecasting. However we need sufficient data (more than 40 points) and a stable (not too much volatile) pattern.