The processes of forecasting and analysis of time-series use plotting of partial autocorrelation and autocorrelation. The plots summarize graphically the strength of a relationship in a time-series where one can compare the present series with prior steps of the time series. However, the differences between the two types of correlation need further explanation especially for those who are new to the forecasting of time-series.
In this article let us look at:
In statistical correlation, the relationship strength between 2-variables is studied using bell curve Gaussian distribution for each variable’s distribution of autocorrelation in regression. Then the correlation between variables is described by the autocorrelation formula’s Pearson Coefficient to describe the inter-variables correlation. Its value lies between values 1 and -1 showing positive, negative and zero correlation when the coefficient is zero. Thus when one uses this with previous time steps the lags can calculate the correlation of the same time series known as the ACF or Auto Correlation Function or Serial Correlation. The plot is called the autocorrelation plot or correlogram.
Now one can calculate the properties of autocorrelation function or correlation for time-series observations with the lag observations. Since the same series at previous times is used for time-series correlation observations this is called the autocorrelation function of a serial correlation.
For an example, use the minimum daily temperatures dataset described below.
Take the Australian Bureau of Meteorology’s daily minimum temperatures in Australia’s Melbourne city for the decade 1981-90. The temperatures are Celsius and it has over 3,650 observations. Firstly, download the dataset into the current working directory and store it with the “daily-minimum-temperatures.csv” filename to graph the time-series and load the Minimum Daily Temperatures as described. Use import and read-CSV from pandas and the import pyplot from matplotlib setting header and index-col to zero. The resultant loaded dataset creates the time-series line-plot for the dataset as a Pandas Series as shown below.
Plotting and calculating the autocorrelation plot for the Minimum Daily Temperatures using the statsmodels library and plot_acf() function using pandas to read-CSV, matplotlib to import pyplot and studying the resulting 2D plot with X-axis having the lag values and Y-axis showing the autocorrelation in time series, the correlation lies between values of 1 and -1. For such plots, the confidence or cone-like intervals are set to 95% by default meaning outside the code correlation values are not statistical flukes but rather are correlations themselves.
Since all of the lag values in the ACF time series are considered the plot has autocorrelation problems and is noisy. To make it easier to read the lag number on X-axis is set to 50 lags and the new plot for properties is as below.
The process of Partial Autocorrelation is similar to an autocorrelation description of a time (in R) series observation with prior time observation steps with the removal of observations intervening. The partial autocorrelation function- PACF or partial autocorrelation with k lag will provide the correlation after removing any correlations of shorter lag terms. It has both direct and indirect types of correlations where the indirect ones provide the linear function of the observed values with the time-step intervening observations.
In short, the indirect correlations removed by partial autocorrelation function form PACF intuition. Using the same dataset the plot for the PACF in the dataset for the first 50 lags using the statsmodels library plot_pacf() is shown below.
The PACF and ACF plots of a time series provide consequences of autocorrelation and the intuition of the correlation.
Using the time series plot of the AR- autoregression process having a lag of k, the ACF relationship’s ACF provides indirect and direct dependence information on the correlation between the particular observation and observations of this time series at prior time steps. We see that the AR(k) time series is provided with a k lag since the relation’s inertia will in subsequent values of lag weaken the effects causing it to trail off. This suggests that the PACF describes an only and direct relationship between lag and its observation meaning there is no correlation beyond the k lag value which tallies with the expectations of the AR(k) process and its PACF and ACF plots.
Intuition for PACF and ACF Plots
Use the MA- Moving Average process with a k lag on a time-series. Note that this method uses the time-series of residual errors from prior predictions or that it uses errors made on recent forecasts to correct future forecasts in autocorrelation. The plots generated show the MA(k) ACF process has recent values strong correlation up to the lag of k, declining to no or low correlation thereafter in the process. The PACF plot is also expected to show a strong relationship until the lag preset and trails off beyond the value. Both plots generated in such a case prove the expectations!
From the discussion above, one can note that ACF and PACF regression models provide a time series correlation comparison with prior sets of the same meeting all expectations for the generation of autocorrelation results. Need to know more about the auto-correlation time series ACF and PACF? Try these resources. Wikipedia’s Autocorrelation, Correlogram, Correlation and dependence, Partial autocorrelation function and Forecasting and Control Time Series Analysis.
There are no right or wrong ways of learning AI and ML technologies – the more, the better! These valuable resources can be the starting point for your journey on how to learn Artificial Intelligence and Machine Learning. Do pursuing AI and ML interest you? If you want to step into the world of emerging tech, you can accelerate your career with this Machine Learning And AI Courses by Jigsaw Academy.
8 Important Time Series Datasets For Machine Learning
Time Series Forecasting – An Interesting Overview(2021)