Harness the Potential of AI Tools with ChatGPT. Our blog offers comprehensive insights into the world of AI technology, showcasing the latest advancements and practical applications facilitated by ChatGPT’s intelligent capabilities.

Autocorrelation

Once our data is stationary, we can investigate other key time series attributes: partial autocorrelation and autocorrelation. In formal terms:

The autocorrelation function (ACF) measures the linear relationship between lagged values of a time series. In other words, it measures the correlation of the time series with itself. [2]
The partial autocorrelation function (PACF) measures the correlation between lagged values in a time series when we remove the influence of correlated lagged values in between. Those are known as confounding variables. [3]

Both metrics can be visualized with statistical plots known as correlograms. But first, it is important to develop a better understanding of them.

Since this article is focused on exploratory analysis and these concepts are fundamental to statistical forecasting models, I will keep the explanation brief, but bear in mind that these are highly important ideas to build a solid intuition upon when working with time series. For a comprehensive read, I recommend the great kernel “Time Series: Interpreting ACF and PACF” by the Kaggle Notebooks Grandmaster Leonie Monigatti.

As noted above, autocorrelation measures how the time series correlates with itself on previous q lags. You can think of it as a measurement of the linear relationship of a subset of your data with a copy of itself shifted back by q periods. Autocorrelation, or ACF, is an important metric to determine the order q of Moving Average (MA) models.

On the other hand, partial autocorrelation is the correlation of the time series with its p lagged version, but now solely regarding its direct effects. For example, if I want to check the partial autocorrelation of the t-3 to t-1 time period with my current t0 value, I won’t care about how t-3 influences t-2 and t-1 or how t-2 influences t-1. I’ll be exclusively focused on the direct effects of t-3, t-2, and t-1 on my current time stamp, t0. Partial autocorrelation, or PACF, is an important metric to determine the order p of Autoregressive (AR) models.

With these concepts cleared out, we can now come back to our data. Since the two metrics are often analyzed together, our last function will combine the PACF and ACF plots in a grid plot that will return correlograms for multiple variables. It will make use of statsmodels plot_pacf() and plot_acf() functions, and map them to a Matplotlib subplots() grid.