Tutorial 4: Auto regression#

Autoregression is a time series model that uses observations from previous time steps as input to a regression equation to predict the value at the next time step.

We start with the same model as in the previous tutorial.

[4]:
import pandas as pd
from neuralprophet import NeuralProphet, set_log_level

# Disable logging messages unless there is an error
set_log_level("ERROR")

# Load the dataset from the CSV file using pandas
df = pd.read_csv("https://github.com/ourownstory/neuralprophet-data/raw/main/kaggle-energy/datasets/tutorial01.csv")

# Model and prediction
m = NeuralProphet(
    n_changepoints=10,
    yearly_seasonality=True,
    weekly_seasonality=True,
    daily_seasonality=True,
)
m.set_plotting_backend("plotly-static")
metrics = m.fit(df)
forecast = m.predict(df)
m.plot(forecast)
../_images/tutorials_tutorial04_2_3.png

To better understand what the remaining mismatch between our model and the real data is, we can look at the residuals. The residuals are the difference between the model’s prediction and the real data. If the model is perfect, the residuals should be zero.

[5]:
df_residuals = pd.DataFrame({"ds": df["ds"], "residuals": df["y"] - forecast["yhat1"]})
fig = df_residuals.plot(x="ds", y="residuals", figsize=(10, 6))
../_images/tutorials_tutorial04_4_0.png

Let us explore what a good value for the autoregression would be. Create a autocorrelation chart.

[6]:
from statsmodels.graphics.tsaplots import plot_acf

plt = plot_acf(df_residuals["residuals"], lags=50)
../_images/tutorials_tutorial04_6_0.svg

Now we add autoregression to our model with the n_lags parameter.

[7]:
# Model and prediction
m = NeuralProphet(
    # Disable trend changepoints
    n_changepoints=10,
    # Disable seasonality components
    yearly_seasonality=True,
    weekly_seasonality=True,
    daily_seasonality=True,
    # Add the autogression
    n_lags=10,
)
m.set_plotting_backend("matplotlib")  # Use matplotlib due to #1235
metrics = m.fit(df)
forecast = m.predict(df)
m.plot(forecast)
../_images/tutorials_tutorial04_8_3.png

As we can see the forecasting model with autoregression does fit the data a lot better than the base model. Feel free to explore how different numbers of lags n_lags affect the model.

[8]:
m.plot_parameters(components=["autoregression"])
../_images/tutorials_tutorial04_10_0.png
[9]:
m.plot_components(forecast, components=["autoregression"])
../_images/tutorials_tutorial04_11_0.png