Tutorial 5: Lagged regressors#

Lagged regressors are used to correlate other observed variables to our target time series. For example the temperature of the previous day might be a good predictor of the temperature of the next day.

They are often referred to as covariates. Unlike future regressors, the future of lagged regressors is unknown to us.

At the time \(t\) of forecasting, we only have access to their observed, past values up to and including \(t − 1\).

\[\text{Lagged regressor}(t) = L(t) = \sum_{x \in X}L_x(x_{t-1},x_{t-2},...,x_{t-p})\]

First we load a new dataset which also contains the temperature of the previous day.

[1]:
import pandas as pd

# Load the dataset for tutorial 4 with the extra temperature column
df = pd.read_csv("https://github.com/ourownstory/neuralprophet-data/raw/main/kaggle-energy/datasets/tutorial04.csv")
df.head()
[1]:
ds y temperature
0 2015-01-01 64.92 277.00
1 2015-01-02 58.46 277.95
2 2015-01-03 63.35 278.83
3 2015-01-04 50.54 279.64
4 2015-01-05 64.89 279.05
[2]:
fig = df.plot(x="ds", y=["y", "temperature"], figsize=(10, 6))
../_images/tutorials_tutorial05_4_0.png

After viewing the additional data we will add it as lagged regressor to our model. We start with our model from the previous tutorial. And then add the lagged regressor for the temperature to get a better energy price prediction.

[3]:
from neuralprophet import NeuralProphet, set_log_level

# Disable logging messages unless there is an error
set_log_level("ERROR")

# Model and prediction
m = NeuralProphet(
    # Disable trend changepoints
    n_changepoints=10,
    # Disable seasonality components
    yearly_seasonality=True,
    weekly_seasonality=True,
    daily_seasonality=True,
    # Add the autogression
    n_lags=10,
)
m.set_plotting_backend("plotly-static")

# Add the new lagged regressor
m.add_lagged_regressor("temperature")

# Continue training the model and making a prediction
metrics = m.fit(df)
forecast = m.predict(df)
m.plot(forecast)
../_images/tutorials_tutorial05_6_3.svg
[4]:
m.plot_components(forecast, components=["lagged_regressors"])
../_images/tutorials_tutorial05_7_0.svg
[5]:
m.plot_parameters(components=["lagged_regressors"])
../_images/tutorials_tutorial05_8_0.svg

Let us explore how our model improved after adding the lagged regressor.

[6]:
metrics
[6]:
MAE RMSE Loss RegLoss epoch
0 57.422047 68.648613 0.454331 0.0 0
1 53.999779 64.757294 0.413841 0.0 1
2 50.681854 61.224556 0.374771 0.0 2
3 45.628796 55.634834 0.319898 0.0 3
4 40.890278 49.883217 0.266118 0.0 4
... ... ... ... ... ...
168 4.899452 6.577740 0.005222 0.0 168
169 4.902421 6.521432 0.005134 0.0 169
170 4.891082 6.474842 0.005115 0.0 170
171 4.916732 6.520473 0.005170 0.0 171
172 4.910669 6.507926 0.005161 0.0 172

173 rows × 5 columns

[7]:
df_residuals = pd.DataFrame({"ds": df["ds"], "residuals": df["y"] - forecast["yhat1"]})
fig = df_residuals.plot(x="ds", y="residuals", figsize=(10, 6))
../_images/tutorials_tutorial05_11_0.png