Integrating with MLflow#
MLflow is an open-source platform designed for managing the end-to-end machine learning lifecycle. It provides functionalities for tracking experiments, packaging code into reproducible runs, and sharing and deploying models.
NeuralProphet is compatible with MLflow and we can track our jobs on the MLflow platform.
[1]:
# for this tutorial, we need to install MLflow.
# !pip install mlflow
# Start a MLflow tracking-server on your local machine
# !mlflow server --host 127.0.0.1 --port 8080
if "google.colab" in str(get_ipython()):
# uninstall preinstalled packages from Colab to avoid conflicts
!pip uninstall -y torch notebook notebook_shim tensorflow tensorflow-datasets prophet torchaudio torchdata torchtext torchvision
!pip install git+https://github.com/ourownstory/neural_prophet.git # may take a while
# much faster using the following code, but may not have the latest upgrades/bugfixes
# pip install neuralprophet
[2]:
import pandas as pd
from neuralprophet import NeuralProphet, set_log_level, save
import mlflow
import time
set_log_level("ERROR")
data_location = "https://raw.githubusercontent.com/ourownstory/neuralprophet-data/main/datasets/"
df = pd.read_csv(data_location + "air_passengers.csv")
Setting Up the MLflow Tracking Server#
In this step, we’re configuring MLflow to use a tracking server for logging and monitoring our machine learning experiments. The tracking server acts as a central repository for MLflow to store experiment data. This includes information like model parameters, metrics, and output files.
[3]:
# Set variable 'local' to True if you want to run this notebook locally
local = False
[4]:
# Set our tracking server uri for logging
mlflow.set_tracking_uri(uri="http://127.0.0.1:8080") if local else None
End Previous Run#
If you have an active run before you start logging and monitoring this will throw an error. Therefore make sure you end all previous active runs. In a normal setting you should not have any active runs and you can ignore the following cell.
[5]:
# End previous run if any
# mlflow.end_run()
Starting an MLflow Experiment with NeuralProphet#
In the next step, we’ll delve into initiating and managing an MLflow experiment for training a NeuralProphet model. The focus will be on setting up the experiment, defining model hyperparameters, and logging essential training metrics.
[6]:
# Start a new MLflow run
if local:
with mlflow.start_run():
# Create a new MLflow experiment
mlflow.set_experiment("NP-MLflow Quickstart_v1")
# Set a tag for the experiment
mlflow.set_tag("Description", "NeuralProphet MLflow Quickstart")
# Define NeuralProphet hyperparameters
params = {
"n_lags": 5,
"n_forecasts": 3,
}
# Log Hyperparameters
mlflow.log_params(params)
# Initialize NeuralProphet model and fit
start = time.time()
m = NeuralProphet(**params)
metrics_train = m.fit(df=df, freq="MS")
end = time.time()
# Log training duration
mlflow.log_metric("duration", end - start)
# Log training metrics
mlflow.log_metric("MAE_train", value=list(metrics_train["MAE"])[-1])
mlflow.log_metric("RMSE_train", value=list(metrics_train["RMSE"])[-1])
mlflow.log_metric("Loss_train", value=list(metrics_train["Loss"])[-1])
# save model
model_path = "np-model.np"
save(m, model_path)
# Log the model in MLflow
mlflow.log_artifact(model_path, "np-model")
View the NeuralProphet Run in the MLflow UI#
In order to see the results of our run, we can navigate to the MLflow UI. Since we have already started the Tracking Server at http://localhost:8080, we can simply navigate to that URL in our browser and observe our experiments. If we click on the respective experiments we can see a list of all runs associated with the experiment. Clicking on the run will take us to the run page, where the details of what we’ve logged will be shown.
Advanced Example#
with: MLflow Autologging, dataset, requirements & environment metadata with model signature inference, another MLflow Experiment with NeuralProphet
[7]:
# MLflow setup
# Run this command with environment activated: mlflow ui --port xxxx (e.g. 5000, 5001, 5002)
# Copy and paste url from command line to web browser
import mlflow
import torchmetrics
from mlflow.data.pandas_dataset import PandasDataset
if local:
mlflow.pytorch.autolog(
log_every_n_epoch=1,
log_every_n_step=None,
log_models=True,
log_datasets=True,
disable=False,
exclusive=False,
disable_for_unsupported_versions=False,
silent=False,
registered_model_name=None,
extra_tags=None,
)
import mlflow.pytorch
from mlflow.client import MlflowClient
model_name = "NeuralProphet"
with mlflow.start_run() as run:
dataset: PandasDataset = mlflow.data.from_pandas(df, source="AirPassengersDataset")
# Log the dataset to the MLflow Run. Specify the "training" context to indicate that the
# dataset is used for model training
mlflow.log_input(dataset, context="training")
mlflow.log_param("model_type", "NeuralProphet")
mlflow.log_param("n_lags", 8)
mlflow.log_param("ar_layers", [8, 8, 8, 8])
mlflow.log_param("accelerator", "gpu")
# To Train
# Import the NeuralProphet class
from neuralprophet import NeuralProphet, set_log_level
# Disable logging messages unless there is an error
set_log_level("ERROR")
# Create a NeuralProphet model with default parameters
m = NeuralProphet(n_lags=8, ar_layers=[8, 8, 8, 8], trainer_config={"accelerator": "gpu"})
# Use static plotly in notebooks
m.set_plotting_backend("plotly-resampler")
# Fit the model on the dataset
metrics = m.fit(df)
df_future = m.make_future_dataframe(df, n_historic_predictions=48, periods=12)
# Predict the future
forecast = m.predict(df_future)
mlflow.metrics.mae()
# Save conda environment used to run the model
mlflow.pytorch.get_default_conda_env()
# Save pip requirements
mlflow.pytorch.get_default_pip_requirements()
# Registering model
model_uri = f"runs:/{run.info.run_id}/NeuralProphet_test"
mlflow.register_model(model_uri=model_uri, name=model_name)