Integrating with MLflow#

MLflow is an open-source platform designed for managing the end-to-end machine learning lifecycle. It provides functionalities for tracking experiments, packaging code into reproducible runs, and sharing and deploying models.

NeuralProphet is compatible with MLflow and we can track our jobs on the MLflow platform.

[1]:
# for this tutorial, we need to install MLflow.
# !pip install mlflow

# Start a MLflow tracking-server on your local machine
# !mlflow server --host 127.0.0.1 --port 8080

if "google.colab" in str(get_ipython()):
    # uninstall preinstalled packages from Colab to avoid conflicts
    !pip uninstall -y torch notebook notebook_shim tensorflow tensorflow-datasets prophet torchaudio torchdata torchtext torchvision
    !pip install git+https://github.com/ourownstory/neural_prophet.git  # may take a while

# much faster using the following code, but may not have the latest upgrades/bugfixes
# pip install neuralprophet
[2]:
import pandas as pd
from neuralprophet import NeuralProphet, set_log_level, save
import mlflow
import time

set_log_level("ERROR")

data_location = "https://raw.githubusercontent.com/ourownstory/neuralprophet-data/main/datasets/"
df = pd.read_csv(data_location + "air_passengers.csv")

Setting Up the MLflow Tracking Server#

In this step, we’re configuring MLflow to use a tracking server for logging and monitoring our machine learning experiments. The tracking server acts as a central repository for MLflow to store experiment data. This includes information like model parameters, metrics, and output files.

[3]:
# Set variable 'local' to True if you want to run this notebook locally
local = False
[4]:
# Set our tracking server uri for logging
mlflow.set_tracking_uri(uri="http://127.0.0.1:8080") if local else None

End Previous Run#

If you have an active run before you start logging and monitoring this will throw an error. Therefore make sure you end all previous active runs. In a normal setting you should not have any active runs and you can ignore the following cell.

[5]:
# End previous run if any
# mlflow.end_run()

Starting an MLflow Experiment with NeuralProphet#

In the next step, we’ll delve into initiating and managing an MLflow experiment for training a NeuralProphet model. The focus will be on setting up the experiment, defining model hyperparameters, and logging essential training metrics.

[6]:
# Start a new MLflow run
if local:
    with mlflow.start_run():

        # Create a new MLflow experiment
        mlflow.set_experiment("NP-MLflow Quickstart_v1")

        # Set a tag for the experiment
        mlflow.set_tag("Description", "NeuralProphet MLflow Quickstart")

        # Define NeuralProphet hyperparameters
        params = {
            "n_lags": 5,
            "n_forecasts": 3,
        }

        # Log Hyperparameters
        mlflow.log_params(params)

        # Initialize NeuralProphet model and fit
        start = time.time()
        m = NeuralProphet(**params)
        metrics_train = m.fit(df=df, freq="MS")
        end = time.time()

        # Log training duration
        mlflow.log_metric("duration", end - start)

        # Log training metrics
        mlflow.log_metric("MAE_train", value=list(metrics_train["MAE"])[-1])
        mlflow.log_metric("RMSE_train", value=list(metrics_train["RMSE"])[-1])
        mlflow.log_metric("Loss_train", value=list(metrics_train["Loss"])[-1])

        # save model
        model_path = "np-model.np"
        save(m, model_path)

        # Log the model in MLflow
        mlflow.log_artifact(model_path, "np-model")

View the NeuralProphet Run in the MLflow UI#

In order to see the results of our run, we can navigate to the MLflow UI. Since we have already started the Tracking Server at http://localhost:8080, we can simply navigate to that URL in our browser and observe our experiments. If we click on the respective experiments we can see a list of all runs associated with the experiment. Clicking on the run will take us to the run page, where the details of what we’ve logged will be shown.

Advanced Example#

with: MLflow Autologging, dataset, requirements & environment metadata with model signature inference, another MLflow Experiment with NeuralProphet

[7]:
# MLflow setup
# Run this command with environment activated: mlflow ui --port xxxx (e.g. 5000, 5001, 5002)
# Copy and paste url from command line to web browser

import mlflow
import torchmetrics
from mlflow.data.pandas_dataset import PandasDataset

if local:

    mlflow.pytorch.autolog(
        log_every_n_epoch=1,
        log_every_n_step=None,
        log_models=True,
        log_datasets=True,
        disable=False,
        exclusive=False,
        disable_for_unsupported_versions=False,
        silent=False,
        registered_model_name=None,
        extra_tags=None,
    )

    import mlflow.pytorch
    from mlflow.client import MlflowClient

    model_name = "NeuralProphet"

    with mlflow.start_run() as run:

        dataset: PandasDataset = mlflow.data.from_pandas(df, source="AirPassengersDataset")

        # Log the dataset to the MLflow Run. Specify the "training" context to indicate that the
        # dataset is used for model training
        mlflow.log_input(dataset, context="training")

        mlflow.log_param("model_type", "NeuralProphet")
        mlflow.log_param("n_lags", 8)
        mlflow.log_param("ar_layers", [8, 8, 8, 8])
        mlflow.log_param("accelerator", "gpu")

        # To Train
        # Import the NeuralProphet class
        from neuralprophet import NeuralProphet, set_log_level

        # Disable logging messages unless there is an error
        set_log_level("ERROR")

        # Create a NeuralProphet model with default parameters
        m = NeuralProphet(n_lags=8, ar_layers=[8, 8, 8, 8], trainer_config={"accelerator": "gpu"})

        # Use static plotly in notebooks
        m.set_plotting_backend("plotly-resampler")

        # Fit the model on the dataset
        metrics = m.fit(df)

        df_future = m.make_future_dataframe(df, n_historic_predictions=48, periods=12)

        # Predict the future
        forecast = m.predict(df_future)

    mlflow.metrics.mae()

    # Save conda environment used to run the model
    mlflow.pytorch.get_default_conda_env()

    # Save pip requirements
    mlflow.pytorch.get_default_pip_requirements()

    # Registering model
    model_uri = f"runs:/{run.info.run_id}/NeuralProphet_test"
    mlflow.register_model(model_uri=model_uri, name=model_name)