From 48ffcbdcbca9368bfd37eef4bcc8a3b27c04de50 Mon Sep 17 00:00:00 2001
From: Eli Fajardo <efajardo@nvidia.com>
Date: Wed, 13 Dec 2023 16:15:45 -0500
Subject: [PATCH 1/7] Fixes to modular DFP examples and benchmarks (#1429)

- Fix modular DFP benchmark and examples after `DFPArgParser`updates from PR #1245
- Shorten `load` file paths in example control messages
- Doc fixes

Fixes #1431
Fixes #1432

## By Submitting this PR I confirm:
- I am familiar with the [Contributing Guidelines](https://github.com/nv-morpheus/Morpheus/blob/main/docs/source/developer_guide/contributing.md).
- When the PR is ready for review, new or existing tests cover these changes.
- When the PR is ready for review, the documentation is up to date with these changes.

Authors:
  - Eli Fajardo (https://github.com/efajardo-nv)

Approvers:
  - Michael Demoret (https://github.com/mdemoret-nv)

URL: https://github.com/nv-morpheus/Morpheus/pull/1429
---
 .../10_modular_pipeline_digital_fingerprinting.md |  8 ++++----
 .../production/morpheus/benchmarks/README.md      |  8 ++++----
 .../benchmarks/benchmark_conf_generator.py        |  2 ++
 .../control_messages/azure_payload_inference.json |  2 +-
 .../azure_payload_load_train_inference.json       |  4 ++--
 .../azure_payload_load_training.json              |  2 +-
 .../control_messages/azure_payload_lti.json       |  4 ++--
 .../control_messages/azure_payload_training.json  |  2 +-
 .../azure_streaming_inference.json                |  2 +-
 .../control_messages/azure_streaming_lti.json     |  4 ++--
 .../azure_streaming_training.json                 |  2 +-
 .../control_messages/duo_payload_inference.json   |  2 +-
 .../duo_payload_load_train_inference.json         |  4 ++--
 .../control_messages/duo_payload_lti.json         |  4 ++--
 .../control_messages/duo_payload_only_load.json   |  2 +-
 .../control_messages/duo_payload_training.json    |  2 +-
 .../control_messages/duo_streaming_inference.json |  2 +-
 .../control_messages/duo_streaming_lti.json       |  4 ++--
 .../control_messages/duo_streaming_only_load.json |  2 +-
 .../control_messages/duo_streaming_payload.json   |  4 ++--
 .../control_messages/duo_streaming_training.json  |  2 +-
 .../dfp_integrated_training_batch_pipeline.py     | 14 ++++++++++----
 .../dfp_integrated_training_streaming_pipeline.py | 15 +++++++++++----
 23 files changed, 56 insertions(+), 41 deletions(-)

diff --git a/docs/source/developer_guide/guides/10_modular_pipeline_digital_fingerprinting.md b/docs/source/developer_guide/guides/10_modular_pipeline_digital_fingerprinting.md
index 408d6e6804..e30952dd41 100644
--- a/docs/source/developer_guide/guides/10_modular_pipeline_digital_fingerprinting.md
+++ b/docs/source/developer_guide/guides/10_modular_pipeline_digital_fingerprinting.md
@@ -527,7 +527,7 @@ From the `examples/digital_fingerprinting/production` dir, run:
 ```bash
 docker compose run morpheus_pipeline bash
 ```
-To run the DFP pipelines with the example datasets within the container, run:
+To run the DFP pipelines with the example datasets within the container, run the following from `examples/digital_fingerprinting/production/morpheus`:
 
 * Duo Training Pipeline
     ```bash
@@ -560,7 +560,7 @@ To run the DFP pipelines with the example datasets within the container, run:
         --start_time "2022-08-01" \
         --duration "60d" \
         --train_users generic \
-        --input_file "./control_messages/duo_payload_load_train_inference.json" 
+        --input_file "./control_messages/duo_payload_load_train_inference.json"
     ```
 
 * Azure Training Pipeline
@@ -594,7 +594,7 @@ To run the DFP pipelines with the example datasets within the container, run:
         --start_time "2022-08-01" \
         --duration "60d" \
         --train_users generic \
-        --input_file "./control_messages/azure_payload_load_train_inference.json" 
+        --input_file "./control_messages/azure_payload_load_train_inference.json"
     ```
 
 ### Output Fields
@@ -615,4 +615,4 @@ In addition to this, for each input feature the following output fields will exi
 | `<feature name>_z_loss` | FLOAT | The loss z-score |
 | `<feature name>_pred` | FLOAT | The predicted value |
 
-Refer to [DFPInferenceStage](6_digital_fingerprinting_reference.md#inference-stage-dfpinferencestage) for more on these fields.
\ No newline at end of file
+Refer to [DFPInferenceStage](6_digital_fingerprinting_reference.md#inference-stage-dfpinferencestage) for more on these fields.
diff --git a/examples/digital_fingerprinting/production/morpheus/benchmarks/README.md b/examples/digital_fingerprinting/production/morpheus/benchmarks/README.md
index cad1fe96d6..4ff9f77e99 100644
--- a/examples/digital_fingerprinting/production/morpheus/benchmarks/README.md
+++ b/examples/digital_fingerprinting/production/morpheus/benchmarks/README.md
@@ -103,11 +103,11 @@ To ensure the [file_to_df_loader.py](../../../../../morpheus/loaders/file_to_df_
 export MORPHEUS_FILE_DOWNLOAD_TYPE=dask
 ```
 
-Benchmarks for an individual workflow can be run using the following in your dev container:
+Benchmarks for an individual workflow can be run from `examples/digital_fingerprinting/production/morpheus` in your dev container:
 
 ```
 
-pytest -s --log-level=WARN --benchmark-enable --benchmark-warmup=on --benchmark-warmup-iterations=1 --benchmark-autosave test_bench_e2e_dfp_pipeline.py::<test-workflow>
+pytest -s --log-level=WARN --benchmark-enable --benchmark-warmup=on --benchmark-warmup-iterations=1 --benchmark-autosave benchmarks/test_bench_e2e_dfp_pipeline.py::<test-workflow>
 ```
 The `-s` option allows outputs of pipeline execution to be displayed so you can ensure there are no errors while running your benchmarks.
 
@@ -137,12 +137,12 @@ The `--benchmark-warmup` and `--benchmark-warmup-iterations` options are used to
 
 For example, to run E2E benchmarks on the DFP training (modules) workflow on the azure logs:
 ```
-pytest -s --benchmark-enable --benchmark-warmup=on --benchmark-warmup-iterations=1 --benchmark-autosave test_bench_e2e_dfp_pipeline.py::test_dfp_modules_azure_payload_lti_e2e
+pytest -s --benchmark-enable --benchmark-warmup=on --benchmark-warmup-iterations=1 --benchmark-autosave benchmarks/test_bench_e2e_dfp_pipeline.py::test_dfp_modules_azure_payload_lti_e2e
 ```
 
 To run E2E benchmarks on all workflows:
 ```
-pytest -s --benchmark-enable --benchmark-warmup=on --benchmark-warmup-iterations=1 --benchmark-autosave test_bench_e2e_dfp_pipeline.py
+pytest -s --benchmark-enable --benchmark-warmup=on --benchmark-warmup-iterations=1 --benchmark-autosave benchmarks/test_bench_e2e_dfp_pipeline.py
 ```
 
 Here are the benchmark comparisons for individual tests. When the control message type is "payload", the rolling window stage is bypassed, whereas when it is "streaming", the windows are created with historical data.
diff --git a/examples/digital_fingerprinting/production/morpheus/benchmarks/benchmark_conf_generator.py b/examples/digital_fingerprinting/production/morpheus/benchmarks/benchmark_conf_generator.py
index d86ddbc660..0cb187a51b 100644
--- a/examples/digital_fingerprinting/production/morpheus/benchmarks/benchmark_conf_generator.py
+++ b/examples/digital_fingerprinting/production/morpheus/benchmarks/benchmark_conf_generator.py
@@ -161,6 +161,8 @@ def get_module_conf(self):
                                       source=(self.source),
                                       tracking_uri=mlflow.get_tracking_uri(),
                                       silence_monitors=True,
+                                      mlflow_experiment_name_formatter=self._get_experiment_name_formatter(),
+                                      mlflow_model_name_formatter=self._get_model_name_formatter(),
                                       train_users='generic')
         dfp_arg_parser.init()
         config_generator = ConfigGenerator(self.pipe_config, dfp_arg_parser, self.get_schema())
diff --git a/examples/digital_fingerprinting/production/morpheus/control_messages/azure_payload_inference.json b/examples/digital_fingerprinting/production/morpheus/control_messages/azure_payload_inference.json
index a258cf3fd5..f8d2b9ed3a 100644
--- a/examples/digital_fingerprinting/production/morpheus/control_messages/azure_payload_inference.json
+++ b/examples/digital_fingerprinting/production/morpheus/control_messages/azure_payload_inference.json
@@ -7,7 +7,7 @@
 			"properties": {
 			  "loader_id": "fsspec",
 			  "files": [
-				"../../../../examples/data/dfp/azure-inference-data/*.json"
+				"../../../data/dfp/azure-inference-data/*.json"
 			  ]
 			}
 		  },
diff --git a/examples/digital_fingerprinting/production/morpheus/control_messages/azure_payload_load_train_inference.json b/examples/digital_fingerprinting/production/morpheus/control_messages/azure_payload_load_train_inference.json
index 0286129151..bf0a3771ef 100644
--- a/examples/digital_fingerprinting/production/morpheus/control_messages/azure_payload_load_train_inference.json
+++ b/examples/digital_fingerprinting/production/morpheus/control_messages/azure_payload_load_train_inference.json
@@ -7,7 +7,7 @@
 			"properties": {
 			  "loader_id": "fsspec",
 			  "files": [
-				"../../../../examples/data/dfp/azure-training-data/*.json"
+				"../../../data/dfp/azure-training-data/*.json"
 			  ]
 			}
 		  },
@@ -28,7 +28,7 @@
 			"properties": {
 			  "loader_id": "fsspec",
 			  "files": [
-				"../../../../examples/data/dfp/azure-inference-data/*.json"
+				"../../../data/dfp/azure-inference-data/*.json"
 			  ]
 			}
 		  },
diff --git a/examples/digital_fingerprinting/production/morpheus/control_messages/azure_payload_load_training.json b/examples/digital_fingerprinting/production/morpheus/control_messages/azure_payload_load_training.json
index d6e028d4eb..dad09e6062 100644
--- a/examples/digital_fingerprinting/production/morpheus/control_messages/azure_payload_load_training.json
+++ b/examples/digital_fingerprinting/production/morpheus/control_messages/azure_payload_load_training.json
@@ -7,7 +7,7 @@
 			"properties": {
 			  "loader_id": "fsspec",
 			  "files": [
-				"../../../../examples/data/dfp/azure-training-data/*.json"
+				"../../../data/dfp/azure-training-data/*.json"
 			  ]
 			}
 		  },
diff --git a/examples/digital_fingerprinting/production/morpheus/control_messages/azure_payload_lti.json b/examples/digital_fingerprinting/production/morpheus/control_messages/azure_payload_lti.json
index 97053f82a4..1b1e226145 100644
--- a/examples/digital_fingerprinting/production/morpheus/control_messages/azure_payload_lti.json
+++ b/examples/digital_fingerprinting/production/morpheus/control_messages/azure_payload_lti.json
@@ -7,7 +7,7 @@
 			"properties": {
 			  "loader_id": "fsspec",
 			  "files": [
-				"../../../../examples/data/dfp/azure-training-data/*.json"
+				"../../../data/dfp/azure-training-data/*.json"
 			  ]
 			}
 		  },
@@ -34,7 +34,7 @@
 			"properties": {
 			  "loader_id": "fsspec",
 			  "files": [
-				"../../../../examples/data/dfp/azure-inference-data/*.json"
+				"../../../data/dfp/azure-inference-data/*.json"
 			  ]
 			}
 		  },
diff --git a/examples/digital_fingerprinting/production/morpheus/control_messages/azure_payload_training.json b/examples/digital_fingerprinting/production/morpheus/control_messages/azure_payload_training.json
index d6e028d4eb..dad09e6062 100644
--- a/examples/digital_fingerprinting/production/morpheus/control_messages/azure_payload_training.json
+++ b/examples/digital_fingerprinting/production/morpheus/control_messages/azure_payload_training.json
@@ -7,7 +7,7 @@
 			"properties": {
 			  "loader_id": "fsspec",
 			  "files": [
-				"../../../../examples/data/dfp/azure-training-data/*.json"
+				"../../../data/dfp/azure-training-data/*.json"
 			  ]
 			}
 		  },
diff --git a/examples/digital_fingerprinting/production/morpheus/control_messages/azure_streaming_inference.json b/examples/digital_fingerprinting/production/morpheus/control_messages/azure_streaming_inference.json
index d122241e8c..9c5d889d5c 100644
--- a/examples/digital_fingerprinting/production/morpheus/control_messages/azure_streaming_inference.json
+++ b/examples/digital_fingerprinting/production/morpheus/control_messages/azure_streaming_inference.json
@@ -7,7 +7,7 @@
 			"properties": {
 			  "loader_id": "fsspec",
 			  "files": [
-				"../../../../examples/data/dfp/azure-inference-data/*.json"
+				"../../../data/dfp/azure-inference-data/*.json"
 			  ]
 			}
 		  },
diff --git a/examples/digital_fingerprinting/production/morpheus/control_messages/azure_streaming_lti.json b/examples/digital_fingerprinting/production/morpheus/control_messages/azure_streaming_lti.json
index f798a1a475..7a28c85d73 100644
--- a/examples/digital_fingerprinting/production/morpheus/control_messages/azure_streaming_lti.json
+++ b/examples/digital_fingerprinting/production/morpheus/control_messages/azure_streaming_lti.json
@@ -7,7 +7,7 @@
 			"properties": {
 			  "loader_id": "fsspec",
 			  "files": [
-				"../../../../examples/data/dfp/azure-training-data/*.json"
+				"../../../data/dfp/azure-training-data/*.json"
 			  ]
 			}
 		  },
@@ -34,7 +34,7 @@
 			"properties": {
 			  "loader_id": "fsspec",
 			  "files": [
-				"../../../../examples/data/dfp/azure-inference-data/*.json"
+				"../../../data/dfp/azure-inference-data/*.json"
 			  ]
 			}
 		  },
diff --git a/examples/digital_fingerprinting/production/morpheus/control_messages/azure_streaming_training.json b/examples/digital_fingerprinting/production/morpheus/control_messages/azure_streaming_training.json
index 8397489cb2..882b23a0a3 100644
--- a/examples/digital_fingerprinting/production/morpheus/control_messages/azure_streaming_training.json
+++ b/examples/digital_fingerprinting/production/morpheus/control_messages/azure_streaming_training.json
@@ -7,7 +7,7 @@
 			"properties": {
 			  "loader_id": "fsspec",
 			  "files": [
-				"../../../../examples/data/dfp/azure-training-data/*.json"
+				"../../../data/dfp/azure-training-data/*.json"
 			  ]
 			}
 		  },
diff --git a/examples/digital_fingerprinting/production/morpheus/control_messages/duo_payload_inference.json b/examples/digital_fingerprinting/production/morpheus/control_messages/duo_payload_inference.json
index 7b4bb9672a..ebd6669be7 100644
--- a/examples/digital_fingerprinting/production/morpheus/control_messages/duo_payload_inference.json
+++ b/examples/digital_fingerprinting/production/morpheus/control_messages/duo_payload_inference.json
@@ -7,7 +7,7 @@
 			"properties": {
 			  "loader_id": "fsspec",
 			  "files": [
-				"../../../../examples/data/dfp/duo-inference-data/*.json"
+				"../../../data/dfp/duo-inference-data/*.json"
 			  ]
 			}
 		  },
diff --git a/examples/digital_fingerprinting/production/morpheus/control_messages/duo_payload_load_train_inference.json b/examples/digital_fingerprinting/production/morpheus/control_messages/duo_payload_load_train_inference.json
index e378d82760..e30f66ccfe 100644
--- a/examples/digital_fingerprinting/production/morpheus/control_messages/duo_payload_load_train_inference.json
+++ b/examples/digital_fingerprinting/production/morpheus/control_messages/duo_payload_load_train_inference.json
@@ -7,7 +7,7 @@
 			"properties": {
 			  "loader_id": "fsspec",
 			  "files": [
-				"../../../../examples/data/dfp/duo-training-data/*.json"
+				"../../../data/dfp/duo-training-data/*.json"
 			  ]
 			}
 		  },
@@ -28,7 +28,7 @@
 			"properties": {
 			  "loader_id": "fsspec",
 			  "files": [
-				"../../../../examples/data/dfp/duo-inference-data/*.json"
+				"../../../data/dfp/duo-inference-data/*.json"
 			  ]
 			}
 		  },
diff --git a/examples/digital_fingerprinting/production/morpheus/control_messages/duo_payload_lti.json b/examples/digital_fingerprinting/production/morpheus/control_messages/duo_payload_lti.json
index 07c69233ba..382ff22808 100644
--- a/examples/digital_fingerprinting/production/morpheus/control_messages/duo_payload_lti.json
+++ b/examples/digital_fingerprinting/production/morpheus/control_messages/duo_payload_lti.json
@@ -7,7 +7,7 @@
 			"properties": {
 			  "loader_id": "fsspec",
 			  "files": [
-				"../../../../examples/data/dfp/duo-training-data/*.json"
+				"../../../data/dfp/duo-training-data/*.json"
 			  ]
 			}
 		  },
@@ -34,7 +34,7 @@
 			"properties": {
 			  "loader_id": "fsspec",
 			  "files": [
-				"../../../../examples/data/dfp/duo-inference-data/*.json"
+				"../../../data/dfp/duo-inference-data/*.json"
 			  ]
 			}
 		  },
diff --git a/examples/digital_fingerprinting/production/morpheus/control_messages/duo_payload_only_load.json b/examples/digital_fingerprinting/production/morpheus/control_messages/duo_payload_only_load.json
index 3a214ea810..ed69cbf5af 100644
--- a/examples/digital_fingerprinting/production/morpheus/control_messages/duo_payload_only_load.json
+++ b/examples/digital_fingerprinting/production/morpheus/control_messages/duo_payload_only_load.json
@@ -7,7 +7,7 @@
 			"properties": {
 			  "loader_id": "fsspec",
 			  "files": [
-				"../../../../examples/data/dfp/duo-inference-data/*.json"
+				"../../../data/dfp/duo-inference-data/*.json"
 			  ]
 			}
 		  }
diff --git a/examples/digital_fingerprinting/production/morpheus/control_messages/duo_payload_training.json b/examples/digital_fingerprinting/production/morpheus/control_messages/duo_payload_training.json
index df21751d36..2b9995875f 100644
--- a/examples/digital_fingerprinting/production/morpheus/control_messages/duo_payload_training.json
+++ b/examples/digital_fingerprinting/production/morpheus/control_messages/duo_payload_training.json
@@ -7,7 +7,7 @@
           "properties": {
             "loader_id": "fsspec",
             "files": [
-              "../../../../examples/data/dfp/duo-training-data/*.json"
+              "../../../data/dfp/duo-training-data/*.json"
             ]
           }
         },
diff --git a/examples/digital_fingerprinting/production/morpheus/control_messages/duo_streaming_inference.json b/examples/digital_fingerprinting/production/morpheus/control_messages/duo_streaming_inference.json
index fc1db57669..2b033b9d7c 100644
--- a/examples/digital_fingerprinting/production/morpheus/control_messages/duo_streaming_inference.json
+++ b/examples/digital_fingerprinting/production/morpheus/control_messages/duo_streaming_inference.json
@@ -7,7 +7,7 @@
 			"properties": {
 			  "loader_id": "fsspec",
 			  "files": [
-				"../../../../examples/data/dfp/duo-inference-data/*.json"
+				"../../../data/dfp/duo-inference-data/*.json"
 			  ]
 			}
 		  },
diff --git a/examples/digital_fingerprinting/production/morpheus/control_messages/duo_streaming_lti.json b/examples/digital_fingerprinting/production/morpheus/control_messages/duo_streaming_lti.json
index 91d41cad22..d74bae952b 100644
--- a/examples/digital_fingerprinting/production/morpheus/control_messages/duo_streaming_lti.json
+++ b/examples/digital_fingerprinting/production/morpheus/control_messages/duo_streaming_lti.json
@@ -7,7 +7,7 @@
 			"properties": {
 			  "loader_id": "fsspec",
 			  "files": [
-				"../../../../../examples/data/dfp/duo-training-data/*.json"
+				"../../../data/dfp/duo-training-data/*.json"
 			  ]
 			}
 		  },
@@ -34,7 +34,7 @@
 			"properties": {
 			  "loader_id": "fsspec",
 			  "files": [
-				"../../../../../examples/data/dfp/duo-inference-data/*.json"
+				"../../../data/dfp/duo-inference-data/*.json"
 			  ]
 			}
 		  },
diff --git a/examples/digital_fingerprinting/production/morpheus/control_messages/duo_streaming_only_load.json b/examples/digital_fingerprinting/production/morpheus/control_messages/duo_streaming_only_load.json
index 28b2a2f7f1..205842d72b 100644
--- a/examples/digital_fingerprinting/production/morpheus/control_messages/duo_streaming_only_load.json
+++ b/examples/digital_fingerprinting/production/morpheus/control_messages/duo_streaming_only_load.json
@@ -7,7 +7,7 @@
 			"properties": {
 			  "loader_id": "fsspec",
 			  "files": [
-				"../../../../examples/data/dfp/duo-inference-data/*.json"
+				"../../../data/dfp/duo-inference-data/*.json"
 			  ]
 			}
 		  }
diff --git a/examples/digital_fingerprinting/production/morpheus/control_messages/duo_streaming_payload.json b/examples/digital_fingerprinting/production/morpheus/control_messages/duo_streaming_payload.json
index ab5dadf0e5..3daf318961 100644
--- a/examples/digital_fingerprinting/production/morpheus/control_messages/duo_streaming_payload.json
+++ b/examples/digital_fingerprinting/production/morpheus/control_messages/duo_streaming_payload.json
@@ -7,7 +7,7 @@
 			"properties": {
 			  "loader_id": "fsspec",
 			  "files": [
-				"../../../../examples/data/dfp/duo-training-data/*.json"
+				"../../../data/dfp/duo-training-data/*.json"
 			  ]
 			}
 		  },
@@ -28,7 +28,7 @@
 			"properties": {
 			  "loader_id": "fsspec",
 			  "files": [
-				"../../../../examples/data/dfp/duo-inference-data/*.json"
+				"../../../data/dfp/duo-inference-data/*.json"
 			  ]
 			}
 		  },
diff --git a/examples/digital_fingerprinting/production/morpheus/control_messages/duo_streaming_training.json b/examples/digital_fingerprinting/production/morpheus/control_messages/duo_streaming_training.json
index 73cc9046d9..4486149127 100644
--- a/examples/digital_fingerprinting/production/morpheus/control_messages/duo_streaming_training.json
+++ b/examples/digital_fingerprinting/production/morpheus/control_messages/duo_streaming_training.json
@@ -7,7 +7,7 @@
           "properties": {
             "loader_id": "fsspec",
             "files": [
-              "../../../../examples/data/dfp/duo-training-data/*.json"
+              "../../../data/dfp/duo-training-data/*.json"
             ]
           }
         },
diff --git a/examples/digital_fingerprinting/production/morpheus/dfp_integrated_training_batch_pipeline.py b/examples/digital_fingerprinting/production/morpheus/dfp_integrated_training_batch_pipeline.py
index c18da19ee4..a1ab2cb9c6 100644
--- a/examples/digital_fingerprinting/production/morpheus/dfp_integrated_training_batch_pipeline.py
+++ b/examples/digital_fingerprinting/production/morpheus/dfp_integrated_training_batch_pipeline.py
@@ -103,12 +103,14 @@
               help=("The MLflow tracking URI to connect to the tracking backend."))
 @click.option('--mlflow_experiment_name_template',
               type=str,
-              default="dfp/{source}/training/{reg_model_name}",
-              help="The MLflow experiment name template to use when logging experiments. ")
+              default=None,
+              help=("The MLflow experiment name template to use when logging experiments."
+                    "If None, defaults to dfp/source/training/{reg_model_name}"))
 @click.option('--mlflow_model_name_template',
               type=str,
-              default="DFP-{source}-{user_id}",
-              help="The MLflow model name template to use when logging models. ")
+              default=None,
+              help=("The MLflow model name template to use when logging models."
+                    "If None, defaults to DFP-source-{user_id}"))
 @click.option("--disable_pre_filtering",
               is_flag=True,
               help=("Enabling this option will skip pre-filtering of json messages. "
@@ -140,6 +142,10 @@ def run_pipeline(source: str,
     if (skip_user and only_user):
         logging.error("Option --skip_user and --only_user are mutually exclusive. Exiting")
 
+    if mlflow_experiment_name_template is None:
+        mlflow_experiment_name_template = f'dfp/{source}/training/' + '{reg_model_name}'
+    if mlflow_model_name_template is None:
+        mlflow_model_name_template = f'DFP-{source}-' + '{user_id}'
     dfp_arg_parser = DFPArgParser(skip_user,
                                   only_user,
                                   start_time,
diff --git a/examples/digital_fingerprinting/production/morpheus/dfp_integrated_training_streaming_pipeline.py b/examples/digital_fingerprinting/production/morpheus/dfp_integrated_training_streaming_pipeline.py
index e60792d6d3..29e7893d04 100644
--- a/examples/digital_fingerprinting/production/morpheus/dfp_integrated_training_streaming_pipeline.py
+++ b/examples/digital_fingerprinting/production/morpheus/dfp_integrated_training_streaming_pipeline.py
@@ -103,12 +103,14 @@
               help=("The MLflow tracking URI to connect to the tracking backend."))
 @click.option('--mlflow_experiment_name_template',
               type=str,
-              default="dfp/{source}/training/{reg_model_name}",
-              help="The MLflow experiment name template to use when logging experiments. ")
+              default=None,
+              help=("The MLflow experiment name template to use when logging experiments."
+                    "If None, defaults to dfp/source/training/{reg_model_name}"))
 @click.option('--mlflow_model_name_template',
               type=str,
-              default="DFP-{source}-{user_id}",
-              help="The MLflow model name template to use when logging models. ")
+              default=None,
+              help=("The MLflow model name template to use when logging models."
+                    "If None, defaults to DFP-source-{user_id}"))
 @click.option('--bootstrap_servers',
               type=str,
               default="localhost:9092",
@@ -152,6 +154,11 @@ def run_pipeline(source: str,
     if (skip_user and only_user):
         logging.error("Option --skip_user and --only_user are mutually exclusive. Exiting")
 
+    if mlflow_experiment_name_template is None:
+        mlflow_experiment_name_template = f'dfp/{source}/training/' + '{reg_model_name}'
+    if mlflow_model_name_template is None:
+        mlflow_model_name_template = f'DFP-{source}-' + '{user_id}'
+
     dfp_arg_parser = DFPArgParser(skip_user,
                                   only_user,
                                   start_time,

From a30c0bfd3657da6ce9ec34ca696049cf1614586d Mon Sep 17 00:00:00 2001
From: David Gardner <96306125+dagardner-nv@users.noreply.github.com>
Date: Wed, 13 Dec 2023 13:16:56 -0800
Subject: [PATCH 2/7] Document incompatible mlflow models issue (#1434)

* Adds entry to `docs/source/extra_info/troubleshooting.md` regarding incompatible MLflow models.

Closes #1409

## By Submitting this PR I confirm:
- I am familiar with the [Contributing Guidelines](https://github.com/nv-morpheus/Morpheus/blob/main/docs/source/developer_guide/contributing.md).
- When the PR is ready for review, new or existing tests cover these changes.
- When the PR is ready for review, the documentation is up to date with these changes.

Authors:
  - David Gardner (https://github.com/dagardner-nv)

Approvers:
  - Pete MacKinnon (https://github.com/pdmack)
  - Michael Demoret (https://github.com/mdemoret-nv)

URL: https://github.com/nv-morpheus/Morpheus/pull/1434
---
 docs/source/extra_info/troubleshooting.md | 36 +++++++++++++++++++++++
 1 file changed, 36 insertions(+)

diff --git a/docs/source/extra_info/troubleshooting.md b/docs/source/extra_info/troubleshooting.md
index b1ef6005b9..1a1ba3b9ad 100644
--- a/docs/source/extra_info/troubleshooting.md
+++ b/docs/source/extra_info/troubleshooting.md
@@ -29,6 +29,42 @@ rm -rf ${MORPHEUS_ROOT}/build
 # Restart the build
 ./scripts/compile.sh
 ```
+
+**Incompatible MLflow Models**
+
+Models trained with a previous version of Morpheus and stored into MLflow may be incompatible with the current version. This error can be identified by the following error message occurring in an MLflow based pipeline such as DFP.
+
+```
+Error trying to get model
+
+Traceback (most recent call last):
+
+File "/workspace/examples/digital_fingerprinting/production/morpheus/dfp/stages/dfp_inference_stage.py", line 101, in on_data
+
+loaded_model = model_cache.load_model(self._client)
+```
+```
+ModuleNotFoundError: No module named 'dfencoder'
+```
+
+The work arounds available for this issue are:
+
+* Revert to the previous version of Morpheus until the models can be re-trained.
+* Re-train the model using the current version of Morpheus
+
+In the case of models trained by the DFP example, the existing models can be deleted by running the following command:
+
+```bash
+docker volume ls # list current docker volumes
+docker volume rm production_db_data production_mlflow_data
+
+# Re-build the MLflow container for DFP
+cd ${MORPHEUS_ROOT}/examples/digital_fingerprinting/production/
+docker compose build
+docker compose up mlflow
+```
+
+
 **Debugging Python Code**
 
 To debug issues in python code, several Visual Studio Code launch configurations have been included in the repo. These launch configurations can be found in `${MORPHEUS_ROOT}/morpheus.code-workspace`. To launch the debugging environment, ensure Visual Studio Code has opened the morpheus workspace file (File->Open Workspace from File...). Once the workspace has been loaded, the launch configurations should be available in the debugging tab.

From fb35a5eef27c5c9d5c17fee33d782290598df47d Mon Sep 17 00:00:00 2001
From: David Gardner <96306125+dagardner-nv@users.noreply.github.com>
Date: Wed, 13 Dec 2023 16:45:36 -0800
Subject: [PATCH 3/7] Add benchmarks for stand-alone RAG & vdb upload pipelines
 (#1421)

* Adds a benchmark for the vdb_upload & RAG stand-alone pipelines.
* The vdb_upload benchmark mocks the HTTP requests performed by the RSS source and webscraper stages.
* Update benchmark instructions in `tests/benchmarks/README.md`

## By Submitting this PR I confirm:
- I am familiar with the [Contributing Guidelines](https://github.com/nv-morpheus/Morpheus/blob/main/docs/source/developer_guide/contributing.md).
- When the PR is ready for review, new or existing tests cover these changes.
- When the PR is ready for review, the documentation is up to date with these changes.

Authors:
  - David Gardner (https://github.com/dagardner-nv)

Approvers:
  - Michael Demoret (https://github.com/mdemoret-nv)

URL: https://github.com/nv-morpheus/Morpheus/pull/1421
---
 tests/_utils/milvus.py                        |  30 ++++
 tests/benchmarks/README.md                    |  25 ++-
 tests/benchmarks/conftest.py                  |   8 +-
 .../test_bench_completion_pipeline.py         |   6 +-
 .../test_bench_rag_standalone_pipeline.py     | 156 ++++++++++++++++++
 .../test_bench_vdb_upload_pipeline.py         | 141 ++++++++++++++++
 tests/llm/test_rag_standalone_pipe.py         |  45 +++--
 7 files changed, 373 insertions(+), 38 deletions(-)
 create mode 100644 tests/_utils/milvus.py
 create mode 100644 tests/benchmarks/test_bench_rag_standalone_pipeline.py
 create mode 100644 tests/benchmarks/test_bench_vdb_upload_pipeline.py

diff --git a/tests/_utils/milvus.py b/tests/_utils/milvus.py
new file mode 100644
index 0000000000..63c7f82af2
--- /dev/null
+++ b/tests/_utils/milvus.py
@@ -0,0 +1,30 @@
+# SPDX-FileCopyrightText: Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Utilities for testing Morpheus with Milvus"""
+
+import cudf
+
+from morpheus.service.vdb.milvus_vector_db_service import MilvusVectorDBService
+
+
+def populate_milvus(milvus_server_uri: str,
+                    collection_name: str,
+                    resource_kwargs: dict,
+                    df: cudf.DataFrame,
+                    overwrite: bool = False):
+    milvus_service = MilvusVectorDBService(uri=milvus_server_uri)
+    milvus_service.create(collection_name, overwrite=overwrite, **resource_kwargs)
+    resource_service = milvus_service.load_resource(name=collection_name)
+    resource_service.insert_dataframe(name=collection_name, df=df, **resource_kwargs)
diff --git a/tests/benchmarks/README.md b/tests/benchmarks/README.md
index 2ef44d537c..7154376afa 100644
--- a/tests/benchmarks/README.md
+++ b/tests/benchmarks/README.md
@@ -29,9 +29,7 @@ docker pull nvcr.io/nvidia/tritonserver:23.06-py3
 
 ##### Start Triton Inference Server container
 ```bash
-cd ${MORPHEUS_ROOT}/models
-
-docker run --gpus=1 --rm -p8000:8000 -p8001:8001 -p8002:8002 -v $PWD:/models nvcr.io/nvidia/tritonserver:23.06-py3 tritonserver --model-repository=/models/triton-model-repo --model-control-mode=explicit --load-model sid-minibert-onnx --load-model abp-nvsmi-xgb --load-model phishing-bert-onnx
+docker run --gpus=all --rm -p8000:8000 -p8001:8001 -p8002:8002 -v $PWD/models:/models nvcr.io/nvidia/tritonserver:23.06-py3 tritonserver --model-repository=/models/triton-model-repo --model-control-mode=explicit --load-model sid-minibert-onnx --load-model abp-nvsmi-xgb --load-model phishing-bert-onnx --load-model all-MiniLM-L6-v2
 ```
 
 ##### Verify Model Deployments
@@ -42,6 +40,7 @@ Once Triton server finishes starting up, it will display the status of all loade
 | Model              | Version | Status |
 +--------------------+---------+--------+
 | abp-nvsmi-xgb      | 1       | READY  |
+| all-MiniLM-L6-v2   | 1       | READY  |
 | phishing-bert-onnx | 1       | READY  |
 | sid-minibert-onnx  | 1       | READY  |
 +--------------------+---------+--------+
@@ -100,17 +99,27 @@ Morpheus configurations for each workflow are managed using `e2e_test_configs.js
 ...
 ```
 
-Benchmarks for an individual workflow can be run using the following:
-
+To run all benchmarks run the following:
 ```bash
 cd tests/benchmarks
 
-pytest -s --run_benchmark --benchmark-enable --benchmark-warmup=on --benchmark-warmup-iterations=1 --benchmark-autosave test_bench_e2e_pipelines.py::<test-workflow>
+pytest -s --run_benchmark --run_milvus --benchmark-enable --benchmark-warmup=on --benchmark-warmup-iterations=1 --benchmark-autosave
 ```
+
 The `-s` option allows outputs of pipeline execution to be displayed so you can ensure there are no errors while running your benchmarks.
 
-The `--benchmark-warmup` and `--benchmark-warmup-iterations` options are used to run the workflow(s) once before starting measurements. This is because the models deployed to Triton are configured to convert from ONNX to TensorRT on first use. Since the conversion can take a considerable amount of time, we don't want to include it in the measurements.
+The `--benchmark-warmup` and `--benchmark-warmup-iterations` options are used to run the workflow(s) once before starting measurements. This is because the models deployed to Triton are configured to convert from ONNX to TensorRT on first use. Since the conversion can take a considerable amount of time, we don't want to include it in the measurements. The `--run_milvus` flag enables benchmarks which require the Milvus database.
 
+#### Running with an existing Milvus database
+
+By default when `--run_milvus` flag is provided, pytest will start a new Milvus database. If you wish to use an existing Milvus database, you can set the `MORPHEUS_MILVUS_URI` environment variable. For a local Milvus database, running on the default port, you can set the environment variable as follows:
+```bash
+export MORPHEUS_MILVUS_URI="http://127.0.0.1:19530"
+```
+
+#### test_bench_e2e_pipelines.py
+
+The `test_bench_e2e_pipelines.py` script contains several benchmarks within it.
 `<test-workflow>` is the name of the test to run benchmarks on. This can be one of the following:
 - `test_sid_nlp_e2e`
 - `test_abp_fil_e2e`
@@ -195,4 +204,4 @@ conda run -n base --live-stream conda-merge docker/conda/environments/cuda${CUDA
   docker/conda/environments/cuda${CUDA_VER}_examples.yml > .tmp/merged.yml \
   && mamba env update -n ${CONDA_DEFAULT_ENV} --file .tmp/merged.yml
 
-```
\ No newline at end of file
+```
diff --git a/tests/benchmarks/conftest.py b/tests/benchmarks/conftest.py
index f4a1bbbf7b..c65b583765 100644
--- a/tests/benchmarks/conftest.py
+++ b/tests/benchmarks/conftest.py
@@ -81,10 +81,11 @@ def pytest_benchmark_update_json(config, benchmarks, output_json):
 @pytest.fixture(name="mock_chat_completion")
 @pytest.mark.usefixtures()
 def mock_chat_completion_fixture(mock_chat_completion: mock.MagicMock):
+    sleep_time = float(os.environ.get("MOCK_OPENAI_REQUEST_TIME", 1.265))
 
     async def sleep_first(*args, **kwargs):
         # Sleep time is based on average request time
-        await asyncio.sleep(1.265)
+        await asyncio.sleep(sleep_time)
         return mock.DEFAULT
 
     mock_chat_completion.acreate.side_effect = sleep_first
@@ -95,11 +96,12 @@ async def sleep_first(*args, **kwargs):
 @pytest.mark.usefixtures("nemollm")
 @pytest.fixture(name="mock_nemollm")
 def mock_nemollm_fixture(mock_nemollm: mock.MagicMock):
-    # The generate function is a blocking call that returns a future when return_type="async"
+    sleep_time = float(os.environ.get("MOCK_NEMOLLM_REQUEST_TIME", 0.412))
 
+    # The generate function is a blocking call that returns a future when return_type="async"
     async def sleep_first(fut: asyncio.Future, value: typing.Any = mock.DEFAULT):
         # Sleep time is based on average request time
-        await asyncio.sleep(0.412)
+        await asyncio.sleep(sleep_time)
         fut.set_result(value)
 
     def create_future(*args, **kwargs) -> asyncio.Future:
diff --git a/tests/benchmarks/test_bench_completion_pipeline.py b/tests/benchmarks/test_bench_completion_pipeline.py
index 59f67e9cb3..a583a07964 100644
--- a/tests/benchmarks/test_bench_completion_pipeline.py
+++ b/tests/benchmarks/test_bench_completion_pipeline.py
@@ -38,7 +38,7 @@
 from morpheus.stages.preprocess.deserialize_stage import DeserializeStage
 
 
-def _build_engine(llm_service_cls: LLMService, model_name: str = "test_model"):
+def _build_engine(llm_service_cls: type[LLMService], model_name: str = "test_model"):
     llm_service = llm_service_cls()
     llm_client = llm_service.get_client(model_name=model_name)
 
@@ -54,7 +54,7 @@ def _build_engine(llm_service_cls: LLMService, model_name: str = "test_model"):
 
 
 def _run_pipeline(config: Config,
-                  llm_service_cls: LLMService,
+                  llm_service_cls: type[LLMService],
                   source_df: cudf.DataFrame,
                   model_name: str = "test_model") -> dict:
     """
@@ -83,5 +83,5 @@ def _run_pipeline(config: Config,
 def test_completion_pipe(benchmark: collections.abc.Callable[[collections.abc.Callable], typing.Any],
                          config: Config,
                          dataset: DatasetManager,
-                         llm_service_cls: LLMService):
+                         llm_service_cls: type[LLMService]):
     benchmark(_run_pipeline, config, llm_service_cls, source_df=dataset["countries.csv"])
diff --git a/tests/benchmarks/test_bench_rag_standalone_pipeline.py b/tests/benchmarks/test_bench_rag_standalone_pipeline.py
new file mode 100644
index 0000000000..a0bfa56ebb
--- /dev/null
+++ b/tests/benchmarks/test_bench_rag_standalone_pipeline.py
@@ -0,0 +1,156 @@
+# SPDX-FileCopyrightText: Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Benchmark the examples/llm/rag/standalone_pipeline.py example"""
+
+import collections.abc
+import os
+import types
+import typing
+
+import pytest
+
+import cudf
+
+from _utils import TEST_DIRS
+from _utils.dataset_manager import DatasetManager
+from _utils.milvus import populate_milvus
+from morpheus.config import Config
+from morpheus.config import PipelineModes
+from morpheus.llm import LLMEngine
+from morpheus.llm.nodes.extracter_node import ExtracterNode
+from morpheus.llm.nodes.rag_node import RAGNode
+from morpheus.llm.task_handlers.simple_task_handler import SimpleTaskHandler
+from morpheus.messages import ControlMessage
+from morpheus.pipeline.linear_pipeline import LinearPipeline
+from morpheus.stages.input.in_memory_source_stage import InMemorySourceStage
+from morpheus.stages.llm.llm_engine_stage import LLMEngineStage
+from morpheus.stages.output.in_memory_sink_stage import InMemorySinkStage
+from morpheus.stages.preprocess.deserialize_stage import DeserializeStage
+
+EMBEDDING_SIZE = 384
+QUESTION = "What are some new attacks discovered in the cyber security industry?"
+PROMPT = """You are a helpful assistant. Given the following background information:\n
+{% for c in contexts -%}
+Title: {{ c.title }}
+Summary: {{ c.summary }}
+Text: {{ c.page_content }}
+{% endfor %}
+
+Please answer the following question: \n{{ query }}"""
+EXPECTED_RESPONSE = "Ransomware, Phishing, Malware, Denial of Service, SQL injection, and Password Attacks"
+
+
+def _build_engine(llm_service_name: str,
+                  model_name: str,
+                  milvus_server_uri: str,
+                  collection_name: str,
+                  utils_mod: types.ModuleType):
+    engine = LLMEngine()
+    engine.add_node("extracter", node=ExtracterNode())
+
+    vector_service = utils_mod.build_milvus_service(embedding_size=EMBEDDING_SIZE, uri=milvus_server_uri)
+    embeddings = utils_mod.build_huggingface_embeddings("sentence-transformers/all-MiniLM-L6-v2",
+                                                        model_kwargs={'device': 'cuda'},
+                                                        encode_kwargs={'batch_size': 100})
+
+    llm_service = utils_mod.build_llm_service(model_name=model_name,
+                                              llm_service=llm_service_name,
+                                              temperature=0.5,
+                                              tokens_to_generate=200)
+
+    # Async wrapper around embeddings
+    async def calc_embeddings(texts: list[str]) -> list[list[float]]:
+        return embeddings.embed_documents(texts)
+
+    engine.add_node("rag",
+                    inputs=["/extracter"],
+                    node=RAGNode(prompt=PROMPT,
+                                 vdb_service=vector_service.load_resource(collection_name),
+                                 embedding=calc_embeddings,
+                                 llm_client=llm_service))
+
+    engine.add_task_handler(inputs=["/rag"], handler=SimpleTaskHandler())
+
+    return engine
+
+
+def _run_pipeline(config: Config,
+                  llm_service_name: str,
+                  model_name: str,
+                  milvus_server_uri: str,
+                  collection_name: str,
+                  repeat_count: int,
+                  utils_mod: types.ModuleType):
+
+    config.mode = PipelineModes.NLP
+    config.edge_buffer_size = 128
+    config.pipeline_batch_size = 1024
+    config.model_max_batch_size = 64
+
+    questions = [QUESTION] * repeat_count
+    source_df = cudf.DataFrame({"questions": questions})
+
+    completion_task = {"task_type": "completion", "task_dict": {"input_keys": ["questions"], }}
+    pipe = LinearPipeline(config)
+
+    pipe.set_source(InMemorySourceStage(config, dataframes=[source_df]))
+
+    pipe.add_stage(
+        DeserializeStage(config, message_type=ControlMessage, task_type="llm_engine", task_payload=completion_task))
+
+    pipe.add_stage(
+        LLMEngineStage(config,
+                       engine=_build_engine(llm_service_name=llm_service_name,
+                                            model_name=model_name,
+                                            milvus_server_uri=milvus_server_uri,
+                                            collection_name=collection_name,
+                                            utils_mod=utils_mod)))
+    pipe.add_stage(InMemorySinkStage(config))
+
+    pipe.run()
+
+
+@pytest.mark.milvus
+@pytest.mark.use_python
+@pytest.mark.use_cudf
+@pytest.mark.benchmark
+@pytest.mark.import_mod(os.path.join(TEST_DIRS.examples_dir, 'llm/common/utils.py'))
+@pytest.mark.usefixtures("mock_nemollm", "mock_chat_completion")
+@pytest.mark.parametrize("llm_service_name", ["nemollm", "openai"])
+@pytest.mark.parametrize("repeat_count", [10, 100])
+def test_rag_standalone_pipe(benchmark: collections.abc.Callable[[collections.abc.Callable], typing.Any],
+                             config: Config,
+                             dataset: DatasetManager,
+                             milvus_server_uri: str,
+                             repeat_count: int,
+                             import_mod: types.ModuleType,
+                             llm_service_name: str):
+    collection_name = f"test_bench_rag_standalone_pipe_{llm_service_name}"
+    populate_milvus(milvus_server_uri=milvus_server_uri,
+                    collection_name=collection_name,
+                    resource_kwargs=import_mod.build_milvus_config(embedding_size=EMBEDDING_SIZE),
+                    df=dataset["service/milvus_rss_data.json"],
+                    overwrite=True)
+
+    benchmark(
+        _run_pipeline,
+        config=config,
+        llm_service_name=llm_service_name,
+        model_name="test_model",
+        milvus_server_uri=milvus_server_uri,
+        collection_name=collection_name,
+        repeat_count=repeat_count,
+        utils_mod=import_mod,
+    )
diff --git a/tests/benchmarks/test_bench_vdb_upload_pipeline.py b/tests/benchmarks/test_bench_vdb_upload_pipeline.py
new file mode 100644
index 0000000000..affd8a91e3
--- /dev/null
+++ b/tests/benchmarks/test_bench_vdb_upload_pipeline.py
@@ -0,0 +1,141 @@
+# SPDX-FileCopyrightText: Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""
+Benchmark the examples/llm/vdb_upload/pipeline.py example, talking with live Triton and Milvus servers, but mocked
+RSS and web scraper responses."""
+
+import collections.abc
+import json
+import os
+import time
+import types
+import typing
+from unittest import mock
+
+import pytest
+
+from _utils import TEST_DIRS
+from morpheus.config import Config
+from morpheus.config import PipelineModes
+from morpheus.pipeline.linear_pipeline import LinearPipeline
+from morpheus.stages.inference.triton_inference_stage import TritonInferenceStage
+from morpheus.stages.input.rss_source_stage import RSSSourceStage
+from morpheus.stages.output.write_to_vector_db_stage import WriteToVectorDBStage
+from morpheus.stages.preprocess.deserialize_stage import DeserializeStage
+from morpheus.stages.preprocess.preprocess_nlp_stage import PreprocessNLPStage
+
+EMBEDDING_SIZE = 384
+MODEL_MAX_BATCH_SIZE = 64
+MODEL_FEA_LENGTH = 512
+
+
+def _run_pipeline(config: Config,
+                  milvus_server_uri: str,
+                  collection_name: str,
+                  rss_urls: list[str],
+                  utils_mod: types.ModuleType,
+                  web_scraper_stage_mod: types.ModuleType):
+
+    config.mode = PipelineModes.NLP
+    config.pipeline_batch_size = 1024
+    config.model_max_batch_size = MODEL_MAX_BATCH_SIZE
+    config.feature_length = MODEL_FEA_LENGTH
+    config.edge_buffer_size = 128
+    config.class_labels = [str(i) for i in range(EMBEDDING_SIZE)]
+
+    pipe = LinearPipeline(config)
+
+    pipe.set_source(
+        RSSSourceStage(config, feed_input=rss_urls, batch_size=128, run_indefinitely=False, enable_cache=False))
+    pipe.add_stage(web_scraper_stage_mod.WebScraperStage(config, chunk_size=MODEL_FEA_LENGTH, enable_cache=False))
+    pipe.add_stage(DeserializeStage(config))
+
+    pipe.add_stage(
+        PreprocessNLPStage(config,
+                           vocab_hash_file=os.path.join(TEST_DIRS.data_dir, 'bert-base-uncased-hash.txt'),
+                           do_lower_case=True,
+                           truncation=True,
+                           add_special_tokens=False,
+                           column='page_content'))
+
+    pipe.add_stage(
+        TritonInferenceStage(config,
+                             model_name='all-MiniLM-L6-v2',
+                             server_url='localhost:8001',
+                             force_convert_inputs=True))
+
+    pipe.add_stage(
+        WriteToVectorDBStage(config,
+                             resource_name=collection_name,
+                             resource_kwargs=utils_mod.build_milvus_config(embedding_size=EMBEDDING_SIZE),
+                             recreate=True,
+                             service="milvus",
+                             uri=milvus_server_uri))
+    pipe.run()
+
+
+@pytest.mark.milvus
+@pytest.mark.use_python
+@pytest.mark.use_pandas
+@pytest.mark.benchmark
+@pytest.mark.import_mod([
+    os.path.join(TEST_DIRS.examples_dir, 'llm/common/utils.py'),
+    os.path.join(TEST_DIRS.examples_dir, 'llm/common/web_scraper_stage.py'),
+])
+@mock.patch('feedparser.http.get')
+@mock.patch('requests.Session')
+def test_vdb_upload_pipe(mock_requests_session: mock.MagicMock,
+                         mock_feedparser_http_get: mock.MagicMock,
+                         benchmark: collections.abc.Callable[[collections.abc.Callable], typing.Any],
+                         config: Config,
+                         milvus_server_uri: str,
+                         import_mod: list[types.ModuleType]):
+
+    with open(os.path.join(TEST_DIRS.tests_data_dir, 'service/cisa_web_responses.json'), encoding='utf-8') as fh:
+        web_responses = json.load(fh)
+
+    mock_web_scraper_request_time = float(os.environ.get("MOCK_WEB_SCRAPER_REQUEST_TIME", 0.5))
+
+    def mock_get_fn(url: str):
+        mock_response = mock.MagicMock()
+        mock_response.ok = True
+        mock_response.status_code = 200
+        mock_response.text = web_responses[url]
+        time.sleep(mock_web_scraper_request_time)
+        return mock_response
+
+    mock_requests_session.return_value = mock_requests_session
+    mock_requests_session.get.side_effect = mock_get_fn
+
+    mock_feedparser_request_time = float(os.environ.get("MOCK_FEEDPARSER_REQUEST_TIME", 0.5))
+
+    def mock_feedparser_http_get_fn(*args, **kwargs):  # pylint: disable=unused-argument
+        time.sleep(mock_feedparser_request_time)
+        # The RSS Parser expects a bytes string
+        with open(os.path.join(TEST_DIRS.tests_data_dir, 'service/cisa_rss_feed.xml'), 'rb') as fh:
+            return fh.read()
+
+    mock_feedparser_http_get.side_effect = mock_feedparser_http_get_fn
+
+    (utils_mod, web_scraper_stage_mod) = import_mod
+    collection_name = "test_bench_vdb_upload_pipeline"
+
+    benchmark(_run_pipeline,
+              config=config,
+              milvus_server_uri=milvus_server_uri,
+              collection_name=collection_name,
+              rss_urls=["https://www.us-cert.gov/ncas/current-activity.xml"],
+              utils_mod=utils_mod,
+              web_scraper_stage_mod=web_scraper_stage_mod)
diff --git a/tests/llm/test_rag_standalone_pipe.py b/tests/llm/test_rag_standalone_pipe.py
index cd44e0a152..a98c9e1c1a 100644
--- a/tests/llm/test_rag_standalone_pipe.py
+++ b/tests/llm/test_rag_standalone_pipe.py
@@ -25,6 +25,7 @@
 from _utils import TEST_DIRS
 from _utils import assert_results
 from _utils.dataset_manager import DatasetManager
+from _utils.milvus import populate_milvus
 from morpheus.config import Config
 from morpheus.config import PipelineModes
 from morpheus.llm import LLMEngine
@@ -33,7 +34,6 @@
 from morpheus.llm.task_handlers.simple_task_handler import SimpleTaskHandler
 from morpheus.messages import ControlMessage
 from morpheus.pipeline.linear_pipeline import LinearPipeline
-from morpheus.service.vdb.milvus_vector_db_service import MilvusVectorDBService
 from morpheus.stages.input.in_memory_source_stage import InMemorySourceStage
 from morpheus.stages.llm.llm_engine_stage import LLMEngineStage
 from morpheus.stages.output.compare_dataframe_stage import CompareDataFrameStage
@@ -52,13 +52,6 @@
 EXPECTED_RESPONSE = "Ransomware, Phishing, Malware, Denial of Service, SQL injection, and Password Attacks"
 
 
-def _populate_milvus(milvus_server_uri: str, collection_name: str, resource_kwargs: dict, df: cudf.DataFrame):
-    milvus_service = MilvusVectorDBService(uri=milvus_server_uri)
-    milvus_service.create(collection_name, **resource_kwargs)
-    resource_service = milvus_service.load_resource(name=collection_name)
-    resource_service.insert_dataframe(name=collection_name, df=df, **resource_kwargs)
-
-
 def _build_engine(llm_service_name: str,
                   model_name: str,
                   milvus_server_uri: str,
@@ -150,10 +143,11 @@ def test_rag_standalone_pipe_nemo(
         repeat_count: int,
         import_mod: types.ModuleType):
     collection_name = "test_rag_standalone_pipe_nemo"
-    _populate_milvus(milvus_server_uri=milvus_server_uri,
-                     collection_name=collection_name,
-                     resource_kwargs=import_mod.build_milvus_config(embedding_size=EMBEDDING_SIZE),
-                     df=dataset["service/milvus_rss_data.json"])
+    populate_milvus(milvus_server_uri=milvus_server_uri,
+                    collection_name=collection_name,
+                    resource_kwargs=import_mod.build_milvus_config(embedding_size=EMBEDDING_SIZE),
+                    df=dataset["service/milvus_rss_data.json"],
+                    overwrite=True)
     mock_asyncio_gather.return_value = [mock.MagicMock() for _ in range(repeat_count)]
     mock_nemollm.post_process_generate_response.side_effect = [{"text": EXPECTED_RESPONSE} for _ in range(repeat_count)]
     results = _run_pipeline(
@@ -189,10 +183,11 @@ def test_rag_standalone_pipe_openai(config: Config,
     } for _ in range(repeat_count)]
 
     collection_name = "test_rag_standalone_pipe_openai"
-    _populate_milvus(milvus_server_uri=milvus_server_uri,
-                     collection_name=collection_name,
-                     resource_kwargs=import_mod.build_milvus_config(embedding_size=EMBEDDING_SIZE),
-                     df=dataset["service/milvus_rss_data.json"])
+    populate_milvus(milvus_server_uri=milvus_server_uri,
+                    collection_name=collection_name,
+                    resource_kwargs=import_mod.build_milvus_config(embedding_size=EMBEDDING_SIZE),
+                    df=dataset["service/milvus_rss_data.json"],
+                    overwrite=True)
 
     results = _run_pipeline(
         config=config,
@@ -219,10 +214,11 @@ def test_rag_standalone_pipe_integration_nemo(config: Config,
                                               repeat_count: int,
                                               import_mod: types.ModuleType):
     collection_name = "test_rag_standalone_pipe__integration_nemo"
-    _populate_milvus(milvus_server_uri=milvus_server_uri,
-                     collection_name=collection_name,
-                     resource_kwargs=import_mod.build_milvus_config(embedding_size=EMBEDDING_SIZE),
-                     df=dataset["service/milvus_rss_data.json"])
+    populate_milvus(milvus_server_uri=milvus_server_uri,
+                    collection_name=collection_name,
+                    resource_kwargs=import_mod.build_milvus_config(embedding_size=EMBEDDING_SIZE),
+                    df=dataset["service/milvus_rss_data.json"],
+                    overwrite=True)
     results = _run_pipeline(
         config=config,
         llm_service_name="nemollm",
@@ -251,10 +247,11 @@ def test_rag_standalone_pipe_integration_openai(config: Config,
                                                 repeat_count: int,
                                                 import_mod: types.ModuleType):
     collection_name = "test_rag_standalone_pipe_integration_openai"
-    _populate_milvus(milvus_server_uri=milvus_server_uri,
-                     collection_name=collection_name,
-                     resource_kwargs=import_mod.build_milvus_config(embedding_size=EMBEDDING_SIZE),
-                     df=dataset["service/milvus_rss_data.json"])
+    populate_milvus(milvus_server_uri=milvus_server_uri,
+                    collection_name=collection_name,
+                    resource_kwargs=import_mod.build_milvus_config(embedding_size=EMBEDDING_SIZE),
+                    df=dataset["service/milvus_rss_data.json"],
+                    overwrite=True)
 
     results = _run_pipeline(
         config=config,

From fbc68def01ddd68bf53184421422ecd8ae864475 Mon Sep 17 00:00:00 2001
From: David Gardner <96306125+dagardner-nv@users.noreply.github.com>
Date: Wed, 13 Dec 2023 20:51:22 -0800
Subject: [PATCH 4/7] Add mocked test & benchmark for LLM agents pipeline
 (#1424)

* Add a test for the LLM agents pipeline that doesn't communicate with OpenAI or SerpAPI.
* Add benchmark for LLM agents pipeline.

Includes changes from PR #1421

## By Submitting this PR I confirm:
- I am familiar with the [Contributing Guidelines](https://github.com/nv-morpheus/Morpheus/blob/main/docs/source/developer_guide/contributing.md).
- When the PR is ready for review, new or existing tests cover these changes.
- When the PR is ready for review, the documentation is up to date with these changes.

Authors:
  - David Gardner (https://github.com/dagardner-nv)

Approvers:
  - Michael Demoret (https://github.com/mdemoret-nv)

URL: https://github.com/nv-morpheus/Morpheus/pull/1424
---
 tests/benchmarks/conftest.py                  |  36 +++-
 .../test_bench_agents_simple_pipeline.py      | 174 ++++++++++++++++++
 .../test_bench_vdb_upload_pipeline.py         |   6 +-
 tests/llm/test_agents_simple_pipe.py          | 122 ++++++++++--
 4 files changed, 308 insertions(+), 30 deletions(-)
 create mode 100644 tests/benchmarks/test_bench_agents_simple_pipeline.py

diff --git a/tests/benchmarks/conftest.py b/tests/benchmarks/conftest.py
index c65b583765..ff83e4b9a3 100644
--- a/tests/benchmarks/conftest.py
+++ b/tests/benchmarks/conftest.py
@@ -77,15 +77,38 @@ def pytest_benchmark_update_json(config, benchmarks, output_json):
         bench['stats']['median_throughput_bytes'] = (byte_count * repeat) / bench['stats']['median']
 
 
+@pytest.fixture(name="mock_openai_request_time")
+def mock_openai_request_time_fixture():
+    return float(os.environ.get("MOCK_OPENAI_REQUEST_TIME", 1.265))
+
+
+@pytest.fixture(name="mock_nemollm_request_time")
+def mock_nemollm_request_time_fixture():
+    return float(os.environ.get("MOCK_NEMOLLM_REQUEST_TIME", 0.412))
+
+
+@pytest.fixture(name="mock_web_scraper_request_time")
+def mock_web_scraper_request_time_fixture():
+    return float(os.environ.get("MOCK_WEB_SCRAPER_REQUEST_TIME", 0.5))
+
+
+@pytest.fixture(name="mock_feedparser_request_time")
+def mock_feedparser_request_time_fixture():
+    return float(os.environ.get("MOCK_FEEDPARSER_REQUEST_TIME", 0.5))
+
+
+@pytest.fixture(name="mock_serpapi_request_time")
+def mock_serpapi_request_time_fixture():
+    return float(os.environ.get("MOCK_SERPAPI_REQUEST_TIME", 1.7))
+
+
 @pytest.mark.usefixtures("openai")
 @pytest.fixture(name="mock_chat_completion")
-@pytest.mark.usefixtures()
-def mock_chat_completion_fixture(mock_chat_completion: mock.MagicMock):
-    sleep_time = float(os.environ.get("MOCK_OPENAI_REQUEST_TIME", 1.265))
+def mock_chat_completion_fixture(mock_chat_completion: mock.MagicMock, mock_openai_request_time: float):
 
     async def sleep_first(*args, **kwargs):
         # Sleep time is based on average request time
-        await asyncio.sleep(sleep_time)
+        await asyncio.sleep(mock_openai_request_time)
         return mock.DEFAULT
 
     mock_chat_completion.acreate.side_effect = sleep_first
@@ -95,13 +118,12 @@ async def sleep_first(*args, **kwargs):
 
 @pytest.mark.usefixtures("nemollm")
 @pytest.fixture(name="mock_nemollm")
-def mock_nemollm_fixture(mock_nemollm: mock.MagicMock):
-    sleep_time = float(os.environ.get("MOCK_NEMOLLM_REQUEST_TIME", 0.412))
+def mock_nemollm_fixture(mock_nemollm: mock.MagicMock, mock_nemollm_request_time: float):
 
     # The generate function is a blocking call that returns a future when return_type="async"
     async def sleep_first(fut: asyncio.Future, value: typing.Any = mock.DEFAULT):
         # Sleep time is based on average request time
-        await asyncio.sleep(sleep_time)
+        await asyncio.sleep(mock_nemollm_request_time)
         fut.set_result(value)
 
     def create_future(*args, **kwargs) -> asyncio.Future:
diff --git a/tests/benchmarks/test_bench_agents_simple_pipeline.py b/tests/benchmarks/test_bench_agents_simple_pipeline.py
new file mode 100644
index 0000000000..a59b4e594c
--- /dev/null
+++ b/tests/benchmarks/test_bench_agents_simple_pipeline.py
@@ -0,0 +1,174 @@
+# SPDX-FileCopyrightText: Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import asyncio
+import collections.abc
+import os
+import typing
+from unittest import mock
+
+import langchain
+import pytest
+from langchain.agents import AgentType
+from langchain.agents import initialize_agent
+from langchain.agents import load_tools
+from langchain.agents.tools import Tool
+from langchain.utilities import serpapi
+
+import cudf
+
+from morpheus.config import Config
+from morpheus.llm import LLMEngine
+from morpheus.llm.nodes.extracter_node import ExtracterNode
+from morpheus.llm.nodes.langchain_agent_node import LangChainAgentNode
+from morpheus.llm.task_handlers.simple_task_handler import SimpleTaskHandler
+from morpheus.messages import ControlMessage
+from morpheus.pipeline.linear_pipeline import LinearPipeline
+from morpheus.stages.input.in_memory_source_stage import InMemorySourceStage
+from morpheus.stages.llm.llm_engine_stage import LLMEngineStage
+from morpheus.stages.output.in_memory_sink_stage import InMemorySinkStage
+from morpheus.stages.preprocess.deserialize_stage import DeserializeStage
+
+
+def _build_agent_executor(model_name: str):
+
+    llm = langchain.OpenAI(model=model_name, temperature=0, cache=False)
+
+    # Explicitly construct the serpapi tool, loading it via load_tools makes it too difficult to mock
+    tools = [
+        Tool(
+            name="Search",
+            description="",
+            func=serpapi.SerpAPIWrapper().run,
+            coroutine=serpapi.SerpAPIWrapper().arun,
+        )
+    ]
+    tools.extend(load_tools(["llm-math"], llm=llm))
+
+    agent_executor = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)
+
+    return agent_executor
+
+
+def _build_engine(model_name: str):
+
+    engine = LLMEngine()
+
+    engine.add_node("extracter", node=ExtracterNode())
+
+    engine.add_node("agent",
+                    inputs=[("/extracter")],
+                    node=LangChainAgentNode(agent_executor=_build_agent_executor(model_name=model_name)))
+
+    engine.add_task_handler(inputs=["/agent"], handler=SimpleTaskHandler())
+
+    return engine
+
+
+def _run_pipeline(config: Config, source_dfs: list[cudf.DataFrame], model_name: str = "test_model"):
+    completion_task = {"task_type": "completion", "task_dict": {"input_keys": ["questions"]}}
+
+    pipe = LinearPipeline(config)
+
+    pipe.set_source(InMemorySourceStage(config, dataframes=source_dfs))
+
+    pipe.add_stage(
+        DeserializeStage(config, message_type=ControlMessage, task_type="llm_engine", task_payload=completion_task))
+
+    pipe.add_stage(LLMEngineStage(config, engine=_build_engine(model_name=model_name)))
+
+    pipe.add_stage(InMemorySinkStage(config))
+
+    pipe.run()
+
+
+@pytest.mark.usefixtures("openai", "restore_environ")
+@pytest.mark.use_python
+@pytest.mark.benchmark
+@mock.patch("langchain.utilities.serpapi.SerpAPIWrapper.aresults")
+@mock.patch("langchain.OpenAI._agenerate", autospec=True)  # autospec is needed as langchain will inspect the function
+def test_agents_simple_pipe(mock_openai_agenerate: mock.AsyncMock,
+                            mock_serpapi_aresults: mock.AsyncMock,
+                            mock_openai_request_time: float,
+                            mock_serpapi_request_time: float,
+                            benchmark: collections.abc.Callable[[collections.abc.Callable], typing.Any],
+                            config: Config):
+    os.environ.update({'OPENAI_API_KEY': 'test_api_key', 'SERPAPI_API_KEY': 'test_api_key'})
+
+    from langchain.schema import Generation
+    from langchain.schema import LLMResult
+
+    assert serpapi.SerpAPIWrapper().aresults is mock_serpapi_aresults
+
+    model_name = "test_model"
+
+    mock_responses = [
+        LLMResult(generations=[[
+            Generation(text="I should use a search engine to find information about unittests.\n"
+                       "Action: Search\nAction Input: \"unittests\"",
+                       generation_info={
+                           'finish_reason': 'stop', 'logprobs': None
+                       })
+        ]],
+                  llm_output={
+                      'token_usage': {}, 'model_name': model_name
+                  }),
+        LLMResult(generations=[[
+            Generation(text="I now know the final answer.\nFinal Answer: 3.99.",
+                       generation_info={
+                           'finish_reason': 'stop', 'logprobs': None
+                       })
+        ]],
+                  llm_output={
+                      'token_usage': {}, 'model_name': model_name
+                  })
+    ]
+
+    async def _mock_openai_agenerate(self, *args, **kwargs):  # pylint: disable=unused-argument
+        nonlocal mock_responses
+        call_count = getattr(self, '_unittest_call_count', 0)
+        response = mock_responses[call_count % 2]
+
+        # The OpenAI object will raise a ValueError if we attempt to set the attribute directly or use setattr
+        self.__dict__['_unittest_call_count'] = call_count + 1
+        await asyncio.sleep(mock_openai_request_time)
+        return response
+
+    mock_openai_agenerate.side_effect = _mock_openai_agenerate
+
+    async def _mock_serpapi_aresults(*args, **kwargs):  # pylint: disable=unused-argument
+        await asyncio.sleep(mock_serpapi_request_time)
+        return {
+            'answer_box': {
+                'answer': '25 years', 'link': 'http://unit.test', 'people_also_search_for': []
+            },
+            'inline_people_also_search_for': [],
+            'knowledge_graph': {},
+            'organic_results': [],
+            'pagination': {},
+            'related_questions': [],
+            'related_searches': [],
+            'search_information': {},
+            'search_metadata': {},
+            'search_parameters': {},
+            'serpapi_pagination': None
+        }
+
+    mock_serpapi_aresults.side_effect = _mock_serpapi_aresults
+
+    source_df = cudf.DataFrame(
+        {"questions": ["Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?"]})
+
+    benchmark(_run_pipeline, config, source_dfs=[source_df], model_name=model_name)
diff --git a/tests/benchmarks/test_bench_vdb_upload_pipeline.py b/tests/benchmarks/test_bench_vdb_upload_pipeline.py
index affd8a91e3..9ebbe0de80 100644
--- a/tests/benchmarks/test_bench_vdb_upload_pipeline.py
+++ b/tests/benchmarks/test_bench_vdb_upload_pipeline.py
@@ -98,6 +98,8 @@ def _run_pipeline(config: Config,
 @mock.patch('requests.Session')
 def test_vdb_upload_pipe(mock_requests_session: mock.MagicMock,
                          mock_feedparser_http_get: mock.MagicMock,
+                         mock_web_scraper_request_time: float,
+                         mock_feedparser_request_time: float,
                          benchmark: collections.abc.Callable[[collections.abc.Callable], typing.Any],
                          config: Config,
                          milvus_server_uri: str,
@@ -106,8 +108,6 @@ def test_vdb_upload_pipe(mock_requests_session: mock.MagicMock,
     with open(os.path.join(TEST_DIRS.tests_data_dir, 'service/cisa_web_responses.json'), encoding='utf-8') as fh:
         web_responses = json.load(fh)
 
-    mock_web_scraper_request_time = float(os.environ.get("MOCK_WEB_SCRAPER_REQUEST_TIME", 0.5))
-
     def mock_get_fn(url: str):
         mock_response = mock.MagicMock()
         mock_response.ok = True
@@ -119,8 +119,6 @@ def mock_get_fn(url: str):
     mock_requests_session.return_value = mock_requests_session
     mock_requests_session.get.side_effect = mock_get_fn
 
-    mock_feedparser_request_time = float(os.environ.get("MOCK_FEEDPARSER_REQUEST_TIME", 0.5))
-
     def mock_feedparser_http_get_fn(*args, **kwargs):  # pylint: disable=unused-argument
         time.sleep(mock_feedparser_request_time)
         # The RSS Parser expects a bytes string
diff --git a/tests/llm/test_agents_simple_pipe.py b/tests/llm/test_agents_simple_pipe.py
index c271a54e45..5e3d1d2223 100644
--- a/tests/llm/test_agents_simple_pipe.py
+++ b/tests/llm/test_agents_simple_pipe.py
@@ -13,17 +13,21 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
+import os
 import re
+from unittest import mock
 
-import pandas as pd
+import langchain
 import pytest
-from langchain import OpenAI
 from langchain.agents import AgentType
 from langchain.agents import initialize_agent
 from langchain.agents import load_tools
+from langchain.agents.tools import Tool
+from langchain.utilities import serpapi
 
 import cudf
 
+from _utils import assert_results
 from morpheus.config import Config
 from morpheus.llm import LLMEngine
 from morpheus.llm.nodes.extracter_node import ExtracterNode
@@ -33,16 +37,31 @@
 from morpheus.pipeline.linear_pipeline import LinearPipeline
 from morpheus.stages.input.in_memory_source_stage import InMemorySourceStage
 from morpheus.stages.llm.llm_engine_stage import LLMEngineStage
+from morpheus.stages.output.compare_dataframe_stage import CompareDataFrameStage
 from morpheus.stages.output.in_memory_sink_stage import InMemorySinkStage
 from morpheus.stages.preprocess.deserialize_stage import DeserializeStage
 from morpheus.utils.concat_df import concat_dataframes
 
 
+@pytest.fixture(name="questions")
+def questions_fixture():
+    return ["Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?"]
+
+
 def _build_agent_executor(model_name: str):
 
-    llm = OpenAI(model=model_name, temperature=0)
+    llm = langchain.OpenAI(model=model_name, temperature=0, cache=False)
 
-    tools = load_tools(["serpapi", "llm-math"], llm=llm)
+    # Explicitly construct the serpapi tool, loading it via load_tools makes it too difficult to mock
+    tools = [
+        Tool(
+            name="Search",
+            description="",
+            func=serpapi.SerpAPIWrapper().run,
+            coroutine=serpapi.SerpAPIWrapper().arun,
+        )
+    ]
+    tools.extend(load_tools(["llm-math"], llm=llm))
 
     agent_executor = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)
 
@@ -64,10 +83,10 @@ def _build_engine(model_name: str):
     return engine
 
 
-def _run_pipeline(config: Config, questions: list[str], model_name: str = "test_model") -> pd.DataFrame:
-    """
-    Loosely patterned after `examples/llm/completion`
-    """
+def _run_pipeline(config: Config,
+                  questions: list[str],
+                  model_name: str = "test_model",
+                  expected_df: cudf.DataFrame = None) -> InMemorySinkStage:
     source_df = cudf.DataFrame({"questions": questions})
 
     completion_task = {"task_type": "completion", "task_dict": {"input_keys": ["questions"]}}
@@ -80,27 +99,92 @@ def _run_pipeline(config: Config, questions: list[str], model_name: str = "test_
         DeserializeStage(config, message_type=ControlMessage, task_type="llm_engine", task_payload=completion_task))
 
     pipe.add_stage(LLMEngineStage(config, engine=_build_engine(model_name=model_name)))
-    sink = pipe.add_stage(InMemorySinkStage(config))
 
-    pipe.run()
+    if expected_df is not None:
+        sink = pipe.add_stage(CompareDataFrameStage(config, compare_df=expected_df))
+    else:
+        sink = pipe.add_stage(InMemorySinkStage(config))
 
-    result_df = concat_dataframes(sink.get_messages())
+    pipe.run()
 
-    return result_df
+    return sink
 
 
-@pytest.mark.usefixtures("openai")
-@pytest.mark.usefixtures("openai_api_key")
-@pytest.mark.usefixtures("serpapi_api_key")
+@pytest.mark.usefixtures("openai", "openai_api_key", "serpapi_api_key")
 @pytest.mark.use_python
-def test_agents_simple_pipe_integration_openai(config: Config):
-    questions = ["Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?"]
-    result_df = _run_pipeline(config, questions=questions, model_name="gpt-3.5-turbo-instruct")
+def test_agents_simple_pipe_integration_openai(config: Config, questions: list[str]):
+    sink = _run_pipeline(config, questions=questions, model_name="gpt-3.5-turbo-instruct")
 
+    result_df = concat_dataframes(sink.get_messages())
     assert len(result_df.columns) == 2
-    assert any(result_df.columns == ["questions", "response"])
+    assert sorted(result_df.columns) == ["questions", "response"]
 
     response_txt = result_df.response.iloc[0]
     response_match = re.match(r".*(\d+\.\d+)\.?$", response_txt)
     assert response_match is not None
     assert float(response_match.group(1)) >= 3.7
+
+
+@pytest.mark.usefixtures("openai", "restore_environ")
+@pytest.mark.use_python
+@mock.patch("langchain.utilities.serpapi.SerpAPIWrapper.aresults")
+@mock.patch("langchain.OpenAI._agenerate", autospec=True)  # autospec is needed as langchain will inspect the function
+def test_agents_simple_pipe(mock_openai_agenerate: mock.AsyncMock,
+                            mock_serpapi_aresults: mock.AsyncMock,
+                            config: Config,
+                            questions: list[str]):
+    os.environ.update({'OPENAI_API_KEY': 'test_api_key', 'SERPAPI_API_KEY': 'test_api_key'})
+
+    from langchain.schema import Generation
+    from langchain.schema import LLMResult
+
+    assert serpapi.SerpAPIWrapper().aresults is mock_serpapi_aresults
+
+    model_name = "test_model"
+
+    mock_openai_agenerate.side_effect = [
+        LLMResult(generations=[[
+            Generation(text="I should use a search engine to find information about unittests.\n"
+                       "Action: Search\nAction Input: \"unittests\"",
+                       generation_info={
+                           'finish_reason': 'stop', 'logprobs': None
+                       })
+        ]],
+                  llm_output={
+                      'token_usage': {}, 'model_name': model_name
+                  }),
+        LLMResult(generations=[[
+            Generation(text="I now know the final answer.\nFinal Answer: 3.99.",
+                       generation_info={
+                           'finish_reason': 'stop', 'logprobs': None
+                       })
+        ]],
+                  llm_output={
+                      'token_usage': {}, 'model_name': model_name
+                  })
+    ]
+
+    mock_serpapi_aresults.return_value = {
+        'answer_box': {
+            'answer': '25 years', 'link': 'http://unit.test', 'people_also_search_for': []
+        },
+        'inline_people_also_search_for': [],
+        'knowledge_graph': {},
+        'organic_results': [],
+        'pagination': {},
+        'related_questions': [],
+        'related_searches': [],
+        'search_information': {},
+        'search_metadata': {},
+        'search_parameters': {},
+        'serpapi_pagination': None
+    }
+
+    expected_df = cudf.DataFrame({'questions': questions, 'response': ["3.99."]})
+
+    sink = _run_pipeline(config, questions=questions, model_name=model_name, expected_df=expected_df)
+
+    assert len(mock_openai_agenerate.mock_calls) == 2
+    mock_serpapi_aresults.assert_awaited_once()
+
+    assert_results(sink.get_results())

From 006e3c51c314caed3855fa5214fa2bf8658a8ce9 Mon Sep 17 00:00:00 2001
From: Devin Robison <drobison00@users.noreply.github.com>
Date: Thu, 14 Dec 2023 10:21:23 -0700
Subject: [PATCH 5/7] Update phishing-model-card.md (#1437)

Requested updates on behalf of Michael Boone.

Authors:
  - Devin Robison (https://github.com/drobison00)
  - https://github.com/HesAnEasyCoder

Approvers:
  - https://github.com/raykallen
  - https://github.com/shawn-davis

URL: https://github.com/nv-morpheus/Morpheus/pull/1437
---
 models/model-cards/phishing-model-card.md | 54 ++++++-----------------
 1 file changed, 13 insertions(+), 41 deletions(-)

diff --git a/models/model-cards/phishing-model-card.md b/models/model-cards/phishing-model-card.md
index 914fee31a2..fc08de4528 100644
--- a/models/model-cards/phishing-model-card.md
+++ b/models/model-cards/phishing-model-card.md
@@ -21,7 +21,7 @@ limitations under the License.
 # Model Overview
 
 ## Description:
-* Phishing detection is a binary classifier differentiating between phishing/spam and benign emails and SMS messages. <br>
+* Phishing detection is a binary classifier differentiating between phishing/spam and benign emails and SMS messages.  This model is for demonstration purposes and not for production usage. <br>
 
 ## References(s):
 * https://archive.ics.uci.edu/ml/datasets/SMS+Spam+Collection <br>
@@ -114,24 +114,15 @@ limitations under the License.
 
 **Test Hardware:** <br>
 
-* Other  <br>
+* DGX (V100) <br>
+
+## Ethical Considerations:
+NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications.  When downloaded or used in accordance with our terms of service, developers should work with their supporting model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.  For more detailed information on ethical considerations for this model, please see the Model Card++ Explainability, Bias, Safety & Security, and Privacy Subcards below.  Please report security vulnerabilities or NVIDIA AI Concerns [here](https://www.nvidia.com/en-us/support/submit-security-vulnerability/).
 
 # Subcards
 
 ## Model Card ++ Bias Subcard
 
-### What is the gender balance of the model validation data?  
-
-* Not Applicable
-
-### What is the racial/ethnicity balance of the model validation data?
-
-* Not Applicable
-
-### What is the age balance of the model validation data?
-
-* Not Applicable
-
 ### What is the language balance of the model validation data?
 
 * English
@@ -140,26 +131,6 @@ limitations under the License.
 
 * UK
 
-### What is the educational background balance of the model validation data?
-
-* Not Applicable
-
-### What is the accent balance of the model validation data?
-
-* Not Applicable
-
-### What is the face/key point balance of the model validation data? 
-
-* Not Applicable
-
-### What is the skin/tone balance of the model validation data?
-
-* Not Applicable
-
-### What is the religion balance of the model validation data?
-
-* Not Applicable
-
 ### Individuals from the following adversely impacted (protected classes) groups participate in model design and testing.
 
 * Not Applicable
@@ -193,6 +164,10 @@ limitations under the License.
 ### List the technical limitations of the model. 
 * For different email/SMS types and content, different models need to be trained.
 
+### Has this been verified to have met prescribed NVIDIA standards?
+
+* Yes
+
 ### What performance metrics were used to affirm the model's performance?
 * F1
 
@@ -210,7 +185,7 @@ limitations under the License.
 ### Is the model used in an application with physical safety impact?
 * No
 
-### Describe physical safety impact (if present).
+### Describe life-critical impact (if present).
 * None
 
 ### Was model and dataset assessed for vulnerability for potential form of attack?
@@ -223,9 +198,6 @@ limitations under the License.
 ### Name use case restrictions for the model.
 * This pretrained model's use case is restricted to testing the Morpheus pipeline and may not be suitable for other applications.
 
-### Has this been verified to have met prescribed quality standards?
-* No
-
 ### Name target quality Key Performance Indicators (KPIs) for which this has been tested.  
 * N/A
 
@@ -246,12 +218,12 @@ limitations under the License.
 
 
 ### Generatable or reverse engineerable personally-identifiable information (PII)?
-* Neither
+* None
 
-### Was consent obtained for any PII used?
+### Protected classes used to create this model? (The following were used in model the model's training:)
 * N/A
 
-### Protected classes used to create this model? (The following were used in model the model's training:)
+### Was consent obtained for any PII used?
 * N/A
 
 ### How often is dataset reviewed?

From 1afe6685ecf4139e30b8ebe9b507d599ecca79e5 Mon Sep 17 00:00:00 2001
From: Devin Robison <drobison00@users.noreply.github.com>
Date: Thu, 14 Dec 2023 10:21:45 -0700
Subject: [PATCH 6/7] Update gnn-fsi-model-card.md (#1438)

Requested updates on behalf of Michael Boone.

Authors:
  - Devin Robison (https://github.com/drobison00)
  - https://github.com/HesAnEasyCoder

Approvers:
  - https://github.com/raykallen

URL: https://github.com/nv-morpheus/Morpheus/pull/1438
---
 models/model-cards/gnn-fsi-model-card.md | 46 +++++++++---------------
 1 file changed, 16 insertions(+), 30 deletions(-)

diff --git a/models/model-cards/gnn-fsi-model-card.md b/models/model-cards/gnn-fsi-model-card.md
index 66a22fb706..87f72340c7 100644
--- a/models/model-cards/gnn-fsi-model-card.md
+++ b/models/model-cards/gnn-fsi-model-card.md
@@ -18,7 +18,7 @@ limitations under the License.
 # Model Overview
 
 ### Description:
-* This model shows an application of a graph neural network for fraud detection in a credit card transaction graph. A transaction dataset that includes three types of nodes, transaction, client, and merchant nodes is used for modeling. A combination of `GraphSAGE` along `XGBoost` is used to identify frauds in the transaction networks. <br>
+* This model shows an application of a graph neural network for fraud detection in a credit card transaction graph. A transaction dataset that includes three types of nodes, transaction, client, and merchant nodes is used for modeling. A combination of `GraphSAGE` along `XGBoost` is used to identify frauds in the transaction networks.  This model is for demonstration purposes and not for production usage. <br>
 
 ## References(s):
 1. https://stellargraph.readthedocs.io/en/stable/hinsage.html?highlight=hinsage
@@ -93,30 +93,15 @@ This model is an example of a fraud detection pipeline using a graph neural netw
 * Triton <br>
 
 **Test Hardware:** <br>
-* Other   <br>
+* DGX (V100) <br>
+
+## Ethical Considerations:
+NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications.  When downloaded or used in accordance with our terms of service, developers should work with their supporting model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.  For more detailed information on ethical considerations for this model, please see the Model Card++ Explainability, Bias, Safety & Security, and Privacy Subcards below.  Please report security vulnerabilities or NVIDIA AI Concerns [here](https://www.nvidia.com/en-us/support/submit-security-vulnerability/).
 
 # Subcards
 ## Model Card ++ Bias Subcard
 
-### What is the gender balance of the model validation data?  
-* Not Applicable
-
-### What is the racial/ethnicity balance of the model validation data?
-* Not Applicable
-
-### What is the age balance of the model validation data?
-* Not Applicable
-
-### What is the language balance of the model validation data?
-* Not Applicable
-
-### What is the geographic origin language balance of the model validation data?
-* Not Applicable
-
-### What is the educational background balance of the model validation data?
-* Not Applicable
-
-### What is the accent balance of the model validation data?
+### Individuals from the following adversely impacted (protected classes) groups participate in model design and testing.
 * Not Applicable
 
 ### Describe measures taken to mitigate against unwanted bias.
@@ -145,6 +130,10 @@ This model is an example of a fraud detection pipeline using a graph neural netw
 ### List the technical limitations of the model.
 * This model version requires a transactional data schema with entities (user, merchant, transaction) as requirement for the model.
 
+### Has this been verified to have met prescribed NVIDIA standards?
+
+* Yes
+
 ### What performance metrics were used to affirm the model's performance?
 * Area under ROC curve and Accuracy
 
@@ -162,7 +151,7 @@ This model is an example of a fraud detection pipeline using a graph neural netw
 ### Is the model used in an application with physical safety impact?
 * No
 
-### Describe physical safety impact (if present).
+### Describe life-critical impact (if present).
 * Not Applicable
 
 ### Was model and dataset assessed for vulnerability for potential form of attack?
@@ -174,9 +163,6 @@ This model is an example of a fraud detection pipeline using a graph neural netw
 ### Name use case restrictions for the model.
 * The model's use case is restricted to testing the Morpheus pipeline and may not be suitable for other applications.
 
-### Has this been verified to have met prescribed quality standards?
-* No
-
 ### Name target quality Key Performance Indicators (KPIs) for which this has been tested.  
 * Not Applicable
 
@@ -195,14 +181,14 @@ This model is an example of a fraud detection pipeline using a graph neural netw
 ## Model Card ++ Privacy Subcard
 
 ### Generatable or reverse engineerable personally-identifiable information (PII)?
-* Neither
-
-### Was consent obtained for any PII used?
-* Not Applicable (Data is extracted from synthetically created credit card transaction,refer[3] for the source of data creation)
+* None
 
 ### Protected classes used to create this model? (The following were used in model the model's training:)
 * Not applicable
 
+### Was consent obtained for any PII used?
+* Not Applicable (Data is extracted from synthetically created credit card transaction,refer[3] for the source of data creation)
+
 ### How often is dataset reviewed?
 * The dataset is initially reviewed upon addition, and subsequent reviews are conducted as needed or upon request for any changes.
 
@@ -222,4 +208,4 @@ This model is an example of a fraud detection pipeline using a graph neural netw
 * Not applicable
 
 ### Is data compliant with data subject requests for data correction or removal, if such a request was made?
-* Not applicable
\ No newline at end of file
+* Not applicable

From 9f951a841fe13ad6088505ca4e83360137b367a4 Mon Sep 17 00:00:00 2001
From: Devin Robison <drobison00@users.noreply.github.com>
Date: Thu, 14 Dec 2023 10:22:46 -0700
Subject: [PATCH 7/7] Update abp-model-card.md (#1439)

Requested updates on behalf of Michael Boone

Authors:
  - Devin Robison (https://github.com/drobison00)
  - https://github.com/HesAnEasyCoder

Approvers:
  - https://github.com/raykallen

URL: https://github.com/nv-morpheus/Morpheus/pull/1439
---
 models/model-cards/abp-model-card.md | 58 +++++-----------------------
 1 file changed, 10 insertions(+), 48 deletions(-)

diff --git a/models/model-cards/abp-model-card.md b/models/model-cards/abp-model-card.md
index bcca363bbb..c69fc62329 100644
--- a/models/model-cards/abp-model-card.md
+++ b/models/model-cards/abp-model-card.md
@@ -22,7 +22,7 @@ limitations under the License.
 
 ## Description:
 
-* This model is an example of a binary XGBoost classifier to differentiate between anomalous GPU behavior, such as crypto mining / GPU malware, and non-anomalous GPU-based workflows (e.g., ML/DL training). The model is an XGBoost model. <br>
+* This model is an example of a binary XGBoost classifier to differentiate between anomalous GPU behavior, such as crypto mining / GPU malware, and non-anomalous GPU-based workflows (e.g., ML/DL training). This model is for demonstration purposes and not for production usage. <br>
 
 ## References(s):
 
@@ -112,52 +112,15 @@ limitations under the License.
 
 **Test Hardware:** <br>
 
-* Other <br>
+* DGX (V100) <br>
+
+## Ethical Considerations:
+NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications.  When downloaded or used in accordance with our terms of service, developers should work with their supporting model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.  For more detailed information on ethical considerations for this model, please see the Model Card++ Explainability, Bias, Safety & Security, and Privacy Subcards below.  Please report security vulnerabilities or NVIDIA AI Concerns [here](https://www.nvidia.com/en-us/support/submit-security-vulnerability/).
 
 # Subcards
 
 ## Model Card ++ Bias Subcard
 
-### What is the gender balance of the model validation data?  
-
-* Not Applicable
-
-### What is the racial/ethnicity balance of the model validation data?
-
-* Not Applicable
-
-### What is the age balance of the model validation data?
-
-* Not Applicable
-
-### What is the language balance of the model validation data?
-
-* Not Applicable
-
-### What is the geographic origin language balance of the model validation data?
-
-* Not Applicable
-
-### What is the educational background balance of the model validation data?
-
-* Not Applicable
-
-### What is the accent balance of the model validation data?
-
-* Not Applicable
-
-### What is the face/key point balance of the model validation data? 
-
-* Not Applicable
-
-### What is the skin/tone balance of the model validation data?
-
-* Not Applicable
-
-### What is the religion balance of the model validation data?
-
-* Not Applicable
-
 ### Individuals from the following adversely impacted (protected classes) groups participate in model design and testing.
 
 * Not Applicable
@@ -196,7 +159,10 @@ limitations under the License.
 
 * For different GPU workloads different models need to be trained.
 
+### Has this been verified to have met prescribed NVIDIA standards?
 
+* Yes
+  
 ### What performance metrics were used to affirm the model's performance?
 
 * Accuracy
@@ -220,7 +186,7 @@ limitations under the License.
 
 * No
 
-### Describe physical safety impact (if present).
+### Describe life-critical impact (if present).
 
 * N/A
 
@@ -236,10 +202,6 @@ limitations under the License.
 
 * The model's use case is restricted to testing the Morpheus pipeline and may not be suitable for other applications.
 
-### Has this been verified to have met prescribed quality standards?
-
-* No
-
 ### Name target quality Key Performance Indicators (KPIs) for which this has been tested. 
 
 * N/A
@@ -265,7 +227,7 @@ limitations under the License.
 
 ### Generatable or reverse engineerable personally-identifiable information (PII)?
 
-* Neither
+* None
 
 ### Was consent obtained for any PII used?