Merge pull request mlcommons#138 from mlcommons/dev

updating reproducibility initiatives from cTuning.org and cKnowledge.org
ctuning · Jul 22, 2024 · 31afd71 · 31afd71
2 parents e459f8d + 7e5449b
commit 31afd71
Show file tree

Hide file tree

Showing 199 changed files with 7,278 additions and 296 deletions.
diff --git a/README.md b/README.md
@@ -10,7 +10,7 @@
 [![CM script automation features test](https://github.com/mlcommons/cm4mlops/actions/workflows/test-cm-script-features.yml/badge.svg)](https://github.com/mlcommons/cm4mlops/actions/workflows/test-cm-script-features.yml)
 [![MLPerf inference MLCommons C++ ResNet50](https://github.com/mlcommons/cm4mlops/actions/workflows/test-mlperf-inference-mlcommons-cpp-resnet50.yml/badge.svg)](https://github.com/mlcommons/cm4mlops/actions/workflows/test-mlperf-inference-mlcommons-cpp-resnet50.yml)
 
-This repository contains reusable and cross-platform automation recipes to run DevOps, MLOps, AIOps and MLPerf 
+This repository contains reusable and cross-platform automation recipes to run DevOps, MLOps, and MLPerf 
 via a simple and human-readable [Collective Mind interface (CM)](https://github.com/mlcommons/ck) 
 while adapting to different operating systems, software and hardware.
 
@@ -19,31 +19,66 @@ and unified input/output to make them reusable in different projects either indi
 or by chaining them together into portable automation workflows, applications 
 and web services adaptable to continuously changing models, data sets, software and hardware.
 
-### Citing this project
+We develop and test [CM scripts](script) as a community effort to support the following projects:
+* [CM for MLPerf](https://docs.mlcommons.org/inference): modularize and automate MLPerf benchmarks 
+  (maintained by [MLCommons](https://mlcommons.org) and originally developed by [cKnowledge.org](https://cKnowledge.org), [OctoML](https://octoml.ai) and [cTuning.org](https://cTuning.org))
+* [CM for research and education](https://cTuning.org/ae): provide a common interface to automate and reproduce results from research papers 
+  and MLPerf benchmarks (maintained by [cTuning foundation](https://cTuning.org) and [cKnowledge.org](https://cKnowledge.org))
+* [CM for ABTF](https://github.com/mlcommons/cm4abtf): provide a unified CM interface to run automotive benchmarks
+  (maintained by [MLCommons](https://mlcommons.org) and originally developed by [cKnowledge.org](https://cKnowledge.org))
+* [CM for optimization](https://access.cknowledge.org/playground/?action=challenges): co-design efficient and cost-effective 
+  software and hardware for AI, ML and other emerging workloads via open challenges 
+  (maintained by [cKnowledge.org](https://cKnowledge.org))
 
-Please use this [BibTeX file](https://github.com/mlcommons/ck/blob/master/citation.bib).
+You can read this [ArXiv paper](https://arxiv.org/abs/2406.16791) to learn more about the CM motivation and long-term vision.
+
+Please provide your feedback or submit your issues [here](https://github.com/mlcommons/cm4mlops/issues).
 
 ## Catalog
 
 Online catalog: [cKnowledge](https://access.cknowledge.org/playground/?action=scripts), [MLCommons](https://docs.mlcommons.org/cm4mlops/scripts).
 
-## Examples
+## Citation
 
-### Run image classificaiton via CM
+Please use this [BibTeX file](https://github.com/mlcommons/ck/blob/master/citation.bib) to cite this project.
 
-```bash
-pip install cmind -U
+## A few demos
 
-cm pull repo mlcommons@cm4mlops --branch=dev
+### Install CM and virtual env
+
+Install the [MLCommons CM automation language](https://access.cknowledge.org/playground/?action=install).
 
-cmr "python app image-classification onnx" --quiet
+### Pull this repository
+
+```bash
+cm pull repo mlcommons@cm4mlops --branch=dev
 ```
 
-### Run MLPerf inference benchmark via CM
+### Run image classification using CM
 
 ```bash
-pip install cm4mlperf -U
 
+cm run script "python app image-classification onnx _cpu" --help
+
+cm run script "download file _wget" --url=https://cKnowledge.org/ai/data/computer_mouse.jpg --verify=no --env.CM_DOWNLOAD_CHECKSUM=45ae5c940233892c2f860efdf0b66e7e
+cm run script "python app image-classification onnx _cpu" --input=computer_mouse.jpg
+
+cmr "python app image-classification onnx _cpu" --input=computer_mouse.jpg
+cmr --tags=python,app,image-classification,onnx,_cpu --input=computer_mouse.jpg
+cmr 3d5e908e472b417e --input=computer_mouse.jpg
+
+cm docker script "python app image-classification onnx _cpu" --input=computer_mouse.jpg
+
+cm gui script "python app image-classification onnx _cpu"
+```
+
+### Re-run experiments from the ACM/IEEE MICRO'23 paper
+
+Check this [script/reproduce-ieee-acm-micro2023-paper-96](README.md).
+
+### Run MLPerf ResNet CPU inference benchmark via CM
+
+```bash
 cm run script --tags=run-mlperf,inference,_performance-only,_short  \
    --division=open \
    --category=edge \
@@ -62,6 +97,38 @@ cm run script --tags=run-mlperf,inference,_performance-only,_short  \
    --time
 ```
 
+### Run MLPerf BERT CUDA inference benchmark v4.1 via CM
+
+```bash
+cmr "run-mlperf inference _find-performance _full _r4.1" \
+   --model=bert-99 \
+   --implementation=nvidia \
+   --framework=tensorrt \
+   --category=datacenter \
+   --scenario=Offline \
+   --execution_mode=test \
+   --device=cuda  \
+   --docker \
+   --docker_cm_repo=mlcommons@cm4mlops \
+   --docker_cm_repo_flags="--branch=mlperf-inference" \
+   --test_query_count=100 \
+   --quiet
+```
+
+### Run MLPerf SDXL reference inference benchmark v4.1 via CM
+
+```bash
+cm run script \
+	--tags=run-mlperf,inference,_r4.1 \
+	--model=sdxl \
+	--implementation=reference \
+	--framework=pytorch \
+	--category=datacenter \
+	--scenario=Offline \
+	--execution_mode=valid \
+	--device=cuda \
+	--quiet
+```
 
 
 ## License
@@ -72,6 +139,5 @@ cm run script --tags=run-mlperf,inference,_performance-only,_short  \
 
 We thank [cKnowledge.org](https://cKnowledge.org), [cTuning foundation](https://cTuning.org)
 and [MLCommons](https://mlcommons.org) for sponsoring this project!
-
-We also thank all [volunteers, collaborators and contributors](CONTRIBUTING.md) 
+We also thank all [volunteers, collaborators and contributors](https://github.com/mlcommons/ck/blob/master/CONTRIBUTING.md) 
 for their support, fruitful discussions, and useful feedback! 
diff --git a/automation/script/module_misc.py b/automation/script/module_misc.py
@@ -1077,6 +1077,10 @@ def doc(i):
     r = utils.save_txt(output_file, s)
     if r['return']>0: return r
 
+    out_docs_file = os.path.join("..", "docs", "scripts", category, alias, "index.md")
+    r = utils.save_txt(out_docs_file, s)
+    if r['return']>0: return r
+
     return {'return':0}
 
 

diff --git a/challenge/add-derived-metrics-to-mlperf-inference/README.md b/challenge/add-derived-metrics-to-mlperf-inference/README.md
@@ -0,0 +1,32 @@
+### Challenge
+
+Check past MLPerf inference results in [this MLCommons repository](https://github.com/mlcommons/cm4mlperf-results)
+and add derived metrics such as result/No of cores, power efficiency, device cost, operational costs, etc.
+
+Add clock speed as a third dimension to graphs and improve Bar graph visualization.
+
+Read [this documentation](https://github.com/mlcommons/ck/blob/master/docs/mlperf/inference/README.md) 
+to run reference implementations of MLPerf inference benchmarks 
+using the CM automation language and use them as a base for your developments.
+
+Check [this ACM REP'23 keynote](https://doi.org/10.5281/zenodo.8105339) to learn more about our open-source project and long-term vision.
+
+
+### Prizes
+
+* *All contributors will receive 1 point for submitting valid results for 1 complete benchmark on one system.*
+* *All contributors will receive an official MLCommons Collective Knowledge contributor award (see [this example](https://ctuning.org/awards/ck-award-202307-zhu.pdf)).*
+
+
+### Organizers
+
+* [MLCommons](https://cKnowledge.org/mlcommons-taskforce)
+* [cTuning.org](https://www.linkedin.com/company/ctuning-foundation)
+* [cKnowledge.org](https://www.linkedin.com/company/cknowledge)
+
+### Results
+
+All accepted results will be publicly available in the CM format with derived metrics 
+in this [MLCommons repository](https://github.com/mlcommons/cm4mlperf-results),
+in [MLCommons Collective Knowledge explorer](https://access.cknowledge.org/playground/?action=experiments) 
+and at official [MLCommons website](https://mlcommons.org).
diff --git a/challenge/add-derived-metrics-to-mlperf-inference/_cm.json b/challenge/add-derived-metrics-to-mlperf-inference/_cm.json
@@ -0,0 +1,22 @@
+{
+  "alias": "add-derived-metrics-to-mlperf-inference",
+  "automation_alias": "challenge",
+  "automation_uid": "3d84abd768f34e08",
+  "date_close_extension": true,
+  "date_open": "20240204",
+  "points": 2,
+  "tags": [
+    "modularize",
+    "optimize",
+    "reproduce",
+    "replicate",
+    "benchmark",
+    "automate",
+    "derived-metrics",
+    "mlperf-inference",
+    "mlperf-inference-derived-metrics"
+  ],
+  "title": "Add derived metrics to MLPerf inference benchmarks (power efficiency, results / No of cores, costs, etc)",
+  "trophies": true,
+  "uid": "c65b56d7770946ee"
+}
diff --git a/challenge/automate-mlperf-inference-v3.1-and-v4.0-2024/README.md b/challenge/automate-mlperf-inference-v3.1-and-v4.0-2024/README.md
@@ -0,0 +1,4 @@
+20240220: 
+* A prototype of a GUI to generate CM commands to run MLPerf inference benchmarks is ready: [link](https://access.cknowledge.org/playground/?action=howtorun&bench_uid=39877bb63fb54725)
+* A prototype of the infrastructure to reproduce MLPerf inference benchmark results is ready: [link](https://access.cknowledge.org/playground/?action=reproduce)
+* On-going efforts: https://github.com/mlcommons/ck/issues/1052
diff --git a/challenge/automate-mlperf-inference-v3.1-and-v4.0-2024/_cm.yaml b/challenge/automate-mlperf-inference-v3.1-and-v4.0-2024/_cm.yaml
@@ -0,0 +1,21 @@
+alias: automate-mlperf-inference-v3.1-and-v4.0-2024
+uid: f89f152fc2614240
+
+automation_alias: challenge
+automation_uid: 3d84abd768f34e08
+
+title: Add MLCommons CM workflows and unifed interface to automate MLPerf inference v3.1 and v4.0 benchmarks (Intel, Nvidia, Qualcomm, Arm64, TPU ...)
+
+date_open: '20231215'
+date_close: '20240315'
+
+hot: true
+
+tags:
+- automate
+- mlperf-inference-v3.1-and-v4.0
+- 2024
+
+experiments:
+- tags: mlperf-inference,v3.1
+- tags: mlperf-inference,v4.0
diff --git a/...gh-performance-and-cost-efficient-ai-systems-based-on-mlperf-4.0-2024/README.md b/...gh-performance-and-cost-efficient-ai-systems-based-on-mlperf-4.0-2024/README.md
@@ -0,0 +1,10 @@
+This challenge is under preparation. You can read about the motivation behind this challenge in our [invited talk at MLPerf-Bench @ HPCA'24](https://doi.org/10.5281/zenodo.10786893).
+
+We plan to extend [MLCommons CM framework](https://github.com/mlcommons/ck) 
+to automatically compose high-performance and cost-efficient AI systems
+based on MLPerf inference v4.0 results and [CM automation recipes](https://access.cknowledge.org/playground/?action=scripts).
+
+* A prototype of a GUI to generate CM commands to run MLPerf inference benchmarks is ready: [link](https://access.cknowledge.org/playground/?action=howtorun&bench_uid=39877bb63fb54725)
+* A prototype of the infrastructure to reproduce MLPerf inference benchmark results is ready: [link](https://access.cknowledge.org/playground/?action=reproduce)
+
+Contact the [MLCommons Task Force on Automation and Reproducibility](https://github.com/mlcommons/ck/blob/master/docs/taskforce.md) for more details.
diff --git a/.../compose-high-performance-and-cost-efficient-ai-systems-based-on-mlperf-4.0-2024/_cm.yaml b/.../compose-high-performance-and-cost-efficient-ai-systems-based-on-mlperf-4.0-2024/_cm.yaml
@@ -0,0 +1,25 @@
+alias: compose-high-performance-and-cost-efficient-ai-systems-based-on-mlperf-4.0-2024
+uid: 7c983102d89e4869
+
+automation_alias: challenge
+automation_uid: 3d84abd768f34e08
+
+title: "Compose high-performance and cost-efficint AI systems using MLCommons' Collective Mind and MLPerf inference"
+
+date_open: '20240101'
+
+tags:
+- compose
+- ai
+- systems
+- mlperf-inference-v4.0
+- cm
+- mlcommons-cm
+- mlperf
+- v4.0
+- performance
+- energy
+- cost
+
+experiments:
+- tags: mlperf-inference,v4.0
diff --git a/challenge/connect-mlperf-inference-v3.1-with-openbenchmarking/README.md b/challenge/connect-mlperf-inference-v3.1-with-openbenchmarking/README.md
@@ -0,0 +1,30 @@
+### Challenge
+
+Connect CM workflows to run MLPerf inference benchmarks with [OpenBenchmarking.org](https://openbenchmarking.org).
+
+Read [this documentation](https://github.com/mlcommons/ck/blob/master/docs/mlperf/inference/README.md) 
+to run reference implementations of MLPerf inference benchmarks 
+using the CM automation language and use them as a base for your developments.
+
+Check [this ACM REP'23 keynote](https://doi.org/10.5281/zenodo.8105339) to learn more about our open-source project and long-term vision.
+
+
+### Prizes
+
+* *All contributors will receive 1 point for submitting valid results for 1 complete benchmark on one system.*
+* *All contributors will receive an official MLCommons Collective Knowledge contributor award (see [this example](https://ctuning.org/awards/ck-award-202307-zhu.pdf)).*
+
+
+
+### Organizers
+
+* Michael Larabel
+* Grigori Fursin
+* [MLCommons](https://cKnowledge.org/mlcommons-taskforce)
+* [cTuning.org](https://www.linkedin.com/company/ctuning-foundation)
+* [cKnowledge.org](https://www.linkedin.com/company/cknowledge)
+
+### Results
+
+Results will be available at [OpenBenchmark.org](https://openbenchmarking.org) 
+and [MLCommons CK playgronud](https://access.cknowledge.org/playground/?action=experiments).
diff --git a/challenge/connect-mlperf-inference-v3.1-with-openbenchmarking/_cm.json b/challenge/connect-mlperf-inference-v3.1-with-openbenchmarking/_cm.json
@@ -0,0 +1,22 @@
+{
+  "alias": "connect-mlperf-inference-v3.1-with-openbenchmarking",
+  "automation_alias": "challenge",
+  "automation_uid": "3d84abd768f34e08",
+  "date_open": "20240101",
+  "date_close_extension": true,
+  "points": 2,
+  "tags": [
+    "modularize",
+    "optimize",
+    "reproduce",
+    "replicate",
+    "benchmark",
+    "automate",
+    "openbenchmarking",
+    "mlperf-inference",
+    "mlperf-inference-openbenchmarking"
+  ],
+  "title": "Run MLPerf inference benchmarks using CM via OpenBenchmarking.org",
+  "trophies": true,
+  "uid": "534592626eb44efe"
+}
diff --git a/challenge/connect-mlperf-with-medperf/README.md b/challenge/connect-mlperf-with-medperf/README.md
@@ -0,0 +1,23 @@
+### Challenge
+
+Evaluate models from [MLCommons MedPerf platform](https://www.medperf.org) in terms of latency, throughput, power consumption and other metrics
+using MLPerf loadgen and MLCommons CM automation language. 
+
+See the [Nature 2023 article about MedPerf](https://www.nature.com/articles/s42256-023-00652-2)
+and [ACM REP'23 keynote about CM](https://doi.org/10.5281/zenodo.8105339) to learn more about these projects.
+
+Read [this documentation](https://github.com/mlcommons/ck/blob/master/docs/mlperf/inference/README.md) 
+to run reference implementations of MLPerf inference benchmarks 
+using the CM automation language and use them as a base for your developments.
+
+
+### Prizes
+
+* *All contributors will receive an official MLCommons Collective Knowledge contributor award (see [this example](https://ctuning.org/awards/ck-award-202307-zhu.pdf)).*
+
+
+### Organizers
+
+* [cKnowledge.org](https://www.linkedin.com/company/cknowledge)
+* [cTuning.org](https://www.linkedin.com/company/ctuning-foundation)
+* [MLCommons](https://cKnowledge.org/mlcommons-taskforce)
diff --git a/challenge/connect-mlperf-with-medperf/_cm.json b/challenge/connect-mlperf-with-medperf/_cm.json
@@ -0,0 +1,26 @@
+{
+  "alias": "connect-mlperf-with-medperf",
+  "automation_alias": "challenge",
+  "automation_uid": "3d84abd768f34e08",
+  "date_close_extension": true,
+  "date_open": "20240105",
+  "points": 2,
+  "tags": [
+    "modularize",
+    "optimize",
+    "reproduce",
+    "replicate",
+    "benchmark",
+    "automate",
+    "medperf",
+    "mlperf-inference",
+    "mlperf-inference-medperf",
+    "mlperf-inference-medperf",
+    "mlperf-inference-medperf-v3.1",
+    "mlperf-inference-medperf-v3.1-2023",
+    "v3.1"
+  ],
+  "title": "Connect MedPerf with MLPerf and CM",
+  "trophies": true,
+  "uid": "c26d1fbf89164728"
+}
diff --git a/challenge/optimize-mlperf-inference-scc2023/README.md b/challenge/optimize-mlperf-inference-scc2023/README.md
@@ -0,0 +1,16 @@
+### CM tutorial
+
+https://github.com/mlcommons/ck/blob/master/docs/tutorials/scc23-mlperf-inference-bert.md
+
+### Challenge
+
+Reproduce and optimize MLPerf inference benchmarks during Student Cluster Competition at SuperComputing'23.
+
+See our [related challange from 2022]()https://access.cknowledge.org/playground/?action=challenges&name=repro-mlperf-inference-retinanet-scc2022).
+
+### Organizers
+
+* [MLCommons taskforce on automation and reproducibility](https://cKnowledge.org/mlcommons-taskforce)
+* [cTuning foundation](https://cTuning.org)
+* [cKnowledge.org](https://cKnowledge.org)
+