diff --git a/examples/huggingface-transformers/README.md b/examples/huggingface-transformers/README.md
index b3414a7499..8f1850a438 100644
--- a/examples/huggingface-transformers/README.md
+++ b/examples/huggingface-transformers/README.md
@@ -38,7 +38,7 @@ Question-Answering task. The current version of the pipeline supports only
 from pipelines import pipeline
 
 # SparseZoo model stub or path to ONNX file
-onnx_filepath="zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/pruned-moderate"
+onnx_filepath="zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/pruned-aggressive_98"
 
 num_cores=None  # uses all available CPU cores by default
 
@@ -70,7 +70,7 @@ benchmark.py -h`.
 To run a benchmark using the DeepSparse Engine with a pruned BERT model that uses all available CPU cores and batch size 1, run:
 ```bash
 python benchmark.py \
-    zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/pruned-moderate \
+    zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/pruned-aggressive_98 \
     --batch-size 1
 ```
 
@@ -94,7 +94,7 @@ also supported.
 Example command:
 ```bash
 python server.py \
-    zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/pruned-moderate
+    zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/pruned-aggressive_98
 ```
 
 You can leave that running as a detached process or in a spare terminal.
@@ -142,10 +142,8 @@ Learn more at
 
 | Model Name     |      Stub      | Description |
 |----------|-------------|-------------|
-| bert-pruned-moderate | zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/pruned-moderate |This model is the result of pruning BERT base uncased on the SQuAD dataset. The sparsity level is 90% uniformly applied to all encoder layers. Distillation was used with the teacher being the BERT model fine-tuned on the dataset for two epochs.|
-| bert-6layers-aggressive-pruned| zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/pruned_6layers-aggressive_96 |This model is the result of pruning a modified BERT base uncased with 6 layers on the SQuAD dataset. The sparsity level is 95% uniformly applied to all encoder layers. Distillation was used with the teacher being the BERT model fine-tuned on the dataset for two epochs.|
+| bert-6layers-aggressive-pruned-96| zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/pruned_6layers-aggressive_96 |This model is the result of pruning a modified BERT base uncased with 6 layers on the SQuAD dataset. The sparsity level is 95% uniformly applied to all encoder layers. Distillation was used with the teacher being the BERT model fine-tuned on the dataset for two epochs.|
 | bert-pruned-conservative| zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/pruned-conservative |This model is the result of pruning BERT base uncased on the SQuAD dataset. The sparsity level is 80% uniformly applied to all encoder layers. Distillation was used with the teacher being the BERT model fine-tuned on the dataset for two epochs.|
-| pruned_6layers-moderate | zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/pruned_6layers-moderate |This model is the result of pruning a modified BERT base uncased with 6 layers on the SQuAD dataset. The sparsity level is 90% uniformly applied to all encoder layers. Distillation was used with the teacher being the BERT model fine-tuned on the dataset for two epochs. The integration with Hugging Face's Transformers can be found [here](https://github.com/neuralmagic/sparseml/tree/main/integrations/huggingface-transformers).|
-| pruned-aggressive_94 | zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/pruned-aggressive_94|This model is the result of pruning BERT base uncased on the SQuAD dataset. The sparsity level is 95% uniformly applied to all encoder layers. Distillation was used with the teacher being the BERT model fine-tuned on the dataset for two epochs.|
-| pruned_6layers-conservative| zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/pruned_6layers-conservative|This model is the result of pruning a modified BERT base uncased with 6 layers on the SQuAD dataset. The sparsity level is 80% uniformly applied to all encoder layers. Distillation was used with the teacher being the BERT model fine-tuned on the dataset for two epochs.|
-| bert-base|zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/base-none |This model is the result of a BERT base uncased model fine-tuned on the SQuAD dataset for two epochs.|
+| pruned-aggressive_94 | zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/pruned-aggressive_94|This model is the result of pruning a modified BERT base uncased with 6 layers on the SQuAD dataset. The sparsity level is 90% uniformly applied to all encoder layers. Distillation was used with the teacher being the BERT model fine-tuned on the dataset for two epochs.|
+| bert-3layers-pruned-aggressive-89| zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/pruned_3layers-aggressive_89|This model is the result of pruning a modified BERT base uncased with 6 layers on the SQuAD dataset. The sparsity level is 89% uniformly applied to all encoder layers. Distillation was used with the teacher being the BERT model fine-tuned on the dataset for two epochs.|
+| bert-base|zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/base-none |This model is the result of a BERT base uncased model fine-tuned on the SQuAD dataset for two epochs.|
\ No newline at end of file
diff --git a/examples/huggingface-transformers/benchmark.py b/examples/huggingface-transformers/benchmark.py
index aa260a7a31..6a4b1c368e 100644
--- a/examples/huggingface-transformers/benchmark.py
+++ b/examples/huggingface-transformers/benchmark.py
@@ -66,7 +66,7 @@
 ##########
 Example for benchmarking on a pruned BERT model from sparsezoo with deepsparse:
 python benchmark.py \
-    zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/pruned-moderate \
+   zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/pruned-aggressive_98 \
 
 ##########
 Example for benchmarking on a local ONNX model with deepsparse:
diff --git a/examples/huggingface-transformers/server.py b/examples/huggingface-transformers/server.py
index d1026636ff..5b87693ca3 100644
--- a/examples/huggingface-transformers/server.py
+++ b/examples/huggingface-transformers/server.py
@@ -38,7 +38,7 @@
 ##########
 Example command for running using a model from sparsezoo:
 python server.py \
-    zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/pruned-moderate
+    zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/pruned-aggressive_98
 """
 import argparse
 import json
diff --git a/examples/huggingface-transformers/squad_eval.py b/examples/huggingface-transformers/squad_eval.py
index 661f7c4c87..f5005ed5dd 100644
--- a/examples/huggingface-transformers/squad_eval.py
+++ b/examples/huggingface-transformers/squad_eval.py
@@ -48,7 +48,7 @@
 ##########
 Example command for evaluating a sparse BERT QA model from sparsezoo:
 python squad_eval.py \
-    zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/pruned-moderate
+    zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/pruned-aggressive_98
 """
 
 
diff --git a/examples/huggingface-transformers/squad_inference.py b/examples/huggingface-transformers/squad_inference.py
index 85a55c9be7..d6f23a6536 100644
--- a/examples/huggingface-transformers/squad_inference.py
+++ b/examples/huggingface-transformers/squad_inference.py
@@ -60,7 +60,7 @@
 ##########
 Example command for running 1000 samples using a model from sparsezoo:
 python squad_inference.py \
-    zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/pruned-moderate \
+   zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/pruned-aggressive_98 \
     --num-samples 1000
 """