Update example to not fail hessian inversion (#904)

* update Signed-off-by: Dipika <dipikasikka1@gmail.com> * quality --------- Signed-off-by: Dipika <dipikasikka1@gmail.com> Co-authored-by: Rahul Tuli <rahul@neuralmagic.com>
vllm-project · Nov 9, 2024 · a173a0c · a173a0c
1 parent 644a500
commit a173a0c
Showing 1 changed file with 3 additions and 1 deletion.
diff --git a/examples/big_models_with_accelerate/multi_gpu_int8.py b/examples/big_models_with_accelerate/multi_gpu_int8.py
@@ -59,7 +59,9 @@ def tokenize(sample):
 #   * quantize the weights to int8 with GPTQ (static per channel)
 #   * quantize the activations to int8 (dynamic per token)
 recipe = [
-    GPTQModifier(targets="Linear", scheme="W8A8", ignore=["lm_head"]),
+    GPTQModifier(
+        targets="Linear", scheme="W8A8", ignore=["lm_head"], dampening_frac=0.1
+    ),
 ]
 
 # 4) Apply algorithms and save in `compressed-tensors` format.