diff --git a/README.md b/README.md index 25dcd81682d..75be0848d06 100644 --- a/README.md +++ b/README.md @@ -62,7 +62,7 @@ | Model | Technique | Gen. Token [3] | Batch | End-to-end throughput [1] | Device throughput [2] | Target | |-----------------------------------------------------------|--------------------|---------------------|-----------------------|------------------------------|------------------------------|-----------------| -| [Falcon7B-decode](./models/demos/t3000/falcon7b) | Data Parallel | 129th | 256 | 4.9 t/s/u - 1247 t/s | 15.5 t/s/u - 3968 t/s | 26 t/s/u | +| [Falcon7B-decode](./models/demos/t3000/falcon7b) | Data Parallel | 129th | 256 | 7.4 t/s/u - 1901 t/s | 15.5 t/s/u - 3968 t/s | 26 t/s/u | | [LLaMA-2-70B-decode](./models/demos/t3000/llama2_70b) | Tensor Parallel | 129th | 32 | 10.4 t/s/u - 333 t/s | 16.6 t/s/u - 532 t/s | 20 t/s/u | | [LLaMA-3-70B-decode](./models/demos/t3000/llama3_70b) | Tensor Parallel | 129th | 32 | 10.4 t/s/u - 333 t/s | 15.8 t/s/u - 506 t/s | 20 t/s/u | | [Falcon40B-decode](./models/demos/t3000/falcon40b) | Tensor Parallel | 129th | 32 | work-in-progress | 10.0 t/s/u - 320 t/s | 36 t/s/u |