[deepsparse.benchmark] enable internal kv cache by default #1335

bfineran · 2023-10-19T15:25:20Z

if a model is detected to have kv cache, default behavior on deepsparse engine is to run with internal enabled

also adds argument disable-kv-cache-overrides to skip any kv cache updates (addresses previous need for the sequence_length set requirement)

example:
with internal cache
deepsparse.benchmark zoo:nlg/text_generation/mpt-7b/pytorch/huggingface/mpt_chat/base-none
Throughput (items/sec): 2.6370

with external cache
deepsparse.benchmark zoo:nlg/text_generation/mpt-7b/pytorch/huggingface/mpt_chat/base-none --no-internal-kv-cache
Throughput (items/sec): 2.2022

with no model edits
deepsparse.benchmark zoo:nlg/text_generation/mpt-7b/pytorch/huggingface/mpt_chat/base-none --disable-kv-cache-overrides
Throughput (items/sec): 2.1987

mgoin

appreciate it!

[deepsparse.benchmark] enable internal kv cache by default

9fb2d83

bfineran requested review from mgoin and dbogunowicz October 19, 2023 15:25

bfineran self-assigned this Oct 19, 2023

Benjamin added 3 commits October 19, 2023 11:45

remove requirement for sequence length to be set to run in kv cache mode

7685510

add option to disable all kv cache overrides

9c14c76

add store_true

e4b1f3e

mgoin previously approved these changes Oct 19, 2023

View reviewed changes

argparse fix

85fea46

bfineran dismissed mgoin’s stale review via 85fea46 October 19, 2023 16:05

mgoin approved these changes Oct 19, 2023

View reviewed changes

mgoin merged commit eb114ba into main Oct 19, 2023

mgoin deleted the default-benchmark-internal-kv branch October 19, 2023 16:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[deepsparse.benchmark] enable internal kv cache by default #1335

[deepsparse.benchmark] enable internal kv cache by default #1335

bfineran commented Oct 19, 2023 •

edited

Loading

mgoin left a comment

[deepsparse.benchmark] enable internal kv cache by default #1335

[deepsparse.benchmark] enable internal kv cache by default #1335

Conversation

bfineran commented Oct 19, 2023 • edited Loading

mgoin left a comment

Choose a reason for hiding this comment

bfineran commented Oct 19, 2023 •

edited

Loading