Fix the logic for perplexity evaluation (Not enough kv_cache capacity to run generation. Please use a larger sequence_length or a shorter prompt
)
#5445
Job | Run time |
---|---|
42s | |
36s | |
1m 18s |