Problems with all tasks execution #2

JohnConnor123 · 2025-04-02T16:57:25Z

When running the bash run_tests.sh command from the evaluation folder, the test starts on context 250, and then hangs instead of moving to contexts 500, 1000, 2000,...

Here is the Traceback when the keyboard is interrupted:

Results saved at ./results/CL250/0408_T04_C02_twohop2/minimax-01_book_0408_T04_C02_twohop2_1743612171.json
100%|███████████████████████████████████████████████| 26/26 [00:00<00:00, 35.32it/s]
100%|███████████████████████████████████████████| 26/26 [00:00<00:00, 146378.39it/s]
Results saved at ./results/CL250/0408_T05_C02_twohop2/minimax-01_book_0408_T05_C02_twohop2_1743612175.json
100%|███████████████████████████████████████████████| 26/26 [00:00<00:00, 34.85it/s]
^[[A
^CTraceback (most recent call last):
  File "/home/calibri/experiments/RULER/LongContext/NoLiMa/evaluation/run_tests.py", line 113, in <module>
    tester.evaluate()
  File "/home/calibri/experiments/RULER/LongContext/NoLiMa/evaluation/async_evaluate.py", line 235, in evaluate
    responses = loop.run_until_complete(asyncio.gather(*async_tasks))
  File "/home/calibri/.pyenv/versions/3.10.16/lib/python3.10/asyncio/base_events.py", line 636, in run_until_complete
    self.run_forever()
  File "/home/calibri/.pyenv/versions/3.10.16/lib/python3.10/asyncio/base_events.py", line 603, in run_forever
    self._run_once()
  File "/home/calibri/.pyenv/versions/3.10.16/lib/python3.10/asyncio/base_events.py", line 1871, in _run_once
    event_list = self._selector.select(timeout)
  File "/home/calibri/.pyenv/versions/3.10.16/lib/python3.10/selectors.py", line 469, in select
    fd_event_list = self._selector.poll(timeout, max_ev)
KeyboardInterrupt
^[[A^CException ignored in: <module 'threading' from '/home/calibri/.pyenv/versions/3.10.16/lib/python3.10/threading.py'>
Traceback (most recent call last):
  File "/home/calibri/.pyenv/versions/3.10.16/lib/python3.10/threading.py", line 1567, in _shutdown
    lock.acquire()
KeyboardInterrupt:

The text was updated successfully, but these errors were encountered:

amodaresi · 2025-04-04T14:29:38Z

Which model are you testing? Is it running locally (e.g. via vLLM) or served via a cloud-based API?

JohnConnor123 · 2025-04-04T16:14:46Z

Which model are you testing? Is it running locally (e.g. via vLLM) or served via a cloud-based API?

Qwen-0.5B-Instruct and llama3.1-8B-Instruct, running locally via vLLM, and MiniMaxAI/MiniMax-Text-01, running via OpenRouter (for MiniMaxAI/MiniMax-Text-01 I specified vLLM and changed the tokenizer to the repository from hugging face, so that it loaded without errors)

JohnConnor123 · 2025-04-08T19:24:11Z

Which model are you testing? Is it running locally (e.g. via vLLM) or served via a cloud-based API?

Is it possible to fix this bug?

amodaresi · 2025-04-09T21:57:18Z

Have you tried lowering down the timeout? (e.g. using 120 instead of 700 seconds; we opted for the larger number in longer contexts)

NoLiMa/evaluation/model_configs/llama_3.3_70b.json

Line 9 in a02da41

"timeout": 700,

It is possible that the hang is just some API request failing and then ends up with retries with long timeouts.

JohnConnor123 · 2025-04-10T08:10:09Z

Have you tried lowering down the timeout? (e.g. using 120 instead of 700 seconds; we opted for the larger number in longer contexts)

NoLiMa/evaluation/model_configs/llama_3.3_70b.json

Line 9 in a02da41

"timeout": 700,
It is possible that the hang is just some API request failing and then ends up with retries with long timeouts.

So your solution is to change timeout from 700 to 120, right?

amodaresi · 2025-04-10T09:22:27Z

Yes, exactly.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problems with all tasks execution #2

Problems with all tasks execution #2

JohnConnor123 commented Apr 2, 2025

amodaresi commented Apr 4, 2025

JohnConnor123 commented Apr 4, 2025 •

edited

Loading

JohnConnor123 commented Apr 8, 2025

amodaresi commented Apr 9, 2025

JohnConnor123 commented Apr 10, 2025

amodaresi commented Apr 10, 2025 •

edited

Loading

Problems with all tasks execution #2

Problems with all tasks execution #2

Comments

JohnConnor123 commented Apr 2, 2025

amodaresi commented Apr 4, 2025

JohnConnor123 commented Apr 4, 2025 • edited Loading

JohnConnor123 commented Apr 8, 2025

amodaresi commented Apr 9, 2025

JohnConnor123 commented Apr 10, 2025

amodaresi commented Apr 10, 2025 • edited Loading

JohnConnor123 commented Apr 4, 2025 •

edited

Loading

amodaresi commented Apr 10, 2025 •

edited

Loading