experience_replay_interface.py: terminal_actions defined as int, but de-facto float #83

Wuodan · 2024-08-22T10:11:38Z

linesight/trackmania_rl/experience_replay/experience_replay_interface.py

Line 53 in 09d84b5

terminal_actions: int,

There is a bit of inconsistency in class Experience in experience_replay/experience_replay_interface.py.
terminal_actions is defined as int, the documentation says it's int or math.inf (float). And in the code it's only filed by buffer_management.py with this line:
terminal_actions = float((n_frames - 1) - i) if "race_time" in rollout_results else math.inf

So de-facto terminal_actions is always float, not just when it's math.inf.

I tried replacing replacing float()' with inf()` and math.inf with sys.maxsize and got some error like "int cannot be converted to C long" or "int to large for C long". Other easy workarounds also failed.

The text was updated successfully, but these errors were encountered:

Wuodan · 2024-08-22T10:45:36Z

I tried changing the above line to
terminal_actions = n_frames - 1 - i if "race_time" in rollout_results else sys.maxsize
so terminal_actions is always int, but it fails after running for quite.
The error is:

OverflowError: Python int too large to convert to C long

I still think terminal_actions should be and can be int.

Full stack-trace on failure is:

All rollout queues were empty. Learner sleeps 1 second.
Race time ratio   3.518026737413334
 NMG=7992970 
Process Process-1:
Traceback (most recent call last):
  File "C:\Users\stefa\miniconda3\envs\linesight\Lib\multiprocessing\process.py", line 314, in _bootstrap
    self.run()
  File "C:\Users\stefa\miniconda3\envs\linesight\Lib\multiprocessing\process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "C:\development\github\Linesight-RL\linesight\trackmania_rl\multiprocess\learner_process.py", line 494, in learner_process_fn
    loss, grad_norm = trainer.train_on_batch(buffer, do_learn=True)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\development\github\Linesight-RL\linesight\trackmania_rl\agents\iqn.py", line 231, in train_on_batch
    batch, batch_info = buffer.sample(self.batch_size, return_info=True)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\stefa\miniconda3\envs\linesight\Lib\site-packages\torchrl\data\replay_buffers\replay_buffers.py", line 671, in sample
    ret = self._prefetch_queue.popleft().result()
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\stefa\miniconda3\envs\linesight\Lib\concurrent\futures\_base.py", line 456, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "C:\Users\stefa\miniconda3\envs\linesight\Lib\concurrent\futures\_base.py", line 401, in __get_result
    raise self._exception
  File "C:\Users\stefa\miniconda3\envs\linesight\Lib\concurrent\futures\thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\stefa\miniconda3\envs\linesight\Lib\site-packages\torchrl\data\replay_buffers\utils.py", line 72, in decorated_fun
    output = fun(self, *args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\stefa\miniconda3\envs\linesight\Lib\site-packages\torchrl\data\replay_buffers\replay_buffers.py", line 607, in _sample
    data = self._collate_fn(data)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "C:\development\github\Linesight-RL\linesight\trackmania_rl\buffer_utilities.py", line 61, in buffer_collate_function
    ) = tuple(
        ^^^^^^
  File "C:\development\github\Linesight-RL\linesight\trackmania_rl\buffer_utilities.py", line 63, in <lambda>
    lambda attr_name: fast_collate_cpu(batch, attr_name),
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\development\github\Linesight-RL\linesight\trackmania_rl\buffer_utilities.py", line 38, in fast_collate_cpu
    buffer[:] = source[:]
    ~~~~~~^^^
OverflowError: Python int too large to convert to C long

Wuodan · 2024-08-22T12:13:24Z

There seems to be no problem when using int and 2**31 - 1 instead of math.inf.
terminal_actions = n_frames - 1 - i if "race_time" in rollout_results else 2**31 - 1

Warning was: Expected type 'int', got 'float' instead Fixes Linesight-RL#83

Warning was: Expected type 'int', got 'float' instead

Warning was: Expected type 'int', got 'float' instead Fixes Linesight-RL#83

Wuodan added a commit to Wuodan/linesight that referenced this issue Aug 22, 2024

Make terminal_actions int

58883c8

Warning was: Expected type 'int', got 'float' instead Fixes Linesight-RL#83

Wuodan linked a pull request Aug 22, 2024 that will close this issue

Make terminal_actions int #84

Open

Wuodan added a commit to Wuodan/linesight that referenced this issue Aug 22, 2024

Make terminal_actions int

53c4d2f

Warning was: Expected type 'int', got 'float' instead Fixes Linesight-RL#83

Wuodan added a commit to Wuodan/linesight that referenced this issue Aug 22, 2024

Linesight-RL#83 Make terminal_actions int

7aca7d8

Warning was: Expected type 'int', got 'float' instead

Wuodan added a commit to Wuodan/linesight that referenced this issue Aug 22, 2024

Linesight-RL#83 Make terminal_actions int

602455d

Warning was: Expected type 'int', got 'float' instead

Wuodan added a commit to Wuodan/linesight that referenced this issue Aug 30, 2024

Make terminal_actions int

c0f690f

Warning was: Expected type 'int', got 'float' instead Fixes Linesight-RL#83

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

experience_replay_interface.py: terminal_actions defined as int, but de-facto float #83

experience_replay_interface.py: terminal_actions defined as int, but de-facto float #83

Wuodan commented Aug 22, 2024

Wuodan commented Aug 22, 2024

Wuodan commented Aug 22, 2024

experience_replay_interface.py: terminal_actions defined as int, but de-facto float #83

experience_replay_interface.py: terminal_actions defined as int, but de-facto float #83

Comments

Wuodan commented Aug 22, 2024

Wuodan commented Aug 22, 2024

Wuodan commented Aug 22, 2024