forked from hlin01/mini_behavior
-
Notifications
You must be signed in to change notification settings - Fork 0
/
nohup.out
7188 lines (7150 loc) · 292 KB
/
nohup.out
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
wandb: Using wandb-core as the SDK backend. Please refer to https://wandb.me/wandb-core for more information.
begin training
TRAINING PARAMETERS
---------------------------------------
Total timesteps: 1000000.0
Learning rate: 0.0001
Number of total updates: 1000
Number of parallel environments: 8
Number of steps per rollout (Used for each kNN update and curiosity reward calculation): 125
Batch size: 1000
Number of PPO update epochs: 4
Minibatch size: 250
K parameter: 50
---------------------------------------
wandb: ERROR api_key not configured (no-tty). call wandb.login(key=[your_api_key])
Traceback (most recent call last):
File "/home/kevin/mini_behavior/train_APT.py", line 75, in <module>
model.train()
File "/home/kevin/mini_behavior/algorithms/APT_PPO.py", line 94, in train
wandb.init(project="APT_PPO_Training",
File "/home/kevin/miniconda3/envs/babyRL/lib/python3.9/site-packages/wandb/sdk/wandb_init.py", line 1270, in init
wandb._sentry.reraise(e)
File "/home/kevin/miniconda3/envs/babyRL/lib/python3.9/site-packages/wandb/analytics/sentry.py", line 161, in reraise
raise exc.with_traceback(sys.exc_info()[2])
File "/home/kevin/miniconda3/envs/babyRL/lib/python3.9/site-packages/wandb/sdk/wandb_init.py", line 1255, in init
wi.setup(kwargs)
File "/home/kevin/miniconda3/envs/babyRL/lib/python3.9/site-packages/wandb/sdk/wandb_init.py", line 304, in setup
wandb_login._login(
File "/home/kevin/miniconda3/envs/babyRL/lib/python3.9/site-packages/wandb/sdk/wandb_login.py", line 347, in _login
wlogin.prompt_api_key()
File "/home/kevin/miniconda3/envs/babyRL/lib/python3.9/site-packages/wandb/sdk/wandb_login.py", line 281, in prompt_api_key
raise UsageError("api_key not configured (no-tty). call " + directive)
wandb.errors.errors.UsageError: api_key not configured (no-tty). call wandb.login(key=[your_api_key])
wandb: Using wandb-core as the SDK backend. Please refer to https://wandb.me/wandb-core for more information.
wandb: Currently logged in as: kevinhan (kevinhan-the-university-of-texas-at-austin). Use `wandb login --relogin` to force relogin
wandb: Tracking run with wandb version 0.18.6
wandb: Run data is saved locally in /home/kevin/mini_behavior/wandb/run-20241125_212333-d7akopqu
wandb: Run `wandb offline` to turn off syncing.
wandb: Syncing run helpful-bird-24
wandb: ⭐️ View project at https://wandb.ai/kevinhan-the-university-of-texas-at-austin/APT_PPO_Training
wandb: 🚀 View run at https://wandb.ai/kevinhan-the-university-of-texas-at-austin/APT_PPO_Training/runs/d7akopqu
=== Observation Space ===
Shape: (59,)
Type: float32
=== Observation Space ===
Shape: (59,)
Type: float32
=== Observation Space ===
Shape: (59,)
Type: float32
=== Observation Space ===
Shape: (59,)
Type: float32
=== Observation Space ===
Shape: (59,)
Type: float32
=== Observation Space ===
Shape: (59,)
Type: float32
=== Observation Space ===
Shape: (59,)
Type: float32
=== Observation Space ===
Shape: (59,)
Type: float32
begin training
TRAINING PARAMETERS
---------------------------------------
Total timesteps: 1000000.0
Learning rate: 0.0001
Number of total updates: 1000
Number of parallel environments: 8
Number of steps per rollout (Used for each kNN update and curiosity reward calculation): 125
Batch size: 1000
Number of PPO update epochs: 4
Minibatch size: 250
K parameter: 50
---------------------------------------
UPDATE: 1/1000
Average reward: 8.366446
UPDATE: 2/1000
Average reward: 22.536833
UPDATE: 3/1000
Average reward: 35.343876
UPDATE: 4/1000
Average reward: 47.23839
UPDATE: 5/1000
Average reward: 58.72467
UPDATE: 6/1000
Average reward: 70.07354
UPDATE: 7/1000
Average reward: 80.81484
UPDATE: 8/1000
Average reward: 91.09367
UPDATE: 9/1000
Average reward: 99.438515
UPDATE: 10/1000
Average reward: 106.62314
UPDATE: 11/1000
Average reward: 113.51161
UPDATE: 12/1000
Average reward: 119.564095
UPDATE: 13/1000
Average reward: 124.290825
UPDATE: 14/1000
Average reward: 128.73979
UPDATE: 15/1000
Average reward: 133.57414
UPDATE: 16/1000
Average reward: 138.9045
UPDATE: 17/1000
Average reward: 144.76987
UPDATE: 18/1000
Average reward: 150.83353
UPDATE: 19/1000
Average reward: 157.14632
UPDATE: 20/1000
Average reward: 164.11305
UPDATE: 21/1000
Average reward: 169.28952
UPDATE: 22/1000
Average reward: 172.69254
UPDATE: 23/1000
Average reward: 175.3569
UPDATE: 24/1000
Average reward: 176.95532
UPDATE: 25/1000
Average reward: 178.9683
UPDATE: 26/1000
Average reward: 181.20168
UPDATE: 27/1000
Average reward: 181.80255
UPDATE: 28/1000
Average reward: 180.2777
UPDATE: 29/1000
Average reward: 180.12268
UPDATE: 30/1000
Average reward: 181.02531
UPDATE: 31/1000
Average reward: 181.4665
UPDATE: 32/1000
Average reward: 183.60388
UPDATE: 33/1000
Average reward: 185.65347
UPDATE: 34/1000
Average reward: 187.14803
UPDATE: 35/1000
Average reward: 188.9467
UPDATE: 36/1000
Average reward: 190.9636
UPDATE: 37/1000
Average reward: 191.77216
UPDATE: 38/1000
Average reward: 192.87901
UPDATE: 39/1000
Average reward: 193.51202
UPDATE: 40/1000
Average reward: 193.63196
UPDATE: 41/1000
Average reward: 193.61353
UPDATE: 42/1000
Average reward: 193.45023
UPDATE: 43/1000
Average reward: 192.99863
UPDATE: 44/1000
Average reward: 192.70074
UPDATE: 45/1000
Average reward: 193.9325
UPDATE: 46/1000
Average reward: 194.94778
UPDATE: 47/1000
Average reward: 195.69595
UPDATE: 48/1000
Average reward: 196.64128
UPDATE: 49/1000
Average reward: 198.00903
UPDATE: 50/1000
Average reward: 198.79962
UPDATE: 51/1000
Average reward: 198.57443
UPDATE: 52/1000
Average reward: 198.8353
UPDATE: 53/1000
Average reward: 200.55789
UPDATE: 54/1000
Average reward: 201.96692
UPDATE: 55/1000
Average reward: 203.36748
UPDATE: 56/1000
Average reward: 205.30782
UPDATE: 57/1000
Average reward: 206.64304
UPDATE: 58/1000
Average reward: 207.92769
UPDATE: 59/1000
Average reward: 207.49287
UPDATE: 60/1000
Average reward: 206.19316
UPDATE: 61/1000
Average reward: 204.89262
UPDATE: 62/1000
Average reward: 205.22922
UPDATE: 63/1000
Average reward: 205.86087
UPDATE: 64/1000
Average reward: 204.29387
UPDATE: 65/1000
Average reward: 205.1164
UPDATE: 66/1000
Average reward: 205.633
UPDATE: 67/1000
Average reward: 206.39139
UPDATE: 68/1000
Average reward: 206.53702
UPDATE: 69/1000
Average reward: 206.12703
UPDATE: 70/1000
Average reward: 204.93811
UPDATE: 71/1000
Average reward: 203.72595
UPDATE: 72/1000
Average reward: 203.89532
UPDATE: 73/1000
Average reward: 203.45105
UPDATE: 74/1000
Average reward: 205.19106
UPDATE: 75/1000
Average reward: 207.33713
UPDATE: 76/1000
Average reward: 207.80113
UPDATE: 77/1000
Average reward: 208.14928
UPDATE: 78/1000
Average reward: 208.02097
UPDATE: 79/1000
Average reward: 206.63513
UPDATE: 80/1000
Average reward: 205.82341
UPDATE: 81/1000
Average reward: 205.6436
UPDATE: 82/1000
Average reward: 206.319
UPDATE: 83/1000
Average reward: 206.9334
UPDATE: 84/1000
Average reward: 207.39674
UPDATE: 85/1000
Average reward: 206.10637
UPDATE: 86/1000
Average reward: 204.64835
UPDATE: 87/1000
Average reward: 204.1276
UPDATE: 88/1000
Average reward: 204.91335
UPDATE: 89/1000
Average reward: 204.57014
UPDATE: 90/1000
Average reward: 203.99928
UPDATE: 91/1000
Average reward: 203.80013
UPDATE: 92/1000
Average reward: 205.24799
UPDATE: 93/1000
Average reward: 206.34262
UPDATE: 94/1000
Average reward: 207.81686
UPDATE: 95/1000
Average reward: 208.50906
UPDATE: 96/1000
Average reward: 209.99718
UPDATE: 97/1000
Average reward: 211.37968
UPDATE: 98/1000
Average reward: 211.50465
UPDATE: 99/1000
Average reward: 210.86778
UPDATE: 100/1000
Saving model...
=== Testing Agent: 1 Episodes ===
=== Observation Space ===
Shape: (59,)
Type: float32
=== Episode 1/1 ===
Step 0 | Action: left | Reward: 0.00 |
Step 1 | Action: left | Reward: 0.00 |
Step 2 | Action: shake_bang | Reward: 0.00 |
Step 3 | Action: left | Reward: 0.00 |
Step 4 | Action: pickup_0 | Reward: 0.00 |
Step 5 | Action: left | Reward: 0.00 |
Step 6 | Action: pickup_0 | Reward: 0.00 |
Step 7 | Action: toggle | Reward: 0.00 |
Step 8 | Action: toggle | Reward: 0.00 |
Step 9 | Action: left | Reward: 0.00 |
Step 10 | Action: toggle | Reward: 0.00 |
Step 11 | Action: left | Reward: 0.00 |
Step 12 | Action: pickup_0 | Reward: 0.00 |
Step 13 | Action: right | Reward: 0.00 |
Step 14 | Action: right | Reward: 0.00 |
Step 15 | Action: left | Reward: 0.00 |
Step 16 | Action: drop_0 | Reward: 0.00 |
Step 17 | Action: forward | Reward: 0.00 |
Step 18 | Action: forward | Reward: 0.00 |
Step 19 | Action: forward | Reward: 0.00 |
Step 20 | Action: toggle | Reward: 0.00 |
Step 21 | Action: right | Reward: 0.00 |
Step 22 | Action: drop_0 | Reward: 0.00 |
Step 23 | Action: pickup_0 | Reward: 0.00 |
Step 24 | Action: toggle | Reward: 0.00 |
Step 25 | Action: shake_bang | Reward: 0.00 |
Step 26 | Action: right | Reward: 0.00 |
Step 27 | Action: toggle | Reward: 0.00 |
Step 28 | Action: toggle | Reward: 0.00 |
Step 29 | Action: right | Reward: 0.00 |
Step 30 | Action: left | Reward: 0.00 |
Step 31 | Action: toggle | Reward: 0.00 |
Step 32 | Action: right | Reward: 0.00 |
Step 33 | Action: left | Reward: 0.00 |
Step 34 | Action: right | Reward: 0.00 |
Step 35 | Action: left | Reward: 0.00 |
Step 36 | Action: pickup_0 | Reward: 0.00 |
Step 37 | Action: drop_0 | Reward: 0.00 |
Step 38 | Action: left | Reward: 0.00 |
Step 39 | Action: drop_0 | Reward: 0.00 |
Step 40 | Action: shake_bang | Reward: 0.00 |
Step 41 | Action: right | Reward: 0.00 |
Step 42 | Action: forward | Reward: 0.00 |
Step 43 | Action: shake_bang | Reward: 0.00 |
Step 44 | Action: right | Reward: 0.00 |
Step 45 | Action: right | Reward: 0.00 |
Step 46 | Action: pickup_0 | Reward: 0.00 |
Step 47 | Action: left | Reward: 0.00 |
Step 48 | Action: toggle | Reward: 0.00 |
Step 49 | Action: right | Reward: 0.00 |
Step 50 | Action: forward | Reward: 0.00 |
Step 51 | Action: toggle | Reward: 0.00 |
Step 52 | Action: toggle | Reward: 0.00 |
Step 53 | Action: forward | Reward: 0.00 |
Step 54 | Action: toggle | Reward: 0.00 |
Step 55 | Action: toggle | Reward: 0.00 |
Step 56 | Action: shake_bang | Reward: 0.00 |
Step 57 | Action: shake_bang | Reward: 0.00 |
Step 58 | Action: pickup_0 | Reward: 0.00 |
Step 59 | Action: shake_bang | Reward: 0.00 |
Step 60 | Action: right | Reward: 0.00 |
Step 61 | Action: left | Reward: 0.00 |
Step 62 | Action: pickup_0 | Reward: 0.00 |
Step 63 | Action: drop_0 | Reward: 0.00 |
Step 64 | Action: shake_bang | Reward: 0.00 |
Step 65 | Action: pickup_0 | Reward: 0.00 |
Step 66 | Action: drop_0 | Reward: 0.00 |
Step 67 | Action: shake_bang | Reward: 0.00 |
Step 68 | Action: forward | Reward: 0.00 |
Step 69 | Action: pickup_0 | Reward: 0.00 |
Step 70 | Action: toggle | Reward: 0.00 |
Step 71 | Action: forward | Reward: 0.00 |
Step 72 | Action: pickup_0 | Reward: 0.00 |
Step 73 | Action: pickup_0 | Reward: 0.00 |
Step 74 | Action: toggle | Reward: 0.00 |
Step 75 | Action: forward | Reward: 0.00 |
Step 76 | Action: left | Reward: 0.00 |
Step 77 | Action: drop_0 | Reward: 0.00 |
Step 78 | Action: forward | Reward: 0.00 |
Step 79 | Action: forward | Reward: 0.00 |
Step 80 | Action: right | Reward: 0.00 |
Step 81 | Action: drop_0 | Reward: 0.00 |
Step 82 | Action: right | Reward: 0.00 |
Step 83 | Action: forward | Reward: 0.00 |
Step 84 | Action: shake_bang | Reward: 0.00 |
Step 85 | Action: forward | Reward: 0.00 |
Step 86 | Action: forward | Reward: 0.00 |
Step 87 | Action: drop_0 | Reward: 0.00 |
Step 88 | Action: shake_bang | Reward: 0.00 |
Step 89 | Action: forward | Reward: 0.00 |
Step 90 | Action: right | Reward: 0.00 |
Step 91 | Action: left | Reward: 0.00 |
Step 92 | Action: drop_0 | Reward: 0.00 |
Step 93 | Action: pickup_0 | Reward: 0.00 |
Step 94 | Action: pickup_0 | Reward: 0.00 |
Step 95 | Action: pickup_0 | Reward: 0.00 |
Step 96 | Action: left | Reward: 0.00 |
Step 97 | Action: shake_bang | Reward: 0.00 |
Step 98 | Action: pickup_0 | Reward: 0.00 |
Step 99 | Action: pickup_0 | Reward: 0.00 |
Step 100 | Action: left | Reward: 0.00 |
Step 101 | Action: drop_0 | Reward: 0.00 |
Step 102 | Action: shake_bang | Reward: 0.00 |
Step 103 | Action: pickup_0 | Reward: 0.00 |
Step 104 | Action: shake_bang | Reward: 0.00 |
Step 105 | Action: pickup_0 | Reward: 0.00 |
Step 106 | Action: left | Reward: 0.00 |
Step 107 | Action: forward | Reward: 0.00 |
Step 108 | Action: right | Reward: 0.00 |
Step 109 | Action: forward | Reward: 0.00 |
Step 110 | Action: pickup_0 | Reward: 0.00 |
Step 111 | Action: right | Reward: 0.00 |
Step 112 | Action: right | Reward: 0.00 |
Step 113 | Action: left | Reward: 0.00 |
Step 114 | Action: forward | Reward: 0.00 |
Step 115 | Action: pickup_0 | Reward: 0.00 |
Step 116 | Action: pickup_0 | Reward: 0.00 |
Step 117 | Action: left | Reward: 0.00 |
Step 118 | Action: pickup_0 | Reward: 0.00 |
Step 119 | Action: shake_bang | Reward: 0.00 |
Step 120 | Action: pickup_0 | Reward: 0.00 |
Step 121 | Action: drop_0 | Reward: 0.00 |
Step 122 | Action: pickup_0 | Reward: 0.00 |
Step 123 | Action: forward | Reward: 0.00 |
Step 124 | Action: forward | Reward: 0.00 |
Step 125 | Action: toggle | Reward: 0.00 |
Step 126 | Action: right | Reward: 0.00 |
Step 127 | Action: drop_0 | Reward: 0.00 |
Step 128 | Action: pickup_0 | Reward: 0.00 |
Step 129 | Action: toggle | Reward: 0.00 |
Step 130 | Action: left | Reward: 0.00 |
Step 131 | Action: left | Reward: 0.00 |
Step 132 | Action: drop_0 | Reward: 0.00 |
Step 133 | Action: right | Reward: 0.00 |
Step 134 | Action: shake_bang | Reward: 0.00 |
Step 135 | Action: drop_0 | Reward: 0.00 |
Step 136 | Action: right | Reward: 0.00 |
Step 137 | Action: left | Reward: 0.00 |
Step 138 | Action: left | Reward: 0.00 |
Step 139 | Action: drop_0 | Reward: 0.00 |
Step 140 | Action: drop_0 | Reward: 0.00 |
Step 141 | Action: toggle | Reward: 0.00 |
Step 142 | Action: left | Reward: 0.00 |
Step 143 | Action: pickup_0 | Reward: 0.00 |
Step 144 | Action: left | Reward: 0.00 |
Step 145 | Action: shake_bang | Reward: 0.00 |
Step 146 | Action: drop_0 | Reward: 0.00 |
Step 147 | Action: shake_bang | Reward: 0.00 |
Step 148 | Action: drop_0 | Reward: 0.00 |
Step 149 | Action: shake_bang | Reward: 0.00 |
Step 150 | Action: drop_0 | Reward: 0.00 |
Step 151 | Action: toggle | Reward: 0.00 |
Step 152 | Action: right | Reward: 0.00 |
Step 153 | Action: pickup_0 | Reward: 0.00 |
Step 154 | Action: toggle | Reward: 0.00 |
Step 155 | Action: left | Reward: 0.00 |
Step 156 | Action: toggle | Reward: 0.00 |
Step 157 | Action: forward | Reward: 0.00 |
Step 158 | Action: left | Reward: 0.00 |
Step 159 | Action: pickup_0 | Reward: 0.00 |
Step 160 | Action: shake_bang | Reward: 0.00 |
Step 161 | Action: drop_0 | Reward: 0.00 |
Step 162 | Action: pickup_0 | Reward: 0.00 |
Step 163 | Action: toggle | Reward: 0.00 |
Step 164 | Action: drop_0 | Reward: 0.00 |
Step 165 | Action: pickup_0 | Reward: 0.00 |
Step 166 | Action: drop_0 | Reward: 0.00 |
Step 167 | Action: pickup_0 | Reward: 0.00 |
Step 168 | Action: forward | Reward: 0.00 |
Step 169 | Action: shake_bang | Reward: 0.00 |
Step 170 | Action: drop_0 | Reward: 0.00 |
Step 171 | Action: shake_bang | Reward: 0.00 |
Step 172 | Action: left | Reward: 0.00 |
Step 173 | Action: pickup_0 | Reward: 0.00 |
Step 174 | Action: pickup_0 | Reward: 0.00 |
Step 175 | Action: drop_0 | Reward: 0.00 |
Step 176 | Action: shake_bang | Reward: 0.00 |
Step 177 | Action: left | Reward: 0.00 |
Step 178 | Action: pickup_0 | Reward: 0.00 |
Step 179 | Action: left | Reward: 0.00 |
Step 180 | Action: forward | Reward: 0.00 |
Step 181 | Action: toggle | Reward: 0.00 |
Step 182 | Action: pickup_0 | Reward: 0.00 |
Step 183 | Action: right | Reward: 0.00 |
Step 184 | Action: shake_bang | Reward: 0.00 |
Step 185 | Action: left | Reward: 0.00 |
Step 186 | Action: toggle | Reward: 0.00 |
Step 187 | Action: toggle | Reward: 0.00 |
Step 188 | Action: left | Reward: 0.00 |
Step 189 | Action: left | Reward: 0.00 |
Step 190 | Action: pickup_0 | Reward: 0.00 |
Step 191 | Action: pickup_0 | Reward: 0.00 |
Step 192 | Action: right | Reward: 0.00 |
Step 193 | Action: pickup_0 | Reward: 0.00 |
Step 194 | Action: right | Reward: 0.00 |
Step 195 | Action: pickup_0 | Reward: 0.00 |
Step 196 | Action: pickup_0 | Reward: 0.00 |
Step 197 | Action: drop_0 | Reward: 0.00 |
Step 198 | Action: drop_0 | Reward: 0.00 |
Step 199 | Action: forward | Reward: 0.00 |
Step 200 | Action: forward | Reward: 0.00 |
Step 201 | Action: pickup_0 | Reward: 0.00 |
Step 202 | Action: pickup_0 | Reward: 0.00 |
Step 203 | Action: forward | Reward: 0.00 |
Step 204 | Action: left | Reward: 0.00 |
Step 205 | Action: forward | Reward: 0.00 |
Step 206 | Action: toggle | Reward: 0.00 |
Step 207 | Action: drop_0 | Reward: 0.00 |
Step 208 | Action: pickup_0 | Reward: 0.00 |
Step 209 | Action: pickup_0 | Reward: 0.00 |
Step 210 | Action: forward | Reward: 0.00 |
Step 211 | Action: drop_0 | Reward: 0.00 |
Step 212 | Action: pickup_0 | Reward: 0.00 |
Step 213 | Action: drop_0 | Reward: 0.00 |
Step 214 | Action: forward | Reward: 0.00 |
Step 215 | Action: toggle | Reward: 0.00 |
Step 216 | Action: right | Reward: 0.00 |
Step 217 | Action: right | Reward: 0.00 |
Step 218 | Action: pickup_0 | Reward: 0.00 |
Step 219 | Action: forward | Reward: 0.00 |
Step 220 | Action: pickup_0 | Reward: 0.00 |
Step 221 | Action: toggle | Reward: 0.00 |
Step 222 | Action: pickup_0 | Reward: 0.00 |
Step 223 | Action: left | Reward: 0.00 |
Step 224 | Action: pickup_0 | Reward: 0.00 |
Step 225 | Action: toggle | Reward: 0.00 |
Step 226 | Action: pickup_0 | Reward: 0.00 |
Step 227 | Action: forward | Reward: 0.00 |
Step 228 | Action: pickup_0 | Reward: 0.00 |
Step 229 | Action: left | Reward: 0.00 |
Step 230 | Action: left | Reward: 0.00 |
Step 231 | Action: forward | Reward: 0.00 |
Step 232 | Action: toggle | Reward: 0.00 |
Step 233 | Action: left | Reward: 0.00 |
Step 234 | Action: toggle | Reward: 0.00 |
Step 235 | Action: forward | Reward: 0.00 |
Step 236 | Action: pickup_0 | Reward: 0.00 |
Step 237 | Action: left | Reward: 0.00 |
Step 238 | Action: toggle | Reward: 0.00 |
Step 239 | Action: drop_0 | Reward: 0.00 |
Step 240 | Action: right | Reward: 0.00 |
Step 241 | Action: shake_bang | Reward: 0.00 |
Step 242 | Action: forward | Reward: 0.00 |
Step 243 | Action: left | Reward: 0.00 |
Step 244 | Action: pickup_0 | Reward: 0.00 |
Step 245 | Action: right | Reward: 0.00 |
Step 246 | Action: left | Reward: 0.00 |
Step 247 | Action: pickup_0 | Reward: 0.00 |
Step 248 | Action: left | Reward: 0.00 |
Step 249 | Action: right | Reward: 0.00 |
Step 250 | Action: right | Reward: 0.00 |
Step 251 | Action: toggle | Reward: 0.00 |
Step 252 | Action: toggle | Reward: 0.00 |
Step 253 | Action: forward | Reward: 0.00 |
Step 254 | Action: right | Reward: 0.00 |
Step 255 | Action: left | Reward: 0.00 |
Step 256 | Action: right | Reward: 0.00 |
Step 257 | Action: forward | Reward: 0.00 |
Step 258 | Action: left | Reward: 0.00 |
Step 259 | Action: left | Reward: 0.00 |
Step 260 | Action: pickup_0 | Reward: 0.00 |
Step 261 | Action: forward | Reward: 0.00 |
Step 262 | Action: shake_bang | Reward: 0.00 |
Step 263 | Action: drop_0 | Reward: 0.00 |
Step 264 | Action: shake_bang | Reward: 0.00 |
Step 265 | Action: pickup_0 | Reward: 0.00 |
Step 266 | Action: left | Reward: 0.00 |
Step 267 | Action: pickup_0 | Reward: 0.00 |
Step 268 | Action: pickup_0 | Reward: 0.00 |
Step 269 | Action: shake_bang | Reward: 0.00 |
Step 270 | Action: toggle | Reward: 0.00 |
Step 271 | Action: toggle | Reward: 0.00 |
Step 272 | Action: pickup_0 | Reward: 0.00 |
Step 273 | Action: pickup_0 | Reward: 0.00 |
Step 274 | Action: right | Reward: 0.00 |
Step 275 | Action: drop_0 | Reward: 0.00 |
Step 276 | Action: drop_0 | Reward: 0.00 |
Step 277 | Action: right | Reward: 0.00 |
Step 278 | Action: toggle | Reward: 0.00 |
Step 279 | Action: right | Reward: 0.00 |
Step 280 | Action: shake_bang | Reward: 0.00 |
Step 281 | Action: right | Reward: 0.00 |
Step 282 | Action: toggle | Reward: 0.00 |
Step 283 | Action: toggle | Reward: 0.00 |
Step 284 | Action: drop_0 | Reward: 0.00 |
Step 285 | Action: left | Reward: 0.00 |
Step 286 | Action: pickup_0 | Reward: 0.00 |
Step 287 | Action: shake_bang | Reward: 0.00 |
Step 288 | Action: forward | Reward: 0.00 |
Step 289 | Action: forward | Reward: 0.00 |
Step 290 | Action: pickup_0 | Reward: 0.00 |
Step 291 | Action: pickup_0 | Reward: 0.00 |
Step 292 | Action: forward | Reward: 0.00 |
Step 293 | Action: left | Reward: 0.00 |
Step 294 | Action: forward | Reward: 0.00 |
Step 295 | Action: forward | Reward: 0.00 |
Step 296 | Action: left | Reward: 0.00 |
Step 297 | Action: left | Reward: 0.00 |
Step 298 | Action: pickup_0 | Reward: 0.00 |
Step 299 | Action: shake_bang | Reward: 0.00 |
Step 300 | Action: drop_0 | Reward: 0.00 |
Step 301 | Action: right | Reward: 0.00 |
Step 302 | Action: left | Reward: 0.00 |
Step 303 | Action: forward | Reward: 0.00 |
Step 304 | Action: pickup_0 | Reward: 0.00 |
Step 305 | Action: drop_0 | Reward: 0.00 |
Step 306 | Action: pickup_0 | Reward: 0.00 |
Step 307 | Action: forward | Reward: 0.00 |
Step 308 | Action: forward | Reward: 0.00 |
Step 309 | Action: left | Reward: 0.00 |
Step 310 | Action: right | Reward: 0.00 |
Step 311 | Action: pickup_0 | Reward: 0.00 |
Step 312 | Action: left | Reward: 0.00 |
Step 313 | Action: forward | Reward: 0.00 |
Step 314 | Action: forward | Reward: 0.00 |
Step 315 | Action: drop_0 | Reward: 0.00 |
Step 316 | Action: left | Reward: 0.00 |
Step 317 | Action: left | Reward: 0.00 |
Step 318 | Action: pickup_0 | Reward: 0.00 |
Step 319 | Action: left | Reward: 0.00 |
Step 320 | Action: forward | Reward: 0.00 |
Step 321 | Action: left | Reward: 0.00 |
Step 322 | Action: forward | Reward: 0.00 |
Step 323 | Action: pickup_0 | Reward: 0.00 |
Step 324 | Action: left | Reward: 0.00 |
Step 325 | Action: forward | Reward: 0.00 |
Step 326 | Action: left | Reward: 0.00 |
Step 327 | Action: left | Reward: 0.00 |
Step 328 | Action: toggle | Reward: 0.00 |
Step 329 | Action: forward | Reward: 0.00 |
Step 330 | Action: forward | Reward: 0.00 |
Step 331 | Action: drop_0 | Reward: 0.00 |
Step 332 | Action: toggle | Reward: 0.00 |
Step 333 | Action: pickup_0 | Reward: 0.00 |
Step 334 | Action: left | Reward: 0.00 |
Step 335 | Action: forward | Reward: 0.00 |
Step 336 | Action: forward | Reward: 0.00 |
Step 337 | Action: forward | Reward: 0.00 |
Step 338 | Action: drop_0 | Reward: 0.00 |
Step 339 | Action: left | Reward: 0.00 |
Step 340 | Action: forward | Reward: 0.00 |
Step 341 | Action: forward | Reward: 0.00 |
Step 342 | Action: pickup_0 | Reward: 0.00 |
Step 343 | Action: drop_0 | Reward: 0.00 |
Step 344 | Action: drop_0 | Reward: 0.00 |
Step 345 | Action: pickup_0 | Reward: 0.00 |
Step 346 | Action: toggle | Reward: 0.00 |
Step 347 | Action: shake_bang | Reward: 0.00 |
Step 348 | Action: drop_0 | Reward: 0.00 |
Step 349 | Action: forward | Reward: 0.00 |
Step 350 | Action: right | Reward: 0.00 |
Step 351 | Action: forward | Reward: 0.00 |
Step 352 | Action: forward | Reward: 0.00 |
Step 353 | Action: drop_0 | Reward: 0.00 |
Step 354 | Action: left | Reward: 0.00 |
Step 355 | Action: right | Reward: 0.00 |
Step 356 | Action: pickup_0 | Reward: 0.00 |
Step 357 | Action: forward | Reward: 0.00 |
Step 358 | Action: shake_bang | Reward: 0.00 |
Step 359 | Action: left | Reward: 0.00 |
Step 360 | Action: pickup_0 | Reward: 0.00 |
Step 361 | Action: forward | Reward: 0.00 |
Step 362 | Action: shake_bang | Reward: 0.00 |
Step 363 | Action: pickup_0 | Reward: 0.00 |
Step 364 | Action: forward | Reward: 0.00 |
Step 365 | Action: pickup_0 | Reward: 0.00 |
Step 366 | Action: toggle | Reward: 0.00 |
Step 367 | Action: forward | Reward: 0.00 |
Step 368 | Action: left | Reward: 0.00 |
Step 369 | Action: forward | Reward: 0.00 |
Step 370 | Action: left | Reward: 0.00 |
Step 371 | Action: pickup_0 | Reward: 0.00 |
Step 372 | Action: right | Reward: 0.00 |
Step 373 | Action: left | Reward: 0.00 |
Step 374 | Action: left | Reward: 0.00 |
Step 375 | Action: pickup_0 | Reward: 0.00 |
Step 376 | Action: pickup_0 | Reward: 0.00 |
Step 377 | Action: forward | Reward: 0.00 |
Step 378 | Action: left | Reward: 0.00 |
Step 379 | Action: pickup_0 | Reward: 0.00 |
Step 380 | Action: right | Reward: 0.00 |
Step 381 | Action: drop_0 | Reward: 0.00 |
Step 382 | Action: drop_0 | Reward: 0.00 |
Step 383 | Action: forward | Reward: 0.00 |
Step 384 | Action: left | Reward: 0.00 |
Step 385 | Action: right | Reward: 0.00 |
Step 386 | Action: pickup_0 | Reward: 0.00 |
wandb: WARNING `fps` argument does not affect the frame rate of the video when providing a file path or raw bytes.
Step 387 | Action: right | Reward: 0.00 |
Step 388 | Action: toggle | Reward: 0.00 |
Step 389 | Action: shake_bang | Reward: 0.00 |
Step 390 | Action: toggle | Reward: 0.00 |
Step 391 | Action: right | Reward: 0.00 |
Step 392 | Action: drop_0 | Reward: 0.00 |
Step 393 | Action: left | Reward: 0.00 |
Step 394 | Action: left | Reward: 0.00 |
Step 395 | Action: forward | Reward: 0.00 |
Step 396 | Action: right | Reward: 0.00 |
Step 397 | Action: left | Reward: 0.00 |
Step 398 | Action: forward | Reward: 0.00 |
Step 399 | Action: shake_bang | Reward: 0.00 |
Step 400 | Action: forward | Reward: 0.00 |
Step 401 | Action: pickup_0 | Reward: 0.00 |
Step 402 | Action: drop_0 | Reward: 0.00 |
Step 403 | Action: shake_bang | Reward: 0.00 |
Step 404 | Action: drop_0 | Reward: 0.00 |
Step 405 | Action: pickup_0 | Reward: 0.00 |
Step 406 | Action: right | Reward: 0.00 |
Step 407 | Action: toggle | Reward: 0.00 |
Step 408 | Action: left | Reward: 0.00 |
Step 409 | Action: forward | Reward: 0.00 |
Step 410 | Action: pickup_0 | Reward: 0.00 |
Step 411 | Action: drop_0 | Reward: 0.00 |
Step 412 | Action: toggle | Reward: 0.00 |
Step 413 | Action: shake_bang | Reward: 0.00 |
Step 414 | Action: shake_bang | Reward: 0.00 |
Step 415 | Action: pickup_0 | Reward: 0.00 |
Step 416 | Action: forward | Reward: 0.00 |
Step 417 | Action: forward | Reward: 0.00 |
Step 418 | Action: shake_bang | Reward: 0.00 |
Step 419 | Action: shake_bang | Reward: 0.00 |
Step 420 | Action: shake_bang | Reward: 0.00 |
Step 421 | Action: right | Reward: 0.00 |
Step 422 | Action: forward | Reward: 0.00 |
Step 423 | Action: shake_bang | Reward: 0.00 |
Step 424 | Action: forward | Reward: 0.00 |
Step 425 | Action: forward | Reward: 0.00 |
Step 426 | Action: drop_0 | Reward: 0.00 |
Step 427 | Action: left | Reward: 0.00 |
Step 428 | Action: pickup_0 | Reward: 0.00 |
Step 429 | Action: left | Reward: 0.00 |
Step 430 | Action: toggle | Reward: 0.00 |
Step 431 | Action: toggle | Reward: 0.00 |
Step 432 | Action: shake_bang | Reward: 0.00 |
Step 433 | Action: pickup_0 | Reward: 0.00 |
Step 434 | Action: left | Reward: 0.00 |
Step 435 | Action: shake_bang | Reward: 0.00 |
Step 436 | Action: forward | Reward: 0.00 |
Step 437 | Action: drop_0 | Reward: 0.00 |
Step 438 | Action: left | Reward: 0.00 |
Step 439 | Action: pickup_0 | Reward: 0.00 |
Step 440 | Action: forward | Reward: 0.00 |
Step 441 | Action: drop_0 | Reward: 0.00 |
Step 442 | Action: left | Reward: 0.00 |
Step 443 | Action: drop_0 | Reward: 0.00 |
Step 444 | Action: right | Reward: 0.00 |
Step 445 | Action: pickup_0 | Reward: 0.00 |
Step 446 | Action: shake_bang | Reward: 0.00 |
Step 447 | Action: right | Reward: 0.00 |
Step 448 | Action: left | Reward: 0.00 |
Step 449 | Action: forward | Reward: 0.00 |
Step 450 | Action: drop_0 | Reward: 0.00 |
Step 451 | Action: shake_bang | Reward: 0.00 |
Step 452 | Action: forward | Reward: 0.00 |
Step 453 | Action: forward | Reward: 0.00 |
Step 454 | Action: forward | Reward: 0.00 |
Step 455 | Action: right | Reward: 0.00 |
Step 456 | Action: pickup_0 | Reward: 0.00 |
Step 457 | Action: forward | Reward: 0.00 |
Step 458 | Action: right | Reward: 0.00 |
Step 459 | Action: pickup_0 | Reward: 0.00 |
Step 460 | Action: shake_bang | Reward: 0.00 |
Step 461 | Action: left | Reward: 0.00 |
Step 462 | Action: left | Reward: 0.00 |
Step 463 | Action: pickup_0 | Reward: 0.00 |
Step 464 | Action: pickup_0 | Reward: 0.00 |
Step 465 | Action: forward | Reward: 0.00 |
Step 466 | Action: toggle | Reward: 0.00 |
Step 467 | Action: right | Reward: 0.00 |
Step 468 | Action: left | Reward: 0.00 |
Step 469 | Action: drop_0 | Reward: 0.00 |
Step 470 | Action: drop_0 | Reward: 0.00 |
Step 471 | Action: pickup_0 | Reward: 0.00 |
Step 472 | Action: left | Reward: 0.00 |
Step 473 | Action: left | Reward: 0.00 |
Step 474 | Action: toggle | Reward: 0.00 |
Step 475 | Action: drop_0 | Reward: 0.00 |
Step 476 | Action: forward | Reward: 0.00 |
Step 477 | Action: drop_0 | Reward: 0.00 |
Step 478 | Action: right | Reward: 0.00 |
Step 479 | Action: pickup_0 | Reward: 0.00 |
Step 480 | Action: left | Reward: 0.00 |
Step 481 | Action: left | Reward: 0.00 |
Step 482 | Action: forward | Reward: 0.00 |
Step 483 | Action: drop_0 | Reward: 0.00 |
Step 484 | Action: drop_0 | Reward: 0.00 |
Step 485 | Action: shake_bang | Reward: 0.00 |
Step 486 | Action: pickup_0 | Reward: 0.00 |
Step 487 | Action: drop_0 | Reward: 0.00 |
Step 488 | Action: forward | Reward: 0.00 |
Step 489 | Action: forward | Reward: 0.00 |
Step 490 | Action: left | Reward: 0.00 |
Step 491 | Action: left | Reward: 0.00 |
Step 492 | Action: forward | Reward: 0.00 |
Step 493 | Action: shake_bang | Reward: 0.00 |
Step 494 | Action: left | Reward: 0.00 |
Step 495 | Action: forward | Reward: 0.00 |
Step 496 | Action: right | Reward: 0.00 |
Step 497 | Action: pickup_0 | Reward: 0.00 |
Step 498 | Action: pickup_0 | Reward: 0.00 |
Step 499 | Action: forward | Reward: 0.00 |
Average reward: 210.59023
UPDATE: 101/1000
Average reward: 208.96217
UPDATE: 102/1000
Average reward: 209.10728
UPDATE: 103/1000
Average reward: 210.53117
UPDATE: 104/1000
Average reward: 211.26884
UPDATE: 105/1000
Average reward: 211.80267
UPDATE: 106/1000
Average reward: 213.19891
UPDATE: 107/1000
Average reward: 214.96353
UPDATE: 108/1000
Average reward: 216.98596
UPDATE: 109/1000
Average reward: 219.31085
UPDATE: 110/1000
Average reward: 220.85034
UPDATE: 111/1000
Average reward: 221.86388
UPDATE: 112/1000
Average reward: 223.90991
UPDATE: 113/1000
Average reward: 224.59753
UPDATE: 114/1000
Average reward: 223.98422
UPDATE: 115/1000
Average reward: 223.68123
UPDATE: 116/1000
Average reward: 223.829
UPDATE: 117/1000
Average reward: 223.96353
UPDATE: 118/1000
Average reward: 222.09535
UPDATE: 119/1000
Average reward: 220.20438
UPDATE: 120/1000
Average reward: 218.17021
UPDATE: 121/1000
Average reward: 217.5609
UPDATE: 122/1000
Average reward: 217.98181
UPDATE: 123/1000
Average reward: 219.6976
UPDATE: 124/1000
Average reward: 221.42873
UPDATE: 125/1000
Average reward: 222.9676
UPDATE: 126/1000
Average reward: 224.37114
UPDATE: 127/1000
Average reward: 226.11658
UPDATE: 128/1000
Average reward: 228.21962
UPDATE: 129/1000
Average reward: 228.63466
UPDATE: 130/1000
Average reward: 228.58948
UPDATE: 131/1000
Average reward: 229.96913
UPDATE: 132/1000
Average reward: 231.26923
UPDATE: 133/1000
Average reward: 232.59941
UPDATE: 134/1000
Average reward: 234.44225
UPDATE: 135/1000
Average reward: 235.87749
UPDATE: 136/1000
Average reward: 236.83362
UPDATE: 137/1000
Average reward: 235.08096
UPDATE: 138/1000
Average reward: 233.85779
UPDATE: 139/1000
Average reward: 234.92603
UPDATE: 140/1000
Average reward: 236.13335
UPDATE: 141/1000
Average reward: 237.58974
UPDATE: 142/1000
Average reward: 239.11281
UPDATE: 143/1000
Average reward: 240.30138
UPDATE: 144/1000
Average reward: 241.70642
UPDATE: 145/1000
Average reward: 241.47974
UPDATE: 146/1000
Average reward: 241.39395
UPDATE: 147/1000
Average reward: 242.62773
UPDATE: 148/1000
Average reward: 243.97362
UPDATE: 149/1000
Average reward: 245.54778
UPDATE: 150/1000
Average reward: 247.11302
UPDATE: 151/1000
Average reward: 248.3104
UPDATE: 152/1000
Average reward: 249.46205
UPDATE: 153/1000
Average reward: 247.51968
UPDATE: 154/1000
Average reward: 245.78511
UPDATE: 155/1000
Average reward: 246.35237
UPDATE: 156/1000
Average reward: 247.77534
UPDATE: 157/1000
Average reward: 249.0758
UPDATE: 158/1000
Average reward: 250.20958
UPDATE: 159/1000
Average reward: 251.33578
UPDATE: 160/1000
Average reward: 252.38141
UPDATE: 161/1000
Average reward: 251.45071
UPDATE: 162/1000
Average reward: 250.18213
UPDATE: 163/1000
Average reward: 250.69537
UPDATE: 164/1000
Average reward: 251.75114
UPDATE: 165/1000
Average reward: 252.77284
UPDATE: 166/1000
Average reward: 253.73999
UPDATE: 167/1000
Average reward: 254.47212
UPDATE: 168/1000
Average reward: 255.25066
UPDATE: 169/1000
Average reward: 254.22885
UPDATE: 170/1000
Average reward: 252.99486
UPDATE: 171/1000
Average reward: 253.48753
UPDATE: 172/1000
Average reward: 254.08241
UPDATE: 173/1000
Average reward: 255.05997
UPDATE: 174/1000
Average reward: 255.84497
UPDATE: 175/1000
Average reward: 256.51044
UPDATE: 176/1000
Average reward: 257.16684
UPDATE: 177/1000
Average reward: 255.95706
UPDATE: 178/1000
Average reward: 255.0902
UPDATE: 179/1000
Average reward: 255.67934
UPDATE: 180/1000
Average reward: 256.43628
UPDATE: 181/1000
Average reward: 257.13943
UPDATE: 182/1000
Average reward: 257.7421
UPDATE: 183/1000
Average reward: 258.33392
UPDATE: 184/1000
Average reward: 258.82007
UPDATE: 185/1000
Average reward: 257.26703
UPDATE: 186/1000
Average reward: 256.20724
UPDATE: 187/1000
Average reward: 255.65881
UPDATE: 188/1000
Average reward: 256.35822
UPDATE: 189/1000
Average reward: 257.1002
UPDATE: 190/1000
Average reward: 257.86652
UPDATE: 191/1000
Average reward: 258.4741
UPDATE: 192/1000
Average reward: 259.00494
UPDATE: 193/1000
Average reward: 257.93552
UPDATE: 194/1000
Average reward: 257.33167
UPDATE: 195/1000
Average reward: 257.33395
UPDATE: 196/1000
Average reward: 257.65524
UPDATE: 197/1000
Average reward: 258.0801
UPDATE: 198/1000
Average reward: 258.4972
UPDATE: 199/1000
Average reward: 258.87976
UPDATE: 200/1000
Saving model...
=== Testing Agent: 1 Episodes ===
=== Observation Space ===