Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Codebase writes malformed learned*ft.json files #64

Open
yangengineering opened this issue Jul 4, 2024 · 8 comments
Open

Codebase writes malformed learned*ft.json files #64

yangengineering opened this issue Jul 4, 2024 · 8 comments
Assignees

Comments

@yangengineering
Copy link

Dear orr,

Thanks for your opening code again, but when I use this code, I found a question about exemplar_replay_prev_file. When I finish the code
PY_ARGS=${@:1}
python -u main_open_world.py
--output_dir "${EXP_DIR}/t2" --dataset TOWOD --PREV_INTRODUCED_CLS 20 --CUR_INTRODUCED_CLS 20
--train_set 'owod_t2_train' --test_set 'owod_all_task_test' --epochs 51
--model_type 'prob' --obj_loss_coef 8e-4 --obj_temp 1.3 --freeze_prob_model
--wandb_name "${WANDB_NAME}_t2"
--exemplar_replay_selection --exemplar_replay_max_length 1743 --exemplar_replay_dir ${WANDB_NAME}
--exemplar_replay_prev_file "learned_owod_t1_ft.txt" --exemplar_replay_cur_file "learned_owod_t2_ft.txt"
--pretrain "${EXP_DIR}/t1/checkpoint0040.pth" --lr 2e-5
--resume ./exps/MOWODB/PROB/t2/checkpoint0050.pth
${PY_ARGS}
There are some images can't be found as follows (such as 02):
5c7afff3b5c2f00b7b16e30a8f37ddd
I must delete these images' name by myself, and then the next code can be running.
I don't know why did this thing happen. I am looking forword to your help.

Bests,
Zhenni Yang

@orrzohar
Copy link
Owner

Hi @yangengineering,

I can't reproduce this bug.
What GPU/Cuda version are you using? did you follow my installation instructions?

Best,
Orr

@yangengineering
Copy link
Author

I follow the installation instructions and use the four 3090. I note in the last issue that someone also has this problem, but I don't know why did this thing happen

@yangengineering
Copy link
Author

this question is the same as issue 13

@orrzohar orrzohar reopened this Aug 11, 2024
@orrzohar
Copy link
Owner

Hi @yangengineering,

Yes this was an issue, but we debugged and fixed this (as indicated in issue #13) in this PR #15.

When did you clone this repository? For now, I suggest you just delete the ones that are malformed. Looking at this now, the only thing I can think of is adding:

if args.exemplar_replay_selection:

to:

if args.exemplar_replay_selection and utils.is_main_process(): 

as maybe multiple processes are trying to write to the files, causing this issue?
Best,
Orr

@orrzohar orrzohar self-assigned this Aug 11, 2024
@orrzohar
Copy link
Owner

Also, please update if this works so I can fix this issue permanently.

Best,

Orr

@orrzohar orrzohar changed the title A problem about learned_owod_t2_ft.txt Codebase writes malformed learned*ft.json files Aug 11, 2024
@yangengineering
Copy link
Author

yangengineering commented Sep 11, 2024

@orrzohar
I have replaced 'if args.exemplar_replay_selection:' as' if args.exemplar_replay_selection and utils.is_main_process(): '.
But I met a new problem caused by this code.
image
When I finish the first step,
image
'the exemplar_replay_cur_file' isn't saved in 'the exemplar_replay_dir' and the code is stopped here
image
At the same time, the code has the wrong issues as follows:
image
image

Bests,
Zhenni Yang

@yangengineering
Copy link
Author

@orrzohar
At the same time, I try the M_benchmark again with the code ' if args.exemplar_replay_selection: '.The result is shown as follows:
image
That is the same as I have said in July 4.
But M_benchmark_random and S_benchmark don't have these issues.

Best,
Zhenni Yang.

@orrzohar
Copy link
Owner

Hi Zhenni,

Di you see the 11.jpg file? does is exist or just not there at all?

Orr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants