Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to Create a Custom Description Dataset for HumanML3D? #16

Open
mustafizur-r opened this issue Aug 25, 2024 · 6 comments
Open

How to Create a Custom Description Dataset for HumanML3D? #16

mustafizur-r opened this issue Aug 25, 2024 · 6 comments

Comments

@mustafizur-r
Copy link

Hi,
Great work!
I'm currently working on a project where I need to generate a custom description dataset similar to the one used in HumanML3D. I noticed that you've made changes to your own combat descriptions, and I'm curious about the best practices and steps involved in creating such a dataset. could you help me?

@fyyakaxyy
Copy link
Owner

Hi, Great work! I'm currently working on a project where I need to generate a custom description dataset similar to the one used in HumanML3D. I noticed that you've made changes to your own combat descriptions, and I'm curious about the best practices and steps involved in creating such a dataset. could you help me?

Thanks for you appreciate,
About descriptions, our main work can be summarized as constructing a vocabulary, processing with LLM, and manually filtering. You can find the detailed process in this link: https://zhuanlan.zhihu.com/p/691984079 "§2.2 制作过程"

@mustafizur-r
Copy link
Author

Thank you for share the link. could you give the dataset creating code?

@fyyakaxyy
Copy link
Owner

Thank you for share the link. could you give the dataset creating code?

We created the dataset according to the processing flow of HumanML3D. For text processing, you can refer to: https://github.com/EricGuo5513/HumanML3D/blob/main/text_process.py

@mustafizur-r
Copy link
Author

mustafizur-r commented Aug 27, 2024

Thank you for send me the HumanML3d text process code link.

def process_humanml3d(corpus):
text_save_path = './dataset/pose_data_raw/texts'
desc_all = corpus
for i in tqdm(range(len(desc_all))):
caption = desc_all.iloc[i]['caption']
start = desc_all.iloc[i]['from']
end = desc_all.iloc[i]['to']
name = desc_all.iloc[i]['new_joint_name']
word_list, pose_list = process_text(caption)
tokens = ' '.join(['%s/%s'%(word_list[i], pose_list[i]) for i in range(len(word_list))])
with cs.open(pjoin(text_save_path, name.replace('npy', 'txt')), 'a+') as f:
f.write('%s#%s#%s#%s\n'%(caption, tokens, start, end))

if name == "main":
corpus = pd.read_csv('./dataset/kit_mocap_dataset/desc_final.csv')
process_humanml3d(corpus)

How to make desc_final.csv?
i am concern with this 'new_joint_name'.
could you help me? can i see yours one?

@fyyakaxyy
Copy link
Owner

Thank you for send me the HumanML3d text process code link.

def process_humanml3d(corpus): text_save_path = './dataset/pose_data_raw/texts' desc_all = corpus for i in tqdm(range(len(desc_all))): caption = desc_all.iloc[i]['caption'] start = desc_all.iloc[i]['from'] end = desc_all.iloc[i]['to'] name = desc_all.iloc[i]['new_joint_name'] word_list, pose_list = process_text(caption) tokens = ' '.join(['%s/%s'%(word_list[i], pose_list[i]) for i in range(len(word_list))]) with cs.open(pjoin(text_save_path, name.replace('npy', 'txt')), 'a+') as f: f.write('%s#%s#%s#%s\n'%(caption, tokens, start, end))

if name == "main": corpus = pd.read_csv('./dataset/kit_mocap_dataset/desc_final.csv') process_humanml3d(corpus)

How to make desc_final.csv? i am concern with this 'new_joint_name'. could you help me? can i see yours one?

We used Excel when processing text, with the first column containing animation ID and the second column containing text annotations, then processed into id.txt.
My suggestion is to read this paper and the code of HumanML3D. They contain all the code and steps.

@mustafizur-r
Copy link
Author

mustafizur-r commented Oct 17, 2024

Hi,
To create a .npy file for animation and pair it with a suitable text description for use in the HumanML3D dataset. could you tell me the process?
Actually, I want to make my own animation .npy file and then use into HumanML3d after that i want to use that dataset to your project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants