Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Producing file formats of my data set #2

Open
ans92 opened this issue Jan 24, 2021 · 0 comments
Open

Producing file formats of my data set #2

ans92 opened this issue Jan 24, 2021 · 0 comments

Comments

@ans92
Copy link

ans92 commented Jan 24, 2021

Hi @jacoxu,
Thank you for great code. First of all I want to know that do you have any python code through which I can prepare following two files from my own data set:

  1. vocab_withIdx.dic
  2. vocab_emb_Word2vec_48.vec

When I saw your raw titles text files and vocab_withIdx.dic then I do not understand how you have prepared this. Have you performed any text preprocessing before you convert it into vocab with indexes. I would be very thankful to you for your help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant