Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify DNN dense corpus and interaction with python backend #88

Open
johann-petrak opened this issue Nov 22, 2018 · 0 comments
Open

Comments

@johann-petrak
Copy link
Collaborator

johann-petrak commented Nov 22, 2018

  • see also IMPORTANT! Refactor and simplify gate-lf-python-data#15
  • Keep the option to have many features but make it easy to have just the simple one-feature approach.
  • Store dense corpus instances as maps in each line with standard keys for label and possibly features
  • unify with representation for unlabeled data (e.g. embedding creation or topic models) and other kinds of supervised/unsupervised tasks, e.g. seq2seq or semantic similarity
  • !!!! change representation of sequences: instead of having a sequence of element with multiple features, have a sequence for each feature. Makes it MUCH easier to create batches later.
  • Make it easy to swith between our output and the torchnlp library in the python backend

This should become a project possibly with several subissues.

@johann-petrak johann-petrak changed the title Rethink Pytorch DNN backend Simplify DNN dense corpus and interaction with python backend Nov 30, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant