-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Testing on a Small New Dataset with zero (or almost no) Training Data #5
Comments
Hi @rudra0713,
I hope that helps. Let me know if you have any more questions. |
Hi @v1nc3nt27
|
Hi @v1nc3nt27, waiting for your response. |
Hi @rudra0713, the model won't check the labels, it will only leverage the index you set. It should be fine setting different indices for the same label if you use several different datasets. I have done this at training as well and, theoretically, the model should have learned to generalize the embeddings in the transformer and base the class decision solely (or at least mostly) on the classification layer. |
Hi,
I have a very small stance detection dataset (80-100 examples) with 3 stance classes: disagree(class 0), agree(class 1), and balance (class 2) and I want to test the MT-DNN model's performance on this dataset. I have 2 questions regarding this:
Since my dataset is very small, I do not want to split this to create training samples. Is there a way to just test the MT-DNN model (trained on 10 datasets) on my dataset? I don't think that will be possible, because, without any training data, the MT-DNN model will not have a dataset-specific top layer. Is this correct?
Assuming that point 1 is valid, this is more of a theoretical question on MT-DNN. Let's say, I do build a minimal training set from my dataset (with just 2 examples from each class). For simplicity, let's assume MT-DNN is only tuned on 2 datasets, the first dataset has 'agree' as class 0, 'disagree' as class 1. The second dataset has 'agree' as class 0, 'disagree' as class 1, 'balanced' as class 2. Since both these datasets have different class labels compared to my dataset, will this cause any problem? For example, for a given sentence pair from my dataset, if the model decided that the stance label should be pro, will it predict the class label to be 0 or 1?
The text was updated successfully, but these errors were encountered: