Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testing on a Small New Dataset with zero (or almost no) Training Data #5

Open
rudra0713 opened this issue Feb 10, 2022 · 4 comments
Open
Assignees
Labels
question Further information is requested

Comments

@rudra0713
Copy link

Hi,
I have a very small stance detection dataset (80-100 examples) with 3 stance classes: disagree(class 0), agree(class 1), and balance (class 2) and I want to test the MT-DNN model's performance on this dataset. I have 2 questions regarding this:

  1. Since my dataset is very small, I do not want to split this to create training samples. Is there a way to just test the MT-DNN model (trained on 10 datasets) on my dataset? I don't think that will be possible, because, without any training data, the MT-DNN model will not have a dataset-specific top layer. Is this correct?

  2. Assuming that point 1 is valid, this is more of a theoretical question on MT-DNN. Let's say, I do build a minimal training set from my dataset (with just 2 examples from each class). For simplicity, let's assume MT-DNN is only tuned on 2 datasets, the first dataset has 'agree' as class 0, 'disagree' as class 1. The second dataset has 'agree' as class 0, 'disagree' as class 1, 'balanced' as class 2. Since both these datasets have different class labels compared to my dataset, will this cause any problem? For example, for a given sentence pair from my dataset, if the model decided that the stance label should be pro, will it predict the class label to be 0 or 1?

@v1nc3nt27
Copy link
Contributor

Hi @rudra0713,

  1. Yes, you are right, there. The dataset specific dense layers for the classification are not included.
  2. No, this won't be a problem. You will have to create a LabelMapper and there you basically decide whether (e.g.) your "agree" label will be on position 0, 1, or 2 (based on the order you add them). The code will automatically create a classification layer based on the number of classes you pass there.

I hope that helps. Let me know if you have any more questions.

@v1nc3nt27 v1nc3nt27 added the question Further information is requested label Feb 10, 2022
@v1nc3nt27 v1nc3nt27 self-assigned this Feb 10, 2022
@rudra0713
Copy link
Author

rudra0713 commented Feb 11, 2022

Hi @v1nc3nt27

  1. Thanks for confirming.
  2. Regarding Point 2 of your answer, I understand that code will create a classification layer based on the number of classes in my dataset (in my example 3). I am trying to understand, how would the model utilize the knowledge that it learned from other datasets. In my example, during training, the model probably learned that for an instance of the class "agree", the probability for class 0 should be high because 0 was the label for class "agree" in the other two datasets. But in my dataset, a sample (let's say x) in the class "agree" has the label of 1. So in this case, will the model still try to maximize the probability for class 0? Another way of saying this, let's say the model correctly realizes that sample x is an instance of class "agree", then will it check the Labelmapper to realize that in my dataset, "agree" is class 1, so it will increase the probability of class 1? (Also, please take into account that, my training set very very small.)

@rudra0713
Copy link
Author

Hi @v1nc3nt27, waiting for your response.

@v1nc3nt27
Copy link
Contributor

v1nc3nt27 commented Feb 15, 2022

Hi @rudra0713, the model won't check the labels, it will only leverage the index you set. It should be fine setting different indices for the same label if you use several different datasets. I have done this at training as well and, theoretically, the model should have learned to generalize the embeddings in the transformer and base the class decision solely (or at least mostly) on the classification layer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants