This project focuses on detecting fake news within articles using a variety of supervised machine learning models, including Recurrent Neural Networks (RNN), Convolutional Neural Networks (CNN), and Transformers.
KAGGLE dataset results:
accuracy | precision | recall | f1 | |
---|---|---|---|---|
PassiveAggressiveClassifier | 0.99443 | 0.99532 | 0.993 | 0.99416 |
LogisticRegression | 0.98664 | 0.98646 | 0.98554 | 0.986 |
MultinomialNB | 0.93964 | 0.94112 | 0.9319 | 0.93649 |
KNeighborsClassifier | 0.65612 | 0.95872 | 0.29244 | 0.44818 |
RandomForestClassifier | 0.9902 | 0.98883 | 0.99067 | 0.98975 |
LinearSVC | 0.99465 | 0.99579 | 0.993 | 0.9944 |
WELFake dataset results:
accuracy | precision | recall | f1 | |
---|---|---|---|---|
PassiveAggressiveClassifier | 0.95992 | 0.95307 | 0.96892 | 0.96093 |
LogisticRegression | 0.94397 | 0.93197 | 0.95992 | 0.94574 |
MultinomialNB | 0.86907 | 0.87051 | 0.87241 | 0.87146 |
KNeighborsClassifier | 0.6301 | 0.58095 | 0.97928 | 0.72927 |
RandomForestClassifier | 0.93329 | 0.91945 | 0.95229 | 0.93558 |
LinearSVC | 0.96117 | 0.95173 | 0.97301 | 0.96225 |
RNN | 0.98767 | 0.98829 | 0.98827 | 0.98828 |
CNN | 0.98706 | 0.98662 | 0.98652 | 0.98657 |
Transformer | 0.97152 | 0.97149 | 0.97149 | 0.97149 |
LIAR dataset results:
accuracy | precision | recall | f1 | |
---|---|---|---|---|
PassiveAggressiveClassifier | 0.23362 | 0.23245 | 0.23362 | 0.23303 |
LogisticRegression | 0.25019 | 0.24541 | 0.25019 | 0.24778 |
MultinomialNB | 0.23757 | 0.23089 | 0.23757 | 0.23418 |
KNeighborsClassifier | 0.21863 | 0.21483 | 0.21863 | 0.21671 |
RandomForestClassifier | 0.25651 | 0.26152 | 0.25651 | 0.25899 |
LinearSVC | 0.24546 | 0.24380 | 0.24546 | 0.24463 |
RNN | 0.24226 | 0.22991 | 0.24073 | 0.23519 |
CNN | 0.21354 | 0.20257 | 0.21231 | 0.20732 |
Transformer | 0.21078 | 0.16977 | 0.21073 | 0.18804 |
LIAR dataset with party and speaker results:
accuracy | precision | recall | f1 | |
---|---|---|---|---|
PassiveAggressiveClassifier | 0.22889 | 0.23023 | 0.22889 | 0.22956 |
LogisticRegression | 0.25572 | 0.26227 | 0.25572 | 0.25895 |
MultinomialNB | 0.25493 | 0.26958 | 0.25493 | 0.26205 |
KNeighborsClassifier | 0.22257 | 0.23144 | 0.22257 | 0.22692 |
RandomForestClassifier | 0.24625 | 0.25599 | 0.24625 | 0.25103 |
LinearSVC | 0.23994 | 0.24102 | 0.23994 | 0.24048 |
Binary LIAR dataset results:
accuracy | precision | recall | f1 | |
---|---|---|---|---|
PassiveAggressiveClassifier | 0.56196 | 0.60935 | 0.62045 | 0.61485 |
LogisticRegression | 0.61563 | 0.63182 | 0.76190 | 0.69079 |
MultinomialNB | 0.60142 | 0.60336 | 0.85434 | 0.70725 |
KNeighborsClassifier | 0.57064 | 0.60732 | 0.67367 | 0.63878 |
RandomForestClassifier | 0.60852 | 0.62415 | 0.76751 | 0.68844 |
LinearSVC | 0.60221 | 0.63636 | 0.68627 | 0.66038 |
RNN | 0.58626 | 0.56522 | 0.07052 | 0.12540 |
CNN | 0.52904 | 1.0 | 0.00181 | 0.00361 |
Transformer | 0.56380 | 0.0 | 0.0 | 0.0 |
Transformer-Pretrained | 0.50185 | 0.37963 | 0.22242 | 0.28050 |