Financial-Sentiment-Distillation is a sentiment analysis project focused on financial news, utilizing a teacher-student knowledge distillation approach. The goal is to pre-train a compact and efficient language model specialized in financial sentiment analysis, leveraging domain-specific knowledge from a larger pre-trained model.
- Knowledge Distillation: Implements a teacher-student framework where a smaller student model learns from a larger, pre-trained teacher model.
- Domain-Specific Fine-Tuning: Focuses on financial news to enhance the model's understanding of domain-specific language.
- Data Augmentation: Employs techniques like synonym replacement to enrich the dataset and improve model generalization.
- Evaluation Metrics: Includes detailed performance evaluation using accuracy, classification reports, and confusion matrices.
- Visualization: Provides loss and accuracy plots to monitor training and validation performance.
The dataset used is a financial news dataset from Kaggle, consisting of labeled news articles categorized into positive, negative, and neutral sentiments.
- Teacher Model:
ProsusAI/finbert
(a BERT-based model fine-tuned on financial texts). - Student Model:
distilbert-base-uncased
further fine-tuned using knowledge distillation.
The student model architecture includes:
- DistilBERT encoder
- Layer normalization
- Global average pooling
- Fully connected layers with dropout
- Softmax output layer for sentiment classification
The distillation process uses Kullback-Leibler Divergence (KL Divergence) to align the student model's output with the teacher model, combined with traditional cross-entropy loss to maintain accuracy on labeled data.
Synonym replacement is applied to the training data to introduce variability and improve model robustness. This technique helps the model generalize better to unseen data.
- Optimizer: Adam with a learning rate of 5e-5
- Loss Function: Custom distillation loss combining KL Divergence and cross-entropy
- Batch Size: 32
- Epochs: 20 (with early stopping and learning rate reduction callbacks)
The model achieved an accuracy of approximately 76% on the validation set. Evaluation includes:
- Accuracy and loss plots
- Confusion matrix
- Classification report detailing precision, recall, and F1-scores
- Training and validation loss/accuracy curves
- Confusion matrix heatmap for visualizing model predictions
-
Clone the repository:
git clone https://github.com/your-username/FinSentDistill.git
-
Navigate to the project directory:
cd FinSentDistill
-
Install dependencies:
pip install -r requirements.txt
- Prepare the dataset and place it in the project directory.
- Run the training script:
python train.py
- Evaluate the model:
python evaluate.py
- Visualize results:
python visualize.py
Contributions are welcome. Please fork the repository and submit a pull request.
This project is licensed under the MIT License.
- ProsusAI/finbert for the pre-trained teacher model
- Kaggle for the financial news dataset
- Hugging Face Transformers library for model implementations