A robust phishing detection framework that leverages advanced machine learning techniques, including adversarial training and multi-modal feature extraction. By analyzing diverse data sources such as URLs, HTML content, metadata, and user behavior, GRPD uses GAN generated samples to avoid adversarial attacks.
The main goal of this project is to develop a robust phishing website detection framework that can withstand adversarial attacks, particularly those generated using Generative Adversarial Networks (GANs). By incorporating adversarial training and multi-modal feature extraction (URL analysis, HTML content inspection, and metadata interpretation), the project aims to improve the accuracy, reliability, and robustness of phishing detection systems, ensuring better protection against evolving cyber threats.
-
Adversarial Training:
- Integrates adversarial phishing websites generated by Generative Adversarial Networks (GANs) into the training data, enhancing model robustness against phishing attacks crafted to deceive standard detection models.
-
Multi-modal Feature Extraction:
- URL-based features: Analyzes URL structure, length, and the presence of suspicious keywords.
- HTML content analysis: Extracts phishing-related signatures from the webpage’s structure and content.
- Metadata analysis: Leverages metadata such as expiration date,DNS record and WHOIS data for further phishing site identification.
-
Evaluation Metrics:
- Performance is measured through key metrics such as Accuracy, Precision, Recall, F1-Score, and adversarial robustness, ensuring comprehensive evaluation of the model’s phishing detection capabilities.
-
Interactive Notebooks:
- Use Google colab or Jupyter notebooks to facilitate running experiments, tuning models, and visualizing results, making it easy for us to explore and modify the code.