Skip to content

PolyToxiQ: A WebApp for Polymer Toxicity Prediction using Transfer Learning from Tox21 Additives

Notifications You must be signed in to change notification settings

kuennethgroup/PolyToxiQ

Repository files navigation

PolyToxiQ : A Polymer Toxicity Prediction Tool using PSMILE Strings

About This Project

This application predicts the toxicity level of polymers based on their PSMILES string representation using transfer learning techniques and Tox21 molecular fingerprinting.

Methodology

  • AutoGluon & Scikit-learn: We used AutoGluon's TabularPredictor to build a robust machine learning model that classifies polymers into different toxicity levels ( High, Medium, and Low). The model was trained on a carefully curated dataset of Tox 21 datsets (4974) with their known toxicity properties and level of concern(LoC).

  • Cosine Similarity: We calculate the cosine similarity between the PolyBERT Generated fingerprint of the input polymer and those in our reference database of Tox21 Molecule Fingerprints. This metric measures how similar two molecular structures are in their vector space representation, with values ranging from 0 (completely different) to 1 (identical).

  • Zero-Shot Transfer Learning: Our approach leverages transfer learning principles that allow us to make predictions on novel polymer structures that weren't present in the training data using Transfer learning of pre-trained Autogluon Model of Tox21 Molecule dataset.

Toxicity Classification Levels:

Polymers are classified into three concern levels depending on their toxicity properties or Hazard Criteria (0<= Hazard Criteria <= 8):

  • Persistent, Bioaccumulative(BIOACCUM) ,carcinogenicity(CARCINOGEN), mutagenic(MUTA), reproductive toxicity(REPROTOX), specific target organ toxicity(STOT), Endocrine Disrutive Chemicals(EDC), and aquatic toxicity(AQUATOX)

Toxicity Classification Levels:

image image

Releases

No releases published

Packages

No packages published