Skip to content

Latest commit

 

History

History
2 lines (2 loc) · 681 Bytes

README.md

File metadata and controls

2 lines (2 loc) · 681 Bytes

sdg-text

We leverage readily-available natural language data, scraped from Wikipedia, to predict localized indices (asset, sanitation, women's education) relevant to the UN's Sustainability Goals. We explore the impact of different text embedding extraction methods and model architectures on performance in this small data task. We explore logistic regression models, feedforward DNNs, and NLP-CNNs. We use geolocated and extracted “relevant” sentence embeddings to achieve ROC-AUC scores of 0.80 (logistic regression model), 0.70 (logistic regression model), and 0.81 (feedforward DNN model) for asset, sanitation, and women's education index classification, respectively.