ELMo: Deep Contextualized Word Representations

Introduction

Modern word embedding algorithms like word2vec and GloVe provide single representations for words, ignoring contextual information. ELMo, a contextualized embedding model, addresses this by capturing word meaning in context using stacked Bi-LSTM layers. This README outlines the implementation and training of an ELMo architecture from scratch using PyTorch.

Implementation and Training

Architecture

The ELMo architecture consists of stacked Bi-LSTM layers to generate contextualized word embeddings. Weights for combining word representations across layers are trained.

Model Pre-training

ELMo embeddings are learned through bidirectional language modeling on the given dataset's train split.

Trained model: bilstm.pt
Download Model

Downstream Task

Trained the ELMo architecture on a 4-way classification task using the AG News Classification Dataset.

Corpus

Trained the model on the provided News Classification Dataset (same dataset used for other methods in getting word embeddingd - check Word_Vectorization repository of mine for more detail).

Download Corpus

Hyperparameter Tuning

Trainable λs

Trained and found the best λs for combining word representations across different layers.

Model: classifier_1.pt
Download Model

Frozen λs

Randomly initialized and froze the λs.

Model: classifier_2.pt
Download Model

Learnable Function

Learned a function to combine word representations across layers.

Model: classifier_3.pt
Download Model

Analysis

Comprehensive analysis of ELMo's performance in pretraining and the downstream task compared to SVD and Word2Vec embeddings. Included performance metrics like accuracy, F1 score, precision, recall, and confusion matrices for different settings.

Loading Models

Load Model

data = torch.load("<filename>")

To load any model ( .pt files ) :-

`<data retrieved>` = torch.load("`<filename>`")

Note :-

While pretraining the elmo , i used only first 10000 sentences in train.csv for it
Also i used only first 10000 train sentences for downstream task also

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
corpus		corpus
README.md		README.md
Report.pdf		Report.pdf
classification.py		classification.py
elmo.py		elmo.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ELMo: Deep Contextualized Word Representations

Introduction

Implementation and Training

Architecture

Model Pre-training

Downstream Task

Corpus

Hyperparameter Tuning

Trainable λs

Frozen λs

Learnable Function

Analysis

Loading Models

Load Model

To load any model ( .pt files ) :-

About

Releases

Packages

Languages

MallaSailesh/ELMO

Folders and files

Latest commit

History

Repository files navigation

ELMo: Deep Contextualized Word Representations

Introduction

Implementation and Training

Architecture

Model Pre-training

Downstream Task

Corpus

Hyperparameter Tuning

Trainable λs

Frozen λs

Learnable Function

Analysis

Loading Models

Load Model

To load any model ( .pt files ) :-

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages