Discourse Classification

Description

Discourse_classification is a Longformer model fine-tuned to identify different discourse elements in a student's writing. The model is fine-tuned for NER Token Classification and uses the PERSUADE dataset for training.

Paper - Identifying discourse elements in writing by fine-tuning BERT, LongFormer, and GPT-2 models for NER Token Classification

All Discourse Elements (NER Token Classifiers)

First, the model divides the corpus into distinct discourse elements and then the elements are classified as one of the following...

Lead: an introduction that begins with a statistic, a quotation, a description, or some other device to grab the reader’s attention and point toward the thesis
Position: an opinion or conclusion on the main question
Claim: a claim that supports the position
CounterClaim: a claim that refutes another claim or gives an opposing reason to the position
Rebuttal: a claim that refutes a counterclaim
Evidence: ideas or examples that support claims, counterclaims, or rebuttals
Concluding Statment: a concluding statement that restates the claims

Example

Run in browser

Go to website https://huggingface.co/brad1141/Longformer_v5
Under Hosted inference API, paste the corpus that you wish to evaluate

Run on computer

following code also available in the runModel.ipynb file

from transformers import pipeline

model_checkpoint = "brad1141/Longformer_v5"
token_classifier = pipeline(
    "token-classification", model=model_checkpoint, aggregation_strategy="simple"
)

# code for evaluating a single text
predicts = token_classifier("YOUR TEXT HERE")
for p in predicts:
    print(p)

Kaggle Submission

To evaluate your texts and create a csv file in Kaggle Submission file format, run the file kaggleRun.ipynb with your own text folder

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
.gitignore		.gitignore
README.md		README.md
Research_Paper.pdf		Research_Paper.pdf
data_prep.ipynb		data_prep.ipynb
example1.png		example1.png
kaggleRun.ipynb		kaggleRun.ipynb
local_eval.ipynb		local_eval.ipynb
main_Final (1).ipynb		main_Final (1).ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Discourse Classification

Description

Paper - Identifying discourse elements in writing by fine-tuning BERT, LongFormer, and GPT-2 models for NER Token Classification

All Discourse Elements (NER Token Classifiers)

Example

Run in browser

Run on computer

Kaggle Submission

About

Releases

Packages

Languages

Brad1141/Discourse_Classification

Folders and files

Latest commit

History

Repository files navigation

Discourse Classification

Description

Paper - Identifying discourse elements in writing by fine-tuning BERT, LongFormer, and GPT-2 models for NER Token Classification

All Discourse Elements (NER Token Classifiers)

Example

Run in browser

Run on computer

Kaggle Submission

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages