Text Classification for Records Management

By Jason Franks

Supervisors: Greg Rolan, Lan Du

This repository contains the source code for the paper Text Classification for Records Management.

All code was developed on Google Colab (https://colab.research.google.com/) and is intended to run there.

In order to run these experiments you will need your data in a tab-separated .tsv file with two columns: 'label', containing the category name; and 'text', containing the raw text. Evrey category in the data file should have at least 10 records.

The notebooks are set up to load these data files from a google drive and must be provided with the path to mount (mount_path) and the name of the file containing your text data (data_file). The mount path must contain a folder named output, into which the notebooks will write output metrics.

The notebooks will install any software missing from the Colab environment as of 06/2020.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
EvaluationData		EvaluationData
CNN_Experiments.ipynb		CNN_Experiments.ipynb
C_LSTMN_Experiments.ipynb		C_LSTMN_Experiments.ipynb
LSTM_Experiments.ipynb		LSTM_Experiments.ipynb
TF_IDF_Experiments.ipynb		TF_IDF_Experiments.ipynb
Transformer_Experiments.ipynb		Transformer_Experiments.ipynb
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Text Classification for Records Management

By Jason Franks

Supervisors: Greg Rolan, Lan Du

About

Releases

Packages

Languages

jasonfrankenstein/MLForRecords

Folders and files

Latest commit

History

Repository files navigation

Text Classification for Records Management

By Jason Franks

Supervisors: Greg Rolan, Lan Du

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages