Skip to content

Latest commit

 

History

History
26 lines (17 loc) · 1.03 KB

README.md

File metadata and controls

26 lines (17 loc) · 1.03 KB

News Categorizer

News categorizer is docker-based web service:

  • provides category of given news url
  • provides category of given news body

Installation

Edit docker-compose.yml based on your server. Then run the following command:

docker-compose up

Dataset

The dataset we use for the predictive model is BBC News. We split BBC News Train.csv into %20 of the data as validation, %10 of the data test, and the rest as train set by using random seed 42.

Technologies

It uses BERT to predict the category given content. BERT is fine tuned by using ktrain library in Colab. You may use same scripts on the Colab to train your model. Make sure that you replace the files names as model and model.preproc under the directory model in the source code.

For url based detection, it uses rule-based approach.

License

MIT