At some point, I realized that most of the posts from the Telegram channels I subscribed to were not informative and lacked value for me. Therefore, I decided to create a bot that sorts out ads and categorizes all the news and posts from the channels I follow.
The configuration file can be found in the config/example_config.yaml
file. Here's an example of what it should look like:
telegram:
api_id: 1821196
api_hash: "your_api_hash_here" # https://my.telegram.org/auth
session_name: "news_classifier"
bot_settings:
model_path: "model"
db_path: "messages.db"
message_lifetime: 2 # Time in hours
# Optional: Uncomment to exclude categories or channels
# exclude_categories:
# - 1
# - 2
# exclude_channels:
# - 1
# - 2
- Obtain your API ID and API Hash by logging into Telegram's Developer Portal.
- Replace
your_api_hash_here
with your actual API hash. - The
model_path
should point to the folder where the model is located (downloadable from this link). - The
db_path
is the database where the bot stores the messages. - The
message_lifetime
is the time in hours that messages are stored in the database to account for repeated messages.
The example_config.yaml
is just a template. Once you've filled it with your details, you can rename it to config.yaml
.
You can run the bot using Docker. Simply execute:
docker build -t telegram-news-classifier .
If the session file (news_classifier.session
) is missing, the bot will require you to log in. To do this, run the following command:
docker run -i -t -v $(pwd):/app telegram-news-classifier --login
This command will initiate the login process, and you will be prompted to enter your phone number and the authentication code from Telegram. After the first login, the session file will be saved and used for future runs.
If the session file is already present (created after the first login), you can run the bot without the --login
flag:
docker run -v $(pwd):/app telegram-news-classifier -d
Alternatively, if you're using docker-compose
, you can run the bot with:
docker-compose up --build -d
Alternatively, you can set up and run it manually using Python and Poetry:
-
Install Poetry if you haven't already: Poetry installation guide.
-
Clone the repository and navigate to the project folder.
-
Download the model from this link.
-
Install the dependencies by running:
poetry install
Additionally, you will need to install the language model for spaCy:
poetry run python -m spacy download ru_core_news_sm
-
To run the bot, use:
poetry run python -m bot.main
- Add a "merge" news function (combine news from different sources into the most detailed version).
- Add deletion of topics if changes have been made to the config.
This project is licensed under the MIT License. See LICENSE.md for the full text.