Django Techcrunch Scrapper

DjangoTechcrunchScrapper is a Django app to scrape Techcrunch.com website items . Scrapped Data are authors , categories , articles . Application development and testing with django v4.2

Quick start

Install all the packages and requirements with :
```
 pip install -r  requirements.txt
```
Install broker manager like rabbitmq or redis
Set specific and custome settings for you project in settings.py

Set specific and custome settings for you celery in Celery name space in settings.py

 # CELERY-SETCION
 CELERY_BROKER_URL = 'amqp://localhost:your port' (for rabbitmq)
 CELERY_TIMEZONE = 'Your timezone'
 CELERY_TASK_TIME_LIMIT = 60 * 60
 CELERY_RESULT_BACKEND = 'django-db'
 CELERY_TASK_SERIALIZER = 'json'
 CELERY_RESULT_SERIALIZER = 'json'

open terminal and make migrations for models :

 python manage.py makemigrations     
 python manage.py migrate

First of all set the celery beat schedule, go to celery.py and find schedule , change it by second to change schedule:

 app.conf.beat_schedule =
   {
        'every-day-start-daily-scrape': {
            'task': 'techcrunch.tasks.daily_scrape_task',
            'schedule': 86400,  # One day
        },
    }

Before all the things you should be logged in to use specific services , so at first:
```
   py manage.py createsuperuser
```
Then log in with url host:port/admin

After setting celery settings call celery-beat and celery-worker with each other in two cmd terminal:

 celery -A techcrunch_scrapper_with_django worker -l INFO -P eventlet
 celery -A techcrunch_scrapper_with_django beat --loglevel=INFO

Then at last run the django server and run the app :
```
python manage.py runserver
```

Links description :

admin/  => admin panel
manual_daily_search [name='manual_daily_search']  => manual daily scrapping with out celery beat
search_keyword [name='search_keyword']  => search by keyword  page
diagrams/<slug:model_name> [name='diagrams'] => draw diagrams :
diagrams/author => number of articles of each author
diagrams/category => number of articles of each category
diagrams/article => number of articles seach by keyword

The result of diagram generating , will be saved in basedirectory / exports ...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Quick start

Files

README.md

Latest commit

History

README.md

File metadata and controls

Quick start