Dark-Web-Spiders

This repo has Dark Web scrapy spiders. These were actually used to get data.

What differentiates this from normal scrapers?

In the dark web, CAPTCHAs pose a problem for spiders. This was taken care of by solving CAPTCHAs manually and then feeding cookies to the spider.

To use these files:

Start a new scrapy project.
Overwrite the existing settings by referring to settings.py
First run the title scraper. For this, verify that the selectors work for your website or write your own selectors. Replace 'sample.website' and put proper cookies.
Now using data scraped, do the same for post scaper.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
settings.py		settings.py
thread_scrape_spider.py		thread_scrape_spider.py
title_scraper_spider.py		title_scraper_spider.py