ScrapeMEE is a powerful GUI-based web scraping tool designed to extract text, images, links, tables, and metadata from websites. It features dynamic content rendering, automation with Selenium, recursive scraping, and data export to multiple formats. The tool includes a dark mode, proxy support, and user-agent rotation for ethical and flexible web scraping.
- 🌐 GUI-Based Scraper – Built with Tkinter for an interactive experience.
- ⚡ Dynamic Content Extraction – Uses Selenium to handle JavaScript-rendered pages.
- 🔍 Recursive Scraping – Scrape multiple pages up to a defined depth.
- 🛡️ Proxy & User-Agent Rotation – Avoids detection and blocks.
- 📋 Extracts:
- Text Content
- Links & Social Media Links
- Images (Preview & Download)
- HTML Tables (Convert to CSV/Excel)
- Metadata (Title, Description, Keywords, etc.)
- Emails & Phone Numbers
- 🏷 Automated Form Submission – Interacts with login & input forms.
- 📜 Handles Pagination & Infinite Scroll – Collects data from multi-page websites.
- 🎭 Dark Mode Support – Toggle between light and dark themes.
- 📤 Export Options: JSON, CSV, Excel, PDF.
- 📑 GDPR Disclaimer – Ensures ethical scraping.
- Python 3.x – Core programming language.
- Tkinter – GUI framework.
- BeautifulSoup – HTML parsing and scraping.
- Requests & Selenium – Handling dynamic content.
- Pandas – Data manipulation (tables, CSV/Excel export).
- Pillow (PIL) – Image handling & previews.
- PDFKit – Data export to PDF.
- Install Python 3.x on your system.
- Clone or download the repository.
- Install dependencies:
pip install -r requirements.txt
- Install wkhtmltopdf (for PDF export, if required):
- Windows: Download
- Linux:
sudo apt install wkhtmltopdf
- Run the application:
python ScrapeMEE.py
- Enter a website URL and click Scrape Website.
- View extracted text, images, links, and tables in the respective tabs.
- Export data to JSON, CSV, Excel, or PDF.
- Ensure compliance with robots.txt and GDPR regulations before scraping.
- Do NOT scrape personal or sensitive data without permission.
- This tool is for educational and research purposes only.
- 📌 Multi-threaded scraping for faster data retrieval.
- 📌 More scraping depth levels.
- 📌 Browser automation for interactive sites.
Developed by Aditya Barokar 🚀