This repository contains a PHP script for scraping restaurant details from the Michelin Guide website. The script is designed to extract data such as restaurant names, descriptions, images, addresses, and GPS coordinates for restaurants listed under the Bib Gourmand category in Taiwan.
Before running this script, you will need:
- PHP 7.4 or higher
- cURL support enabled in PHP
- DOM and XPath modules for PHP
- Clone the repository:
- Navigate to the project directory:
Run the script from the command line or a web server that supports PHP.
Upload the repository contents to your web server's public directory and access scraper.php
through your browser.
The script outputs data in XML format, which includes detailed information about each restaurant. This data can be directly viewed in a web browser if accessed via a web server, or viewed in the console if run from the command line.
- Fetches list of restaurants from the Michelin Guide website's Bib Gourmand section.
- Extracts detailed information about each restaurant, including:
- Name
- Description
- Images
- Address
- Timetable
- GPS coordinates
- Outputs data in XML format for easy integration with other systems or for further processing.
Please ensure that your use of this script complies with the Michelin Guide website's Terms of Service, and respect robots.txt directives. This script is intended for educational purposes only. Users are responsible for ensuring that their use of the script complies with legal regulations and website terms applicable to web scraping and data usage.
Contributions to this project are welcome. Please fork the repository and submit a pull request with your enhancements.
This project is released under the MIT License. See the LICENSE file for details.