Skip to content

A simple web crawler that crawls a specified website and generates a report of internal links found during the crawling process.

Notifications You must be signed in to change notification settings

amine-atyq/web-crawler-http

Repository files navigation

Web Crawler

Description

A simple web crawler that crawls a specified website and generates a report of internal links found during the crawling process.

Features

  • Crawls a website starting from a base URL
  • Normalizes and tracks unique URLs
  • Handles both absolute and relative links
  • Generates a report showing the number of links to each page

Prerequisites

  • Node.js
  • npm

Installation

  1. Clone the repository
  2. Run npm install to install dependencies

Usage

npm start [website_url]

Example

npm start https://example.com

Testing

Run tests using Jest:

npm test

Dependencies

  • JSDOM
  • Fetch API
  • Jest (for testing)

About

A simple web crawler that crawls a specified website and generates a report of internal links found during the crawling process.

Resources

Stars

Watchers

Forks