About

SniperMAG is a product sniper. Basically, it notifies you when a product's price or availability changes, and when new products are added. It can be modified to work with any website.

Disclaimer

This tool was designed to be used on websites that allow other tools similar to this to be used. Do NOT use this tool on websites that forbid scraping, or on websites whose terms of service would be broken as a result of using this tool.

The authors of this project are NOT RESPONSIBLE in any way for your own actions, as stated in the license. This tool distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY.

Setting up

Run npm i
Rename the config.sample.json file to config.json
Enter your email details in the config file (the address is used to send the notifications via nodemailer)
Add products/search queries in the config (see the config docs)

Running

TL;DR:

If it's the first time using this project:

npm run populate
npm start

If you changed the config by adding or removing products/search queries:

npm run clean
npm run populate
npm start

Just run:

npm start

If you're running the script for the first time, or you're adding new products/search queries, it's recommended to run the populate command. This populates the database with the products, and doesn't send any notifications. Basically, it's like start, but it only runs once in order to populate the database, and also doesn't spam notifications.

npm run populate

In any other cases, just run the script normally:

npm start

All of the products are cached in an SQLite3 database. Even if you remove products/search queries from the config, those are still cached in the database. To remove them, simply run the clean command.

npm run clean

Config

There are 2 core elements:

product - has a name and a URL; the product URL will be used to extract the price and status of the product
search query - a search URL that is used to extract multiple search results (aka products); use this if you want to see when a new product becomes available, or to watch multiple products matching a search query; each resulting product will be watched for any changes to the price or the status, and the script will also look for new products that appear by periodically sending requests to the URL

If you don't feel like reading this whole section, see Common scenarios.

The config file is config.json. All of the options listed below should be present in the config. The 'default' values refer to what the sample config comes with; the script doesn't set any defaults.

database - the path to the SQLite database (default: data/data.db)
email
- notifications - whether or not to send email notifications; set this to false to disable all email notifications (default: true)
- host - email host
- port - email port (default: 465)
- secure - secure email (default: true)
- user - email username
- pass - email password
- name - email name, which will appear on the notification emails (default: SniperMAG)
- address - email address; this is probably the same as the username; this will appear on the notification emails
requestDelay - delay in ms between requests (default: 5000 aka 5 seconds)
snipeDelay - the script scrapes all products, and then waits for a bit; this is how long to wait in ms (default: 300000 aka 5 minutes)
maxRetries - if a request fails, this value says how many times to retry before giving up (default: 3)
products - an array of products, each one having:
- url - the URL of the product; the script sends a request here and receives a response (either HTML or JSON); this response is then scraped by a provider (more details here) in order to obtain the price and status
- name - the name of the product; should be unique, because it's used as the primary key in the SQLite database
- provider - the JS module that knows how to scrape the response (more details here)
- format - how to pass the response to the provider script
  - html if the response is an HTML page (default) - this loads the response with Cheerio and passes it to the provider
  - json if the response is JSON - this just sends the JSON response to the provider
- threshold - if the new product price is higher than this, the script won't send an email notification when the product becomes available or the price changes; if this is 0, the threshold will be ignored and the script will always send notifications (default: 0 aka no threshold)
- recipients - a list of strings, each one being an email address; the notification emails regarding this product will only be sent to these addresses
searches - an array of search queries, each one having:
- url - the URL of the search query; the script sends a request here and receives a response (either HTML or JSON); this response is then scraped by a provider (more details here) in order to obtain the results
- keywords - a list of strings, each one being a lowercase keyword; the products that result from this search must contain all keywords 9case insensitive); those that don't contain all of them will be ignored
- provider - the JS module that knows how to scrape the response for this search query, and also for each resulting product (more details here)
- searchFormat - how to pass the search response to the provider script
  - html if the response is an HTML page (default) - this loads the response with Cheerio and passes it to the provider
  - json if the response is JSON - this just sends the JSON response to the provider
- format - each product resulting from this search query will have this format passed to it; see the same field from the products config object; searchFormat applies only on the search query response, while this applies on the resulting products
  - example - the search URL returns JSON, but each product's URL is an HTML page -> in this case, set searchFormat to JSON, and format to HTML
- threshold - each product resulting from this search query will have this threshold passed to it; see the same field in the products config object
- recipients - each product resulting from this search query will have this list passed to it; see the same field in the products config object

Common scenarios

I just want to scrape a product by its URL that returns an HTML page

Add an entry to the products list, and set format: "html"

I'm scraping search results and also the resulting products, both using URLs that return HTML

Use "html" for both the searchFormat and format

The search URL returns JSON, but each product's URL returns HTML aka the product's page

Use searchFormat: "json" and format: "html" for the search query

The search URL returns JSON, just like each product's URL

Use "json" for both the searchFormat and format

How do I ignore the price of products? I want to always receive notifications, regardless of the price

Set threshold: 0

I (temporarily) don't want to receive any emails

Set notifications: false in the email object

Custom scraping

In order to scrape content from multiple websites, the script uses so-called providers. A provider is simply a JS module that exports 2 functions:

scrapeData - takes the HTML/JSON content of a product, and returns the price and status (available/unavailable) for that product
scrapeSearch - takes the HTML/JSON result from a search, and returns a list of products (name + URL)

To create a provider, simply copy the providers/sample.js file to providers/your_provider.js and change those 2 functions to suit your needs. Afterwards, for each product/search query that should use your provider, set provider: "your_provider" aka the provider filename without .js.

Providers must be in the providers dir

The script supports both HTML and JSON:

if the URL you use for a product returns HTML, then scrapeData will receive the HTML content in the form of Cheerio's $ object. You can use this to scrape the page by using selectors (refer to the Cheerio documentation).
if the URL returns JSON, each product using this provider should have format: "json" in the config. The scrapeData function will receive a JS object, parsed from the JSON response

The same goes for scrapeSearch; if you use a JSON URL for a search queury, set searchFormat: "json". You should set the format depending on what the resulting product URLs return.

For example, the sample provider (providers/sample.js) assumes the parameter to be $ (aka HTML content) for both scraping functions. Therefore, if you use this provider in products/search queries, those must have format: "html" (and searchFormat: "html" on top of that for search queries).

How it works

All products from the config are added to the database
Each search query from the config has one or more results
Each result is added to the database
Each product's page will be scraped; if anything changes, the script will send notifications and will update the entry in the database
Go back to step 2; if you're running the script with the populate command, the script stops here

If you remove products/seach queries from the config, keep in mind that those products will still be stored in the database (and, therefore, they will be watched). If you don't want this to happen, simply run a clean command followed by populate, and then just use the script normally with start.

License

This project is licensed under the MIT license.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
db		db
img		img
providers		providers
views		views
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
common.js		common.js
config.sample.json		config.sample.json
eryn.js		eryn.js
index.js		index.js
mail.js		mail.js
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Disclaimer

Setting up

Running

Config

Common scenarios

I just want to scrape a product by its URL that returns an HTML page

I'm scraping search results and also the resulting products, both using URLs that return HTML

The search URL returns JSON, but each product's URL returns HTML aka the product's page

The search URL returns JSON, just like each product's URL

How do I ignore the price of products? I want to always receive notifications, regardless of the price

I (temporarily) don't want to receive any emails

Custom scraping

How it works

License

About

Releases

Languages

License

UnexomWid/SniperMAG

Folders and files

Latest commit

History

Repository files navigation

About

Disclaimer

Setting up

Running

Config

Common scenarios

I just want to scrape a product by its URL that returns an HTML page

I'm scraping search results and also the resulting products, both using URLs that return HTML

The search URL returns JSON, but each product's URL returns HTML aka the product's page

The search URL returns JSON, just like each product's URL

How do I ignore the price of products? I want to always receive notifications, regardless of the price

I (temporarily) don't want to receive any emails

Custom scraping

How it works

License

About

Resources

License

Stars

Watchers

Forks

Releases

Languages