How to avoid to scrape again and again the same data
Table of Contents
- 🎯 Objective
- 🏗 Prerequisites
- 👩💻 Just tell me what to do
- 📦 Suggested node modules
- 🛣️ Related Theme and courses
Store deals and sales in a database with node.js to create, read, update or delete data...
- Be sure to have a clean working copy.
This means that you should not have any uncommitted local changes.
❯ cd /path/to/workspace/lego
❯ git status
On branch master
Your branch is up to date with 'origin/master'.
nothing to commit, working tree clean
- Pull the
master
branch to update your local with the new remote changes
❯ git remote add upstream git@github.com:92bondstreet/lego.git
## or ❯ git remote add upstream https://github.com/92bondstreet/lego
❯ git fetch upstream
❯ git pull upstream master
-
Create a free account on MongoDB Atlas, Database as a Service (DBaaS) Provider.
-
Create a new project called Lego and deploy a cluster. MongoDB Cluster
- Connect your node.js server script
const {MongoClient} = require('mongodb');
const MONGODB_URI = 'mongodb+srv://<user>:<password>@<cluster-url>?retryWrites=true&writeConcern=majority';
const MONGODB_DB_NAME = 'lego';
...
const client = await MongoClient.connect(MONGODB_URI, {'useNewUrlParser': true});
const db = client.db(MONGODB_DB_NAME)
...
- Insert the deals and sales into this database
const deals = [];
...
const collection = db.collection('deals');
const result = collection.insertMany(deals);
console.log(result);
- Create at least 6 methods to find deals and sales according MongoDB queries.
These 6 methods should
- Find all best discount deals
- Find all most commented deals
- Find all deals sorted by price
- Find all deals sorted by date
- Find all sales for a given lego set id
- Find all sales scraped less than 3 weeks
- ...
const legoSetId = '42156';
...
const collection = db.collection('sales');
const sales = await collection.find({legoSetId}).toArray();
console.log(sales);
- Commit your modification
❯ cd /path/to/workspace/lego
❯ git add -A && git commit -m "feat(new-deals): insert all new deals"
(why following a commit message convention?)
- Commit early, commit often
- Don't forget to push before the end of the workshop
❯ git push origin master
Note: if you catch an error about authentication, add your ssh to your github profile.
If you need some helps on git commands, read git - the simple guide