Authors > Paige Gorry | Kate Dameron
II. Tutorial
GET /api/v1/characters - get all characters (default 20 per page)
GET /api/v1/characters?perPage=${num}?page=${num} - edit/page through character list
GET /api/v1/characters/:id - get character by their id
GET /api/v1/characters/random?count=${num} - get a random character (default 1)
GET /api/v1/characters?name=${string} - get character by their name
GET /api/v1/characters?${query}=${string} - get character by a specific query string (see options below)
queries available: aliases, otherRelations, affiliation, occupation, residence, appearsInEpisodes, status, gender, eyeColor, born, hairColor, portrayedBy
This is what the data will look like when returned.
const characterSchema = new mongoose.Schema({
name: String,
photo: String,
status: String,
born: String,
aliases: {
type: [String],
default: ['unknown']
},
otherRelations: {
type: [String],
default: ['unknown']
},
affiliation: {
type: [String],
default: ['unknown']
},
occupation: {
type: [String],
default: ['unknown']
},
residence: {
type: [String],
default: ['unknown']
},
gender: String,
eyeColor: String,
hairColor: String,
portrayedBy: String,
appearsInEpisodes: {
type: [String],
default: ['unknown']
},
});
GET /api/v1/characters?name=Eleven
[
{
"_id": "5e77d8d2caf0952a9c8499d9",
"aliases": [
"El",
"011",
"Jane Ives",
"The Weirdo",
"Eleanor",
"Shirley Temple",
"Mage"
],
"otherRelations": [
"Mike Wheeler",
"Dustin Henderson",
"Lucas Sinclair",
"Max Mayfield",
"Will Byers",
"Jonathan Byers",
"Benny Hammond",
"Martin Brenner"
],
"affiliation": [
"Hawkins National Laboratory",
"Party",
"Ives family",
"Hopper family"
],
"occupation": [
"Lab test subject (formerly)"
],
"residence": [
"Hawkins, Indiana (1971 - 1985)",
"Byers house (July 1985 - October 1985)",
"Hopper cabin (December 1983 - July 1985)",
"Wheeler basement (November 1983)",
"Hawkins National Laboratory (1971 - November 1983)"
],
"appearsInEpisodes": [
"1",
"2",
"3",
"4",
"5",
"6",
"7",
"8",
"9",
"10",
"11",
"12",
"13",
"15",
"16",
"17",
"18",
"19",
"20",
"21",
"22",
"23",
"24",
"25"
],
"photo": "https://vignette.wikia.nocookie.net/strangerthings8338/images/f/f1/Eleven_S03_portrait.png/revision/latest/scale-to-width-down/286?cb=20190722075442",
"name": "Eleven",
"status": "Alive",
"born": "1971",
"gender": "Female",
"eyeColor": "Brown",
"hairColor": "Brown",
"portrayedBy": "Millie Bobby Brown"
}
]
This project uses Node.js, Express, Superagent, MongoDB, Mongoose, and node-html-parser deployed to Heroku. This tutorial requires some familiarity with Node.js, Express, and MongoDB, but we have linked resources for you as well. Other technologies are available!
For this project you'll need to set up a Node.js server. We used Express.js but you can use whatever you want! Check out this tutorial.
Think about what you are scraping and how often that data changes.
If you are scraping data for a streaming service (Netflix, Hulu, etc.), think about how often shows are added to those sites. Do you have a schedule for how you want to maintain your API to keep it up to date? How will you maintain versioning on your API?
- You will need a parser package (there are 100s of them available so you can use whatever you want). For this project we used node-html-parser.
- You will also need to install an npm package to help with the request. We used super-agent.
> npm install -D node-html-parser superagent
- Now make a scraper.js file at the root of your repo
- Add a function to make the initial request
const request = require('superagent');
const { parse } = require('node-html-parser');
const scraper = () => {
return request
.get([your url here])
.then(res => res.text)
.then(parse)
.then(console.log);
};
scraper();
module.exports = { scraper };
- Now you should be able to run
node scraper.js
and see some html data appear in your console
Open up your dev tools and inspect the elements that hold the data you need to scrape. In our case, the first thing we needed to do was grab each h2
header with the class pi-title
. Here's a screenshot to see what we started with this:
- We can grab a series of elements like the
h2
by using thequerySelectorAll
method on the html we get back from the parser. To do this we made a helper function. Be sure to look at the documentation for your npm parser to see what kinds of selectors are available.
const titlesList = html => html
.querySelectorAll('h2 .pi-title')
.map(node => node.rawText);
- add your helper function to your request function
const scraper = () => {
return request
.get('url')
.then(res => res.text)
.then(parse)
.then(titlesList)
.then(console.log);
};
run node scraper.js
again
At this point you should be able to see the data and start to make decisions about how to grab different elements, run some clean up functions and start to piece it all together to match your db schema.
We are using MongoDB with Mongoose ODM (object data modeling). The Mongoose documentation is top notch, whereas the MongoDB docs could use some work. Please reference their documentation for more information. You will need MongoDB set up on your computer to follow along. This is how we set up our database.
Your schema is how you want your data to look in your database. Basically it is a blueprint for MongoDB. Since all of our data is information about the characters of Stranger Things we need a character schema. For each key value pair from our data, we need to specify it's type. i.e. Here is a character we scraped:
{
name: 'Eleven'
}
So we need to tell Mongoose that we are expecting all character's names to be a string.
Here is a short snippet:
// See full file in ./lib/models/Character.js
const mongoose = require('mongoose');
const characterSchema = new mongoose.Schema({
name: String,
aliases: {
type: [String],
default: ['unknown']
},
});
module.exports = mongoose.model('Character', characterSchema);
You'll notice that 'aliases' has a type [String]
; this is to specify that all aliases are arrays of strings.
The default value is for characters that do not have an 'aliases' field. This is an optional field. (There is also an optional 'required' field, which is defaulted to false.)
-
You are going to need to connect your application to your database. We also want to listen for on, off, and error events for our connection. Check out our connect.js file. You will see we import and call our event listeners into our server.js file as
require('lib/utils/connect.js')();
-
Your local db name should remain private to you. Set up an .env file and store your
MONGODB_URI=
link there. See our.env.example
file in the root of our project. Don't forget to add .env to your .gitignore file!To access your environment variables, you need to run
> npm i dotenv
and add this to the top of your server.js file
require('dotenv').config();
-
Try running your server. Remember, first you will need to run
gomongo
. Then run your server.js file. You should be able to see the following log:listening on PORT ${PROCESS.ENV_PORT} Connection open on mongodb:${PROCESS.ENV_MONGODB_URI}
In order to seed your database, you will need:
// access to your MONGODB_URI
require('dotenv').config();
// connection to your db
require('./lib/utils/connect')();
// your scraper function
const scrapeData = require('./scrapers/infoScraper');
// your mongoose schema
const Character = require('./lib/Models/Character');
We set all of this up in it's own file in the root of our application.
// ./seed.js
// don't forget to close the connection when finished!
const mongoose = require('mongoose');
scrapeData()
.then(chars => Character.create(chars))
.finally(() => mongoose.connection.close());
To check out your data, we used Robo3T, a free, open source MongoDB GUI. Check out their website for documentation on how to download and set this up on your machine.
This section requires some familiarity with Express Router.
This section will just be a summary of the functionality of each of our routes.
Hot Tip We recommend thinking of your users and data. What data would your users want? If you have a lot of data, consider pagination as an option. Try and bounce off ideas with other devs to come up with your routes.
.get('/:id', (req, res, next) => {
Character
.findById(req.params.id)
.select('-__v')
.then(character => res.send(character))
.catch(next);
})
Our get route looks very similar here...
.get('/random', (req, res, next) => {
const { count = 1 } = req.query;
Character
.getRandom(+count)
.then(character => res.send(character))
.catch(next);
})
You'll notice a custom static called 'getRandom' being used. You can create your own static method in your model schema. Check out the docs to learn more.
characterSchema.statics.getRandom = function(count) {
return this.aggregate([{ $sample: { size: count }}, {$project: { __v: false}}]);
};
For our get all characters route, we repurposed it to handle multiple functionalities including pagination and all queries. Check out the source code:
.get('/', (req, res, next) => {
const { page = 1, perPage = 20, ...search } = req.query;
const query = Object.entries(search)
.reduce((query, [key, value]) => {
query[key] = new RegExp(value, 'gmi');
return query;
}, {});
Character
.find(query)
.skip(+perPage * (+page - 1))
.limit(+perPage)
.lean()
.select('-__v')
.then(characters => res.send(characters))
.catch(next);
});
We decided to deploy to Heroku! Here are some resources:
Take the time to document your application either in a README or create a front end! Provide information on your routes and what type of data users will be accessing.
Share your APIs with us on Twitter! @katerj @paigeegorry