This project provided a Python script uses the psycopg2 library to query a PostgreSQL database of a mock news website. The Python script produces a report that answers the following three questions:
- What are the most popular three articles of all time?
- Who are the most popular article authors of all time?
- On which days did more than 1% of requests lead to errors?
It will show the answers in the terminal, and exampleOutput.txt
is a plain text file that is a copy of what my program printed out.
- Download and unzip this project into the working directory.
- Use a terminal to type command
vagrant init
in the working directory. - Go to here to copy and save the text as
Vagrantfile
. - Replace the old
Vagrantfile
with the newVagrantfile
in your working directory. - Type command
vagrant up
on the terminal to turn on the virtual machine. It will take a while when the first booting. - Once the virtual machine is done booting, type command
vagrant ssh
on the terminal to log in the virtual machine.
- Download and unzip
newsdata.sql
into your working directory. - Type
cd /vagrant
to navigate to the vagrant share directory. - To load the database, use the command
psql -d news -f newsdata.sql
.
To achieve the purpose of reducing the complexity of the SQL query, I made a script create_views.sql
to create three views (the view in SQL just like the variable in Python). Type the command psql -d news -f create_views.sql
on the terminal.
The script logAnalysis.py
can be executed in both Python 2 and Python 3.
You can input python logAnalysis.py
or python3 logAnalysis.py
to generate the report.