Skip to content

Latest commit

 

History

History
269 lines (215 loc) · 10.6 KB

File metadata and controls

269 lines (215 loc) · 10.6 KB

Database views

-> View : result set of a query

Tables vs. views

Views have been described as "virtual tables". It's true that views are similar to tables in certain aspects, but there are key differences. In this exercise, you will organize these differences and similarities.

Instructions

Organize these characteristics into the category that they describe best.

  • Only Tables:
    • Part of the physical schema of a database.
  • Views & Tables
    • Contains rows and columns
    • Can be queried
    • Has access control
  • Only Views
    • Always defined by a query
    • Takes up less memory

Viewing views

Because views are very useful, it's common to end up with many of them in your database. It's important to keep track of them so that database users know what is available to them.

The goal of this exercise is to get familiar with viewing views within a database and interpreting their purpose. This is a skill needed when writing database documentation or organizing views.

Instructions 1/3

  • Query the information schema to get views.
  • Exclude system views in the results.
-- Get all non-systems views
SELECT * FROM information_schema.views
WHERE table_schema NOT IN ('pg_catalog', 'information_schema');

Instructions 2/3

Question

What does view1 do? Possible Answers

  • Returns the content records with reviewids that have been viewed more than 4000 times.
  • Returns the content records that have reviews of more than 4000 characters.
  • Returns the first 4000 records in content. Instructions 3/3

Question

What does view2 do? Possible Answers

  • Returns 10 random reviews published in 2017.
  • Returns the top 10 lowest scored reviews published in 2017.
  • Returns the top 10 highest scored reviews published in 2017.

Creating and querying a view

Have you ever found yourself running the same query over and over again? Maybe, you used to keep a text copy of the query in your desktop notes app, but that was all before you knew about views!

In these Pitchfork reviews, we're particularly interested in high-scoring reviews and if there's a common thread between the works that get high scores. In this exercise, you'll make a view to help with this analysis so that we don't have to type out the same query often to get these high-scoring reviews.

Instructions 1/2

  • Create a view called high_scores that holds reviews with scores above a 9.
-- Create a view for reviews with a score above 9
CREATE VIEW high_scores AS
SELECT * FROM reviews
WHERE score > 9;

Instructions 2/2

  • Count the number of records in high_scores that are self-released in the label field of the labels table.
-- Count the number of self-released works in high_scores
SELECT COUNT(*) FROM high_scores
INNER JOIN labels AS l ON high_scores.reviewid = l.reviewid
WHERE l.label = 'self-released';

Managing views

#granting or revoking access to a view

GRANT privilege(s)orREVOKE privilege(s)

ON object

TO roleorFROM role

  • Privileges :
    • SELECT
    • INSERT
    • UPDATE
    • DELETE
  • Objects :
    • table, view ,schema ,etc
  • Roles :
    • db user or group of db users

#examples

GRANT UPDATE ON ratings TO PUBLIC;

REVOKE INSERT ON films FROM db_user;
--Updating a view eg:
UPDATE films set kind = 'Dramatic' WHERE kind = 'Drama';

--Inserting into a view eg:
INSERT INTO films (code, title, did, date_prod, kind)
   VALUES ('T601', 'Yojimbo', 106, '1961-06-16', *Drama');
   
--Dropping a view eg:
DROP VIEW view_name [ CASCADE| RESTRICT ];

--Redifining a view eg:
CREATE OR REPLACE VIEW viewname AS new_query;

--Altering a view eg:
ALTER VIEW [ IF EXISTS J name ALTER [ COLUMN ] column_name SET DEFAULT expression
ALTER VIEW [ IF EXISTS ] name ALTER [ COLUMN ] column_name DROP DEFAULT
ALTER VIEW [ IF EXISTS ] name owner T0 new_owner
ALTER VIEW [ IF EXISTS J name RENAME TO new name
ALTER VIEW [ IF EXISTS ] name SET SCHEMA new_schema

Creating a view from other views

Views can be created from queries that include other views. This is useful when you have a complex schema, potentially due to normalization, because it helps reduce the JOINS needed. The biggest concern is keeping track of dependencies, specifically how any modifying or dropping of a view may affect other views.

In the next few exercises, we'll continue using the Pitchfork reviews data. There are two views of interest in this exercise. top_15_2017 holds the top 15 highest scored reviews published in 2017 with columns reviewid,title, and score. artist_title returns a list of all reviewed titles and their respective artists with columns reviewid, title, and artist. From these views, we want to create a new view that gets the highest scoring artists of 2017.

Instructions 1/2

  • Create a view called top_artists_2017 with artist from artist_title.
  • To only return the highest scoring artists of 2017, join the views top_15_2017 and artist_title on reviewid.
  • Output top_artists_2017
-- Create a view with the top artists in 2017
CREATE VIEW top_artists_2017 AS
-- with only one column holding the artist field
SELECT at.artist FROM artist_title AS at
INNER JOIN top_15_2017 AS top15
ON at.reviewid = top15.reviewid;

-- Output the new view
SELECT * FROM top_artists_2017;

Instructions 2/2

Question

Which is the DROP command that would drop both top_15_2017 and top_artists_2017? Possible Answers

  • DROP VIEW top_15_2017 CASCADE;
  • DROP VIEW top_15_2017 RESTRICT;
  • DROP VIEW top_artists_2017 RESTRICT;
  • DROP VIEW top_artists_2017 CASCADE;

Granting and revoking access

Access control is a key aspect of database management. Not all database users have the same needs and goals, from analysts, clerks, data scientists, to data engineers. As a general rule of thumb, write access should never be the default and only be given when necessary.

In the case of our Pitchfork reviews, we don't want all database users to be able to write into the long_reviews view. Instead, the editor should be the only user able to edit this view.

Instructions

  • Revoke all database users' update and insert privileges on the long_reviews view.
  • Grant the editor user update and insert privileges on the long_reviews view.
-- Revoke everyone's update and insert privileges
REVOKE UPDATE, INSERT ON long_reviews FROM PUBLIC; 

-- Grant the editor update and insert privileges 
GRANT UPDATE, INSERT ON long_reviews TO editor; 

Updatable views

In a previous exercise, we've used the information_schema.views to get all the views in a database. If you take a closer look at this table, you will notice a column that indicates whether the view is updatable.

Which views are updatable?

SELECT * FROM information_schema.views
WHERE is_updatable='YES';

Instructions

  • long_reviews and top_25_2017
  • top_25_2017
  • long_reviews
  • top_25_2017 and artist_title

Redefining a view

Unlike inserting and updating, redefining a view doesn't mean modifying the actual data a view holds. Rather, it means modifying the underlying query that makes the view. In the last video, we learned of two ways to redefine a view: (1) CREATE OR REPLACE and (2) DROP then CREATE. CREATE OR REPLACE can only be used under certain conditions.

The artist_title view needs to be appended to include a column for the label field from the labels table.

Instructions 1/2

Question

Can the CREATE OR REPLACE statement be used to redefine the artist_title view?

  • Yes, as long as the label column comes at the end.

  • No, because the new query requires a JOIN with the labels table.

  • No, because a new column that did not exist previously is being added to the view.

  • Yes, as long as the label column has the same data type as the other columns in artist_title Instructions 2/2

  • Use CREATE OR REPLACE to redefine the artist_title view.

  • Respecting artist_title's original columns of reviewid, title, and artist, add a label column from the labels table.

  • Join the labels table using the reviewid field.

-- Redefine the artist_title view to have a label column
CREATE OR REPLACE VIEW artist_title AS
SELECT reviews.reviewid, reviews.title, artists.artist, labels.label
FROM reviews
INNER JOIN artists
ON artists.reviewid = reviews.reviewid
INNER JOIN labels
ON reviews.reviewid = labels.reviewid;

SELECT * FROM artist_title;

#Materialized views

materialized-views stores the query results not the query

#examples:

-- Implementing materialized views
CREATE MATERIALIZED VIEW my_mv AS SELECT FROM existing_table ;

REFRESH MATERIALIZED VIEW my_mv;

Materialized versus non-materialized

Materialized and non-materialized are two distinct categories of views. In this exercise, you will organize their differences and similarities.

Instructions

  • organize these characteristics into the category that they describe best.

  • Non-Materialized Views :

    • Always returns up-to-date data
    • Helps reduce the overhead of writing querles
  • Both :

    • Stores the query resut on disk
    • Better to use on write-ntenslve databases
  • Materialized Views :

    • Can be used In a data warehouse
    • Consumes more storage

Creating and refreshing a materialized view

The syntax for creating materialized and non-materialized views are quite similar because they are both defined by a query. One key difference is that we can refresh materialized views, while no such concept exists for non-materialized views. It's important to know how to refresh a materialized view, otherwise the view will remain a snapshot of the time the view was created.

In this exercise, you will create a materialized view from the table genres. A new record will then be inserted into genres. To make sure the view has the latest data, it will have to be refreshed.

Instructions

  • Create a materialized view called genre_count that holds the number of reviews for each genre.
  • Refresh genre_count so that the view is up-to-date.
-- Create a materialized view called genre_count 
CREATE MATERIALIZED VIEW genre_count AS
SELECT genre, COUNT(*) 
FROM genres
GROUP BY genre;

INSERT INTO genres
VALUES (50000, 'classical');

-- Refresh genre_count
REFRESH MATERIALIZED VIEW genre_count;

SELECT * FROM genre_count;

Managing materialized views

Why do companies use pipeline schedulers, such as Airflow and Luigi, to manage materialized views?

Possible Answers

  • To set up a data warehouse and make sure tables have the most up-to-date data.
  • To refresh materialized views with consideration to dependences between views.
  • To convert non-materialized views to materialized views.
  • To prevent the creation of new materialized views when there are too many dependencies.