Using this as a space to log and organize my SQL practice using the website DataLemur. Hope you can learn something from this space as well!
Being a data analyst means keeping my skills sharp. Herein, I log my SQL queries that I've been practicing while going through the exercises on DataLemur.com, a relatively new website where anyone can practice their SQL using questions from the prominent book "How to Ace the Data Science Interview". Some of these entries will have a accompanying LinkedIn posts and commentary, and I hope you follow those links and learn from the community commenting on those posts as you read along. Given this site has 50-70 questions and that I am doing these at my leisure, this repository will be updated as I go through them, ie. will be updated over time at my leisure.
Thanks for visiting my portfolio!
The questions represented herein are from "How to Ace the Data Science Interview". The reproduction of questions here is not a statement of ownership, creation, or done for or with the intent for profit. This is an educational exercise intended to demonstrate my commitment to learning more about the tools used in the analytics industry. The full question along with data dictionaries can be located on the Datalemur site and have not been consistently, if at all reproduced, here for the author's convenience.
If you have any questions or concerns feel free to get in touch with me on LinkedIn.
- Question Title | Difficulty | Company Question is From | Date Completed
- Question
- Code Answer
- Number of Tries
- Lessons learned, new skills practiced, and notable failures
- LinkedIn posts about respective question, if applicable
- Well Paid Employees | Easy | FAANG | 10/15/2024
- Discontinued Question due to Paywall - LinkedIn Power Creators (Part 1) | Easy | LinkedIn | 11/10/2022
- Laptop Vs Mobile Viewership | Easy | New York Times | 10/17/2022
- Data Science Skills | Easy | LinkedIn | 10/12/2022
- Pages with No Likes | Easy | Facebook | 9/13/2022
- Cities with Completed Trades | Easy | Robinhood | 9/12/2022
Note: Questions ordered by most recent date.
- Question: Companies often perform salary analyses to ensure fair compensation practices. One useful analysis is to check if there are any employees earning more than their direct managers. As a HR Analyst, you're asked to identify all employees who earn more than their direct managers. The result should include the employee's ID and name.
- Number of Tries: 2
- Lessons Learned: Paying attention to the fields neccessary is the first lesson here (see the commented out salary field which was not required in my first attempt), but the second lesson here is in the join. The join here is doing most of the work because it effectively lines up the employee and manager pay for comparison. This one is simple, but it is still very important to the question.
- LinkedIn Post: Not yet posted
Discontinued Question due to Paywall - LinkedIn Power Creators (Part 1) | Easy | LinkedIn | STARTED: 11/10/2022
- Question: Write a query to return the IDs of these LinkedIn power creators ordered by the IDs. If someone's LinkedIn page has more followers than the company they work for, we can safely assume that person is a power creator.
-- Write a query to return the IDs of these
-- LinkedIn power creators ordered by the IDs.
SELECT
profile_id, -- ID
personal_profiles.followers AS Personal_Followers, -- Persoanl followers count
company_pages.followers AS Company_Followers -- Company followers count
FROM personal_profiles
JOIN company_pages ON company_pages.company_id = personal_profiles.employer_ID -- Pulling together the two datasets of interest
GROUP BY personal_profiles.profile_id, personal_profiles.followers, company_pages.followers
HAVING personal_profiles.followers > company_pages.followers; -- Comparing personal and company followers
- Number of Tries:
- Lessons Learned:
- LinkedIn Post: N/A
- Question: Assume that you are given a table containing information on viewership by device type (where the three types are laptop, tablet, and phone). Define “mobile” as the sum of tablet and phone viewership numbers. Write a query to compare the viewership on laptops versus mobile devices. Output the total viewership for laptop and mobile devices in the format of "laptop_views" and "mobile_views".
- Number of Tries: 1
- Lessons Learned: Building a plan for your query is essential. This is one of the more difficult queries I've written using CASE statements, but the odd thing is that I felt really comfortable writing it! Took a few tries to get the syntax right, but I really enjoyed writing this.
- LinkedIn Post: Here
- Question: Given a table of candidates and their skills, you're tasked with finding the candidates best suited for an open Data Science job. You want to find candidates who are proficient in Python, Tableau, and PostgreSQL. Write a SQL query to list the candidates who possess all of the required skills for the job. Sort the the output by candidate ID in ascending order.
- Number of Tries: 2
- Lessons Learned: I learned about and practiced using the IN function with the WHERE subclause. The IN function is fairly new to me.
- LinkedIn Post: Here
- Question: Assume you are given the tables below about Facebook pages and page likes. Write a query to return the page IDs of all the Facebook pages that don't have any likes. The output should be in ascending order.
-- Code Answer Submitted
SELECT
pages.page_id
FROM pages
LEFT JOIN page_likes
ON pages.page_id = page_likes.page_id
GROUP{ BY pages.page_id
HAVING Count(page_likes.page_id) < 1;
- Number of Tries: 2
- Lessons Learned: HAVING vs WHERE
“A HAVING clause is like a WHERE clause, but applies only to groups as a whole (that is, to the rows in the result set representing groups), whereas the WHERE clause applies to individual rows. A query can contain both a WHERE clause and a HAVING clause.” 🔗 Microsoft Documentation Link 🔗
This one was more challenging for me since I am generally used to painting in broad strokes. Both clauses allow you to filter the broader dataset. The trick-for me at least is that they do the same thing BUT UNDER DIFFERENT circumstances! I tried to answer the question with both because I refused to accept the difference, but ultimately landed back where I started, ie. the difference between these functions is substantive and they are not interchangeable.
- LinkedIn Post: Here
- Question: You are given the tables below containing information on Robinhood trades and users. Write a query to list the top three cities that have the most completed trade orders in descending order.
-- Code Answer Submitted
SELECT
users.city,
COUNT(order_id) AS total_orders
FROM users
JOIN trades
ON users.user_id = trades.user_id
WHERE status = 'Completed'
GROUP BY city
ORDER BY total_orders DESC
LIMIT 3
- Number of Tries: 1
- Lessons Learned: SQL has an order of operations - I tend to familiarize myself with the data and then draft a query plan before testing code. Finally, I’ll fill in pieces as I go. But, this question required so many components that I had to double check my basics and I realized that the order of operations was far from a locked in skill for me.
- LinkedIn Post: Here
Template
- Question:
- Number of Tries:
- Lessons Learned:
- LinkedIn Post: Here
- Answer a question on Datalemur.com
- Copy the template from the appendix to the top of the questions answered section.
- Fill in the spaces of the template in the new entry as you go or after the question is answered on the platform.
- Create an image folder titled with the name of the question.
- Place whatever screenshots or code shots made using Carbon into the image folder to fill in the image space. This space can be purely functional (ie. easy screenshot and go) or a place for creativity (ie. using Carbon to create a beautiful representation of SQL). Pick based on time, energy, and taste.
- Update the Table of Contents to ensure question responses are easy to navigate.
- Do your best to talk about the work on LinkedIn or Medium.