Basic SQL Queries¶
Here are some of the exercises for which you can write SQL queries to self evaluate.
- Ensure that we have required database and user for retail data. We might provide the database as part of our labs. Here are the instructions to use
psql
for setting up the required database (if required) and tables.
psql -U postgres -h localhost -p 5432 -W
CREATE DATABASE itversity_retail_db;
CREATE USER itversity_retail_user WITH ENCRYPTED PASSWORD 'retail_password';
GRANT ALL ON DATABASE itversity_retail_db TO itversity_retail_user;
- Create Tables using the script provided. You can either use
psql
or SQL Workbench.
psql -U itversity_retail_user \
-h localhost \
-p 5432 \
-d itversity_retail_db \
-W
- You can drop the existing tables.
DROP TABLE IF EXISTS order_items CASCADE;
DROP TABLE IF EXISTS orders CASCADE;
DROP TABLE IF EXISTS customers CASCADE;
DROP TABLE IF EXISTS products CASCADE;
DROP TABLE IF EXISTS categories CASCADE;
DROP TABLE IF EXISTS departments CASCADE;
- Once the tables are dropped you can run below script to create the tables for the purpose of exercises.
\i /data/retail_db/create_db_tables_pg.sql
- Data shall be loaded using the script provided.
\i /data/retail_db/load_db_tables_pg.sql
- Run queries to validate we have data in all the 3 tables.
In [1]:
%load_ext sql
In [2]:
%env DATABASE_URL=postgresql://itversity_retail_user:retail_password@localhost:5432/itversity_retail_db
env: DATABASE_URL=postgresql://itversity_retail_user:retail_password@localhost:5432/itversity_retail_db
Exercise 1 – Customer order count¶
Get order count per customer for the month of 2014 January.
- Tables – orders and customers
- Data should be sorted in descending order by count and ascending order by customer id.
- Output should contain customer_id, customer_first_name, customer_last_name and customer_order_count.
In [3]:
%%sql
SELECT c.customer_id,
c.customer_fname,
c.customer_lname,
count(o.order_id) AS customer_order_count
FROM orders o JOIN customers c
ON o.order_customer_id = c.customer_id
WHERE to_char(o.order_date, 'yyyy-MM') = '2014-01'
GROUP BY c.customer_id
ORDER BY customer_order_count DESC, c.customer_id
LIMIT 10;
10 rows affected.
Out[3]:
customer_id | customer_fname | customer_lname | customer_order_count |
---|---|---|---|
8622 | Shirley | Smith | 5 |
9676 | Theresa | Smith | 5 |
7 | Melissa | Wilcox | 4 |
222 | Frank | Ruiz | 4 |
2444 | Kenneth | Smith | 4 |
2485 | Mary | Hernandez | 4 |
2555 | Mary | Long | 4 |
3128 | Karen | Turner | 4 |
3199 | Ashley | Hernandez | 4 |
3610 | Jordan | Smith | 4 |
Exercise 2 – Dormant Customers¶
Get the customer details who have not placed any order for the month of 2014 January.
- Tables – orders and customers
- Data should be sorted in ascending order by customer_id
- Output should contain all the fields from customers
In [7]:
%%sql
SELECT count(1) FROM customers
* postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db 1 rows affected.
Out[7]:
count |
---|
12435 |
In [15]:
%%sql
select c.*
FROM customers c LEFT OUTER JOIN
(SELECT DISTINCT order_customer_id
FROM orders WHERE
to_char(order_date, 'yyyy-MM') = '2014-01'
) o
ON c.customer_id = o.order_customer_id
WHERE o.order_customer_id IS NULL
ORDER BY c.customer_id
LIMIT 10
* postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db 10 rows affected.
Out[15]:
customer_id | customer_fname | customer_lname | customer_email | customer_password | customer_street | customer_city | customer_state | customer_zipcode |
---|---|---|---|---|---|---|---|---|
1 | Richard | Hernandez | XXXXXXXXX | XXXXXXXXX | 6303 Heather Plaza | Brownsville | TX | 78521 |
2 | Mary | Barrett | XXXXXXXXX | XXXXXXXXX | 9526 Noble Embers Ridge | Littleton | CO | 80126 |
3 | Ann | Smith | XXXXXXXXX | XXXXXXXXX | 3422 Blue Pioneer Bend | Caguas | PR | 00725 |
4 | Mary | Jones | XXXXXXXXX | XXXXXXXXX | 8324 Little Common | San Marcos | CA | 92069 |
5 | Robert | Hudson | XXXXXXXXX | XXXXXXXXX | 10 Crystal River Mall | Caguas | PR | 00725 |
6 | Mary | Smith | XXXXXXXXX | XXXXXXXXX | 3151 Sleepy Quail Promenade | Passaic | NJ | 07055 |
9 | Mary | Perez | XXXXXXXXX | XXXXXXXXX | 3616 Quaking Street | Caguas | PR | 00725 |
10 | Melissa | Smith | XXXXXXXXX | XXXXXXXXX | 8598 Harvest Beacon Plaza | Stafford | VA | 22554 |
11 | Mary | Huffman | XXXXXXXXX | XXXXXXXXX | 3169 Stony Woods | Caguas | PR | 00725 |
12 | Christopher | Smith | XXXXXXXXX | XXXXXXXXX | 5594 Jagged Embers By-pass | San Antonio | TX | 78227 |
In [14]:
%%sql
select count(1)
FROM customers c LEFT OUTER JOIN
(SELECT DISTINCT order_customer_id
FROM orders WHERE
to_char(order_date, 'yyyy-MM') = '2014-01'
) o
ON c.customer_id = o.order_customer_id
WHERE o.order_customer_id IS NULL
* postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db 1 rows affected.
Out[14]:
count |
---|
7739 |
In [12]:
%%sql
SELECT c.*
FROM customers c LEFT OUTER JOIN ORDERS o
ON c.customer_id = o.order_customer_id
AND to_char(o.order_date, 'yyyy-MM') = '2014-01'
WHERE o.order_customer_id IS NULL
LIMIT 10;
* postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db 10 rows affected.
Out[12]:
customer_id | customer_fname | customer_lname | customer_email | customer_password | customer_street | customer_city | customer_state | customer_zipcode |
---|---|---|---|---|---|---|---|---|
1 | Richard | Hernandez | XXXXXXXXX | XXXXXXXXX | 6303 Heather Plaza | Brownsville | TX | 78521 |
2 | Mary | Barrett | XXXXXXXXX | XXXXXXXXX | 9526 Noble Embers Ridge | Littleton | CO | 80126 |
3 | Ann | Smith | XXXXXXXXX | XXXXXXXXX | 3422 Blue Pioneer Bend | Caguas | PR | 00725 |
4 | Mary | Jones | XXXXXXXXX | XXXXXXXXX | 8324 Little Common | San Marcos | CA | 92069 |
5 | Robert | Hudson | XXXXXXXXX | XXXXXXXXX | 10 Crystal River Mall | Caguas | PR | 00725 |
6 | Mary | Smith | XXXXXXXXX | XXXXXXXXX | 3151 Sleepy Quail Promenade | Passaic | NJ | 07055 |
9 | Mary | Perez | XXXXXXXXX | XXXXXXXXX | 3616 Quaking Street | Caguas | PR | 00725 |
10 | Melissa | Smith | XXXXXXXXX | XXXXXXXXX | 8598 Harvest Beacon Plaza | Stafford | VA | 22554 |
11 | Mary | Huffman | XXXXXXXXX | XXXXXXXXX | 3169 Stony Woods | Caguas | PR | 00725 |
12 | Christopher | Smith | XXXXXXXXX | XXXXXXXXX | 5594 Jagged Embers By-pass | San Antonio | TX | 78227 |
In [13]:
%%sql
SELECT count(1)
FROM customers c LEFT OUTER JOIN ORDERS o
ON c.customer_id = o.order_customer_id
AND to_char(o.order_date, 'yyyy-MM') = '2014-01'
WHERE o.order_customer_id IS NULL
* postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db 1 rows affected.
Out[13]:
count |
---|
7739 |
Exercise 3 – Revenue Per Customer¶
Get the revenue generated by each customer for the month of 2014 January
- Tables – orders, order_items and customers
- Data should be sorted in descending order by revenue and then ascending order by customer_id
- Output should contain customer_id, customer_first_name, customer_last_name, customer_revenue.
- If there are no orders placed by customer, then the corresponding revenue for a given customer should be 0.
- Consider only COMPLETE and CLOSED orders
Solution – Revenue Per Customer¶
In [17]:
%%sql
SELECT c.customer_id, c.customer_fname, c.customer_lname,
CASE
WHEN SUM(oi.order_item_subtotal::numeric) IS NULL
THEN 0
ELSE SUM(oi.order_item_subtotal::numeric)
END AS customer_revenue
FROM customers c
LEFT OUTER JOIN orders o
ON c.customer_id = o.order_customer_id
LEFT OUTER JOIN order_items oi
ON o.order_id = oi.order_item_order_id
WHERE o.order_status IN ('COMPLETE', 'CLOSED')
AND to_char(o.order_date, 'yyyy-MM') = '2014-01'
GROUP BY c.customer_id
ORDER BY customer_revenue DESC, c.customer_id
LIMIT 10;
* postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db 10 rows affected.
Out[17]:
customer_id | customer_fname | customer_lname | customer_revenue |
---|---|---|---|
2555 | Mary | Long | 2954.63 |
3465 | Mary | Gardner | 2929.74 |
3710 | Ashley | Smith | 2739.82 |
1780 | Larry | Sharp | 2689.65 |
986 | Catherine | Hawkins | 2629.90 |
9676 | Theresa | Smith | 2599.84 |
1847 | Mary | Smith | 2589.87 |
11901 | Mary | Smith | 2469.87 |
4618 | Andrea | Smith | 2429.82 |
10896 | Victoria | Smith | 2419.78 |
Exercise 4 – Revenue Per Category¶
Get the revenue generated for each category for the month of 2014 January
- Tables – orders, order_items, products and categories
- Data should be sorted in ascending order by category_id.
- Output should contain all the fields from category along with the revenue as category_revenue.
- Consider only COMPLETE and CLOSED orders
In [19]:
%%sql
SELECT c.*,
round(sum(oi.order_item_subtotal)::numeric, 2) AS category_revenue
FROM categories c JOIN products p
ON c.category_id = p.product_category_id
JOIN order_items oi
ON p.product_id = oi.order_item_product_id
JOIN orders o
ON oi.order_item_order_id = o.order_id
WHERE to_char(o.order_date, 'yyyy-MM') = '2014-01'
AND o.order_status IN ('COMPLETE', 'CLOSED')
GROUP BY c.category_id
ORDER BY c.category_id
LIMIT 10;
* postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db 10 rows affected.
Out[19]:
category_id | category_department_id | category_name | category_revenue |
---|---|---|---|
2 | 2 | Soccer | 1094.88 |
3 | 2 | Baseball & Softball | 3214.41 |
4 | 2 | Basketball | 1299.98 |
5 | 2 | Lacrosse | 1299.69 |
6 | 2 | Tennis & Racquet | 1124.75 |
7 | 2 | Hockey | 1433.00 |
9 | 3 | Cardio Equipment | 133156.77 |
10 | 3 | Strength Training | 3388.96 |
11 | 3 | Fitness Accessories | 1509.73 |
12 | 3 | Boxing & MMA | 3998.46 |
Exercise 5 – Product Count Per Department¶
Get the products for each department.
- Tables – departments, categories, products
- Data should be sorted in ascending order by department_id
- Output should contain all the fields from department and the product count as product_count
In [21]:
%%sql
SELECT d.*,
count(p.product_id) AS product_count
FROM departments d
JOIN categories c
ON d.department_id = c.category_department_id
JOIN products p
ON c.category_id = p.product_category_id
GROUP BY d.department_id
ORDER BY d.department_id;
* postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db 6 rows affected.
Out[21]:
department_id | department_name | product_count |
---|---|---|
2 | Fitness | 168 |
3 | Footwear | 168 |
4 | Apparel | 140 |
5 | Golf | 120 |
6 | Outdoors | 336 |
7 | Fan Shop | 149 |