Solutions – Writing Basic SQL Queries using Postgresql

Basic SQL Queries

Here are some of the exercises for which you can write SQL queries to self evaluate.

  • Ensure that we have required database and user for retail data. We might provide the database as part of our labs. Here are the instructions to use psql for setting up the required database (if required) and tables.
psql -U postgres -h localhost -p 5432 -W
CREATE DATABASE itversity_retail_db;
CREATE USER itversity_retail_user WITH ENCRYPTED PASSWORD 'retail_password';
GRANT ALL ON DATABASE itversity_retail_db TO itversity_retail_user;
  • Create Tables using the script provided. You can either use psql or SQL Workbench.
psql -U itversity_retail_user \
  -h localhost \
  -p 5432 \
  -d itversity_retail_db \
  -W
  • You can drop the existing tables.
DROP TABLE IF EXISTS order_items CASCADE;
DROP TABLE IF EXISTS orders CASCADE;
DROP TABLE IF EXISTS customers CASCADE;
DROP TABLE IF EXISTS products CASCADE;
DROP TABLE IF EXISTS categories CASCADE;
DROP TABLE IF EXISTS departments CASCADE;
  • Once the tables are dropped you can run below script to create the tables for the purpose of exercises.
\i /data/retail_db/create_db_tables_pg.sql
  • Data shall be loaded using the script provided.
\i /data/retail_db/load_db_tables_pg.sql
  • Run queries to validate we have data in all the 3 tables.
In [1]:
%load_ext sql
In [2]:
%env DATABASE_URL=postgresql://itversity_retail_user:retail_password@localhost:5432/itversity_retail_db
env: DATABASE_URL=postgresql://itversity_retail_user:retail_password@localhost:5432/itversity_retail_db

Exercise 1 – Customer order count

Get order count per customer for the month of 2014 January.

  • Tables – orders and customers
  • Data should be sorted in descending order by count and ascending order by customer id.
  • Output should contain customer_id, customer_first_name, customer_last_name and customer_order_count.
In [3]:
%%sql

SELECT c.customer_id, 
    c.customer_fname, 
    c.customer_lname,
    count(o.order_id) AS customer_order_count 
FROM orders o JOIN customers c 
    ON o.order_customer_id = c.customer_id
WHERE to_char(o.order_date, 'yyyy-MM') = '2014-01'
GROUP BY c.customer_id
ORDER BY customer_order_count DESC, c.customer_id
LIMIT 10;
10 rows affected.
Out[3]:
customer_id customer_fname customer_lname customer_order_count
8622 Shirley Smith 5
9676 Theresa Smith 5
7 Melissa Wilcox 4
222 Frank Ruiz 4
2444 Kenneth Smith 4
2485 Mary Hernandez 4
2555 Mary Long 4
3128 Karen Turner 4
3199 Ashley Hernandez 4
3610 Jordan Smith 4

Exercise 2 – Dormant Customers

Get the customer details who have not placed any order for the month of 2014 January.

  • Tables – orders and customers
  • Data should be sorted in ascending order by customer_id
  • Output should contain all the fields from customers
In [7]:
%%sql

SELECT count(1) FROM customers
 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
1 rows affected.
Out[7]:
count
12435
In [15]:
%%sql

select c.* 
FROM customers c LEFT OUTER JOIN 
    (SELECT DISTINCT order_customer_id 
     FROM orders WHERE
     to_char(order_date, 'yyyy-MM') = '2014-01'
    ) o
    ON c.customer_id = o.order_customer_id
WHERE  o.order_customer_id IS NULL 
ORDER BY c.customer_id
LIMIT 10
 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
10 rows affected.
Out[15]:
customer_id customer_fname customer_lname customer_email customer_password customer_street customer_city customer_state customer_zipcode
1 Richard Hernandez XXXXXXXXX XXXXXXXXX 6303 Heather Plaza Brownsville TX 78521
2 Mary Barrett XXXXXXXXX XXXXXXXXX 9526 Noble Embers Ridge Littleton CO 80126
3 Ann Smith XXXXXXXXX XXXXXXXXX 3422 Blue Pioneer Bend Caguas PR 00725
4 Mary Jones XXXXXXXXX XXXXXXXXX 8324 Little Common San Marcos CA 92069
5 Robert Hudson XXXXXXXXX XXXXXXXXX 10 Crystal River Mall Caguas PR 00725
6 Mary Smith XXXXXXXXX XXXXXXXXX 3151 Sleepy Quail Promenade Passaic NJ 07055
9 Mary Perez XXXXXXXXX XXXXXXXXX 3616 Quaking Street Caguas PR 00725
10 Melissa Smith XXXXXXXXX XXXXXXXXX 8598 Harvest Beacon Plaza Stafford VA 22554
11 Mary Huffman XXXXXXXXX XXXXXXXXX 3169 Stony Woods Caguas PR 00725
12 Christopher Smith XXXXXXXXX XXXXXXXXX 5594 Jagged Embers By-pass San Antonio TX 78227
In [14]:
%%sql

select count(1)
FROM customers c LEFT OUTER JOIN 
    (SELECT DISTINCT order_customer_id 
     FROM orders WHERE
     to_char(order_date, 'yyyy-MM') = '2014-01'
    ) o
    ON c.customer_id = o.order_customer_id
WHERE  o.order_customer_id IS NULL 
 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
1 rows affected.
Out[14]:
count
7739
In [12]:
%%sql

SELECT c.*
FROM customers c LEFT OUTER JOIN ORDERS o
    ON c.customer_id = o.order_customer_id
        AND to_char(o.order_date, 'yyyy-MM') = '2014-01'
WHERE o.order_customer_id IS NULL
LIMIT 10;
 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
10 rows affected.
Out[12]:
customer_id customer_fname customer_lname customer_email customer_password customer_street customer_city customer_state customer_zipcode
1 Richard Hernandez XXXXXXXXX XXXXXXXXX 6303 Heather Plaza Brownsville TX 78521
2 Mary Barrett XXXXXXXXX XXXXXXXXX 9526 Noble Embers Ridge Littleton CO 80126
3 Ann Smith XXXXXXXXX XXXXXXXXX 3422 Blue Pioneer Bend Caguas PR 00725
4 Mary Jones XXXXXXXXX XXXXXXXXX 8324 Little Common San Marcos CA 92069
5 Robert Hudson XXXXXXXXX XXXXXXXXX 10 Crystal River Mall Caguas PR 00725
6 Mary Smith XXXXXXXXX XXXXXXXXX 3151 Sleepy Quail Promenade Passaic NJ 07055
9 Mary Perez XXXXXXXXX XXXXXXXXX 3616 Quaking Street Caguas PR 00725
10 Melissa Smith XXXXXXXXX XXXXXXXXX 8598 Harvest Beacon Plaza Stafford VA 22554
11 Mary Huffman XXXXXXXXX XXXXXXXXX 3169 Stony Woods Caguas PR 00725
12 Christopher Smith XXXXXXXXX XXXXXXXXX 5594 Jagged Embers By-pass San Antonio TX 78227
In [13]:
%%sql

SELECT count(1)
FROM customers c LEFT OUTER JOIN ORDERS o
    ON c.customer_id = o.order_customer_id
        AND to_char(o.order_date, 'yyyy-MM') = '2014-01'
WHERE o.order_customer_id IS NULL
 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
1 rows affected.
Out[13]:
count
7739

Exercise 3 – Revenue Per Customer

Get the revenue generated by each customer for the month of 2014 January

  • Tables – orders, order_items and customers
  • Data should be sorted in descending order by revenue and then ascending order by customer_id
  • Output should contain customer_id, customer_first_name, customer_last_name, customer_revenue.
  • If there are no orders placed by customer, then the corresponding revenue for a given customer should be 0.
  • Consider only COMPLETE and CLOSED orders

Solution – Revenue Per Customer

In [17]:
%%sql

SELECT c.customer_id, c.customer_fname, c.customer_lname, 
    CASE 
        WHEN SUM(oi.order_item_subtotal::numeric) IS NULL 
            THEN 0 
        ELSE SUM(oi.order_item_subtotal::numeric)
    END AS customer_revenue 
FROM customers c 
    LEFT OUTER JOIN orders o 
        ON c.customer_id = o.order_customer_id 
    LEFT OUTER JOIN order_items oi 
        ON o.order_id = oi.order_item_order_id
WHERE o.order_status IN ('COMPLETE', 'CLOSED') 
    AND to_char(o.order_date, 'yyyy-MM') = '2014-01'
GROUP BY c.customer_id
ORDER BY customer_revenue DESC, c.customer_id
LIMIT 10;
 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
10 rows affected.
Out[17]:
customer_id customer_fname customer_lname customer_revenue
2555 Mary Long 2954.63
3465 Mary Gardner 2929.74
3710 Ashley Smith 2739.82
1780 Larry Sharp 2689.65
986 Catherine Hawkins 2629.90
9676 Theresa Smith 2599.84
1847 Mary Smith 2589.87
11901 Mary Smith 2469.87
4618 Andrea Smith 2429.82
10896 Victoria Smith 2419.78

Exercise 4 – Revenue Per Category

Get the revenue generated for each category for the month of 2014 January

  • Tables – orders, order_items, products and categories
  • Data should be sorted in ascending order by category_id.
  • Output should contain all the fields from category along with the revenue as category_revenue.
  • Consider only COMPLETE and CLOSED orders
In [19]:
%%sql

SELECT c.*, 
    round(sum(oi.order_item_subtotal)::numeric, 2) AS category_revenue 
FROM categories c JOIN products p 
        ON c.category_id = p.product_category_id 
    JOIN order_items oi 
        ON p.product_id = oi.order_item_product_id 
    JOIN orders o
        ON oi.order_item_order_id = o.order_id
WHERE to_char(o.order_date, 'yyyy-MM') = '2014-01' 
    AND o.order_status IN ('COMPLETE', 'CLOSED')
GROUP BY c.category_id
ORDER BY c.category_id 
LIMIT 10;
 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
10 rows affected.
Out[19]:
category_id category_department_id category_name category_revenue
2 2 Soccer 1094.88
3 2 Baseball & Softball 3214.41
4 2 Basketball 1299.98
5 2 Lacrosse 1299.69
6 2 Tennis & Racquet 1124.75
7 2 Hockey 1433.00
9 3 Cardio Equipment 133156.77
10 3 Strength Training 3388.96
11 3 Fitness Accessories 1509.73
12 3 Boxing & MMA 3998.46

Exercise 5 – Product Count Per Department

Get the products for each department.

  • Tables – departments, categories, products
  • Data should be sorted in ascending order by department_id
  • Output should contain all the fields from department and the product count as product_count
In [21]:
%%sql

SELECT d.*, 
    count(p.product_id) AS product_count 
FROM departments d 
    JOIN categories c 
        ON d.department_id = c.category_department_id 
    JOIN products p
        ON c.category_id = p.product_category_id
GROUP BY d.department_id
ORDER BY d.department_id;
 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
6 rows affected.
Out[21]:
department_id department_name product_count
2 Fitness 168
3 Footwear 168
4 Apparel 140
5 Golf 120
6 Outdoors 336
7 Fan Shop 149

Share this post