Analytics Functions¶
Let us take care of the exercises related to analytics functions. We will be using HR database for the same.
- Get all the employees who is making more than average salary with in each department.
- Get cumulative salary for one of the department along with department name.
- Get top 3 paid employees with in each department by salary (use dense_rank)
- Get top 3 products sold in the month of 2014 January by revenue.
- Get top 3 products in each category sold in the month of 2014 January by revenue.
Prepare HR Database¶
Here are the steps to prepare HR database.
- Connect to HR DB using
psql
or SQL Workbench. Here is the samplepsql
command.
psql -h localhost \
-p 5432 \
-d itversity_hr_db \
-U itversity_hr_user \
-W
- Run scripts to create tables and load the data. You can also drop the tables if they already exists.
\i /data/hr_db/drop_tables_pg.sql
\i /data/hr_db/create_tables_pg.sql
\i /data/hr_db/load_data_pg.sql
- Validate to ensure that data is available in the tables by running these queries.
In [1]:
%load_ext sql
In [2]:
%env DATABASE_URL=postgresql://itversity_hr_user:hr_password@localhost:5432/itversity_hr_db
env: DATABASE_URL=postgresql://itversity_hr_user:hr_password@localhost:5432/itversity_hr_db
In [3]:
%sql SELECT * FROM employees LIMIT 10
10 rows affected.
Out[3]:
employee_id | first_name | last_name | phone_number | hire_date | job_id | salary | commission_pct | manager_id | department_id | |
---|---|---|---|---|---|---|---|---|---|---|
100 | Steven | King | SKING | 515.123.4567 | 1987-06-17 | AD_PRES | 24000.00 | None | None | 90 |
101 | Neena | Kochhar | NKOCHHAR | 515.123.4568 | 1989-09-21 | AD_VP | 17000.00 | None | 100 | 90 |
102 | Lex | De Haan | LDEHAAN | 515.123.4569 | 1993-01-13 | AD_VP | 17000.00 | None | 100 | 90 |
103 | Alexander | Hunold | AHUNOLD | 590.423.4567 | 1990-01-03 | IT_PROG | 9000.00 | None | 102 | 60 |
104 | Bruce | Ernst | BERNST | 590.423.4568 | 1991-05-21 | IT_PROG | 6000.00 | None | 103 | 60 |
105 | David | Austin | DAUSTIN | 590.423.4569 | 1997-06-25 | IT_PROG | 4800.00 | None | 103 | 60 |
106 | Valli | Pataballa | VPATABAL | 590.423.4560 | 1998-02-05 | IT_PROG | 4800.00 | None | 103 | 60 |
107 | Diana | Lorentz | DLORENTZ | 590.423.5567 | 1999-02-07 | IT_PROG | 4200.00 | None | 103 | 60 |
108 | Nancy | Greenberg | NGREENBE | 515.124.4569 | 1994-08-17 | FI_MGR | 12000.00 | None | 101 | 100 |
109 | Daniel | Faviet | DFAVIET | 515.124.4169 | 1994-08-16 | FI_ACCOUNT | 9000.00 | None | 108 | 100 |
In [4]:
%%sql
SELECT * FROM departments
ORDER BY manager_id NULLS LAST
LIMIT 10
* postgresql://itversity_hr_user:***@localhost:5432/itversity_hr_db 10 rows affected.
Out[4]:
department_id | department_name | manager_id | location_id |
---|---|---|---|
90 | Executive | 100 | 1700 |
60 | IT | 103 | 1400 |
100 | Finance | 108 | 1700 |
30 | Purchasing | 114 | 1700 |
50 | Shipping | 121 | 1500 |
80 | Sales | 145 | 2500 |
10 | Administration | 200 | 1700 |
20 | Marketing | 201 | 1800 |
40 | Human Resources | 203 | 2400 |
70 | Public Relations | 204 | 2700 |
Prepare Retail Database¶
Make sure to drop and recreate the tables before taking care of the exercises related to retail database.
- Ensure that we have required database and user for retail data. We might provide the database as part of our labs. Here are the instructions to use
psql
for setting up the required database (if required) and tables.
psql -U postgres -h localhost -p 5432 -W
CREATE DATABASE itversity_retail_db;
CREATE USER itversity_retail_user WITH ENCRYPTED PASSWORD 'retail_password';
GRANT ALL ON DATABASE itversity_retail_db TO itversity_retail_user;
- Create Tables using the script provided. You can either use
psql
or SQL Workbench.
psql -U itversity_retail_user \
-h localhost \
-p 5432 \
-d itversity_retail_db \
-W
- You can drop the existing tables.
DROP TABLE IF EXISTS order_items CASCADE;
DROP TABLE IF EXISTS orders CASCADE;
DROP TABLE IF EXISTS customers CASCADE;
DROP TABLE IF EXISTS products CASCADE;
DROP TABLE IF EXISTS categories CASCADE;
DROP TABLE IF EXISTS departments CASCADE;
- Once the tables are dropped you can run below script to create the tables for the purpose of exercises.
\i /data/retail_db/create_db_tables_pg.sql
- Data shall be loaded using the script provided.
\i /data/retail_db/load_db_tables_pg.sql
- Run queries to validate we have data in all the 3 tables.
In [1]:
%load_ext sql
In [2]:
%env DATABASE_URL=postgresql://itversity_retail_user:retail_password@localhost:5432/itversity_retail_db
env: DATABASE_URL=postgresql://itversity_retail_user:retail_password@localhost:5432/itversity_retail_db
In [3]:
%sql SELECT * FROM orders LIMIT 10
10 rows affected.
Out[3]:
order_id | order_date | order_customer_id | order_status |
---|---|---|---|
1 | 2013-07-25 00:00:00 | 11599 | CLOSED |
2 | 2013-07-25 00:00:00 | 256 | PENDING_PAYMENT |
3 | 2013-07-25 00:00:00 | 12111 | COMPLETE |
4 | 2013-07-25 00:00:00 | 8827 | CLOSED |
5 | 2013-07-25 00:00:00 | 11318 | COMPLETE |
6 | 2013-07-25 00:00:00 | 7130 | COMPLETE |
7 | 2013-07-25 00:00:00 | 4530 | COMPLETE |
8 | 2013-07-25 00:00:00 | 2911 | PROCESSING |
9 | 2013-07-25 00:00:00 | 5657 | PENDING_PAYMENT |
10 | 2013-07-25 00:00:00 | 5648 | PENDING_PAYMENT |
In [4]:
%sql SELECT count(1) FROM orders
* postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db 1 rows affected.
Out[4]:
count |
---|
68883 |
In [5]:
%sql SELECT * FROM order_items LIMIT 10
* postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db 10 rows affected.
Out[5]:
order_item_id | order_item_order_id | order_item_product_id | order_item_quantity | order_item_subtotal | order_item_product_price |
---|---|---|---|---|---|
1 | 1 | 957 | 1 | 299.98 | 299.98 |
2 | 2 | 1073 | 1 | 199.99 | 199.99 |
3 | 2 | 502 | 5 | 250.0 | 50.0 |
4 | 2 | 403 | 1 | 129.99 | 129.99 |
5 | 4 | 897 | 2 | 49.98 | 24.99 |
6 | 4 | 365 | 5 | 299.95 | 59.99 |
7 | 4 | 502 | 3 | 150.0 | 50.0 |
8 | 4 | 1014 | 4 | 199.92 | 49.98 |
9 | 5 | 957 | 1 | 299.98 | 299.98 |
10 | 5 | 365 | 5 | 299.95 | 59.99 |
In [6]:
%sql SELECT count(1) FROM order_items
* postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db 1 rows affected.
Out[6]:
count |
---|
172198 |
In [7]:
%sql SELECT * FROM products LIMIT 10
* postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db 10 rows affected.
Out[7]:
product_id | product_category_id | product_name | product_description | product_price | product_image |
---|---|---|---|---|---|
1 | 2 | Quest Q64 10 FT. x 10 FT. Slant Leg Instant U | 59.98 | http://images.acmesports.sports/Quest+Q64+10+FT.+x+10+FT.+Slant+Leg+Instant+Up+Canopy | |
2 | 2 | Under Armour Men’s Highlight MC Football Clea | 129.99 | http://images.acmesports.sports/Under+Armour+Men%27s+Highlight+MC+Football+Cleat | |
3 | 2 | Under Armour Men’s Renegade D Mid Football Cl | 89.99 | http://images.acmesports.sports/Under+Armour+Men%27s+Renegade+D+Mid+Football+Cleat | |
4 | 2 | Under Armour Men’s Renegade D Mid Football Cl | 89.99 | http://images.acmesports.sports/Under+Armour+Men%27s+Renegade+D+Mid+Football+Cleat | |
5 | 2 | Riddell Youth Revolution Speed Custom Footbal | 199.99 | http://images.acmesports.sports/Riddell+Youth+Revolution+Speed+Custom+Football+Helmet | |
6 | 2 | Jordan Men’s VI Retro TD Football Cleat | 134.99 | http://images.acmesports.sports/Jordan+Men%27s+VI+Retro+TD+Football+Cleat | |
7 | 2 | Schutt Youth Recruit Hybrid Custom Football H | 99.99 | http://images.acmesports.sports/Schutt+Youth+Recruit+Hybrid+Custom+Football+Helmet+2014 | |
8 | 2 | Nike Men’s Vapor Carbon Elite TD Football Cle | 129.99 | http://images.acmesports.sports/Nike+Men%27s+Vapor+Carbon+Elite+TD+Football+Cleat | |
9 | 2 | Nike Adult Vapor Jet 3.0 Receiver Gloves | 50.0 | http://images.acmesports.sports/Nike+Adult+Vapor+Jet+3.0+Receiver+Gloves | |
10 | 2 | Under Armour Men’s Highlight MC Football Clea | 129.99 | http://images.acmesports.sports/Under+Armour+Men%27s+Highlight+MC+Football+Cleat |
In [8]:
%sql SELECT count(1) FROM products
* postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db 1 rows affected.
Out[8]:
count |
---|
1345 |
Exercise 1¶
Get all the employees who is making more than average salary with in each department.
- Use HR database employees and department tables for this problem.
- Compute average salary expense for each department and get those employee details who are making more salary than average salary.
- Make sure average salary expense per department is rounded off to 2 decimals.
- Output should contain employee_id, department_name, salary and avg_salary_expense (derived field).
- Data should be sorted in ascending order by department_id and descending order by salary.
employee_id | department_name | salary | avg_salary_expense |
---|---|---|---|
201 | Marketing | 13000.00 | 9500.00 |
114 | Purchasing | 11000.00 | 4150.00 |
121 | Shipping | 8200.00 | 3475.56 |
120 | Shipping | 8000.00 | 3475.56 |
122 | Shipping | 7900.00 | 3475.56 |
123 | Shipping | 6500.00 | 3475.56 |
124 | Shipping | 5800.00 | 3475.56 |
184 | Shipping | 4200.00 | 3475.56 |
185 | Shipping | 4100.00 | 3475.56 |
192 | Shipping | 4000.00 | 3475.56 |
193 | Shipping | 3900.00 | 3475.56 |
188 | Shipping | 3800.00 | 3475.56 |
137 | Shipping | 3600.00 | 3475.56 |
189 | Shipping | 3600.00 | 3475.56 |
141 | Shipping | 3500.00 | 3475.56 |
103 | IT | 9000.00 | 5760.00 |
104 | IT | 6000.00 | 5760.00 |
145 | Sales | 14000.00 | 8955.88 |
146 | Sales | 13500.00 | 8955.88 |
147 | Sales | 12000.00 | 8955.88 |
168 | Sales | 11500.00 | 8955.88 |
148 | Sales | 11000.00 | 8955.88 |
174 | Sales | 11000.00 | 8955.88 |
149 | Sales | 10500.00 | 8955.88 |
162 | Sales | 10500.00 | 8955.88 |
156 | Sales | 10000.00 | 8955.88 |
150 | Sales | 10000.00 | 8955.88 |
169 | Sales | 10000.00 | 8955.88 |
170 | Sales | 9600.00 | 8955.88 |
163 | Sales | 9500.00 | 8955.88 |
151 | Sales | 9500.00 | 8955.88 |
157 | Sales | 9500.00 | 8955.88 |
158 | Sales | 9000.00 | 8955.88 |
152 | Sales | 9000.00 | 8955.88 |
100 | Executive | 24000.00 | 19333.33 |
108 | Finance | 12000.00 | 8600.00 |
109 | Finance | 9000.00 | 8600.00 |
205 | Accounting | 12000.00 | 10150.00 |
Solution¶
In [1]:
%load_ext sql
In [2]:
%env DATABASE_URL=postgresql://itversity_hr_user:hr_password@localhost:5432/itversity_hr_db
env: DATABASE_URL=postgresql://itversity_hr_user:hr_password@localhost:5432/itversity_hr_db
In [13]:
%%sql
SELECT e.employee_id,
e.department_name,
e.salary,
e.avg_salary_expense
FROM (
SELECT e.employee_id,
e.department_id,
d.department_name,
e.salary,
round(avg(e.salary) OVER (PARTITION BY e.department_id), 2) AS avg_salary_expense
FROM employees AS e
JOIN departments AS d
ON e.department_id = d.department_id
) AS e
WHERE e.salary > e.avg_salary_expense
ORDER BY e.department_id,
e.salary DESC
* postgresql://itversity_hr_user:***@localhost:5432/itversity_hr_db 38 rows affected.
Out[13]:
employee_id | department_name | salary | avg_salary_expense |
---|---|---|---|
201 | Marketing | 13000.00 | 9500.00 |
114 | Purchasing | 11000.00 | 4150.00 |
121 | Shipping | 8200.00 | 3475.56 |
120 | Shipping | 8000.00 | 3475.56 |
122 | Shipping | 7900.00 | 3475.56 |
123 | Shipping | 6500.00 | 3475.56 |
124 | Shipping | 5800.00 | 3475.56 |
184 | Shipping | 4200.00 | 3475.56 |
185 | Shipping | 4100.00 | 3475.56 |
192 | Shipping | 4000.00 | 3475.56 |
193 | Shipping | 3900.00 | 3475.56 |
188 | Shipping | 3800.00 | 3475.56 |
189 | Shipping | 3600.00 | 3475.56 |
137 | Shipping | 3600.00 | 3475.56 |
141 | Shipping | 3500.00 | 3475.56 |
103 | IT | 9000.00 | 5760.00 |
104 | IT | 6000.00 | 5760.00 |
145 | Sales | 14000.00 | 8955.88 |
146 | Sales | 13500.00 | 8955.88 |
147 | Sales | 12000.00 | 8955.88 |
168 | Sales | 11500.00 | 8955.88 |
174 | Sales | 11000.00 | 8955.88 |
148 | Sales | 11000.00 | 8955.88 |
149 | Sales | 10500.00 | 8955.88 |
162 | Sales | 10500.00 | 8955.88 |
156 | Sales | 10000.00 | 8955.88 |
150 | Sales | 10000.00 | 8955.88 |
169 | Sales | 10000.00 | 8955.88 |
170 | Sales | 9600.00 | 8955.88 |
163 | Sales | 9500.00 | 8955.88 |
157 | Sales | 9500.00 | 8955.88 |
151 | Sales | 9500.00 | 8955.88 |
158 | Sales | 9000.00 | 8955.88 |
152 | Sales | 9000.00 | 8955.88 |
100 | Executive | 24000.00 | 19333.33 |
108 | Finance | 12000.00 | 8600.00 |
109 | Finance | 9000.00 | 8600.00 |
205 | Accounting | 12000.00 | 10150.00 |
Exercise 2¶
Get cumulative salary with in each department for Finance and IT department along with department name.
- Use HR database employees and department tables for this problem.
- Compute cumulative salary expense for Finance as well as IT departments with in respective departments.
- Make sure cumulative salary expense per department is rounded off to 2 decimals.
- Output should contain employee_id, department_name, salary and cum_salary_expense (derived field).
- Data should be sorted in ascending order by department_name and then salary.
employee_id | department_name | salary | cum_salary_expense |
---|---|---|---|
113 | Finance | 6900.00 | 6900.00 |
111 | Finance | 7700.00 | 14600.00 |
112 | Finance | 7800.00 | 22400.00 |
110 | Finance | 8200.00 | 30600.00 |
109 | Finance | 9000.00 | 39600.00 |
108 | Finance | 12000.00 | 51600.00 |
107 | IT | 4200.00 | 4200.00 |
106 | IT | 4800.00 | 9000.00 |
105 | IT | 4800.00 | 13800.00 |
104 | IT | 6000.00 | 19800.00 |
103 | IT | 9000.00 | 28800.00 |
In [8]:
%%sql
SELECT e.employee_id,
d.department_name,
e.salary,
round(sum(e.salary) OVER (
PARTITION BY e.department_id
ORDER BY salary
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
), 2) AS cum_salary_expense
FROM employees AS e
JOIN departments AS d
ON e.department_id = d.department_id
WHERE d.department_name IN ('Finance', 'IT')
ORDER BY d.department_name,
e.salarydz
* postgresql://itversity_hr_user:***@localhost:5432/itversity_hr_db 11 rows affected.
Out[8]:
employee_id | department_name | salary | cum_salary_expense |
---|---|---|---|
113 | Finance | 6900.00 | 6900.00 |
111 | Finance | 7700.00 | 14600.00 |
112 | Finance | 7800.00 | 22400.00 |
110 | Finance | 8200.00 | 30600.00 |
109 | Finance | 9000.00 | 39600.00 |
108 | Finance | 12000.00 | 51600.00 |
107 | IT | 4200.00 | 4200.00 |
106 | IT | 4800.00 | 13800.00 |
105 | IT | 4800.00 | 9000.00 |
104 | IT | 6000.00 | 19800.00 |
103 | IT | 9000.00 | 28800.00 |
Exercise 3¶
Get top 3 paid employees with in each department by salary (use dense_rank)
- Use HR database employees and department tables for this problem.
- Highest paid employee should be ranked first.
- Output should contain employee_id, department_id, department_name, salary and employee_rank (derived field).
- Data should be sorted in ascending order by department_id in ascending order and then salary in descending order.
employee_id | department_id | department_name | salary | employee_rank |
---|---|---|---|---|
200 | 10 | Administration | 4400.00 | 1 |
201 | 20 | Marketing | 13000.00 | 1 |
202 | 20 | Marketing | 6000.00 | 2 |
114 | 30 | Purchasing | 11000.00 | 1 |
115 | 30 | Purchasing | 3100.00 | 2 |
116 | 30 | Purchasing | 2900.00 | 3 |
203 | 40 | Human Resources | 6500.00 | 1 |
121 | 50 | Shipping | 8200.00 | 1 |
120 | 50 | Shipping | 8000.00 | 2 |
122 | 50 | Shipping | 7900.00 | 3 |
103 | 60 | IT | 9000.00 | 1 |
104 | 60 | IT | 6000.00 | 2 |
105 | 60 | IT | 4800.00 | 3 |
106 | 60 | IT | 4800.00 | 3 |
204 | 70 | Public Relations | 10000.00 | 1 |
145 | 80 | Sales | 14000.00 | 1 |
146 | 80 | Sales | 13500.00 | 2 |
147 | 80 | Sales | 12000.00 | 3 |
100 | 90 | Executive | 24000.00 | 1 |
101 | 90 | Executive | 17000.00 | 2 |
102 | 90 | Executive | 17000.00 | 2 |
108 | 100 | Finance | 12000.00 | 1 |
109 | 100 | Finance | 9000.00 | 2 |
110 | 100 | Finance | 8200.00 | 3 |
205 | 110 | Accounting | 12000.00 | 1 |
206 | 110 | Accounting | 8300.00 | 2 |
In [12]:
%%sql
SELECT *
FROM (
SELECT e.employee_id,
e.department_id,
d.department_name,
e.salary,
dense_rank() OVER (
PARTITION BY e.department_id
ORDER BY e.salary DESC
) AS employee_rank
FROM employees AS e
JOIN departments AS d
ON e.department_id = d.department_id
) AS e
WHERE e.employee_rank <= 3
ORDER BY e.department_id,
e.salary DESC
* postgresql://itversity_hr_user:***@localhost:5432/itversity_hr_db 26 rows affected.
Out[12]:
employee_id | department_id | department_name | salary | employee_rank |
---|---|---|---|---|
200 | 10 | Administration | 4400.00 | 1 |
201 | 20 | Marketing | 13000.00 | 1 |
202 | 20 | Marketing | 6000.00 | 2 |
114 | 30 | Purchasing | 11000.00 | 1 |
115 | 30 | Purchasing | 3100.00 | 2 |
116 | 30 | Purchasing | 2900.00 | 3 |
203 | 40 | Human Resources | 6500.00 | 1 |
121 | 50 | Shipping | 8200.00 | 1 |
120 | 50 | Shipping | 8000.00 | 2 |
122 | 50 | Shipping | 7900.00 | 3 |
103 | 60 | IT | 9000.00 | 1 |
104 | 60 | IT | 6000.00 | 2 |
105 | 60 | IT | 4800.00 | 3 |
106 | 60 | IT | 4800.00 | 3 |
204 | 70 | Public Relations | 10000.00 | 1 |
145 | 80 | Sales | 14000.00 | 1 |
146 | 80 | Sales | 13500.00 | 2 |
147 | 80 | Sales | 12000.00 | 3 |
100 | 90 | Executive | 24000.00 | 1 |
101 | 90 | Executive | 17000.00 | 2 |
102 | 90 | Executive | 17000.00 | 2 |
108 | 100 | Finance | 12000.00 | 1 |
109 | 100 | Finance | 9000.00 | 2 |
110 | 100 | Finance | 8200.00 | 3 |
205 | 110 | Accounting | 12000.00 | 1 |
206 | 110 | Accounting | 8300.00 | 2 |
Exercise 4¶
Get top 3 products sold in the month of 2014 January by revenue.
- Use retail database tables such as orders, order_items and products.
- Highest revenue generating product should come at top.
- Consider only those orders which are either in COMPLETE or CLOSED status.
- Output should contain product_id, product_name, revenue, product_rank. revenue and product_rank are derived fields.
- Data should be sorted in descending order by revenue.
product_id | product_name | revenue | product_rank |
---|---|---|---|
1004 | Field & Stream Sportsman 16 Gun Fire Safe | 250787.46 | 1 |
365 | Perfect Fitness Perfect Rip Deck | 151474.75 | 2 |
957 | Diamondback Women’s Serene Classic Comfort Bi | 148190.12 | 3 |
In [1]:
%load_ext sql
In [2]:
%env DATABASE_URL=postgresql://itversity_retail_user:retail_password@localhost:5432/itversity_retail_db
env: DATABASE_URL=postgresql://itversity_retail_user:retail_password@localhost:5432/itversity_retail_db
In [9]:
%%sql
WITH product_revenue AS (
SELECT p.product_id,
p.product_name,
round(sum(oi.order_item_subtotal)::numeric, 2) AS revenue
FROM orders AS o
JOIN order_items oi
ON o.order_id = oi.order_item_order_id
JOIN products AS p
ON p.product_id = oi.order_item_product_id
WHERE to_char(o.order_date, 'yyyy-MM') = '2014-01'
AND o.order_status IN ('COMPLETE', 'CLOSED')
GROUP BY p.product_id,
p.product_name
) SELECT * FROM (
SELECT pr.*,
dense_rank() OVER (
ORDER BY revenue DESC
) AS product_rank
FROM product_revenue AS pr
) AS q
WHERE product_rank <= 3
ORDER BY revenue DESC
* postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db 3 rows affected.
Out[9]:
product_id | product_name | revenue | product_rank |
---|---|---|---|
1004 | Field & Stream Sportsman 16 Gun Fire Safe | 250787.46 | 1 |
365 | Perfect Fitness Perfect Rip Deck | 151474.75 | 2 |
957 | Diamondback Women’s Serene Classic Comfort Bi | 148190.12 | 3 |
Exercise 5¶
Get top 3 products sold in the month of 2014 January under selected categories by revenue. The categories are Cardio Equipment and Strength Training.
- Use retail database tables such as orders, order_items, products as well as categories.
- Highest revenue generating product should come at top.
- Consider only those orders which are either in COMPLETE or CLOSED status.
- Output should contain category_id, category_name, product_id, product_name, revenue, product_rank. revenue and product_rank are derived fields.
- Data should be sorted in ascending order by category_id and descending order by revenue.
category_id | category_name | product_id | product_name | revenue | product_rank |
---|---|---|---|---|---|
9 | Cardio Equipment | 191 | Nike Men’s Free 5.0+ Running Shoe | 132286.77 | 1 |
9 | Cardio Equipment | 172 | Nike Women’s Tempo Shorts | 870.00 | 2 |
10 | Strength Training | 208 | SOLE E35 Elliptical | 1999.99 | 1 |
10 | Strength Training | 203 | GoPro HERO3+ Black Edition Camera | 1199.97 | 2 |
10 | Strength Training | 216 | Yakima DoubleDown Ace Hitch Mount 4-Bike Rack | 189.00 | 3 |
In [12]:
%%sql
WITH product_revenue AS (
SELECT c.category_id,
c.category_name,
p.product_id,
p.product_name,
round(sum(oi.order_item_subtotal)::numeric, 2) AS revenue
FROM orders AS o
JOIN order_items oi
ON o.order_id = oi.order_item_order_id
JOIN products AS p
ON p.product_id = oi.order_item_product_id
JOIN categories AS c
ON c.category_id = p.product_category_id
WHERE to_char(o.order_date, 'yyyy-MM') = '2014-01'
AND o.order_status IN ('COMPLETE', 'CLOSED')
AND c.category_name IN ('Cardio Equipment', 'Strength Training')
GROUP BY c.category_id,
c.category_name,
p.product_id,
p.product_name
) SELECT * FROM (
SELECT pr.*,
dense_rank() OVER (
PARTITION BY category_id
ORDER BY revenue DESC
) AS product_rank
FROM product_revenue AS pr
) AS q
WHERE product_rank <= 3
ORDER BY category_id,
revenue DESC
* postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db 5 rows affected.
Out[12]:
category_id | category_name | product_id | product_name | revenue | product_rank |
---|---|---|---|---|---|
9 | Cardio Equipment | 191 | Nike Men’s Free 5.0+ Running Shoe | 132286.77 | 1 |
9 | Cardio Equipment | 172 | Nike Women’s Tempo Shorts | 870.00 | 2 |
10 | Strength Training | 208 | SOLE E35 Elliptical | 1999.99 | 1 |
10 | Strength Training | 203 | GoPro HERO3+ Black Edition Camera | 1199.97 | 2 |
10 | Strength Training | 216 | Yakima DoubleDown Ace Hitch Mount 4-Bike Rack | 189.00 | 3 |