Using CASE and WHEN¶
At times we might have to select values from multiple columns conditionally.
- We can use
CASE
andWHEN
for that. - Let us implement this conditional logic to come up with derived order_status.
- If order_status is COMPLETE or CLOSED, set COMPLETED
- If order_status have PENDING in it, then we will say PENDING
- If order_status have PROCESSING or PAYMENT_REVIEW in it, then we will say PENDING
- We will set all others as OTHER
- We can also have
ELSE
as part ofCASE
andWHEN
.
In [1]:
%load_ext sql
In [2]:
%env DATABASE_URL=postgresql://itversity_retail_user:retail_password@pg.itversity.com:5432/itversity_retail_db
env: DATABASE_URL=postgresql://itversity_retail_user:retail_password@pg.itversity.com:5432/itversity_retail_db
In [3]:
%sql DROP TABLE IF EXISTS sales
Done.
Out[3]:
[]
In [4]:
%%sql
CREATE TABLE IF NOT EXISTS sales(
sales_person_id INT,
sales_amount FLOAT,
commission_pct INT
)
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db Done.
Out[4]:
[]
In [5]:
%%sql
INSERT INTO sales VALUES
(1, 1000, 10),
(2, 1500, 8),
(3, 500, NULL),
(4, 800, 5),
(5, 250, NULL)
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 5 rows affected.
Out[5]:
[]
In [6]:
%%sql
SELECT * FROM sales
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 5 rows affected.
Out[6]:
sales_person_id | sales_amount | commission_pct |
---|---|---|
1 | 1000.0 | 10 |
2 | 1500.0 | 8 |
3 | 500.0 | None |
4 | 800.0 | 5 |
5 | 250.0 | None |
In [7]:
%%sql
SELECT s.*,
CASE WHEN commission_pct IS NOT NULL
THEN round((sales_amount * commission_pct / 100)::numeric, 2)
ELSE 0
END AS commission_amount
FROM sales s
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 5 rows affected.
Out[7]:
sales_person_id | sales_amount | commission_pct | commission_amount |
---|---|---|---|
1 | 1000.0 | 10 | 100.00 |
2 | 1500.0 | 8 | 120.00 |
3 | 500.0 | None | 0 |
4 | 800.0 | 5 | 40.00 |
5 | 250.0 | None | 0 |
In [8]:
%%sql
SELECT DISTINCT order_status FROM orders
ORDER BY order_status
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 9 rows affected.
Out[8]:
order_status |
---|
CANCELED |
CLOSED |
COMPLETE |
ON_HOLD |
PAYMENT_REVIEW |
PENDING |
PENDING_PAYMENT |
PROCESSING |
SUSPECTED_FRAUD |
In [9]:
%%sql
SELECT o.*,
CASE WHEN order_status IN ('COMPLETE', 'CLOSED') THEN 'COMPLETED'
END AS updated_order_status
FROM orders o
LIMIT 10
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 10 rows affected.
Out[9]:
order_id | order_date | order_customer_id | order_status | updated_order_status |
---|---|---|---|---|
1 | 2013-07-25 00:00:00 | 11599 | CLOSED | COMPLETED |
2 | 2013-07-25 00:00:00 | 256 | PENDING_PAYMENT | None |
3 | 2013-07-25 00:00:00 | 12111 | COMPLETE | COMPLETED |
4 | 2013-07-25 00:00:00 | 8827 | CLOSED | COMPLETED |
5 | 2013-07-25 00:00:00 | 11318 | COMPLETE | COMPLETED |
6 | 2013-07-25 00:00:00 | 7130 | COMPLETE | COMPLETED |
7 | 2013-07-25 00:00:00 | 4530 | COMPLETE | COMPLETED |
8 | 2013-07-25 00:00:00 | 2911 | PROCESSING | None |
9 | 2013-07-25 00:00:00 | 5657 | PENDING_PAYMENT | None |
10 | 2013-07-25 00:00:00 | 5648 | PENDING_PAYMENT | None |
In [10]:
%%sql
SELECT o.*,
CASE WHEN order_status IN ('COMPLETE', 'CLOSED') THEN 'COMPLETED'
ELSE order_status
END AS updated_order_status
FROM orders o
LIMIT 10
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 10 rows affected.
Out[10]:
order_id | order_date | order_customer_id | order_status | updated_order_status |
---|---|---|---|---|
1 | 2013-07-25 00:00:00 | 11599 | CLOSED | COMPLETED |
2 | 2013-07-25 00:00:00 | 256 | PENDING_PAYMENT | PENDING_PAYMENT |
3 | 2013-07-25 00:00:00 | 12111 | COMPLETE | COMPLETED |
4 | 2013-07-25 00:00:00 | 8827 | CLOSED | COMPLETED |
5 | 2013-07-25 00:00:00 | 11318 | COMPLETE | COMPLETED |
6 | 2013-07-25 00:00:00 | 7130 | COMPLETE | COMPLETED |
7 | 2013-07-25 00:00:00 | 4530 | COMPLETE | COMPLETED |
8 | 2013-07-25 00:00:00 | 2911 | PROCESSING | PROCESSING |
9 | 2013-07-25 00:00:00 | 5657 | PENDING_PAYMENT | PENDING_PAYMENT |
10 | 2013-07-25 00:00:00 | 5648 | PENDING_PAYMENT | PENDING_PAYMENT |
In [11]:
%%sql
SELECT o.*,
CASE
WHEN order_status IN ('COMPLETE', 'CLOSED') THEN 'COMPLETED'
WHEN order_status ~ 'PENDING' THEN 'PENDING'
ELSE 'OTHER'
END AS updated_order_status
FROM orders o
LIMIT 10
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 10 rows affected.
Out[11]:
order_id | order_date | order_customer_id | order_status | updated_order_status |
---|---|---|---|---|
1 | 2013-07-25 00:00:00 | 11599 | CLOSED | COMPLETED |
2 | 2013-07-25 00:00:00 | 256 | PENDING_PAYMENT | PENDING |
3 | 2013-07-25 00:00:00 | 12111 | COMPLETE | COMPLETED |
4 | 2013-07-25 00:00:00 | 8827 | CLOSED | COMPLETED |
5 | 2013-07-25 00:00:00 | 11318 | COMPLETE | COMPLETED |
6 | 2013-07-25 00:00:00 | 7130 | COMPLETE | COMPLETED |
7 | 2013-07-25 00:00:00 | 4530 | COMPLETE | COMPLETED |
8 | 2013-07-25 00:00:00 | 2911 | PROCESSING | OTHER |
9 | 2013-07-25 00:00:00 | 5657 | PENDING_PAYMENT | PENDING |
10 | 2013-07-25 00:00:00 | 5648 | PENDING_PAYMENT | PENDING |
In [12]:
%%sql
SELECT o.*,
CASE
WHEN order_status IN ('COMPLETE', 'CLOSED') THEN 'COMPLETED'
WHEN order_status LIKE '%PENDING%' OR order_status IN ('PROCESSING', 'PAYMENT_REVIEW')
THEN 'PENDING'
ELSE 'OTHER'
END AS updated_order_status
FROM orders o
LIMIT 10
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 10 rows affected.
Out[12]:
order_id | order_date | order_customer_id | order_status | updated_order_status |
---|---|---|---|---|
1 | 2013-07-25 00:00:00 | 11599 | CLOSED | COMPLETED |
2 | 2013-07-25 00:00:00 | 256 | PENDING_PAYMENT | PENDING |
3 | 2013-07-25 00:00:00 | 12111 | COMPLETE | COMPLETED |
4 | 2013-07-25 00:00:00 | 8827 | CLOSED | COMPLETED |
5 | 2013-07-25 00:00:00 | 11318 | COMPLETE | COMPLETED |
6 | 2013-07-25 00:00:00 | 7130 | COMPLETE | COMPLETED |
7 | 2013-07-25 00:00:00 | 4530 | COMPLETE | COMPLETED |
8 | 2013-07-25 00:00:00 | 2911 | PROCESSING | PENDING |
9 | 2013-07-25 00:00:00 | 5657 | PENDING_PAYMENT | PENDING |
10 | 2013-07-25 00:00:00 | 5648 | PENDING_PAYMENT | PENDING |
In [13]:
%%sql
SELECT DISTINCT order_status,
CASE
WHEN order_status IN ('COMPLETE', 'CLOSED') THEN 'COMPLETED'
WHEN order_status LIKE '%PENDING%' OR order_status IN ('PROCESSING', 'PAYMENT_REVIEW')
THEN 'PENDING'
ELSE 'OTHER'
END AS updated_order_status
FROM orders
ORDER BY updated_order_status
LIMIT 10
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 9 rows affected.
Out[13]:
order_status | updated_order_status |
---|---|
CLOSED | COMPLETED |
COMPLETE | COMPLETED |
SUSPECTED_FRAUD | OTHER |
CANCELED | OTHER |
ON_HOLD | OTHER |
PAYMENT_REVIEW | PENDING |
PENDING_PAYMENT | PENDING |
PENDING | PENDING |
PROCESSING | PENDING |