Let us get an overview of Analytics or Windowing Functions as part of SQL.
Aggregate Functions (sum
, min
, max
, avg
)
Window Functions (lead
, lag
, first_value
, last_value
)
Rank Functions (rank
, dense_rank
, row_number
 etc)
For all the functions when used as part of Analytic or Windowing functions we use OVER
 clause.
For aggregate functions we typically use PARTITION BY
For global ranking and windowing functions we can use ORDER BY sort_column
 and for ranking and windowing with in a partition or group we can use PARTITION BY partition_column ORDER BY sort_column
.
Here is how the syntax will look like.
Aggregate -Â func()Â OVERÂ (PARTITIONÂ BYÂ partition_column)
Global Rank - func() OVER (ORDER BY sort_column DESC)
Rank in a partition - func() OVER (PARTITION BY partition_column ORDER BY sort_column DESC)
We can also get cumulative or moving metrics by adding ROWS BETWEEN
 clause. We will see details later.
Let us create couple of tables which will be used for the demonstrations of Windowing and Ranking functions.
We have ORDERS and ORDER_ITEMS tables in our retail database.
Let us take care of computing daily revenue as well as daily product revenue.
As we will be using same data set several times, let us create the tables to pre compute the data.
daily_revenue will have the order_date and revenue, where data is aggregated using order_date as partition key.
daily_product_revenue will have order_date, order_item_product_id and revenue. In this case data is aggregated using order_date and order_item_product_id as partition keys.
order_date | revenue |
---|---|
2013-07-25 00:00:00 | 31547.23 |
2013-07-26 00:00:00 | 54713.23 |
2013-07-27 00:00:00 | 48411.48 |
2013-07-28 00:00:00 | 35672.03 |
2013-07-29 00:00:00 | 54579.70 |
2013-07-30 00:00:00 | 49329.29 |
2013-07-31 00:00:00 | 59212.49 |
2013-08-01 00:00:00 | 49160.08 |
2013-08-02 00:00:00 | 50688.58 |
2013-08-03 00:00:00 | 43416.74 |
%%sql
CREATE TABLE daily_product_revenue
AS
SELECT o.order_date,
oi.order_item_product_id,
round(sum(oi.order_item_subtotal)::numeric, 2) AS revenue
FROM orders o JOIN order_items oi
ON o.order_id = oi.order_item_order_id
WHERE o.order_status IN ('COMPLETE', 'CLOSED')
GROUP BY o.order_date, oi.order_item_product_id
order_date | order_item_product_id | revenue |
---|---|---|
2013-07-25 00:00:00 | 1004 | 5599.72 |
2013-07-25 00:00:00 | 191 | 5099.49 |
2013-07-25 00:00:00 | 957 | 4499.70 |
2013-07-25 00:00:00 | 365 | 3359.44 |
2013-07-25 00:00:00 | 1073 | 2999.85 |
2013-07-25 00:00:00 | 1014 | 2798.88 |
2013-07-25 00:00:00 | 403 | 1949.85 |
2013-07-25 00:00:00 | 502 | 1650.00 |
2013-07-25 00:00:00 | 627 | 1079.73 |
2013-07-25 00:00:00 | 226 | 599.99 |