Pivoting Rows into Columns

Topic
Materials

Let us understand how we can pivot rows into columns in Postgres.

  • Actual results

order_date

order_status

count

2013-07-25 00:00:00

CANCELED

1

2013-07-25 00:00:00

CLOSED

20

2013-07-25 00:00:00

COMPLETE

42

2013-07-25 00:00:00

ON_HOLD

5

2013-07-25 00:00:00

PAYMENT_REVIEW

3

2013-07-25 00:00:00

PENDING

13

2013-07-25 00:00:00

PENDING_PAYMENT

41

2013-07-25 00:00:00

PROCESSING

16

2013-07-25 00:00:00

SUSPECTED_FRAUD

2

2013-07-26 00:00:00

CANCELED

3

2013-07-26 00:00:00

CLOSED

29

2013-07-26 00:00:00

COMPLETE

87

2013-07-26 00:00:00

ON_HOLD

19

2013-07-26 00:00:00

PAYMENT_REVIEW

6

2013-07-26 00:00:00

PENDING

31

2013-07-26 00:00:00

PENDING_PAYMENT

59

2013-07-26 00:00:00

PROCESSING

30

2013-07-26 00:00:00

SUSPECTED_FRAUD

5

  • Pivoted results

order_date

CANCELED

CLOSED

COMPLETE

ON_HOLD

PAYMENT_REVIEW

PENDING

PENDING_PAYMENT

PROCESSING

SUSPECTED_FRAUD

2013-07-25

1

20

42

5

3

13

41

16

2

2013-07-26

3

29

87

19

6

31

59

30

5

  • We need to use crosstab as part of FROM clause to pivot the data. We need to pass the main query to crosstab function.

  • We need to install tablefunc as Postgres superuser to expose functions like crosstab - CREATE EXTENSION tablefunc;

Note

If you are using environment provided by us, you don’t need to install tablefunc. If you are using your own environment run this command by logging in as superuser into postgres server to install tablefunc.

CREATE EXTENSION tablefunc;

However, in some cases you might have to run scripts in postgres. Follow official instructions by searching around.

%load_ext sql

Copy to clipboard

%env DATABASE_URL=postgresql://itversity_retail_user:retail_password@localhost:5432/itversity_retail_db

Copy to clipboard

%%sql

SELECT order_date,
    order_status,
    count(1)
FROM orders
GROUP BY order_date,
    order_status
ORDER BY order_date,
    order_status
LIMIT 18

Copy to clipboard

%%sql

SELECT * FROM crosstab(
    'SELECT order_date,
        order_status,
        count(1) AS order_count
    FROM orders
    GROUP BY order_date,
        order_status',
    'SELECT DISTINCT order_status FROM orders ORDER BY 1'
) AS (
    order_date DATE,
    "CANCELED" INT,
    "CLOSED" INT,
    "COMPLETE" INT,
    "ON_HOLD" INT,
    "PAYMENT_REVIEW" INT,
    "PENDING" INT,
    "PENDING_PAYMENT" INT,
    "PROCESSING" INT,
    "SUSPECTED_FRAUD" INT
)
LIMIT 10