Selecting or Projecting Data¶
Let us understand different aspects of projecting data. We primarily using SELECT
to project the data.
- We can project all columns using
*
or some columns using column names. - We can provide aliases to a column or expression using
AS
inSELECT
clause. DISTINCT
can be used to get the distinct records from selected columns. We can also useDISTINCT *
to get unique records using all the columns.- As part of
SELECT
clause we can have aggregate functions such ascount
,sum
etc.
In [1]:
%load_ext sql
In [2]:
%env DATABASE_URL=postgresql://itversity_retail_user:retail_password@pg.itversity.com:5432/itversity_retail_db
env: DATABASE_URL=postgresql://itversity_retail_user:retail_password@pg.itversity.com:5432/itversity_retail_db
In [3]:
%sql SELECT * FROM orders LIMIT 10
10 rows affected.
Out[3]:
order_id | order_date | order_customer_id | order_status |
---|---|---|---|
1 | 2013-07-25 00:00:00 | 11599 | CLOSED |
2 | 2013-07-25 00:00:00 | 256 | PENDING_PAYMENT |
3 | 2013-07-25 00:00:00 | 12111 | COMPLETE |
4 | 2013-07-25 00:00:00 | 8827 | CLOSED |
5 | 2013-07-25 00:00:00 | 11318 | COMPLETE |
6 | 2013-07-25 00:00:00 | 7130 | COMPLETE |
7 | 2013-07-25 00:00:00 | 4530 | COMPLETE |
8 | 2013-07-25 00:00:00 | 2911 | PROCESSING |
9 | 2013-07-25 00:00:00 | 5657 | PENDING_PAYMENT |
10 | 2013-07-25 00:00:00 | 5648 | PENDING_PAYMENT |
In [4]:
%%sql
SELECT * FROM information_schema.columns
WHERE table_catalog = 'itversity_retail_db'
AND table_name = 'orders'
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 4 rows affected.
Out[4]:
table_catalog | table_schema | table_name | column_name | ordinal_position | column_default | is_nullable | data_type | character_maximum_length | character_octet_length | numeric_precision | numeric_precision_radix | numeric_scale | datetime_precision | interval_type | interval_precision | character_set_catalog | character_set_schema | character_set_name | collation_catalog | collation_schema | collation_name | domain_catalog | domain_schema | domain_name | udt_catalog | udt_schema | udt_name | scope_catalog | scope_schema | scope_name | maximum_cardinality | dtd_identifier | is_self_referencing | is_identity | identity_generation | identity_start | identity_increment | identity_maximum | identity_minimum | identity_cycle | is_generated | generation_expression | is_updatable |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
itversity_retail_db | public | orders | order_id | 1 | None | NO | integer | None | None | 32 | 2 | 0 | None | None | None | None | None | None | None | None | None | None | None | None | itversity_retail_db | pg_catalog | int4 | None | None | None | None | 1 | NO | NO | None | None | None | None | None | NO | NEVER | None | YES |
itversity_retail_db | public | orders | order_date | 2 | None | NO | timestamp without time zone | None | None | None | None | None | 6 | None | None | None | None | None | None | None | None | None | None | None | itversity_retail_db | pg_catalog | timestamp | None | None | None | None | 2 | NO | NO | None | None | None | None | None | NO | NEVER | None | YES |
itversity_retail_db | public | orders | order_customer_id | 3 | None | NO | integer | None | None | 32 | 2 | 0 | None | None | None | None | None | None | None | None | None | None | None | None | itversity_retail_db | pg_catalog | int4 | None | None | None | None | 3 | NO | NO | None | None | None | None | None | NO | NEVER | None | YES |
itversity_retail_db | public | orders | order_status | 4 | None | NO | character varying | 45 | 180 | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | itversity_retail_db | pg_catalog | varchar | None | None | None | None | 4 | NO | NO | None | None | None | None | None | NO | NEVER | None | YES |
In [5]:
%%sql
SELECT order_customer_id, order_date, order_status
FROM orders
LIMIT 10
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 10 rows affected.
Out[5]:
order_customer_id | order_date | order_status |
---|---|---|
11599 | 2013-07-25 00:00:00 | CLOSED |
256 | 2013-07-25 00:00:00 | PENDING_PAYMENT |
12111 | 2013-07-25 00:00:00 | COMPLETE |
8827 | 2013-07-25 00:00:00 | CLOSED |
11318 | 2013-07-25 00:00:00 | COMPLETE |
7130 | 2013-07-25 00:00:00 | COMPLETE |
4530 | 2013-07-25 00:00:00 | COMPLETE |
2911 | 2013-07-25 00:00:00 | PROCESSING |
5657 | 2013-07-25 00:00:00 | PENDING_PAYMENT |
5648 | 2013-07-25 00:00:00 | PENDING_PAYMENT |
In [6]:
%%sql
SELECT order_customer_id,
to_char(order_date, 'yyyy-MM'),
order_status
FROM orders
LIMIT 10
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 10 rows affected.
Out[6]:
order_customer_id | to_char | order_status |
---|---|---|
11599 | 2013-07 | CLOSED |
256 | 2013-07 | PENDING_PAYMENT |
12111 | 2013-07 | COMPLETE |
8827 | 2013-07 | CLOSED |
11318 | 2013-07 | COMPLETE |
7130 | 2013-07 | COMPLETE |
4530 | 2013-07 | COMPLETE |
2911 | 2013-07 | PROCESSING |
5657 | 2013-07 | PENDING_PAYMENT |
5648 | 2013-07 | PENDING_PAYMENT |
In [7]:
%%sql
SELECT order_customer_id,
to_char(order_date, 'yyyy-MM') AS order_month,
order_status
FROM orders
LIMIT 10
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 10 rows affected.
Out[7]:
order_customer_id | order_month | order_status |
---|---|---|
11599 | 2013-07 | CLOSED |
256 | 2013-07 | PENDING_PAYMENT |
12111 | 2013-07 | COMPLETE |
8827 | 2013-07 | CLOSED |
11318 | 2013-07 | COMPLETE |
7130 | 2013-07 | COMPLETE |
4530 | 2013-07 | COMPLETE |
2911 | 2013-07 | PROCESSING |
5657 | 2013-07 | PENDING_PAYMENT |
5648 | 2013-07 | PENDING_PAYMENT |
In [8]:
%%sql
SELECT DISTINCT to_char(order_date, 'yyyy-MM') AS order_month
FROM orders
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 13 rows affected.
Out[8]:
order_month |
---|
2014-01 |
2014-05 |
2013-12 |
2013-11 |
2014-04 |
2014-07 |
2014-03 |
2013-08 |
2013-10 |
2013-07 |
2014-02 |
2013-09 |
2014-06 |
In [9]:
%sql SELECT count(1) FROM orders
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 1 rows affected.
Out[9]:
count |
---|
68883 |
In [10]:
%%sql
SELECT count(DISTINCT to_char(order_date, 'yyyy-MM')) AS distinct_month_count
FROM orders
* postgresql://itversity_retail_user:***@pg.itversity.com:5432/itversity_retail_db 1 rows affected.
Out[10]:
distinct_month_count |
---|
13 |