Selecting or Projecting Data

Let us understand different aspects of projecting data. We primarily using SELECT to project the data.

  • We can project all columns using * or some columns using column names.

  • We can provide aliases to a column or expression using AS in SELECT clause.

  • DISTINCT can be used to get the distinct records from selected columns. We can also use DISTINCT * to get unique records using all the columns.

  • As part of SELECT clause we can have aggregate functions such as count, sum etc.

%load_ext sql

Copy to clipboard

The sql extension is already loaded. To reload it, use:
  %reload_ext sql

Copy to clipboard

%env DATABASE_URL=postgresql://itversity_retail_user:retail_password@localhost:5432/itversity_retail_db

Copy to clipboard

env: DATABASE_URL=postgresql://itversity_retail_user:retail_password@localhost:5432/itversity_retail_db

Copy to clipboard

%sql SELECT * FROM orders LIMIT 10

Copy to clipboard

 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
10 rows affected.

Copy to clipboard

order_id order_date order_customer_id order_status
1 2013-07-25 00:00:00 11599 CLOSED
2 2013-07-25 00:00:00 256 PENDING_PAYMENT
3 2013-07-25 00:00:00 12111 COMPLETE
4 2013-07-25 00:00:00 8827 CLOSED
5 2013-07-25 00:00:00 11318 COMPLETE
6 2013-07-25 00:00:00 7130 COMPLETE
7 2013-07-25 00:00:00 4530 COMPLETE
8 2013-07-25 00:00:00 2911 PROCESSING
9 2013-07-25 00:00:00 5657 PENDING_PAYMENT
10 2013-07-25 00:00:00 5648 PENDING_PAYMENT
%%sql 

SELECT * FROM information_schema.columns 
WHERE table_catalog = 'itversity_retail_db' 
    AND table_name = 'orders'

Copy to clipboard

 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
4 rows affected.

Copy to clipboard

table_catalog table_schema table_name column_name ordinal_position column_default is_nullable data_type character_maximum_length character_octet_length numeric_precision numeric_precision_radix numeric_scale datetime_precision interval_type interval_precision character_set_catalog character_set_schema character_set_name collation_catalog collation_schema collation_name domain_catalog domain_schema domain_name udt_catalog udt_schema udt_name scope_catalog scope_schema scope_name maximum_cardinality dtd_identifier is_self_referencing is_identity identity_generation identity_start identity_increment identity_maximum identity_minimum identity_cycle is_generated generation_expression is_updatable
itversity_retail_db public orders order_id 1 None NO integer None None 32 2 0 None None None None None None None None None None None None itversity_retail_db pg_catalog int4 None None None None 1 NO NO None None None None None NO NEVER None YES
itversity_retail_db public orders order_date 2 None NO timestamp without time zone None None None None None 6 None None None None None None None None None None None itversity_retail_db pg_catalog timestamp None None None None 2 NO NO None None None None None NO NEVER None YES
itversity_retail_db public orders order_customer_id 3 None NO integer None None 32 2 0 None None None None None None None None None None None None itversity_retail_db pg_catalog int4 None None None None 3 NO NO None None None None None NO NEVER None YES
itversity_retail_db public orders order_status 4 None NO character varying 45 180 None None None None None None None None None None None None None None None itversity_retail_db pg_catalog varchar None None None None 4 NO NO None None None None None NO NEVER None YES
%%sql 

SELECT order_customer_id, order_date, order_status 
FROM orders 
LIMIT 10

Copy to clipboard

 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
10 rows affected.

Copy to clipboard

order_customer_id order_date order_status
11599 2013-07-25 00:00:00 CLOSED
256 2013-07-25 00:00:00 PENDING_PAYMENT
12111 2013-07-25 00:00:00 COMPLETE
8827 2013-07-25 00:00:00 CLOSED
11318 2013-07-25 00:00:00 COMPLETE
7130 2013-07-25 00:00:00 COMPLETE
4530 2013-07-25 00:00:00 COMPLETE
2911 2013-07-25 00:00:00 PROCESSING
5657 2013-07-25 00:00:00 PENDING_PAYMENT
5648 2013-07-25 00:00:00 PENDING_PAYMENT
%%sql 

SELECT order_customer_id, 
    to_char(order_date, 'yyyy-MM'), 
    order_status 
FROM orders 
LIMIT 10

Copy to clipboard

 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
10 rows affected.

Copy to clipboard

order_customer_id to_char order_status
11599 2013-07 CLOSED
256 2013-07 PENDING_PAYMENT
12111 2013-07 COMPLETE
8827 2013-07 CLOSED
11318 2013-07 COMPLETE
7130 2013-07 COMPLETE
4530 2013-07 COMPLETE
2911 2013-07 PROCESSING
5657 2013-07 PENDING_PAYMENT
5648 2013-07 PENDING_PAYMENT
%%sql 

SELECT order_customer_id, 
    to_char(order_date, 'yyyy-MM') AS order_month, 
    order_status 
FROM orders 
LIMIT 10

Copy to clipboard

 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
10 rows affected.

Copy to clipboard

order_customer_id order_month order_status
11599 2013-07 CLOSED
256 2013-07 PENDING_PAYMENT
12111 2013-07 COMPLETE
8827 2013-07 CLOSED
11318 2013-07 COMPLETE
7130 2013-07 COMPLETE
4530 2013-07 COMPLETE
2911 2013-07 PROCESSING
5657 2013-07 PENDING_PAYMENT
5648 2013-07 PENDING_PAYMENT
%%sql 

SELECT DISTINCT to_char(order_date, 'yyyy-MM') AS order_month 
FROM orders

Copy to clipboard

 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
13 rows affected.

Copy to clipboard

order_month
2014-01
2014-05
2013-12
2013-11
2014-04
2014-07
2014-03
2013-08
2013-10
2013-07
2014-02
2013-09
2014-06
%sql SELECT count(1) FROM orders

Copy to clipboard

 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
1 rows affected.

Copy to clipboard

count
68883
%%sql 

SELECT count(DISTINCT to_char(order_date, 'yyyy-MM')) AS distinct_month_count 
FROM orders

Copy to clipboard

 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
1 rows affected.

Copy to clipboard

distinct_month_count
13