Exercises – Pre-Defined Functions

Here are the exercises to ensure our understanding related to Pre-Defined Functions.

  • We will use users table as well as other tables we got as part of retail database.

  • Information will be provided with each exercise.

%load_ext sql

Copy to clipboard

The sql extension is already loaded. To reload it, use:
  %reload_ext sql

Copy to clipboard

%env DATABASE_URL=postgresql://itversity_retail_user:retail_password@localhost:5432/itversity_retail_db

Copy to clipboard

env: DATABASE_URL=postgresql://itversity_retail_user:retail_password@localhost:5432/itversity_retail_db

Copy to clipboard

%%sql

DROP TABLE IF EXISTS users

Copy to clipboard

 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
Done.

Copy to clipboard

[]

Copy to clipboard

%%sql

CREATE TABLE users(
    user_id SERIAL PRIMARY KEY,
    user_first_name VARCHAR(30),
    user_last_name VARCHAR(30),
    user_email_id VARCHAR(50),
    user_gender VARCHAR(1),
    user_unique_id VARCHAR(15),
    user_phone_no VARCHAR(20),
    user_dob DATE,
    created_ts TIMESTAMP
)

Copy to clipboard

 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
Done.

Copy to clipboard

[]

Copy to clipboard

%%sql

insert into users (
    user_first_name, user_last_name, user_email_id, user_gender, 
    user_unique_id, user_phone_no, user_dob, created_ts
) VALUES
    ('Giuseppe', 'Bode', 'gbode0@imgur.com', 'M', '88833-8759', 
     '+86 (764) 443-1967', '1973-05-31', '2018-04-15 12:13:38'),
    ('Lexy', 'Gisbey', 'lgisbey1@mail.ru', 'F', '262501-029', 
     '+86 (751) 160-3742', '2003-05-31', '2020-12-29 06:44:09'),
    ('Karel', 'Claringbold', 'kclaringbold2@yale.edu', 'F', '391-33-2823', 
     '+62 (445) 471-2682', '1985-11-28', '2018-11-19 00:04:08'),
    ('Marv', 'Tanswill', 'mtanswill3@dedecms.com', 'F', '1195413-80', 
     '+62 (497) 736-6802', '1998-05-24', '2018-11-19 16:29:43'),
    ('Gertie', 'Espinoza', 'gespinoza4@nationalgeographic.com', 'M', '471-24-6869', 
     '+249 (687) 506-2960', '1997-10-30', '2020-01-25 21:31:10'),
    ('Saleem', 'Danneil', 'sdanneil5@guardian.co.uk', 'F', '192374-933', 
     '+63 (810) 321-0331', '1992-03-08', '2020-11-07 19:01:14'),
    ('Rickert', 'O''Shiels', 'roshiels6@wikispaces.com', 'M', '749-27-47-52', 
     '+86 (184) 759-3933', '1972-11-01', '2018-03-20 10:53:24'),
    ('Cybil', 'Lissimore', 'clissimore7@pinterest.com', 'M', '461-75-4198', 
     '+54 (613) 939-6976', '1978-03-03', '2019-12-09 14:08:30'),
    ('Melita', 'Rimington', 'mrimington8@mozilla.org', 'F', '892-36-676-2', 
     '+48 (322) 829-8638', '1995-12-15', '2018-04-03 04:21:33'),
    ('Benetta', 'Nana', 'bnana9@google.com', 'M', '197-54-1646', 
     '+420 (934) 611-0020', '1971-12-07', '2018-10-17 21:02:51'),
    ('Gregorius', 'Gullane', 'ggullanea@prnewswire.com', 'F', '232-55-52-58', 
     '+62 (780) 859-1578', '1973-09-18', '2020-01-14 23:38:53'),
    ('Una', 'Glayzer', 'uglayzerb@pinterest.com', 'M', '898-84-336-6', 
     '+380 (840) 437-3981', '1983-05-26', '2019-09-17 03:24:21'),
    ('Jamie', 'Vosper', 'jvosperc@umich.edu', 'M', '247-95-68-44', 
     '+81 (205) 723-1942', '1972-03-18', '2020-07-23 16:39:33'),
    ('Calley', 'Tilson', 'ctilsond@issuu.com', 'F', '415-48-894-3', 
     '+229 (698) 777-4904', '1987-06-12', '2020-06-05 12:10:50'),
    ('Peadar', 'Gregorowicz', 'pgregorowicze@omniture.com', 'M', '403-39-5-869', 
     '+7 (267) 853-3262', '1996-09-21', '2018-05-29 23:51:31'),
    ('Jeanie', 'Webling', 'jweblingf@booking.com', 'F', '399-83-05-03', 
     '+351 (684) 413-0550', '1994-12-27', '2018-02-09 01:31:11'),
    ('Yankee', 'Jelf', 'yjelfg@wufoo.com', 'F', '607-99-0411', 
     '+1 (864) 112-7432', '1988-11-13', '2019-09-16 16:09:12'),
    ('Blair', 'Aumerle', 'baumerleh@toplist.cz', 'F', '430-01-578-5', 
     '+7 (393) 232-1860', '1979-11-09', '2018-10-28 19:25:35'),
    ('Pavlov', 'Steljes', 'psteljesi@macromedia.com', 'F', '571-09-6181', 
     '+598 (877) 881-3236', '1991-06-24', '2020-09-18 05:34:31'),
    ('Darn', 'Hadeke', 'dhadekej@last.fm', 'M', '478-32-02-87', 
     '+370 (347) 110-4270', '1984-09-04', '2018-02-10 12:56:00'),
    ('Wendell', 'Spanton', 'wspantonk@de.vu', 'F', null, 
     '+84 (301) 762-1316', '1973-07-24', '2018-01-30 01:20:11'),
    ('Carlo', 'Yearby', 'cyearbyl@comcast.net', 'F', null, 
     '+55 (288) 623-4067', '1974-11-11', '2018-06-24 03:18:40'),
    ('Sheila', 'Evitts', 'sevittsm@webmd.com', null, '830-40-5287',
     null, '1977-03-01', '2020-07-20 09:59:41'),
    ('Sianna', 'Lowdham', 'slowdhamn@stanford.edu', null, '778-0845', 
     null, '1985-12-23', '2018-06-29 02:42:49'),
    ('Phylys', 'Aslie', 'paslieo@qq.com', 'M', '368-44-4478', 
     '+86 (765) 152-8654', '1984-03-22', '2019-10-01 01:34:28')

Copy to clipboard

 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
25 rows affected.

Copy to clipboard

[]

Copy to clipboard

Exercise 1

Get all the number of users created per year.

  • Use users table for this exercise.

  • Output should contain 4 digit year and count.

  • Use date specific functions to get the year using created_ts.

  • Make sure you define aliases to the columns as created_year and user_count respectively.

  • Data should be sorted in ascending order by created_year.

  • When you run the query using Jupyter environment, it might have decimals for integers. Hence you can display results even with decimal points.

  • Here is the sample output.

created_year

user_count

2018

13

2019

4

2020

8

Exercise 2

Get the day name of the birth days for all the users born in the month of June.

  • Use users table for this exercise.

  • Output should contain user_id, user_dob, user_email_id and user_day_of_birth.

  • Use date specific functions to get the month using user_dob.

  • user_day_of_birth should be full day with first character in upper case such as Tuesday

  • Data should be sorted by day with in the month of May.

user_id

user_dob

user_email_id

user_day_of_birth

4

1998-05-24

mtanswill3@dedecms.com

Sunday

12

1983-05-26

uglayzerb@pinterest.com

Thursday

1

1973-05-31

gbode0@imgur.com

Thursday

2

2003-05-31

lgisbey1@mail.ru

Saturday

Exercise 3

Get the names and email ids of users added in year 2019.

  • Use users table for this exercise.

  • Output should contain user_id, user_name, user_email_id, created_ts, created_year.

  • Use date specific functions to get the year using created_ts.

  • user_name is a derived column by concatenating user_first_name and user_last_name with space in between.

  • user_name should have values in upper case.

  • Data should be sorted in ascending order by user_name

user_id

user_name

user_email_id

created_ts

created_year

8

CYBIL LISSIMORE

clissimore7@pinterest.com

2019-12-09 14:08:30

2019.0

25

PHYLYS ASLIE

paslieo@qq.com

2019-10-01 01:34:28

2019.0

12

UNA GLAYZER

uglayzerb@pinterest.com

2019-09-17 03:24:21

2019.0

17

YANKEE JELF

yjelfg@wufoo.com

2019-09-16 16:09:12

2019.0

Exercise 4

Get the number of users by gender.

  • Use users table for this exercise.

  • Output should contain gender and user_count.

  • For males the output should display Male and for females the output should display Female.

  • If gender is not specified, then it should display Not Specified.

  • Data should be sorted in descending order by user_count.

user_gender

user_count

Female

13

Male

10

Not Specified

2

Exercise 5

Get last 4 digits of unique ids.

  • Use users table for this exercise.

  • Output should contain user_id, user_unique_id and user_unique_id_last4

  • Unique ids are either null or not null.

  • Unique ids contain numbers and hyphens and are of different length.

  • We need to get last 4 digits discarding hyphens only when the number of digits are at least 9.

  • If unique id is null, then you should dispaly Not Specified.

  • After discarding hyphens, if unique id have less than 9 digits then you should display Invalid Unique Id.

  • Data should be sorted by user_id. You might see None or null for those user ids where there is no unique id for user_unique_id

user_id

user_unique_id

user_unique_id_last4

1

88833-8759

8759

2

262501-029

1029

3

391-33-2823

2823

4

1195413-80

1380

5

471-24-6869

6869

6

192374-933

4933

7

749-27-47-52

4752

8

461-75-4198

4198

9

892-36-676-2

6762

10

197-54-1646

1646

11

232-55-52-58

5258

12

898-84-336-6

3366

13

247-95-68-44

6844

14

415-48-894-3

8943

15

403-39-5-869

5869

16

399-83-05-03

0503

17

607-99-0411

0411

18

430-01-578-5

5785

19

571-09-6181

6181

20

478-32-02-87

0287

21

Not Specified

22

Not Specified

23

830-40-5287

5287

24

778-0845

Invalid Unique Id

25

368-44-4478

4478

Exercise 6

Get the count of users based up on country code.

  • Use users table for this exercise.

  • Output should contain country code and count.

  • There should be no + in the country code. It should only contain digits.

  • Data should be sorted as numbers by country code.

  • We should discard user_phone_no with null values.

  • Here is the desired output:

country_code

user_count

1

1

7

2

48

1

54

1

55

1

62

3

63

1

81

1

84

1

86

4

229

1

249

1

351

1

370

1

380

1

420

1

598

1

Exercise 7

Let us validate if we have invalid order_item_subtotal as part of order_items table.

  • order_items table have 6 fields.

    • order_item_id

    • order_item_order_id

    • order_item_product_id

    • order_item_quantity

    • order_item_subtotal

    • order_item_product_price

  • order_item_subtotal is nothing but product of order_item_quantity and order_item_product_price. It means order_item_subtotal is compute by multiplying order_item_quantity and order_item_product_price for each item.

  • You need to get the count of order_items where order_item_subtotal is not equal to the product of order_item_quantity and order_item_product_price.

  • There can be issues related to rounding off. Make sure it is taken care using appropriate function.

  • Output should be 0 as there are no such records.

count

0

Exercise 8

Get number of orders placed on weekdays and weekends in the month of January 2014.

  • orders have 4 fields

    • order_id

    • order_date

    • order_customer_id

    • order_status

  • Use order date to determine the day on which orders are placed.

  • Output should contain 2 columns - day_type and order_count.

  • day_type should have 2 values Week days and Weekend days.

  • Here is the desired output.

day_type

order_count

Weekend days

1505

Week days

4403