Solutions – Managing Database Objects in Postgresql

Managing Database Objects

This exercise is primarily to assess your capabilities related to put all important DDL concepts in practice by coming up with solution for a typical data migration problem from one database (mysql) to another (postgres).

  • Here are the high level steps for database migration from one type of database to another type of database.
    • Extract DDL Statements from source database (MySQL).
    • Extract the data in the form of delimited files and ship them to target database.
    • Refactor scripts as per target database (Postgres).
    • Create tables in the target database.
    • Execute pre-migration steps (disable constraints, drop indexes etc).
    • Load the data using native utilities.
    • Execute post-migration steps (enable constraints, create or rebuild indexes, reset sequences etc).
    • Sanity checks with basic queries.
    • Make sure all the impacted applications are validated thoroughly.
  • We have scripts and data set available in our GitHub repository. If you are using our environment the repository is already cloned under /data/retail_db.
  • It have scripts to create tables with primary keys. Those scripts are generated from MySQL tables and refactored for Postgres.
    • Script to create tables: create_db_tables_pg.sql
    • Load data into tables: load_db_tables_pg.sql
  • Here are the steps you need to perform to take care of this exercise.
    • Create tables
    • Load data
    • All the tables have surrogate primary keys. Here are the details.
      • orders.order_id
      • order_items.order_item_id
      • customers.customer_id
      • products.product_id
      • categories.category_id
      • departments.department_id
    • Get the maximum value from all surrogate primary key fields.
    • Create sequences for all surrogate primary key fields using maximum value. Make sure to use standard naming conventions for sequences.
    • Ensure sequences are mapped to the surrogate primary key fields.
    • Create foreign key constraints based up on this information.
      • orders.order_customer_id to customers.customer_id
      • order_items.order_item_order_id to orders.order_id
      • order_items.order_item_product_id to products.product_id
    • Insert few records in departments to ensure that sequence generated numbers are used for department_id.
  • Here are the commands to launch psql and run scripts to create tables as well as load data into tables.
psql -U itversity_retail_user \
  -h localhost \
  -p 5432 \
  -d itversity_retail_db \
  -W
DROP TABLE IF EXISTS order_items CASCADE;
DROP TABLE IF EXISTS orders CASCADE;
DROP TABLE IF EXISTS customers CASCADE;
DROP TABLE IF EXISTS products CASCADE;
DROP TABLE IF EXISTS categories CASCADE;
DROP TABLE IF EXISTS departments CASCADE;

\i /data/retail_db/create_db_tables_pg.sql

\i /data/retail_db/load_db_tables_pg.sql
  • We use this approach of creating tables, loading data and then adding constraints as well as resetting sequences for large volume data migrations from one database to another database.
  • Here are the commands or queries you need to come up with to solve this problem.
In [1]:
%load_ext sql
In [2]:
%env DATABASE_URL=postgresql://itversity_retail_user:retail_password@localhost:5432/itversity_retail_db
env: DATABASE_URL=postgresql://itversity_retail_user:retail_password@localhost:5432/itversity_retail_db

Exercise 1

Queries to get maximum values from surrogate primary keys.

In [4]:
%%sql

SELECT max(category_id) FROM categories;
 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
1 rows affected.
Out[4]:
max
58
In [5]:
%%sql

SELECT max(customer_id) FROM customers;
 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
1 rows affected.
Out[5]:
max
12435
In [6]:
%%sql

SELECT max(department_id) FROM departments;
 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
1 rows affected.
Out[6]:
max
7
In [7]:
%%sql

SELECT max(order_item_id) FROM order_items;
 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
1 rows affected.
Out[7]:
max
172198
In [8]:
%%sql

SELECT max(order_id) FROM orders;
 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
1 rows affected.
Out[8]:
max
68883
In [9]:
%%sql

SELECT max(product_id) FROM products;
 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
1 rows affected.
Out[9]:
max
1345

Exercise 2

Commands to add sequences with START WITH pointing to the maximum value for the corresponding surrogate primary key fields. Make sure to use meaningful names to sequences TABLENAME_SURROGATEFIELD_seq (example: users_user_id_seq for users.user_id)

In [10]:
%%sql

CREATE SEQUENCE categories_category_id_seq
START WITH 59
INCREMENT BY 1;
 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
Done.
Out[10]:
[]
In [11]:
%%sql

CREATE SEQUENCE customers_customer_id_seq
START WITH 12436
INCREMENT BY 1;
 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
Done.
Out[11]:
[]
In [12]:
%%sql

CREATE SEQUENCE departments_department_id_seq
START WITH 8
INCREMENT BY 1;
 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
Done.
Out[12]:
[]
In [13]:
%%sql

CREATE SEQUENCE orders_order_id_seq
START WITH 68884
INCREMENT BY 1;
 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
Done.
Out[13]:
[]
In [15]:
%%sql

CREATE SEQUENCE order_items_order_item_id_seq
START WITH 172199
INCREMENT BY 1;
 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
Done.
Out[15]:
[]
In [17]:
%%sql

CREATE SEQUENCE products_product_id_seq
START WITH 1346
INCREMENT BY 1;
 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
Done.
Out[17]:
[]

Exercise 3

Commands to alter sequences to bind them to corresponding surrogate primary key fields.

In [18]:
%%sql

ALTER TABLE categories 
    ALTER COLUMN category_id 
        SET DEFAULT nextval('categories_category_id_seq');
 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
Done.
Out[18]:
[]
In [19]:
%%sql

ALTER TABLE customers 
    ALTER COLUMN customer_id 
        SET DEFAULT nextval('customers_customer_id_seq');
 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
Done.
Out[19]:
[]
In [20]:
%%sql

ALTER TABLE departments 
    ALTER COLUMN department_id
        SET DEFAULT nextval('departments_department_id_seq');
 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
Done.
Out[20]:
[]
In [21]:
%%sql

ALTER TABLE orders 
    ALTER COLUMN order_id 
        SET DEFAULT nextval('orders_order_id_seq');
 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
Done.
Out[21]:
[]
In [22]:
%%sql

ALTER TABLE order_items 
    ALTER COLUMN order_item_id 
        SET DEFAULT nextval('order_items_order_item_id_seq');
 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
Done.
Out[22]:
[]
In [23]:
%%sql

ALTER TABLE products 
    ALTER COLUMN product_id 
        SET DEFAULT nextval('products_product_id_seq');
 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
Done.
Out[23]:
[]

Exercise 4

Add Foreign Key constraints to the tables.

  • Validate if the tables have data violataing foreign key constraints (Hint: You can use left outer join to find rows in child table but not in parent table)
  • Alter tables to add foreign keys as specified.
  • Here are the relationships for your reference.
    • orders.order_customer_id to customers.customer_id
    • order_items.order_item_order_id to orders.order_id
    • order_items.order_item_product_id to products.product_id
  • Solution should contain the following:
    • Commands to add foreign keys to the tables.
In [24]:
%%sql

ALTER TABLE orders
ADD CONSTRAINT orders_customers_fk 
    FOREIGN KEY (order_customer_id)
    REFERENCES customers(customer_id);
 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
Done.
Out[24]:
[]
In [25]:
%%sql

ALTER TABLE order_items
ADD CONSTRAINT order_items_orders_fk 
    FOREIGN KEY (order_item_order_id)
    REFERENCES orders(order_id);
 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
Done.
Out[25]:
[]
In [26]:
%%sql

ALTER TABLE order_items
ADD CONSTRAINT order_items_products_fk 
    FOREIGN KEY (order_item_product_id)
    REFERENCES products(product_id);
 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
Done.
Out[26]:
[]

Exercise 5

Queries to validate whether constraints are created or not. You can come up with queries against information_schema tables such as columns, sequences etc.

In [35]:
%%sql

SELECT * FROM information_schema.sequences LIMIT 10
 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
8 rows affected.
Out[35]:
sequence_catalog sequence_schema sequence_name data_type numeric_precision numeric_precision_radix numeric_scale start_value minimum_value maximum_value increment cycle_option
itversity_retail_db public users_user_id_seq integer 32 2 0 1 1 2147483647 1 NO
itversity_retail_db public user_logins_user_login_id_seq integer 32 2 0 1 1 2147483647 1 NO
itversity_retail_db public categories_category_id_seq bigint 64 2 0 59 1 9223372036854775807 1 NO
itversity_retail_db public customers_customer_id_seq bigint 64 2 0 12436 1 9223372036854775807 1 NO
itversity_retail_db public departments_department_id_seq bigint 64 2 0 8 1 9223372036854775807 1 NO
itversity_retail_db public orders_order_id_seq bigint 64 2 0 68884 1 9223372036854775807 1 NO
itversity_retail_db public order_items_order_item_id_seq bigint 64 2 0 172199 1 9223372036854775807 1 NO
itversity_retail_db public products_product_id_seq bigint 64 2 0 1346 1 9223372036854775807 1 NO
In [36]:
%%sql

SELECT * FROM information_schema.columns
WHERE table_name = 'orders';
 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
4 rows affected.
Out[36]:
table_catalog table_schema table_name column_name ordinal_position column_default is_nullable data_type character_maximum_length character_octet_length numeric_precision numeric_precision_radix numeric_scale datetime_precision interval_type interval_precision character_set_catalog character_set_schema character_set_name collation_catalog collation_schema collation_name domain_catalog domain_schema domain_name udt_catalog udt_schema udt_name scope_catalog scope_schema scope_name maximum_cardinality dtd_identifier is_self_referencing is_identity identity_generation identity_start identity_increment identity_maximum identity_minimum identity_cycle is_generated generation_expression is_updatable
itversity_retail_db public orders order_id 1 nextval(‘orders_order_id_seq’::regclass) NO integer None None 32 2 0 None None None None None None None None None None None None itversity_retail_db pg_catalog int4 None None None None 1 NO NO None None None None None NO NEVER None YES
itversity_retail_db public orders order_date 2 None NO timestamp without time zone None None None None None 6 None None None None None None None None None None None itversity_retail_db pg_catalog timestamp None None None None 2 NO NO None None None None None NO NEVER None YES
itversity_retail_db public orders order_customer_id 3 None NO integer None None 32 2 0 None None None None None None None None None None None None itversity_retail_db pg_catalog int4 None None None None 3 NO NO None None None None None NO NEVER None YES
itversity_retail_db public orders order_status 4 None NO character varying 45 180 None None None None None None None None None None None None None None None itversity_retail_db pg_catalog varchar None None None None 4 NO NO None None None None None NO NEVER None YES
In [39]:
%%sql

SELECT * FROM information_schema.table_constraints
WHERE table_name = 'orders'
 * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db
6 rows affected.
Out[39]:
constraint_catalog constraint_schema constraint_name table_catalog table_schema table_name constraint_type is_deferrable initially_deferred enforced
itversity_retail_db public orders_pkey itversity_retail_db public orders PRIMARY KEY NO NO YES
itversity_retail_db public orders_customers_fk itversity_retail_db public orders FOREIGN KEY NO NO YES
itversity_retail_db public 2200_25581_1_not_null itversity_retail_db public orders CHECK NO NO YES
itversity_retail_db public 2200_25581_2_not_null itversity_retail_db public orders CHECK NO NO YES
itversity_retail_db public 2200_25581_3_not_null itversity_retail_db public orders CHECK NO NO YES
itversity_retail_db public 2200_25581_4_not_null itversity_retail_db public orders CHECK NO NO YES
In [ ]:
 

Share this post