Exercises – Managing Database Objects

This exercise is primarily to assess your capabilities related to put all important DDL concepts in practice by coming up with solution for a typical data migration problem from one database (mysql) to another (postgres).

  • Here are the high level steps for database migration from one type of database to another type of database.

    • Extract DDL Statements from source database (MySQL).

    • Extract the data in the form of delimited files and ship them to target database.

    • Refactor scripts as per target database (Postgres).

    • Create tables in the target database.

    • Execute pre-migration steps (disable constraints, drop indexes etc).

    • Load the data using native utilities.

    • Execute post-migration steps (enable constraints, create or rebuild indexes, reset sequences etc).

    • Sanity checks with basic queries.

    • Make sure all the impacted applications are validated thoroughly.

  • We have scripts and data set available in our GitHub repository. If you are using our environment the repository is already cloned under /data/retail_db.

  • It have scripts to create tables with primary keys. Those scripts are generated from MySQL tables and refactored for Postgres.

    • Script to create tables: create_db_tables_pg.sql

    • Load data into tables: load_db_tables_pg.sql

  • Here are the steps you need to perform to take care of this exercise.

    • Create tables

    • Load data

    • All the tables have surrogate primary keys. Here are the details.

      • orders.order_id

      • order_items.order_item_id

      • customers.customer_id

      • products.product_id

      • categories.category_id

      • departments.department_id

    • Get the maximum value from all surrogate primary key fields.

    • Create sequences for all surrogate primary key fields using maximum value. Make sure to use standard naming conventions for sequences.

    • Ensure sequences are mapped to the surrogate primary key fields.

    • Create foreign key constraints based up on this information.

      • orders.order_customer_id to customers.customer_id

      • order_items.order_item_order_id to orders.order_id

      • order_items.order_item_product_id to products.product_id

      • products.product_category_id to categories.category_id

      • categories.category_department_id to departments.department_id

    • Insert few records in departments to ensure that sequence generated numbers are used for department_id.

  • Here are the commands to launch psql and run scripts to create tables as well as load data into tables.

psql -U itversity_retail_user 
  -h localhost 
  -p 5432 
  -d itversity_retail_db 
  -W

i /data/retail_db/create_db_tables_pg.sql

i /data/retail_db/load_db_tables_pg.sql

Copy to clipboard

  • We use this approach of creating tables, loading data and then adding constraints as well as resetting sequences for large volume data migrations from one database to another database.

  • Here are the commands or queries you need to come up with to solve this problem.

Exercise 1

Queries to get maximum values from surrogate primary keys.

Exercise 2

Commands to add sequences with START WITH pointing to the maximum value for the corresponding surrogate primary key fields. Make sure to use meaningful names to sequences TABLENAME_SURROGATEFIELD_seq (example: users_user_id_seq for users.user_id)

Exercise 3

Commands to alter sequences to bind them to corresponding surrogate primary key fields.

Exercise 4

Add Foreign Key constraints to the tables.

  • Validate if the tables have data violataing foreign key constraints (Hint: You can use left outer join to find rows in child table but not in parent table)

  • Alter tables to add foreign keys as specified.

  • Here are the relationships for your reference.

    • orders.order_customer_id to customers.customer_id

    • order_items.order_item_order_id to orders.order_id

    • order_items.order_item_product_id to products.product_id

    • products.product_category_id to categories.category_id

    • categories.category_department_id to departments.department_id

  • Solution should contain the following:

    • Commands to add foreign keys to the tables.

Exercise 5

Queries to validate whether constraints are created or not. You can come up with queries against information_schema tables such as columnssequences etc.