Exercises – Managing Database Objects using Postgresql

Exercises – Managing Database Objects

This exercise is primarily to assess your capabilities related to put all important DDL concepts in practice by coming up with solution for a typical data migration problem from one database (mysql) to another (postgres).

  • Here are the high level steps for database migration from one type of database to another type of database.
    • Extract DDL Statements from source database (MySQL).
    • Extract the data in the form of delimited files and ship them to target database.
    • Refactor scripts as per target database (Postgres).
    • Create tables in the target database.
    • Execute pre-migration steps (disable constraints, drop indexes etc).
    • Load the data using native utilities.
    • Execute post-migration steps (enable constraints, create or rebuild indexes, reset sequences etc).
    • Sanity checks with basic queries.
    • Make sure all the impacted applications are validated thoroughly.
  • We have scripts and data set available in our GitHub repository. If you are using our environment the repository is already cloned under /data/retail_db.
  • It have scripts to create tables with primary keys. Those scripts are generated from MySQL tables and refactored for Postgres.
    • Script to create tables: create_db_tables_pg.sql
    • Load data into tables: load_db_tables_pg.sql
  • Here are the steps you need to perform to take care of this exercise.
    • Create tables
    • Load data
    • All the tables have surrogate primary keys. Here are the details.
      • orders.order_id
      • order_items.order_item_id
      • customers.customer_id
      • products.product_id
      • categories.category_id
      • departments.department_id
    • Get the maximum value from all surrogate primary key fields.
    • Create sequences for all surrogate primary key fields using maximum value. Make sure to use standard naming conventions for sequences.
    • Ensure sequences are mapped to the surrogate primary key fields.
    • Create foreign key constraints based up on this information.
      • orders.order_customer_id to customers.customer_id
      • order_items.order_item_order_id to orders.order_id
      • order_items.order_item_product_id to products.product_id
      • products.product_category_id to categories.category_id
      • categories.category_department_id to departments.department_id
    • Insert few records in departments to ensure that sequence generated numbers are used for department_id.
  • Here are the commands to launch psql and run scripts to create tables as well as load data into tables.
psql -U itversity_retail_user \
  -h localhost \
  -p 5432 \
  -d itversity_retail_db \
  -W

\i /data/retail_db/create_db_tables_pg.sql

\i /data/retail_db/load_db_tables_pg.sql
  • We use this approach of creating tables, loading data and then adding constraints as well as resetting sequences for large volume data migrations from one database to another database.
  • Here are the commands or queries you need to come up with to solve this problem.

Exercise 1

Queries to get maximum values from surrogate primary keys.

Exercise 2

Commands to add sequences with START WITH pointing to the maximum value for the corresponding surrogate primary key fields. Make sure to use meaningful names to sequences TABLENAME_SURROGATEFIELD_seq (example: users_user_id_seq for users.user_id)

Exercise 3

Commands to alter sequences to bind them to corresponding surrogate primary key fields.

Exercise 4

Add Foreign Key constraints to the tables.

  • Validate if the tables have data violataing foreign key constraints (Hint: You can use left outer join to find rows in child table but not in parent table)
  • Alter tables to add foreign keys as specified.
  • Here are the relationships for your reference.
    • orders.order_customer_id to customers.customer_id
    • order_items.order_item_order_id to orders.order_id
    • order_items.order_item_product_id to products.product_id
    • products.product_category_id to categories.category_id
    • categories.category_department_id to departments.department_id
    • There might be data in child table, which might not be in parent table. You should figure out as part of the validations.
    • If there are foreign key violations, make sure to update the data in the child table’s foreign key column to null values.
  • Solution should contain the following:
    • Commands to add foreign keys to the tables.

Exercise 5

Queries to validate whether constraints are created or not. You can come up with queries against information_schema tables such as columns, sequences etc.

Share this post