Section 7:72. Resources and Exercises

Let us use NYSE data and see how we can create tables in Hive.

  1. Data Location (Local): /data/nyse-all/nyse-data
  2. Create a database with the name – YOUR_OS_USER_NAME_nyse
  3. Table Name: nyse_eod
  4. File Format: TEXTFILE (default)
  5. Review the files by running Linux commands before using data sets. Data is compressed and we
  6. can load the files as is.
  7. Copy one of the zip files to your home directory and preview the data. There should be 7 fields. 
  8. you need to determine the delimiter.
  9. Field Names:Åockticker, tradedate, openrice, high price, low price, closeprice, volume
  10. Determine correct data types based on the values
  11. Create a Managed table with default Hive Delimiter.
  12. As delimiters in data and tables are not the same, you need to figure out how to get data into the target table.

Validation

Run the following queries to ensure that you will be able to read the data.

Share this post