Month: February 2023

Section 7:64. Loading the Data into Hive Tables – Overwrite vs Append

When loading data into a Hive table using LOAD DATA INPATH or LOAD DATA LOCAL INPATH, you can also choose to overwrite or append the data in the table, similar to the INSERT statements. To overwrite the data in the table, you can use the OVERWRITE keyword in your LOAD DATA statement. For example: LOAD …

Section 7:64. Loading the Data into Hive Tables – Overwrite vs Append Read More »

Section 7:62. Loading the Data into Hive Table from Local file system

To load data from a local file system into a Hive table, you can use the following steps: Create an external table in Hive with the appropriate schema to match your data. You can do this using the CREATE EXTERNAL TABLE statement. Move your data files to a location that can be accessed by the …

Section 7:62. Loading the Data into Hive Table from Local file system Read More »

Exercise 05 – Develop word count program

Details – Duration 20 minutes Data is available in HDFS /public/randomtextwriter Get word count for the input data using space as delimiter Number of executors should be 10 Executor memory should be 3 GB Executor cores should be 20 in total (2 per executor) Number of output files should be 8 Target Directory: /user/<YOUR_USER_ID>/solutions/solution05/wordcount Target …

Exercise 05 – Develop word count program Read More »