Durga Gadiraju

Exercise 05 – Develop word count program

Details – Duration 20 minutes Data is available in HDFS /public/randomtextwriter Get word count for the input data using space as delimiter Number of executors should be 10 Executor memory should be 3 GB Executor cores should be 20 in total (2 per executor) Number of output files should be 8 Target Directory: /user/<YOUR_USER_ID>/solutions/solution05/wordcount Target …

Exercise 05 – Develop word count program Read More »

Section 6:57. Overview of beeline – Alternative to Hive CLI

Beeline is a command-line interface (CLI) for interacting with Apache Hive, an open-source data warehouse system built on top of Hadoop. It is an alternative to the original Hive CLI, which has some limitations and can be difficult to use. Beeline provides a more modern and flexible interface for interacting with Hive. One of the …

Section 6:57. Overview of beeline – Alternative to Hive CLI Read More »