Durga Gadiraju

Section.12.136. Overview of Analytics or Windowing Functions

Windowing functions in Hive are defined using the OVER() clause, which specifies the window or subset of rows over which the function will operate. The window can be defined using a range of methods, including ROWS BETWEEN, RANGE BETWEEN, and UNBOUNDED PRECEDING/FOLLOWING. Some of the commonly used analytics or windowing functions in Hive include: ROW_NUMBER() …

Section.12.136. Overview of Analytics or Windowing Functions Read More »

Section.12.135. Prepare-hr-database-with-employees-table

As part of this topic we sag how to create database and then table employees to write queries against it. We will be primarily employees for Windowing Functions in Hive. • Create database YOUR-OS-USERNAME-hr • Create table employees using it as delimiter • Load data from /data/hr_db/employees into newly created table • Validate by running …

Section.12.135. Prepare-hr-database-with-employees-table Read More »

Section 10:119.Sorting Data within Groups Using DISTRIBUTE BY and SORT BY

Sorting data within groups using DISTRIBUTE BY and SORT BY in Hive is a way to sort the data within each group based on one or more columns. This is useful when you want to group your data and sort it within each group to get more meaningful insights. Use the SELECT statement with DISTRIBUTE …

Section 10:119.Sorting Data within Groups Using DISTRIBUTE BY and SORT BY Read More »

Section 10:117.Performing  Basic Aggregations Sum, Min,Max Using Group by

Performing basic aggregations using HAVING in Hive is a way to filter the results of a query based on a condition applied to an aggregate function result. HAVING clause is used with the GROUP BY clause and it filters the groups based on a condition specified after the HAVING keyword. To perform basic aggregations using …

Section 10:117.Performing  Basic Aggregations Sum, Min,Max Using Group by Read More »

Section 10:116.Performing  Basic Aggregations Sum, Min,Max Using Group by

Performing basic aggregations using GROUP BY in Hive is a common task in data analysis and data processing. Hive is a data warehousing tool that provides a SQL-like interface to query and analyze large datasets stored in Hadoop Distributed File System (HDFS). To perform basic aggregations using GROUP BY in Hive, you can follow these …

Section 10:116.Performing  Basic Aggregations Sum, Min,Max Using Group by Read More »