As part of this class we have covered
- Operations on Set
- Understanding reduce
- Aggregate functions such as sum, min, max etc
- Reiterated on groupBy
- Sorting data using sorted and sortBy
myReduce using loops
Sorting Data using sorted
- sorted will sort the data in natural order of the elements in the collection
- Element type in the collection should have implicit function with Ordering
val l = List(1, 2, 5, 6, 2, 3, 1)
l.sorted
Sorting Data using sortBy
Problem Statement: Sort data by order customer id (3rd field in orders data)
Exercises
- Sort Data by product price in descending order
- Location: /data/retail_db/products/part-00000
- Price is 5th element in the data
- Filter out the record with product_id 685
- Sort Data by product category id in ascending order
- Location: /data/retail_db/products/part-00000
- Category id is second element
- Sort Data in ascending order by category id and descending order by product price
- Location: /data/retail_db/products/part-00000
- Category is second element and Product Price is 5th element
- Filter out the record with product_id 685
- Compute order revenue for each order id and sort data in descending order by order revenue
- Location: /data/retail_db/order_items/part-00000
- Order id is second element and order item subtotal is 5th element
- First compute revenue for each order id and then sort the data in descending order by revenue
- Output should have only order_id and computed revenue