Here we are Developing Revenue per order using dataframe operations and for each order in order_items,we need to use order_items_subtotal and add it to get order level revenue.
- In Spark 2.0, we need not create sqlContext and sparksession is the only entry point of the program.
- First create spark session object-
val spark = SparkSession
.builder()
.appName("SparkSQLExample")
.master("local")
.getOrCreate()
- import spark.implicits._
- Read json file-
- val order_items = spark.read.json(“/file/path”)
- order_items.groupBy($”order_item_order_id”).sum(“order_item_subtotal”).show