Running application on cluster using Spark 2

Follow the following steps to run application on cluster-

  • Go to path of the project
  • Run sbt package ,this will include new classes and compile to jar files.
  • Use scp to copy the jar file to remote cluster
  • After copying the file, use ssh to connect to the cluster
  • Export SPARK_MAJOR_VERSION = 2 and enter the spark-submit command-

spark-submit --master yarn\
--deploy-mode client\
--conf spark.ui.port = 12890\
--class retail_db_df.GetRevenuePerOrderDF\
spark2demo_2.11-0.1.jar yarn-client retail_db_json/order_items user/itversity/GetRevenuePerOrderDf

  • Verify the data in HDFS.