Run Sample Oozie job

Here we will see how to run the default Map Reduce job using Oozie

  • We can check the status of Oozie Server by running this command – oozie admin -oozie http://bigdataserver-3:11000/oozie -status
  • Oozie have several sub-commands for different purposes – job, admin etc
  • Create directory oozie_demo under home directory – /home/itversity
  • Copy the oozie example provided by the Cloudera by default to oozie_demo under home directory – /home/itversity/oozie_demo.
  • Untar the examples tar file to get the sample oozie job files.
  • Update the job.properties file with Name, Resource Manager values with examplesRoot.
    • Get nameNode URL from /etc/hadoop/conf/core-site.xml and copy the property value fs.defaultFS
    • Get jobTracker URL and copy the value – # /etc/hadoop/conf/yarn-site.xml and property value yarn.resourcemanager.address
    • Update job.properties – location /home/itversity/oozie_demo/examples/apps/map-reduce

Note: Make sure user have the hdfs direcotory (/user/<user-name>) for the user before proceeding for the next steps.

  • Copy the examples directory from /home/itversity/oozie_demo to Hadoop user location – hadoop fs -put oozie_demo /user/itversity
  • Run the Oozie job and get the job id
  • Using job id we can get job status
  • You can see the success in status to know that your job is succeeded. If not, you can troubleshoot map-reduce job.
  • Validate the output data in the directory defined in the workflow.xml with the property mapred.output.dir

https://gist.github.com/dgadiraju/4f5f068023fc432cfcc8df97874cc678

  • Now let us understand what happens when Oozie job is submitted.
    • One or more map reduce jobs will be created for running Oozie Workflow
    • On top of map reduce jobs to run Oozie Workflow, we will also see Map Reduce jobs for the actions submitted.
  • We need to focus on both Map Reduce jobs associated with Oozie Workflow as well as Actions to troubleshoot any issue.

Share this post