Run Sample Oozie job

Here we will see how to run the default Map Reduce job using Oozie

We can check the status of Oozie Server by running this command – oozie admin -oozie http://bigdataserver-3:11000/oozie -status
Oozie have several sub-commands for different purposes – job, admin etc
Create directory oozie_demo under home directory – /home/itversity
Copy the oozie example provided by the Cloudera by default to oozie_demo under home directory – /home/itversity/oozie_demo.
Untar the examples tar file to get the sample oozie job files.
Update the job.properties file with Name, Resource Manager values with examplesRoot.
- Get nameNode URL from /etc/hadoop/conf/core-site.xml and copy the property value fs.defaultFS
- Get jobTracker URL and copy the value – # /etc/hadoop/conf/yarn-site.xml and property value yarn.resourcemanager.address
- Update job.properties – location /home/itversity/oozie_demo/examples/apps/map-reduce

Note: Make sure user have the hdfs direcotory (/user/<user-name>) for the user before proceeding for the next steps.

Copy the examples directory from /home/itversity/oozie_demo to Hadoop user location – hadoop fs -put oozie_demo /user/itversity
Run the Oozie job and get the job id
Using job id we can get job status
You can see the success in status to know that your job is succeeded. If not, you can troubleshoot map-reduce job.
Validate the output data in the directory defined in the workflow.xml with the property mapred.output.dir

Now let us understand what happens when Oozie job is submitted.
- One or more map reduce jobs will be created for running Oozie Workflow
- On top of map reduce jobs to run Oozie Workflow, we will also see Map Reduce jobs for the actions submitted.
We need to focus on both Map Reduce jobs associated with Oozie Workflow as well as Actions to troubleshoot any issue.

Join Our Community