Durga Gadiraju

Run Pig Job

Let us see how we can validate Pig job on our cluster. Pig uses HDFS for File System and Map Reduce to process the data. Ensure you have data to validate (in our case we have data in local file system /home/itversity/data). Let us copy data to HDFS. Create directory /user/itversity/data Copy whole directory on …

Run Pig Job Read More »

Review Important Properties

Let us review property files as well as important properties for all 4 services – Oozie, Pig, Sqoop and Hue. Property Files – Standard Locations Oozie – /etc/oozie/conf Sqoop – /etc/sqoop/conf Pig – /etc/pig/conf Hue – /etc/hue/conf However, with Cloudera there will be only templates actual run time property files will be under /var/run/cloudera-scm-agent/process With …

Review Important Properties Read More »