Introduction

So far, we have seen setting up Zookeeper and HDFS, now we will see how to setup YARN which stands for Yet Another Resource Negotiator. YARN (MRv2) introduces newer daemons that are responsible for job scheduling/monitoring and resource management.

  • Setup YARN + MR2
  • Run Simple Map Reduce Job
  • Components of YARN and MR2
  • Configuration Files and Important Properties
  • Review Web UIs and log files
  • YARN and MR2 CLI
  • YARN Application Life Cycle
  • Map Reduce Job Execution Life Cycle

Cluster Topology

We are setting up the cluster on 7+1 nodes. We start with 7 nodes and then we will add one more node later.

  • Gateway(s) and Management Service
    • bigdataserver-1
  • Masters
    • bigdataserver-2 – Zookeeper, Namenode
    • bigdataserver-3 – Zookeeper, Secondary Namenode
    • bigdataserver-4 – Zookeeper, Resource Manager, Job History Server
  • Slaves or Worker Nodes
    • bigdataserver-5 – Datanode, Node Manager
    • bigdataserver-6 – Datanode, Node Manager
    • bigdataserver-7 – Datanode, Node Manager
  • We will create host group yarn to run commands using ansible on all nodes where YARN is running.

Share this post