Review – YARN Resource Manager HA

Let us review some of the important properties, YARN Resource Manager HA components etc.

  • HA concepts related to YARN is also similar to HDFS.
  • YARN Resource Manager HA is not very common.
  • As part of YARN architecture, Resource Manager takes care of Resource Management and Job Scheduling.
  • Job Execution Life Cycle is managed by per job Application Master.
  • Even a single Resource Manager is highly reliable. However on very large clusters, some clusters are configured with Resource Manager HA.
  • Automatic Failover
    • There is no failover controller and hence only leader election is possible.
    • Leader election is done by Zookeeper itself.
  • When Failover occurs
    • In-Flight work of running job tasks is lost and hence they have to be restarted from scratch.
    • Standby will become active.
    • For all running jobs Application Masters, as well as Tasks, will be restarted.
    • If there is a job with 100 tasks and if 90 of them are completed and 5 of them are running, when Failover has occurred only 5 will be restarted.
    • This is possible for Map Reduce as Map Reduce Application Master does checkpointing.
  • Let us also review some important properties by connecting to the Gateway Node.
  • Click here for Additional Material about YARN Resource Manager Availability.

Share this post