Let us review some of the important properties, YARN Resource Manager HA components etc.
- HA concepts related to YARN is also similar to HDFS.
- YARN Resource Manager HA is not very common.
- As part of YARN architecture, Resource Manager takes care of Resource Management and Job Scheduling.
- Job Execution Life Cycle is managed by per job Application Master.
- Even a single Resource Manager is highly reliable. However on very large clusters, some clusters are configured with Resource Manager HA.
- Automatic Failover
- There is no failover controller and hence only leader election is possible.
- Leader election is done by Zookeeper itself.
- When Failover occurs
- In-Flight work of running job tasks is lost and hence they have to be restarted from scratch.
- Standby will become active.
- For all running jobs Application Masters, as well as Tasks, will be restarted.
- If there is a job with 100 tasks and if 90 of them are completed and 5 of them are running, when Failover has occurred only 5 will be restarted.
- This is possible for Map Reduce as Map Reduce Application Master does checkpointing.
- Let us also review some important properties by connecting to the Gateway Node.
- Click here for Additional Material about YARN Resource Manager Availability.