Now let us understand key concepts with respect to HDFS Namenode High Availability. Let us recap HDFS components before we get into HA.
- HDFS have Namenode, Secondary Namenode and Datanodes
- When data is saved in the form of files in HDFS
- Files will be divided into blocks and blocks are saved in Datanodes
- Mapping between File, block id or name and block location is called as metadata.
- This metadata will be stored in memory as well as edit logs of Namenode.
- To control the size of edit logs over period of time, periodically snapshots are taken and they are called as FSImage (using edit logs and last FSImage).
- This process of creating FSImage using last FSImage and edit logs is called as checkpointing.
- If Namenode is down, it will take several minutes to restore and recover fsimage and edit logs to rebuild metadata in memory. During the recovery process entire cluster is not usable.
- Also if there is hard ware failures, migrating Namenode to other node is also time consuming. It also involves in changing Namenode URI in multiple locations.
- We can overcome these issues by configuring High Availability on Namenode.
- In HA configuration, instead of having Namenode and Secondary Namenode we will have Active and Passive Namenode. Hence, it is also called as Active-Passive Configuration. Here are the issues HA addresses.
- Manual involvement in case of an unplanned outage
- Planned upgrades where Namenode need to be brought down
- HA Configuration provides us transparent and fast failover of the Namenode.