High Availability – Implications

Even though there is no significant difference in the syntax of the commands or running applications, still there are some differences which both Administrators, as well as Developers, should be aware of.

  • On a typical configuration cluster, we can connect to HDFS or YARN by using URI from the client.
  • For Namenode URI originally look like this hdfs://bigdataserver-2.c.cellular-axon-219405.internal:8020
  • Now instead of passing IP address and port number, we need to pass nameservice – hdfs://nameservice1
  • With respect to Resource Manager, no matter which server URI you use – you will be automatically redirected to Active Resource Manager.
  • If High Availability is enabled after running the cluster in production for some time, then where ever Namenode URI  is hardcoded have to be changed or refactored.
  • Also, you need to refer to official documentation and get it tested thoroughly before enabling High Availability in live production cluster.
  • Namenode Web UI or Resource Manager Web UI can be accessed using Active Namenode or Active Resource Manager.

By this time you should have set up Cloudera Manager, then install Cloudera Distribution of Hadoop, Configure services such as Zookeeper, HDFS, and YARN along with HDFS Namenode as well as YARN Resource Manager High Availability.

Make sure to stop services in Cloudera Manager and also shut down servers provisioned from GCP or AWS to leverage credits or control costs.

Share this post