Durga Gadiraju

Review Properties – HDFS Namenode HA

Now let us review core-site.xml and hdfs-site.xml properties to understand how configurations are defined. In core-site.xml, fs.defaultFS will be pointing to nameservice1 instead of static IP address and port number (e.g.: hdfs://nameservice1) In hdfs-site.xml we will see new property by name dfs.nameservices and value nameservice1 For both Namenodes, we will have logical name defined under …

Review Properties – HDFS Namenode HA Read More »

Introduction

As part of this section we will see how to enable HDFS Namenode High Availability as well as YARN Resource Manager High Availability while exploring key concepts. High Availability – Overview Configure HDFS Namenode HA Review Properties – HDFS Namenode HA HDFS Namenode HA – Key Concepts Configure YARN Resource Manager HA Review – YARN …

Introduction Read More »

Map Reduce Job Execution Life Cycle

Now let us talk about Map Reduce Job Execution Life Cycle. While YARN is Resource Management framework, Map Reduce is distributed data processing framework. On Gateway Node we can submit map reduce jobs using hadoop jar command. https://gist.github.com/dgadiraju/0d3df07693e78d07164af0c14493707d There will be JVM launched on the gateway node. It will talk to Resource Manager and get …

Map Reduce Job Execution Life Cycle Read More »