EMR cluster startup cycle and connecting to master node

Understanding startup cycle

In the above video, we will know about Startup Cycle of EMR Cluster.

  • The connections and Master Public DNS will be shown once, the state of the cluster change from provisioning to other states.
  • There is an id for the cluster which is Unique.
  • Auto-terminate is more used in step execution than cluster mode.
  • Terminate protection, if anyone wants to terminate your cluster then it is not allowed by enabling the terminate protection.
  • In Network and hardware, subnet id, master, and core are changed their state.
  • In Configuration details, having Release label, Hadoop distribution, Applications and log URL.
  • In Security and access,  having the Key name, EC2 instance profile and EMR role.

Connecting to master node using SSH


In the above video, we will see how to connect master node using ssh.

  • In summary,  Master Public DNS, there is SSH.
  • Clicking on SSH, it gives the instructions to connect to master node using SSH.
  • In windows, we are using Cygwin, for that we require .pem file to connect.
  • In Mac/ Linux terminal, SSH configured automatically.
  • If not connected, goto Master node security group.
  • Check the inbound rules, Add a rule, add SSH and save.
  • Now we opened the port in security group and the master is associated with that security group. Then we able to connect using SSH.
  • We can able to launch Spark.