Month: February 2023

Install Cloudera Manager

Cloudera Manager is the management tool for setup and manage clusters. It provides wizard to setup cluster as well as to configure alerts in case of any issues. Install Cloudera Manager – sudo yum -y install cloudera-manager-server Stop Cloudera Manager – sudo systemctl stop cloudera-scm-server Setup scm database – sudo /usr/share/cmf/schema/scm_prepare_database.sh mysql -h localhost scm root …

Install Cloudera Manager Read More »

Setup Pre-requisites

Let us setup pre-requisites like JDK and MySQL database so that we can configure cloudera manager and other CDH components using external database. Install Java SDK that come with Cloudera – sudo yum install oracle-j2sdk1.7 -y Install mysql (mariadb) https://gist.github.com/dgadiraju/c35eb63a77438b5bacacdd43ca14ba74 Setup mysql connector – sudo yum -y install mysql-connector-java We need to install mysql-connector-java on other nodes …

Setup Pre-requisites Read More »

Collections – Seq, Set and Map

As part of this topic we will see one of the important topics for building Spark applications using Scala i.e Collections. Collections Scala collections are categorized into 3 types Seq Set Map Seq Sequence have length Elements in Sequence can be accessed using prefix eg: scala.Array, scala.collection.immutable.List etc Classes are divided into Seq classes and Buffer …

Collections – Seq, Set and Map Read More »

Object Oriented Concepts – Case Classes

As part of this topic, we will cover Case Classes in Scala Case Class Case classes can be pattern matched Case classes automatically define hashcode and equals Case classes automatically define getter methods for the constructor arguments(if we use var in the argument). Example case class Order(var orderId:Int,var orderDate:String,var orderCustomerId:Int,var orderStatus:String) { println(“I am inside …

Object Oriented Concepts – Case Classes Read More »