Day: February 8, 2023

Setup Ubuntu using Windows Subsystem

It is better to have Linux based environment for setting up Big Data Clusters. It provides us below capabilities. Ability to connect to remote servers using ssh Copy files using scp or rsync Enable proxy such as sshuttle to access web applications running on the servers behind firewall with out opening up ports to the public using ssh authentication. Setup Process …

Setup Ubuntu using Windows Subsystem Read More »

Configure Hadoop Ecosystem components – Oozie, Pig, Sqoop and Hue

As part of this section, we will see how to set up Pig and Oozie components and some of the key concepts related to each service. Setup Oozie, Pig, Sqoop and Hue Review Important Properties Schedule an Oozie workflow Run Pig Job Validate Sqoop Overview of Hue Cluster Topology We are setting up the cluster …

Configure Hadoop Ecosystem components – Oozie, Pig, Sqoop and Hue Read More »

introduction

As part of this section we will talk about setting up necessary tools to create virtual machines on Google Cloud Platform (GCP) Setup Ubuntu using Windows Subsystem Sign up for GCP Create template for Big Data Server Provision Servers for Big Data Cluster Review Concepts Setting up gcloud Setup ansible on first server Format JBOD …

introduction Read More »