Month: February 2023

Introduction and Setting up Python

Agenda Introduction Setup Python REPL Basic Programming Constructs Functions and Lambda Functions Collections— List, Set, Dict Basic Map Reduce operations Basic 1/0 operations Introduction Python is interpreter based programming language Adaptability of Python is very high in Data Engineering and Data Science fields Spark APIs are well integrated with Python Highly relevant for Cloudera and …

Introduction and Setting up Python Read More »

Spark Framework

Let us understand the execution modes as well as different components of the Spark Framework. Also, we will recap some important aspects of YARN. Execution Modes Following are the different execution modes supported by Spark. Local (for development) Standalone (for development) Mesos YARN As our cluster uses YARN, let us recap some important aspects of …

Spark Framework Read More »