Spark and Kafka Workshop – July 2019

Current Status
Not Enrolled
Price
600
Get Started
This course is currently closed

As part of this workshop we will be focusing on Spark and Kafka using Scala as programming language.

  • Getting Started
  • Fundamentals of Programming – Using Scala
  • Big Data ecosystem – Overview
  • Apache Spark 2 – Architecture and Core APIs
  • Apache Spark 2 – Data Frames and Spark SQL
  • Apache Spark 2 – Building Streaming Pipelines

Getting Started

As the course from Data Engineering Perspective Data processing skills are very important. Even today SQL is the most popular way of processing the data. Hence we will start with SQL and get Big Data eco system overview as part of this session.

  • Setting up the Environment
  • Revision of SQL

Fundamentals of Programming – Using Scala

As part of this module we will learn basics of programming using Scala as Programming Language.

  • Overview of Scala REPL
  • Declaring Variables
  • Functions and Operators
  • User Defined Functions
  • Object Oriented Concepts – Overview
  • Collections and Tuples
  • Development Life Cycle
  • and more

Apache Spark 2 – Architecture and Core APIs

As part of this module we will go through Core APIs of Spark.

  • Apache Spark Official Documentation
  • Creating Resilient Distributed Data Sets
  • Data Processing using Transformations and Actions
  • Understanding Execution Life Cycle
  • and more

Apache Spark 2 – Data Frames and Spark SQL

Data Frames and Spark SQL have become core module of Apache Spark. Most of the new applications are developed using Data Frames and Spark SQL.

  • Creating Data Frames from Files and Databases
  • Pre-Defined Functions
  • Basic Transformations using Data Frame APIs
  • Windowing Functions using Data Frame APIs
  • Basic Transformations using Spark SQL
  • Windowing Functions using Spark SQL

Apache Spark 2 – Building Streaming Pipelines

As part of this module we will see how to build streaming pipelines using Kafka and Spark Structured Streaming.

  • Getting Started with Kafka
  • Overview of Kafka Producer and Consumer APIs.
  • Getting Started with Spark Structured Streaming
  • End to End Streaming Pipeline using Kafka Connect, Kafka and Spark Structured Streaming.

 

Course Content

Expand All