Durga Gadiraju

Spark Modules

February 27, 2023
, 8:21 am
, Uncategorized

In the earlier versions of Spark, we have core API at the bottom and all the higher level modules work with core API. Examples of core API are a map, reduce, join, groupByKey etc. But with Spark 2, Data Frames and Spark SQL has become the core module.

Core – Transformations and Actions — APIs such as map, reduce, join, filter etc. They typically work on RDD
Spark SQL and Data Frames – APIs and Spark SQL interface for batch processing on top of Data Frames or Data Sets (not available for Python)
Structured Streaming – APIs and Spark SQL interface for stream data processing on top of Data Frames
Machine Learning Pipelines – Machine Learning data pipelines to apply Machine Learning algorithms on top of Data Frames
GraphX Pipelines
We can build applications using different programming languages such as Scala, Python, Java, R etc leveraging Spark APIs of the above-mentioned modules.

Durga Gadiraju

Spark Modules

Share this post

Join Our Community

Follow Us

Links

Contact Info

Address

Phone

Email