Let us review the details related to setting up of Spark Environment. We have already covered how to set up the environment earlier.
• Pre-requisites
- 64 bit Computer
- At least 4 GB RAM and enough storage
- 64-bit Operating System – Windows 10, Linux, Mac etc.
- We would recommend Ubuntu on top of Windows 10. You can either setup using Windows Subsystem for Linux or having a virtual machine.
• Setup Process
- Go to https://spark.apache.org
- Download the tarball of your choice
- Uncompress and untar in your favorite location
- Make sure to setup environment variables so that you can use commands such as spark-shell, pyspark, spark-submit from anywhere
• Understand the spark layout