Let us understand the Execution Life Cycle of Spark. You can review this using Spark Official Documentation.
- We submit the job for the client. The JVM typically acts as the Driver Program.
- It will talk to the Resource Manager and create the Application Master.
- Application Master will talk to Worker Nodes on which Node Managers are running and provision resources based on Allocation Settings. Allocation can be either static or dynamic.
- These resources are nothing but Executors. From YARN perspective they are Containers.
- The Executor is nothing but JVM which can run multiple concurrent threads until the Job is complete
https://gist.github.com/dgadiraju/65128e88405c9b80e8bc34d3e878c6c3#file-01sparksubmitexample-sh