Let us understand the Execution Life Cycle of Spark. You can review this using Spark Official Documentation.
- We submit the job for the client. The JVM typically acts as the Driver Program.
- It will talk to the Resource Manager and create the Application Master.
- Application Master will talk to Worker Nodes on which Node Managers are running and provision resources based on Allocation Settings. Allocation can be either static or dynamic.
- These resources are nothing but Executors. From YARN perspective they are Containers.
- The Executor is nothing but JVM which can run multiple concurrent threads until the Job is complete