Cluster Architecture

Apache Spark Cluster Architecture

How Job is executed on Spark Cluster?
  1. When driver submits a job, it sends the request to the YARN Resource manager.
  2. YARN resource manager checks for data locality and find the best available slave nodes for task scheduling
  3. Then job splits into different stages, each stage splits into tasks based on data locality and resources
  4. Prior to task execution, driver daemon sends necessary job details to each node
  5. Driver keeps track of currently executing task and updates the job monitoring status on master node (it can be checked with Master Node UI)
  6. Once job is completed, all the nodes share the aggregate values to the master node

results matching ""

    No results matching ""