Day: February 12, 2023

Run Sample Job

First, let us run jobs without specifying any queue and see what happens. Submit the Long running job to production queue by using the below command. This job requires 279 containers. https://gist.github.com/dgadiraju/f6aa41383703a8b6bb3bb902a8726581 Submit the other job which requires only 18 containers to process the data. https://gist.github.com/dgadiraju/85c9d016bca952d4b1d5d52a828bff2f Submitting jobs and Validating Capacity Scheduler Let us submit …

Run Sample Job Read More »

Introduction to Capacity Scheduler

Capacity Scheduler is nothing but FIFO Scheduler within each queue. Unlike FIFO Scheduler, Capacity Scheduler has multiple queues and users can submit jobs to a particular queue. By default, Hortonworks Hadoop Distribution uses Capacity Scheduler. Configuration files related to the capacity scheduler yarn-site.xml capacity-scheduler.xml For setting up queues in Capacity Scheduler you need to make changes …

Introduction to Capacity Scheduler Read More »

More Control Parameters

Let us review some of the properties related to Fair Scheduler. There are properties as part of yarn-site.xml which can be used to overwrite the behavior of Fair Scheduler. yarn.scheduler.fair.user-as-default-queue yarn.scheduler.fair.preemption yarn.scheduler.fair.preemption.cluster-utilization-threshold Also there are several properties that can be defined as part of the allocation file (fair-scheduler.xml). We can review properties from this URL. Queue …

More Control Parameters Read More »

Submitting jobs and Validating Fair Scheduler

Let us submit few jobs and see how the resources are allocated using Fair Scheduler. Submit the Long running job to the production queue. Submit the other jobs to the test and production queues. Submit another job to the qa queue https://gist.github.com/dgadiraju/479059c8e3d7fd1be7d5b7672a76f447 Now the production queue has two apps. Since it is a fair scheduler …

Submitting jobs and Validating Fair Scheduler Read More »

Configure Fair Scheduler – Running jobs without Specifying Queue

First, let us run jobs without specifying any queue and see what happens. Submit the Long running job to production queue by using the below command. This job requires 279 containers. https://gist.github.com/dgadiraju/f6aa41383703a8b6bb3bb902a8726581 Submit the other job which requires only 18 containers to process the data. https://gist.github.com/dgadiraju/85c9d016bca952d4b1d5d52a828bff2f