[ 
https://issues.apache.org/jira/browse/YARN-11352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17620841#comment-17620841
 ] 

Prabhu Joseph commented on YARN-11352:
--------------------------------------

Thanks [~SanjayKumarSahu] for reporting the issue. Currently Tez Splits is 
based on AMRMClient#getAvailableResources which is the HeadRoom based on How 
Much the Queue/Job User/Partition Limit is set. Changing Split calculation 
based on Total YARN Cluster Resource will lead to high Task Parallelism and Tez 
Job waiting for other queue/user/Partition resources which it won't get.

 
{code:java}
  /**
   * Get the currently available resources in the cluster.
   * A valid value is available after a call to allocate has been made
   * @return Currently available resources
   */
  public abstract Resource getAvailableResources();
{code}

> Support new API to get the total resource available in Yarn
> -----------------------------------------------------------
>
>                 Key: YARN-11352
>                 URL: https://issues.apache.org/jira/browse/YARN-11352
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: capacity scheduler, resourcemanager, yarn
>    Affects Versions: 3.4.0
>            Reporter: Sanjay Kumar Sahu
>            Priority: Major
>
> Hive needs total resource available in yarn by AMRMClient interface. This 
> help hive to decide the split count (Fix the split calculation logic for Hive 
> on Tez/LLAP in  clusters).
>  
> The improvement is identified as a problem in split calculation.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to