[
https://issues.apache.org/jira/browse/YARN-11352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17620841#comment-17620841
]
Prabhu Joseph commented on YARN-11352:
--------------------------------------
Thanks [~SanjayKumarSahu] for reporting the issue. Currently Tez Splits is
based on AMRMClient#getAvailableResources which is the HeadRoom based on How
Much the Queue/Job User/Partition Limit is set. Changing Split calculation
based on Total YARN Cluster Resource will lead to high Task Parallelism and Tez
Job waiting for other queue/user/Partition resources which it won't get.
{code:java}
/**
* Get the currently available resources in the cluster.
* A valid value is available after a call to allocate has been made
* @return Currently available resources
*/
public abstract Resource getAvailableResources();
{code}
> Support new API to get the total resource available in Yarn
> -----------------------------------------------------------
>
> Key: YARN-11352
> URL: https://issues.apache.org/jira/browse/YARN-11352
> Project: Hadoop YARN
> Issue Type: New Feature
> Components: capacity scheduler, resourcemanager, yarn
> Affects Versions: 3.4.0
> Reporter: Sanjay Kumar Sahu
> Priority: Major
>
> Hive needs total resource available in yarn by AMRMClient interface. This
> help hive to decide the split count (Fix the split calculation logic for Hive
> on Tez/LLAP in clusters).
>
> The improvement is identified as a problem in split calculation.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]