[ 
https://issues.apache.org/jira/browse/YARN-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16907944#comment-16907944
 ] 

Szilard Nemeth commented on YARN-9100:
--------------------------------------

Hi [~pbacsko]!
About to review the latest patch.
Agree to remove the RetryCommand from this patch.

Quoting from the description: 
{quote}
Some more words about the new class RetryCommand: 
There are some similar waiting loops in the code in: AMRMClient, 
AMRMClientAsync and even in GenericTestUtils (see waitFor method). RetryCommand 
could be a future replacement of these duplicated code, as it gives a solution 
to this waiting loop problem in a generic way.
The only downside of the usage of RetryCommand in GpuResourceAllocator 
(startGpuAssignmentLoop) is the ugly exception handling part, but that's solely 
because how Java deals with checked exceptions vs. lambdas. If there's a 
cleaner way to solve the exception handling, I'm open for any suggestions.
{quote}

What about filing a follow-up jira to consolidate the retry logic of 
GpuResourceAllocator and several other classes like AMRMClient, 
AMRMClientAsync, GenericTestUtils, etc?
I think what we had in patch003 for the RetryCommand can be a good starting 
point to extract the retry behaviour into one common class from the mentioned 
classes.
What do you think? 


> Add tests for GpuResourceAllocator and do minor code cleanup
> ------------------------------------------------------------
>
>                 Key: YARN-9100
>                 URL: https://issues.apache.org/jira/browse/YARN-9100
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Szilard Nemeth
>            Assignee: Peter Bacsko
>            Priority: Major
>         Attachments: YARN-9100-004.patch, YARN-9100-005.patch, 
> YARN-9100-006.patch, YARN-9100.001.patch, YARN-9100.002.patch, 
> YARN-9100.003.patch
>
>
> Add tests for GpuResourceAllocator and do minor code cleanup
> - Improved log and exception messages
> - Added some new debug logs
> - Some methods are named like *Copy, these are returning copies of internal 
> data structures. The word "copy" is just a noise in their name, so they have 
> been renamed. Additionally, the copied data structures modified to be 
> immutable.
> - The waiting loop in method assignGpus were decoupled into a new class, 
> RetryCommand. 
> Some more words about the new class RetryCommand: 
> There are some similar waiting loops in the code in: AMRMClient, 
> AMRMClientAsync and even in GenericTestUtils (see waitFor method). 
> RetryCommand could be a future replacement of these duplicated code, as it 
> gives a solution to this waiting loop problem in a generic way.
> The only downside of the usage of RetryCommand in GpuResourceAllocator 
> (startGpuAssignmentLoop) is the ugly exception handling part, but that's 
> solely because how Java deals with checked exceptions vs. lambdas. If there's 
> a cleaner way to solve the exception handling, I'm open for any suggestions.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to