[ 
https://issues.apache.org/jira/browse/YARN-5600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15730449#comment-15730449
 ] 

Miklos Szegedi commented on YARN-5600:
--------------------------------------

Thank you, [~templedf]! I addressed most issues. Here are my replies on the 
others.
{quote}
In serviceInit(), you're missing the space before the brace on both your if 
statements. Also, do you think you can find a sane way to combine them or at 
least reuse the error message?
{quote}
I think it might even be more readable, when it is expanded like this.
{quote}
I'm still hung up on +999 / 1000 in ResourceLocalizationService. How about 
Math.ceil(... / 1000.0)?
{quote}
I think we are talking about personal preferences at this point. I think that 
converting to float and back is a bit overkill. This pattern is a known one for 
rounding an integer up to the nearest multiple of a number without casting to 
float.      
{quote}
In TestContainerManager, I'm worried that the new tests are pretty heavily 
timing dependent. At the very least, the second wait should be for the full 
timeout period. Even better would be to validate the internal state is as 
expected, e.g. that the deletion thread is set to execute after the expected 
amount of time.
{quote}
The two waits add up to the total wait time. This is how it is designed to be 
consistent. Waiting half more is unnecessary, since 
{code}waitForApplicationDirDeleted{code} does the same. It verifies the 
internal state, whether the file was deleted. I see this pattern in other 
places in the tests as well. However, to make it more robust, I increased this 
timeout. The third test, {code}testDelayedKeep(){code} may cause false 
positives in case of a busy server, but any errors will be caught by "normal" 
test machines.
{quote}
In TestDeletionService.testCustomDisableDelete(), do you need to set 
DEBUG_NM_MAX_PER_APPLICATION_DELETE_DELAY_SEC? Also, you're missing an assert 
message in that test method further down.
{quote}
Are we looking at the same patch? I set it here:
{code}
204         conf.setInt(YarnConfiguration.DEBUG_NM_DELETE_DELAY_SEC, 
almostForever);
205         
conf.setInt(YarnConfiguration.DEBUG_NM_MAX_PER_APPLICATION_DELETE_DELAY_SEC,
206             Integer.MAX_VALUE);
{code}
{quote}
testCustomRetentionPolicy() should probably also test what happens if no 
container override is given, if the container override is 0, and if it's 2.
{quote}
testEffectiveDelay() is testing these cases already.
What do you think?

> Add a parameter to ContainerLaunchContext to emulate 
> yarn.nodemanager.delete.debug-delay-sec on a per-application basis
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-5600
>                 URL: https://issues.apache.org/jira/browse/YARN-5600
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: nodemanager
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Daniel Templeton
>            Assignee: Miklos Szegedi
>              Labels: oct16-medium
>         Attachments: YARN-5600.000.patch, YARN-5600.001.patch, 
> YARN-5600.002.patch, YARN-5600.003.patch, YARN-5600.004.patch, 
> YARN-5600.005.patch, YARN-5600.006.patch, YARN-5600.007.patch, 
> YARN-5600.008.patch, YARN-5600.009.patch, YARN-5600.010.patch, 
> YARN-5600.011.patch, YARN-5600.012.patch, YARN-5600.013.patch, 
> YARN-5600.014.patch
>
>
> To make debugging application launch failures simpler, I'd like to add a 
> parameter to the CLC to allow an application owner to request delayed 
> deletion of the application's launch artifacts.
> This JIRA solves largely the same problem as YARN-5599, but for cases where 
> ATS is not in use, e.g. branch-2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to