[
https://issues.apache.org/jira/browse/YARN-5311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15902101#comment-15902101
]
Junping Du edited comment on YARN-5311 at 3/8/17 10:41 PM:
-----------------------------------------------------------
Sorry for coming late on this as reviewing document is always a not-easy work.
Thanks [~elek] for the patch, some comments so far:
1. In overview, we should explain some high level use cases - like elasticity
for yarn nodes in public cloud infrastructure, etc. Also, we should mention
timeout tracking in client and server side and their differences in prospective
of IT operations.
2. As far as I remember, we don't support specified timeout value in exclude
file for client side timeout tracking initially. It seems YARN-4676 only
support that for server side tracking. We should mention that explicitly.
3. Also, for exclude file, we should mention currently we only support plain
text (no timeout value) and XML. However, we have plan to support JSON format
in future - please refer YARN-5536 for more details.
4. We should mention the behavior for RM get restarted/failed over, the
decommissioning node will get decommissioned after RM come back as no timeout
value get preserved so far. We should enhance it later - with YARN-5464 get
fixed. So far we can just mention the current behavior as a NOTE but we can
update later once we have better solution.
Some NITs:
bq. (Note: It isn't needed to restart resourcemanager in case of changing the
exclude-path as it's reread at every `refresNodes` command)
We should make it more readable, something like: "It is unnecessary to restart
RM in case of changing the exclude-path as this config will be read again for
every 'refreshNodes' command"
bq. +* WAIT_CONTAINER --- wait for running containers to complete.
Capitalize "w" for wait as other items.
bq. +* WAIT_APP --- wait for running application to complete (after all
containers complete)
Same comments above.
was (Author: djp):
Sorry for coming late on this as reviewing document is always a not-easy work.
Thanks [~elek] for the patch, some comments so far:
1. In overview, we should explain some high level use cases - like elasticity
for yarn nodes in public cloud infrastructure, etc. Also, we should mention
timeout tracking in client and server side and their differences in prospective
of IT operations.
2. As far as I remember, we don't support specified timeout value in exclude
file for client side timeout tracking initially. It seems YARN-4676 only
support that for server side tracking. We should mention that explicitly.
3. Also, for exclude file, we should mention currently we only support plain
text (no timeout value) and XML. However, we have plan to support JSON format
in future - please refer YARN-5536 for more details.
4. We should mention the behavior for RM get restarted/failed over, the
decommissioning node will get decommissioned after RM come back as no timeout
value get preserved so far. We should enhance it later - with YARN-5464 get
fixed. So far we can just mention the current behavior as a NOTE but we can
update later once we have better solution.
Some NITs:
bq. (Note: It isn't needed to restart resourcemanager in case of changing the
exclude-path as it's reread at every `refresNodes` command)
It is unnecessary to restart RM in case of changing the exclude-path as this
config will be read again for every 'refreshNodes' command
bq. +* WAIT_CONTAINER --- wait for running containers to complete.
Capitalize "w" for wait as other items.
bq. +* WAIT_APP --- wait for running application to complete (after all
containers complete)
Same comments above.
> Document graceful decommission CLI and usage
> --------------------------------------------
>
> Key: YARN-5311
> URL: https://issues.apache.org/jira/browse/YARN-5311
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: documentation
> Affects Versions: 2.9.0
> Reporter: Junping Du
> Assignee: Elek, Marton
> Attachments: YARN-5311.001.patch, YARN-5311.002.patch,
> YARN-5311.003.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]