[
https://issues.apache.org/jira/browse/YARN-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16573946#comment-16573946
]
Zian Chen edited comment on YARN-8523 at 8/8/18 10:07 PM:
----------------------------------------------------------
[~eyang], thanks for raising this feature. This is very useful for live debug
of container diagnosis. we can add a series of interactive commands to let
user debug more effectively, like tail -f container log, container resource
usage, etc.
For handling nodemanager restart scenario, we can register a event listener to
listen restart or shutdown signal of node manager web socket and respond in
xterm js terminal accordingly, (like print out NM restart/shutdown message to
user, etc) and do reconnect retries several times after typical nm restart
interval.
Again, if NM meet any unexpected issue which can not resume its service, that's
something we can not solve on this interactive docker shell by itself and we
should just give user reasonable alert message to inform the current situation
(like retry failed with timeout, please check NM log to get more information,
etc).
I think pass command through NM web socket and reuse container-executor
security check would be a good prototype we can build first without have too
much burden on handling root daemon by carving another secure channel.
was (Author: zian chen):
[~eyang], thanks for raising this feature. This is very useful for live debug
of container diagnosis. we can add a series of interactive commands to let
user debug more effectively, like tail -f container log, container resource
usage, etc.
For handling nodemanager restart scenario, we can register a event listener to
listen restart or shutdown signal of node manager web socket and respond in
xterm js terminal accordingly, (like print out NM restart/shutdown message to
user, etc) and do reconnect retries several times after typical nm restart
interval. Again, if NM meet any unexpected issue which can not resume its
service, that's something we can not solve on this interactive docker shell by
itself and we should just give user reasonable alert message to inform the
current situation (like retry failed with timeout, please check NM log to get
more information, etc). I think pass command through NM web socket and reuse
container-executor security check would be a good prototype we can build first
without have too much burden on handling root daemon by carving another secure
channel.
> Interactive docker shell
> ------------------------
>
> Key: YARN-8523
> URL: https://issues.apache.org/jira/browse/YARN-8523
> Project: Hadoop YARN
> Issue Type: Sub-task
> Reporter: Eric Yang
> Priority: Major
> Labels: Docker
>
> Some application might require interactive unix commands executions to carry
> out operations. Container-executor can interface with docker exec to debug
> or analyze docker containers while the application is running. It would be
> nice to support an API to invoke docker exec to perform unix commands and
> report back the output to application master. Application master can
> distribute and aggregate execution of the commands to record in application
> master log file.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]