[
https://issues.apache.org/jira/browse/YARN-867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760933#comment-13760933
]
Vinod Kumar Vavilapalli commented on YARN-867:
----------------------------------------------
bq. Vinod Kumar Vavilapalli Are we making the call that an issue in service
handling is considered a container failure? For the MR AM, it may be critical
for the shuffle to work but this is not necessarily true for all applications
and all services that they interact with.
Yeah, I think that is a reasonable assumption for now. I haven't seen any more
aux-services besides shuffle. In the future, we could make it per container
specifiable along with the concept of optional aux-services for containers
(today everything is implicitly a required aux-service). And we can do that in
a compatible manner.
> Isolation of failures in aux services
> --------------------------------------
>
> Key: YARN-867
> URL: https://issues.apache.org/jira/browse/YARN-867
> Project: Hadoop YARN
> Issue Type: Bug
> Components: nodemanager
> Reporter: Hitesh Shah
> Assignee: Xuan Gong
> Priority: Critical
> Attachments: YARN-867.1.sampleCode.patch, YARN-867.sampleCode.2.patch
>
>
> Today, a malicious application can bring down the NM by sending bad data to a
> service. For example, sending data to the ShuffleService such that it results
> any non-IOException will cause the NM's async dispatcher to exit as the
> service's INIT APP event is not handled properly.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira