[ https://issues.apache.org/jira/browse/YARN-896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13699314#comment-13699314 ]
Steve Loughran commented on YARN-896: ------------------------------------- Based on our Hoya, HBase on YARN work: * we need a restarted AM to be given the existing set of containers from its previous instance. The use case there is region servers should stay up while the AM and master are restarted. * maybe: be able to warn YARN that the services will be long-lived. That could be used in scheduling and placement. * anti-affinity is needed to declare that different container instances SHOULD be deployed on different nodes (use case: region servers). If failure domains are supported in the topology, anti-affinity should use that. I don't know if we'd want best-effort vs absolute requirements. * add ability to increase requirements of running containers, e.g. say "this service is using more RAM than expected, reduce the amount available to others". * maybe: ability to send kill signals to container processes, to do a graceful kill before escalating. This is of limited value if an extra process (such as {{bin/hbase}}) intervenes in the startup process. There's also long-lived service discovery, a topic for another JIRA > Roll up for long lived YARN > --------------------------- > > Key: YARN-896 > URL: https://issues.apache.org/jira/browse/YARN-896 > Project: Hadoop YARN > Issue Type: New Feature > Reporter: Robert Joseph Evans > > YARN is intended to be general purpose, but it is missing some features to be > able to truly support long lived applications and long lived containers. > This ticket is intended to > # discuss what is needed to support long lived processes > # track the resulting JIRA. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira