[ https://issues.apache.org/jira/browse/YARN-1127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13755095#comment-13755095 ]
Bikas Saha commented on YARN-1127: ---------------------------------- How is this different from YARN-957 > reservation exchange and excess reservation is not working for capacity > scheduler > --------------------------------------------------------------------------------- > > Key: YARN-1127 > URL: https://issues.apache.org/jira/browse/YARN-1127 > Project: Hadoop YARN > Issue Type: Bug > Affects Versions: 2.1.1-beta > Reporter: Omkar Vinit Joshi > Assignee: Omkar Vinit Joshi > Priority: Blocker > > I have 2 node managers. > * one with 1024 MB memory.(nm1) > * second with 2048 MB memory.(nm2) > I am submitting simple map reduce application with 1 mapper and one reducer > with 1024mb each. The steps to reproduce this are > * stop nm2 with 2048MB memory.( This I am doing to make sure that this node's > heartbeat doesn't reach RM first). > * now submit application. As soon as it receives first node's (nm1) heartbeat > it will try to reserve memory for AM-container (2048MB). However it has only > 1024MB of memory. > * now start nm2 with 2048 MB memory. > It hangs forever... Ideally this has two potential issues. > * Say 2048MB is reserved on nm1 but nm2 comes back with 2048MB available > memory. In this case if the original request was made without any locality > then scheduler should unreserve memory on nm1 and allocate requested 2048MB > container on nm2. > * We support a notion where if say we have 5 nodes with 4 AM and all node > managers have 8GB each and AM 2 GB each. Each AM is requesting 8GB each. Now > to avoid deadlock AM will make an extra reservation. By doing this we would > never hit the deadlock situation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira