[
https://issues.apache.org/jira/browse/YARN-1127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13755090#comment-13755090
]
Bikas Saha commented on YARN-1127:
----------------------------------
Isnt this similar to a jira opened by you already? The issue being that the
scheduler puts a reservation on a node whose total capacity is smaller than the
reservation resource size. In this case, nm1 has capacity=1024 but the
scheduler is putting a reservation of 2048 on it and that can never be
satisfied. So it does not make sense to make that reservation at all.
> reservation exchange and excess reservation is not working for capacity
> scheduler
> ---------------------------------------------------------------------------------
>
> Key: YARN-1127
> URL: https://issues.apache.org/jira/browse/YARN-1127
> Project: Hadoop YARN
> Issue Type: Bug
> Affects Versions: 2.1.1-beta
> Reporter: Omkar Vinit Joshi
> Assignee: Omkar Vinit Joshi
> Priority: Blocker
>
> I have 2 node managers.
> * one with 1024 MB memory.(nm1)
> * second with 2048 MB memory.(nm2)
> I am submitting simple map reduce application with 1 mapper and one reducer
> with 1024mb each. The steps to reproduce this are
> * stop nm2 with 2048MB memory.( This I am doing to make sure that this node's
> heartbeat doesn't reach RM first).
> * now submit application. As soon as it receives first node's (nm1) heartbeat
> it will try to reserve memory for AM-container (2048MB). However it has only
> 1024MB of memory.
> * now start nm2 with 2048 MB memory.
> It hangs forever... Ideally this has two potential issues.
> * Say 2048MB is reserved on nm1 but nm2 comes back with 2048MB available
> memory. In this case if the original request was made without any locality
> then scheduler should unreserve memory on nm1 and allocate requested 2048MB
> container on nm2.
> * We support a notion where if say we have 5 nodes with 4 AM and all node
> managers have 8GB each and AM 2 GB each. Each AM is requesting 8GB each. Now
> to avoid deadlock AM will make an extra reservation. By doing this we would
> never hit the deadlock situation.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira