[ https://issues.apache.org/jira/browse/YARN-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16177390#comment-16177390 ]
Wangda Tan commented on YARN-4511: ---------------------------------- Thanks [~haibo.chen], 1) SchedulerNode#resetAllocationThisHeartbeat: I'm not sure why we do this, since allowedResourceForOverAllocation is not used by anyone. Secondly, since allocation can be decoupled from heartbeat, this looks confusing as well. Could you add more contexts here? 2) Changes to pass RMContext to ContainerUpdateContext: can we get SchedulerNodes insider SchedulerAppAttempt#pullNewlyUpdatedContainers and pass to swapContainer? 3) ContainerUpdateContext#swapContainer: - In what case the oldNode and newNode will be null and should we throw exception when this happens? Existing logic could cause nodeUpdate called on one node but not on the other. - And related: {{SchedulerNode#containerUpdated}} This part of logic looks confusing, since old containers will be finished inside {{pullNewlyUpdatedContainers}}, do we really need this method? I would like to see thoughts from [~asuresh] for this part as well. 4) SchedulerNode: - Instead of having two separate Map<ContainerId, ContainerInfo>, could we just leave one? It makes logics such as {{getContainer(ContainerId)}} simpler as well. We can get container's executionType from RMContainer in any case. - Why {{containerLaunched}} is added? Should we just increase allocated opportunistic/guaranteed resource? Not related to this patch but also important: 1) I think one of the ResourceThresholds and OverAllocationInfo should be removed, they're kind of duplicated. We should try to reduce unnecessary #PB-records. 2) Should we consider all resource types for configurations / internal calculation? My expectation is, if we want to add different resource overallocation like disk resource, we don't have to change all the places. So probably it's better to convert configs / fields from individual resource types to vector and avoid logics like: {code} ResourceThresholds thresholds = overAllocationInfo.getOverAllocationThresholds(); Resource overAllocationThreshold = Resources.createResource( (long) (capacity.getMemorySize() * thresholds.getMemoryThreshold()), (int) (capacity.getVirtualCores() * thresholds.getCpuThreshold())); {code} > Common scheduler changes supporting scheduler-specific implementations > ---------------------------------------------------------------------- > > Key: YARN-4511 > URL: https://issues.apache.org/jira/browse/YARN-4511 > Project: Hadoop YARN > Issue Type: Sub-task > Reporter: Wangda Tan > Assignee: Haibo Chen > Attachments: YARN-4511-YARN-1011.00.patch, > YARN-4511-YARN-1011.01.patch, YARN-4511-YARN-1011.02.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org