[ https://issues.apache.org/jira/browse/YARN-6808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16087782#comment-16087782 ]
Wangda Tan commented on YARN-6808: ---------------------------------- Thanks [~asuresh] for the additional explanations. Frankly speaking, I'm a little bit worried about this direction. Currently, opportunistic container allocation is too simple to follow what existing schedulers can do. For example, delayed scheduling, user limit. You might want to say that there's no guarantee for opportunistic container so we don't have to follow all of them. This is tree under the context of YARN-2877, which is target to run container with short life and don't need any guarantees. However, if we move all containers beyond queue's configured capacity to queue opportunistic, it is a big deal. I'm not sure if you have any plans regarding to this. One related topic is YARN-1011: Opportunistic containers will be used when node is overcommitted. YARN-1013/YARN-1015 are opened to track changes in scheduler side. To me it's better to reuse existing scheduler code instead of duplicating code to achieve this. I'm not sure is there any concrete plans for this. cc: [~haibo.chen], [~kasha]. If like you said, use this approach to replace existing preemption, I think this patch is far away from the goal. Even with the latest patch, it can only mark containers which are allocated below queue limit and which are allocated above queue limit. (Which could be wrong if queue's configured capacity changed). In that case, I don't know how it is possible to support existing preemption features such as preemption large containers, intra-queue preemption for app priority / user limit, etc. bq. Essentially, the extra delay will look just like localization delay to the AM. We have verified this is fine for MapReduce and Spark. To me this is too simple to solve the problem. Yes existing MR/Spark are able to expect delays when containers are localizing, but in most cases, localization happens only once on each node because files need to be localized are mostly common between containers from the same app (or same set of task like mappers). It looks like in a large cluster, delays need to be expected for every OC launch. This is not acceptable for SLA-sensitive jobs. I don't want to stop or slow down this innovation, but from what I can see, existing assumptions are plans need lots of time to be verified. In this case, like YARN-1011, do you think is it better to create an umbrella JIRA (use opportunistic container to do preemption). And move related works to branch? > Allow Schedulers to return OPPORTUNISTIC containers when queues go over > configured capacity > ------------------------------------------------------------------------------------------- > > Key: YARN-6808 > URL: https://issues.apache.org/jira/browse/YARN-6808 > Project: Hadoop YARN > Issue Type: New Feature > Reporter: Arun Suresh > Assignee: Arun Suresh > Attachments: YARN-6808.001.patch, YARN-6808.002.patch > > > This is based on discussions with [~kasha] and [~kkaranasos]. > Currently, when a Queues goes over capacity, apps on starved queues must wait > either for containers to complete or for them to be pre-empted by the > scheduler to get resources. > This JIRA proposes to allow Schedulers to: > # Allocate all containers over the configured queue capacity/weight as > OPPORTUNISTIC. > # Auto-promote running OPPORTUNISTIC containers of apps as and when their > GUARANTEED containers complete. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org