[ https://issues.apache.org/jira/browse/YARN-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15730055#comment-15730055 ]
Wangda Tan commented on YARN-5889: ---------------------------------- Thanks [~sunilg] for working on the patch and suggestions from [~jlowe], [~eepayne]. Personally I think the title and desc are a little confusing. First of all, I think the most important target of this JIRA is not improving performance. It is to make user-limit preemption correct. Currently we compute an unique user-limit value for each leaf queue, this is enough for allocation but not enough for preemption. Here is an example. A queue has cap=max-cap=100, min-user-limit-percent=50, user-limit-factor=1, at time T, there're 2 users using resources: {code} u1.used = 75, u2.used = 25 {code} Only u2 is active user, According to existing user limit computation: {code} user_limit = round_up( min( max(current_capacity / #active_user, current_capacity * user_limit_percent), queue_capacity * user_limit_factor)), minimum_allocation) {code} Computed user-limit=100, more than any user's usage, so there's nothing will be preempted. We can give many other examples like: {code} minimum-user-limit-percent = 33 3 users: u1.used = 50, u2.used = 20, u3.used = 30 u2/u3 are active users {code} The computed user-limit = 50, which makes preemption cannot kick in. This problem could happen when #active-user < #total-user. The problem is, at the allocation stage, we only need check active users. But in preemption, we need to preempt resource from non-active users. To solve the problem, we need to compute user limit considering non-active users. If a non-active user uses less than minimum-user-limit, we can continue distribute its available quotas to other active users; in the other hand, if a non-active user uses more than minimum-user-limit, we could also get resource from the user. This computation is more expensive, it should be O(N), N is number of applications in the queue. That is why we need an async thread to do all these stuffs: we cannot put a computation which is O(N) to allocation thread. To me, the common things between computation of (actual) user-limit and fair share (FS) are: - They're all too expensive to do when checking every application. - They're all instant limit, no user should understand the computed instant limit. The instant limit and usage could keep changing, but it will converge to a balance over a period of time. I haven't checked patch implantation yet. Please let us know your thoughts about the overall points. I don't want to make this change to block user-limit preemption effort too, so it will be more helpful if you could share ideas about how we can achieve user-limit preemption without the async thread approach. Thanks, > Improve user-limit calculation in capacity scheduler > ---------------------------------------------------- > > Key: YARN-5889 > URL: https://issues.apache.org/jira/browse/YARN-5889 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler > Reporter: Sunil G > Assignee: Sunil G > Attachments: YARN-5889.v0.patch, YARN-5889.v1.patch, > YARN-5889.v2.patch > > > Currently user-limit is computed during every heartbeat allocation cycle with > a write lock. To improve performance, this tickets is focussing on moving > user-limit calculation out of heartbeat allocation flow. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org