[
https://issues.apache.org/jira/browse/YARN-10652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17291614#comment-17291614
]
Gergely Pollak commented on YARN-10652:
---------------------------------------
The problem here is the consistency. There are multiple places where user names
are used in queue path parts, mapping rules is one example. If we don't handle
user names with dots properly, then we cannot say we support user names with
dots. We need to handle in each and every case, which means we need a more
centralized or at least consistent solution. Simply replacing the dots with
underscores for this property won't solve the issue CS wide, and we need a CS
wide solution, which might consist of multiple separate smaller solutions like
this, but we need to centralize this effort.
For example as soon someone introduces a rule like root.user.%user, we already
have an issue with user names with dots, and if we handle them differently
here, than your solution, we might get inconsistent configuration and behavior.
Also we need to check other places where usernames with dots (and group names)
can cause issues.
Also I prefer the FS solution for substitution with '_dot_' rather than a
simple '_' since there is a much smaller chance of user name collision this way.
> Capacity Scheduler fails to handle user weights for a user that has a "."
> (dot) in it
> -------------------------------------------------------------------------------------
>
> Key: YARN-10652
> URL: https://issues.apache.org/jira/browse/YARN-10652
> Project: Hadoop YARN
> Issue Type: Bug
> Components: capacity scheduler
> Affects Versions: 3.3.0
> Reporter: Siddharth Ahuja
> Assignee: Siddharth Ahuja
> Priority: Major
> Attachments: Correct user weight of 0.76 picked up for the user with
> a dot after the patch.png, Incorrect default user weight of 1.0 being picked
> for the user with a dot before the patch.png, YARN-10652.001.patch
>
>
> AD usernames can have a "." (dot) in them i.e. they can be of the format ->
> {{firstname.lastname}}. However, if you specify a username with this format
> against the Capacity Scheduler setting ->
> {{yarn.scheduler.capacity.root.default.user-settings.firstname.lastname.weight}},
> it fails to be applied and is instead assigned the default of 1.0f weight.
> This renders the user weight feature (being used as a means of setting user
> priorities for a queue) unusable for such users.
> This limitation comes from [1]. From [1], only word characters (A word
> character: [a-zA-Z_0-9]) (see [2]) are permissible at the moment which is no
> good for AD names that contain a "." (dot).
> Similar discussion has been had in a few HADOOP jiras e.g. HADOOP-7050 and
> HADOOP-15395 and the outcome was to use non-whitespace characters i.e.
> instead of {{\w+}}, use {{\S+}}.
> We could go down similar path and unblock this feature for the AD usernames
> with a "." (dot) in them.
> [1]
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java#L1953
> [2]
> https://docs.oracle.com/javase/tutorial/essential/regex/pre_char_classes.html
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]