[ 
https://issues.apache.org/jira/browse/YARN-10652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17294833#comment-17294833
 ] 

Szilard Nemeth edited comment on YARN-10652 at 3/4/21, 9:32 AM:
----------------------------------------------------------------

Hi [~sahuja]

First of all, thanks for working on this.
I can't really add too many things, [~pbacsko] and [~shuzirra] summarized my 
concerns pretty much.
I would like to state my own opinion at least.
Again, there will be repercussions of previous comments, bear with me.

1. As Gergo said, we need to keep consistency. It's one thing that usernames 
with dots are kind of supported, but is it really supported in all parts of the 
system? Definitely not for placement rules as the rule Gergo mentioned 
("root.user.%user") could cause an issue easily. It's okay that some customers 
don't want to use placement rules and your change is not strictly related to 
placement rules. But if we are encouraging using usernames with dots across the 
codebase, we need to have handle these usernames in all aspects of the system. 
What if some customers are using usernames with dots and placement rules? There 
we have a problem, we need a more complete solution.

2. Support for usernames with dots: Was this documented anywhere or is this 
fact only can be dig up from the codebase?

3. We also understand that this is a setting of a queue and usernames are 
stored in the config objects and you are just retrieving this with that regex. 
The problem here is that "supporting" this is more like an overstatement as ACL 
handling / placement rules could be problematic areas.

4.  
{quote}
But we are supporting usernames with dots today. Users with dots in their 
usernames can submit jobs to the cluster having CS with no issues today (again, 
I am not talking about queue placement with queues with dots here). There are 
no errors reported when users with dots are supplied against 
"yarn.scheduler.capacity.<queue-path>.user-settings.<user-name>.weight setting" 
and in fact, there should NOT be any errors when it is done so. These are 
real-world usernames and we will have to accept them from any interface, 
whether it be UI or CLI or anything. 
{quote}

My answer for this is added at 3.

5. [~wilfreds] I don't agree with this:
{quote}
If you want to solve the generic dot issue for user based placement then that 
is outside of this change. 
{quote}
Why would we allow usernames with dots more and more places in the code and 
forget about the generic solution? Doesn't make sense for me, just leads to 
developer and user confusion, IMHO.
TBH, it's too confusing as it is now. As [~sahuja] said, users can submit jobs 
without a problem. Then someone defines a simple username-based placement rule 
and things will stop working? That's just not consistent and not acceptable 
from the user's point of view. 




was (Author: snemeth):
Hi [~sahuja]

First of all, thanks for working on this.
I can't really add too many things, [~pbacsko] and [~shuzirra] summarized my 
concerns pretty much.
I would like to state my own opinion at least.
Again, there will be repercussions of previous comments, bear with me.

1. As Gergo said, we need to keep consistency. It's one thing that usernames 
with dots are kind of supported, but is it really supported in all parts of the 
system? Definitely not for placement rules as the rule Gergo mentioned 
("root.user.%user") could cause an issue easily. It's okay that some customers 
don't want to use placement rules and your change is not strictly related to 
placement rules. But if we are encouraging using usernames with dots across the 
codebase, we need to have handle these usernames in all aspects of the system. 
What if some customers are using usernames with dots and placement rules? There 
we have a problem, we need a more complete solution.

2. Support for usernames with dots: Was this documented anywhere or is this 
fact only can be dig up from the codebase?

3. We also understand that this is a setting of a queue and usernames are 
stored in the config objects and you are just retrieving this with that regex. 
The problem here is that "supporting" this is more like an overstatement as ACL 
handling / placement rules could be problematic areas.

4.  
{quote}
But we are supporting usernames with dots today. Users with dots in their 
usernames can submit jobs to the cluster having CS with no issues today (again, 
I am not talking about queue placement with queues with dots here). There are 
no errors reported when users with dots are supplied against 
"yarn.scheduler.capacity.<queue-path>.user-settings.<user-name>.weight setting" 
and in fact, there should NOT be any errors when it is done so. These are 
real-world usernames and we will have to accept them from any interface, 
whether it be UI or CLI or anything. 
{quote}

My answer for this is added at 3.

5. [~wilfreds] I don't agree with this:
{quote}
If you want to solve the generic dot issue for user based placement then that 
is outside of this change. 
{quote}
Why would we allow usernames with dots more and more places in the code and 
forget about the generic solution of ? Doesn't make sense for me, just leads to 
developer and user confusion, IMHO.
TBH, it's too confusing as it is now. As [~sahuja] said, users can submit jobs 
without a problem. Then someone defines a simple username-based placement rule 
and things will stop working? That's just not consistent and not acceptable 
from the user's point of view. 



> Capacity Scheduler fails to handle user weights for a user that has a "." 
> (dot) in it
> -------------------------------------------------------------------------------------
>
>                 Key: YARN-10652
>                 URL: https://issues.apache.org/jira/browse/YARN-10652
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacity scheduler
>    Affects Versions: 3.3.0
>            Reporter: Siddharth Ahuja
>            Assignee: Siddharth Ahuja
>            Priority: Major
>         Attachments: Correct user weight of 0.76 picked up for the user with 
> a dot after the patch.png, Incorrect default user weight of 1.0 being picked 
> for the user with a dot before the patch.png, YARN-10652.001.patch
>
>
> AD usernames can have a "." (dot) in them i.e. they can be of the format -> 
> {{firstname.lastname}}. However, if you specify a username with this format 
> against the Capacity Scheduler setting -> 
> {{yarn.scheduler.capacity.root.default.user-settings.firstname.lastname.weight}},
>  it fails to be applied and is instead assigned the default of 1.0f weight. 
> This renders the user weight feature (being used as a means of setting user 
> priorities for a queue) unusable for such users.
> This limitation comes from [1]. From [1], only word characters (A word 
> character: [a-zA-Z_0-9]) (see [2]) are permissible at the moment which is no 
> good for AD names that contain a "." (dot).
> Similar discussion has been had in a few HADOOP jiras e.g. HADOOP-7050 and 
> HADOOP-15395 and the outcome was to use non-whitespace characters i.e. 
> instead of {{\w+}}, use {{\S+}}.
> We could go down similar path and unblock this feature for the AD usernames 
> with a "." (dot) in them.
> [1] 
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java#L1953
> [2] 
> https://docs.oracle.com/javase/tutorial/essential/regex/pre_char_classes.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to