[ 
https://issues.apache.org/jira/browse/YARN-10652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17291334#comment-17291334
 ] 

Siddharth Ahuja edited comment on YARN-10652 at 2/26/21, 1:48 AM:
------------------------------------------------------------------

Tested the fix on trunk using a single node cluster using the following steps:

* Create a _Standard_ user with a username containing a "." (dot) on Mac -> 
{{firstname.lastname}} as per [1].

* Set up the single node cluster for trunk and enable the following permissions 
such that the new user can have rwx permissions under /tmp folder as otherwise 
job submissions will fail:

{code}
admin@mac hadoop-3.4.0-SNAPSHOT % bin/hdfs dfs -chmod -R a+rwx /tmp             
                                                                                
                   
2021-02-26 12:21:54,034 WARN util.NativeCodeLoader: Unable to load 
native-hadoop library for your platform... using builtin-java classes where 
applicable
{code}

* Add the following setting under 
{{hadoop-3.4.0-SNAPSHOT/etc/hadoop/capacity-scheduler.xml}} to enable weights 
for the {{firstname.lastname}} user:

{code}
  <property>
    
<name>yarn.scheduler.capacity.root.default.user-settings.firstname.lastname.weight</name>
    <value>0.76</value>
  </property>
{code}

* Ensure HDFS & RM services are running on the single node cluster and run the 
sleep job as the new user:

{code}
admin@sahuja-MBP16 hadoop-3.4.0-SNAPSHOT % sudo -u firstname.lastname 
bin/hadoop jar 
share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.4.0-SNAPSHOT-tests.jar
 sleep -m 1 -mt 600000
<Enter admin password>

2021-02-26 12:21:57,305 WARN util.NativeCodeLoader: Unable to load 
native-hadoop library for your platform... using builtin-java classes where 
applicable
2021-02-26 12:21:57,989 INFO client.DefaultNoHARMFailoverProxyProvider: 
Connecting to ResourceManager at /0.0.0.0:8032
2021-02-26 12:21:58,101 INFO client.AHSProxy: Connecting to Application History 
server at /0.0.0.0:10200
2021-02-26 12:21:58,581 INFO mapreduce.JobResourceUploader: Disabling Erasure 
Coding for path: 
/tmp/hadoop-yarn/staging/firstname.lastname/.staging/job_1614302477419_0001
2021-02-26 12:21:59,343 INFO mapreduce.JobSubmitter: number of splits:1
2021-02-26 12:21:59,440 INFO mapreduce.JobSubmitter: Submitting tokens for job: 
job_1614302477419_0001
2021-02-26 12:21:59,440 INFO mapreduce.JobSubmitter: Executing with tokens: []
2021-02-26 12:21:59,573 INFO conf.Configuration: resource-types.xml not found
2021-02-26 12:21:59,574 INFO resource.ResourceUtils: Unable to find 
'resource-types.xml'.
2021-02-26 12:21:59,987 INFO impl.YarnClientImpl: Submitted application 
application_1614302477419_0001
2021-02-26 12:22:00,025 INFO mapreduce.Job: The url to track the job: 
http://localhost:8088/proxy/application_1614302477419_0001/
2021-02-26 12:22:00,025 INFO mapreduce.Job: Running job: job_1614302477419_0001
2021-02-26 12:22:07,128 INFO mapreduce.Job: Job job_1614302477419_0001 running 
in uber mode : false
2021-02-26 12:22:07,130 INFO mapreduce.Job:  map 0% reduce 0%
...
{code}

* Check the "_Active Users Info_" section after expanding the {{root.default}} 
queue on the RM Scheduler page at http://localhost:8088/cluster/scheduler. It 
should contain 0.76 instead of 1.0. Confirmed this to be working after the 
change.

[1] https://support.apple.com/en-au/guide/mac-help/mtusr001/mac

JUnit has also been updated to ensure that the weights for usernames containing 
a dot are set up accordingly. Meanwhile, also fixed the junit to ensure that 
that the {{assertEquals}} with float arguments are picked up instead of double 
by appending the suffix "f" to the literal values and also removed un-necessary 
unboxing using {{floatValue()}} as this is not required. 


was (Author: sahuja):
Tested the fix on trunk using a single node cluster using the following steps:

* Create a _Standard_ user with a username containing a "." (dot) on Mac -> 
{{firstname.lastname}} as per [1].

* Set up the single node cluster for trunk and enable the following permissions 
such that the new user can have rwx permissions under /tmp folder as otherwise 
job submissions will fail:

{code}
admin@mac hadoop-3.4.0-SNAPSHOT % bin/hdfs dfs -chmod -R a+rwx /tmp             
                                                                                
                   
2021-02-26 12:21:54,034 WARN util.NativeCodeLoader: Unable to load 
native-hadoop library for your platform... using builtin-java classes where 
applicable
{code}

* Add the following setting under 
{{hadoop-3.4.0-SNAPSHOT/etc/hadoop/capacity-scheduler.xml}} to enable weights 
for the {{firstname.lastname}} user:

{code}
  <property>
    
<name>yarn.scheduler.capacity.root.default.user-settings.firstname.lastname.weight</name>
    <value>0.76</value>
  </property>
{code}

* Ensure HDFS & RM services are running on the single node cluster and run the 
sleep job as the new user:

{code}
admin@sahuja-MBP16 hadoop-3.4.0-SNAPSHOT % sudo -u firstname.lastname 
bin/hadoop jar 
share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.4.0-SNAPSHOT-tests.jar
 sleep -m 1 -mt 600000
<Enter admin password>

2021-02-26 12:21:57,305 WARN util.NativeCodeLoader: Unable to load 
native-hadoop library for your platform... using builtin-java classes where 
applicable
2021-02-26 12:21:57,989 INFO client.DefaultNoHARMFailoverProxyProvider: 
Connecting to ResourceManager at /0.0.0.0:8032
2021-02-26 12:21:58,101 INFO client.AHSProxy: Connecting to Application History 
server at /0.0.0.0:10200
2021-02-26 12:21:58,581 INFO mapreduce.JobResourceUploader: Disabling Erasure 
Coding for path: 
/tmp/hadoop-yarn/staging/firstname.lastname/.staging/job_1614302477419_0001
2021-02-26 12:21:59,343 INFO mapreduce.JobSubmitter: number of splits:1
2021-02-26 12:21:59,440 INFO mapreduce.JobSubmitter: Submitting tokens for job: 
job_1614302477419_0001
2021-02-26 12:21:59,440 INFO mapreduce.JobSubmitter: Executing with tokens: []
2021-02-26 12:21:59,573 INFO conf.Configuration: resource-types.xml not found
2021-02-26 12:21:59,574 INFO resource.ResourceUtils: Unable to find 
'resource-types.xml'.
2021-02-26 12:21:59,987 INFO impl.YarnClientImpl: Submitted application 
application_1614302477419_0001
2021-02-26 12:22:00,025 INFO mapreduce.Job: The url to track the job: 
http://localhost:8088/proxy/application_1614302477419_0001/
2021-02-26 12:22:00,025 INFO mapreduce.Job: Running job: job_1614302477419_0001
2021-02-26 12:22:07,128 INFO mapreduce.Job: Job job_1614302477419_0001 running 
in uber mode : false
2021-02-26 12:22:07,130 INFO mapreduce.Job:  map 0% reduce 0%
...
{code}

* Check the "_Active Users Info_" section after expanding the 
{{root.default}}queue on the RM Scheduler page at 
http://localhost:8088/cluster/scheduler. It should contain 0.76 instead of 1.0. 
Confirmed this to be working after the change.

[1] https://support.apple.com/en-au/guide/mac-help/mtusr001/mac

JUnit has also been updated to ensure that the weights for usernames containing 
a dot are set up accordingly. Meanwhile, also fixed the junit to ensure that 
that the {{assertEquals}} with float arguments are picked up instead of double 
by appending the suffix "f" to the literal values and also removed un-necessary 
unboxing using {{floatValue()}} as this is not required. 

> Capacity Scheduler fails to handle user weights for a user that has a "." 
> (dot) in it
> -------------------------------------------------------------------------------------
>
>                 Key: YARN-10652
>                 URL: https://issues.apache.org/jira/browse/YARN-10652
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacity scheduler
>    Affects Versions: 3.3.0
>            Reporter: Siddharth Ahuja
>            Assignee: Siddharth Ahuja
>            Priority: Major
>         Attachments: Correct user weight of 0.76 picked up for the user with 
> a dot after the patch.png, Incorrect default user weight of 1.0 being picked 
> for the user with a dot before the patch.png, YARN-10652.001.patch
>
>
> AD usernames can have a "." (dot) in them i.e. they can be of the format -> 
> {{firstname.lastname}}. However, if you specify a username with this format 
> against the Capacity Scheduler setting -> 
> {{yarn.scheduler.capacity.root.default.user-settings.firstname.lastname.weight}},
>  it fails to be applied and is instead assigned the default of 1.0f weight. 
> This renders the user weight feature (being used as a means of setting user 
> priorities for a queue) unusable for such users.
> This limitation comes from [1]. From [1], only word characters (A word 
> character: [a-zA-Z_0-9]) (see [2]) are permissible at the moment which is no 
> good for AD names that contain a "." (dot).
> Similar discussion has been had in a few HADOOP jiras e.g. HADOOP-7050 and 
> HADOOP-15395 and the outcome was to use non-whitespace characters i.e. 
> instead of {{\w+}}, use {{\S+}}.
> We could go down similar path and unblock this feature for the AD usernames 
> with a "." (dot) in them.
> [1] 
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java#L1953
> [2] 
> https://docs.oracle.com/javase/tutorial/essential/regex/pre_char_classes.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to