[ 
https://issues.apache.org/jira/browse/YARN-7528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344906#comment-16344906
 ] 

Szilard Nemeth commented on YARN-7528:
--------------------------------------

Hey [~templedf]!

No problem.
 I tried to reproduce again with the following config: 
 node-resources.xml:
{code:java}
<configuration>
 <property>
   <name>yarn.nodemanager.resource-type.gpu</name>
   <value>100</value>
 </property> 
</configuration>
{code}
resource-types.xml:
{code:java}
<configuration>
 <property>
   <name>yarn.resource-types</name>
   <value>gpu</value>
</property>
<property>
   <name>yarn.resource-types.gpu.units</name>
   <value>m</value>
 </property>
</configuration>
{code}
and the command I ran was:
{code:java}
/Users/szilardnemeth/development/apache/hadoop-maven//hadoop-dist//target/hadoop-3.1.0-SNAPSHOT/bin/hadoop
 jar 
/Users/szilardnemeth/development/apache/hadoop-maven//hadoop-dist//target/hadoop-3.1.0-SNAPSHOT/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.0-SNAPSHOT.jar
 pi -Dmapreduce.framework.name=yarn -Dmapreduce.map.resource.gpu=5k 10 100
{code}
I also tried with the same configuration, except I removed the gpu.units from 
resource-types.xml and the result is the same, does not reproducible.

 

I have a log statement in place in the code for every conversion, so I grepped 
to those conversions in the log:

{{grep converting -i /tmp/rm.log | grep "''"}}

results (just the largest converted numbers since there are a lot of matches 
lines):
{code:java}
2018-01-30 11:51:01,317 INFO  [SchedulerEventDispatcher:Event Processor] 
util.UnitsConversionUtil (UnitsConversionUtil.java:convert(134)) - 
***Converting 100 from '' to 'm'
2018-01-30 11:51:43,610 INFO  [SchedulerEventDispatcher:Event Processor] 
util.UnitsConversionUtil (UnitsConversionUtil.java:convert(134)) - 
***Converting 100 from '' to 'm'
2018-01-30 11:51:54,681 INFO  [SchedulerEventDispatcher:Event Processor] 
util.UnitsConversionUtil (UnitsConversionUtil.java:convert(134)) - 
***Converting 100 from '' to 'k'
2018-01-30 11:51:55,685 INFO  [SchedulerEventDispatcher:Event Processor] 
util.UnitsConversionUtil (UnitsConversionUtil.java:convert(134)) - 
***Converting 100 from '' to 'k'
2018-01-30 11:51:56,691 INFO  [SchedulerEventDispatcher:Event Processor] 
util.UnitsConversionUtil (UnitsConversionUtil.java:convert(134)) - 
***Converting 100 from '' to 'k'
2018-01-30 11:51:57,698 INFO  [SchedulerEventDispatcher:Event Processor] 
util.UnitsConversionUtil (UnitsConversionUtil.java:convert(134)) - 
***Converting 100 from '' to 'k'
2018-01-30 11:51:58,707 INFO  [SchedulerEventDispatcher:Event Processor] 
util.UnitsConversionUtil (UnitsConversionUtil.java:convert(134)) - 
***Converting 100 from '' to 'k'
2018-01-30 11:51:59,709 INFO  [SchedulerEventDispatcher:Event Processor] 
util.UnitsConversionUtil (UnitsConversionUtil.java:convert(134)) - 
***Converting 100 from '' to 'k'
2018-01-30 11:52:04,471 INFO  [SchedulerEventDispatcher:Event Processor] 
util.UnitsConversionUtil (UnitsConversionUtil.java:convert(134)) - 
***Converting 100 from '' to 'k'
2018-01-30 11:52:07,780 INFO  [SchedulerEventDispatcher:Event Processor] 
util.UnitsConversionUtil (UnitsConversionUtil.java:convert(134)) - 
***Converting 100 from '' to 'm'
2018-01-30 11:52:11,560 INFO  [SchedulerEventDispatcher:Event Processor] 
util.UnitsConversionUtil (UnitsConversionUtil.java:convert(134)) - 
***Converting 100 from '' to 'k'
2018-01-30 11:52:15,688 INFO  [SchedulerEventDispatcher:Event Processor] 
util.UnitsConversionUtil (UnitsConversionUtil.java:convert(134)) - 
***Converting 100 from '' to 'k'
2018-01-30 11:52:17,847 INFO  [SchedulerEventDispatcher:Event Processor] 
util.UnitsConversionUtil (UnitsConversionUtil.java:convert(134)) - 
***Converting 100 from '' to 'k'
{code}
So given these outputs, I could only imagine this exception comes up when the 
value is somehow Long.MAX_VALUE for a custom resource type.
 Can you think of a scenario when a custom resource type's value is set to it's 
default (Long.MAX_VALUE)?

Thanks!

> Resource types that use units need to be defined at RM level and NM level or 
> when using small units you will overflow max_allocation calculation
> ------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-7528
>                 URL: https://issues.apache.org/jira/browse/YARN-7528
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: documentation, resourcemanager
>    Affects Versions: 3.0.0
>            Reporter: Grant Sohn
>            Assignee: Szilard Nemeth
>            Priority: Major
>
> When the unit is not defined in the RM, the LONG_MAX default will overflow in 
> the conversion step.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to