[
https://issues.apache.org/jira/browse/YARN-7528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344906#comment-16344906
]
Szilard Nemeth commented on YARN-7528:
--------------------------------------
Hey [~templedf]!
No problem.
I tried to reproduce again with the following config:
node-resources.xml:
{code:java}
<configuration>
<property>
<name>yarn.nodemanager.resource-type.gpu</name>
<value>100</value>
</property>
</configuration>
{code}
resource-types.xml:
{code:java}
<configuration>
<property>
<name>yarn.resource-types</name>
<value>gpu</value>
</property>
<property>
<name>yarn.resource-types.gpu.units</name>
<value>m</value>
</property>
</configuration>
{code}
and the command I ran was:
{code:java}
/Users/szilardnemeth/development/apache/hadoop-maven//hadoop-dist//target/hadoop-3.1.0-SNAPSHOT/bin/hadoop
jar
/Users/szilardnemeth/development/apache/hadoop-maven//hadoop-dist//target/hadoop-3.1.0-SNAPSHOT/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.0-SNAPSHOT.jar
pi -Dmapreduce.framework.name=yarn -Dmapreduce.map.resource.gpu=5k 10 100
{code}
I also tried with the same configuration, except I removed the gpu.units from
resource-types.xml and the result is the same, does not reproducible.
I have a log statement in place in the code for every conversion, so I grepped
to those conversions in the log:
{{grep converting -i /tmp/rm.log | grep "''"}}
results (just the largest converted numbers since there are a lot of matches
lines):
{code:java}
2018-01-30 11:51:01,317 INFO [SchedulerEventDispatcher:Event Processor]
util.UnitsConversionUtil (UnitsConversionUtil.java:convert(134)) -
***Converting 100 from '' to 'm'
2018-01-30 11:51:43,610 INFO [SchedulerEventDispatcher:Event Processor]
util.UnitsConversionUtil (UnitsConversionUtil.java:convert(134)) -
***Converting 100 from '' to 'm'
2018-01-30 11:51:54,681 INFO [SchedulerEventDispatcher:Event Processor]
util.UnitsConversionUtil (UnitsConversionUtil.java:convert(134)) -
***Converting 100 from '' to 'k'
2018-01-30 11:51:55,685 INFO [SchedulerEventDispatcher:Event Processor]
util.UnitsConversionUtil (UnitsConversionUtil.java:convert(134)) -
***Converting 100 from '' to 'k'
2018-01-30 11:51:56,691 INFO [SchedulerEventDispatcher:Event Processor]
util.UnitsConversionUtil (UnitsConversionUtil.java:convert(134)) -
***Converting 100 from '' to 'k'
2018-01-30 11:51:57,698 INFO [SchedulerEventDispatcher:Event Processor]
util.UnitsConversionUtil (UnitsConversionUtil.java:convert(134)) -
***Converting 100 from '' to 'k'
2018-01-30 11:51:58,707 INFO [SchedulerEventDispatcher:Event Processor]
util.UnitsConversionUtil (UnitsConversionUtil.java:convert(134)) -
***Converting 100 from '' to 'k'
2018-01-30 11:51:59,709 INFO [SchedulerEventDispatcher:Event Processor]
util.UnitsConversionUtil (UnitsConversionUtil.java:convert(134)) -
***Converting 100 from '' to 'k'
2018-01-30 11:52:04,471 INFO [SchedulerEventDispatcher:Event Processor]
util.UnitsConversionUtil (UnitsConversionUtil.java:convert(134)) -
***Converting 100 from '' to 'k'
2018-01-30 11:52:07,780 INFO [SchedulerEventDispatcher:Event Processor]
util.UnitsConversionUtil (UnitsConversionUtil.java:convert(134)) -
***Converting 100 from '' to 'm'
2018-01-30 11:52:11,560 INFO [SchedulerEventDispatcher:Event Processor]
util.UnitsConversionUtil (UnitsConversionUtil.java:convert(134)) -
***Converting 100 from '' to 'k'
2018-01-30 11:52:15,688 INFO [SchedulerEventDispatcher:Event Processor]
util.UnitsConversionUtil (UnitsConversionUtil.java:convert(134)) -
***Converting 100 from '' to 'k'
2018-01-30 11:52:17,847 INFO [SchedulerEventDispatcher:Event Processor]
util.UnitsConversionUtil (UnitsConversionUtil.java:convert(134)) -
***Converting 100 from '' to 'k'
{code}
So given these outputs, I could only imagine this exception comes up when the
value is somehow Long.MAX_VALUE for a custom resource type.
Can you think of a scenario when a custom resource type's value is set to it's
default (Long.MAX_VALUE)?
Thanks!
> Resource types that use units need to be defined at RM level and NM level or
> when using small units you will overflow max_allocation calculation
> ------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: YARN-7528
> URL: https://issues.apache.org/jira/browse/YARN-7528
> Project: Hadoop YARN
> Issue Type: Bug
> Components: documentation, resourcemanager
> Affects Versions: 3.0.0
> Reporter: Grant Sohn
> Assignee: Szilard Nemeth
> Priority: Major
>
> When the unit is not defined in the RM, the LONG_MAX default will overflow in
> the conversion step.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]