[ 
https://issues.apache.org/jira/browse/YARN-7427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16236807#comment-16236807
 ] 

Jonathan Hung commented on YARN-7427:
-------------------------------------

So the issue only appears to be when a queue's label is unused so far (e.g. on 
startup), {noformat}  public PartitionResourcesInfo 
getPartitionResourceUsageInfo(
      String partitionName) {
    for (PartitionResourcesInfo partitionResourceUsageInfo :
      resourceUsagesByPartition) {
      if (partitionResourceUsageInfo.getPartitionName().equals(partitionName)) {
        return partitionResourceUsageInfo;
      }
    }
    return new PartitionResourcesInfo();
  }{noformat} will return empty PartitionResourcesInfo if partitionName is not 
found in resourceUsagesByPartition. resourceUsagesByPartition is populated by 
the ResourcesInfo constructor (via resourceUsage.getNodePartitionsSet()), but 
an AbstractCSQueue's queueUsage map only contains a label if this queue has 
allocated/reserved/set AM limit/etc. for this label.

I couldn't repro this issue with one queue, since it seems when 
AbstractCSQueue#getNodeLabelsForQueue is called in 
LeafQueue#activateApplications (when node registration causes 
LeafQueue#updateClusterResource to be called), it grabs the labels from the 
AbstractCSQueue's queueCapacities object, which should be configured since this 
queue should be configured with 100 "x" capacity. Since "x" is in the 
queueCapacities object, when activateApplications is called, it will setAMLimit 
for this queue's queueUsages for "x".  But if two queues are configured, and 
one queue does not have capacity for this label, this label is not in the 
queueCapacities object, and thus not in the queueUsage map either. 

Since the exception is thrown here in CapacitySchedulerPage: {noformat}         
 __("Used Resources:", resourceUsages.getUsed().toString()).{noformat}
and ResourceInfo.resources is null from the empty ResourceInfo constructor, I 
think the easiest solution is to just change the ResourceInfo constructor.

Attached 001 for this


> NullPointerException in ResourceInfo when queue has not used label
> ------------------------------------------------------------------
>
>                 Key: YARN-7427
>                 URL: https://issues.apache.org/jira/browse/YARN-7427
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Jonathan Hung
>            Priority: Major
>         Attachments: YARN-7427.001.patch
>
>
> {noformat}Caused by: java.lang.NullPointerException
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.ResourceInfo.toString(ResourceInfo.java:65)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.CapacitySchedulerPage$LeafQueueInfoBlock.renderQueueCapacityInfo(CapacitySchedulerPage.java:164)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.CapacitySchedulerPage$LeafQueueInfoBlock.renderLeafQueueInfoWithPartition(CapacitySchedulerPage.java:107)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.CapacitySchedulerPage$LeafQueueInfoBlock.render(CapacitySchedulerPage.java:96)
>         at 
> org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69)
>         at 
> org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79)
>         at org.apache.hadoop.yarn.webapp.View.render(View.java:235)
>         at 
> org.apache.hadoop.yarn.webapp.view.HtmlBlock$Block.subView(HtmlBlock.java:43)
>         at 
> org.apache.hadoop.yarn.webapp.hamlet2.HamletImpl$EImp._v(HamletImpl.java:117)
>         at 
> org.apache.hadoop.yarn.webapp.hamlet2.Hamlet$LI.__(Hamlet.java:7709)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.CapacitySchedulerPage$QueueBlock.render(CapacitySchedulerPage.java:301)
>         at 
> org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69)
>         at 
> org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79)
>         at org.apache.hadoop.yarn.webapp.View.render(View.java:235)
>         at 
> org.apache.hadoop.yarn.webapp.view.HtmlBlock$Block.subView(HtmlBlock.java:43)
>         at 
> org.apache.hadoop.yarn.webapp.hamlet2.HamletImpl$EImp._v(HamletImpl.java:117)
>         at 
> org.apache.hadoop.yarn.webapp.hamlet2.Hamlet$LI.__(Hamlet.java:7709)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.CapacitySchedulerPage$QueuesBlock.render(CapacitySchedulerPage.java:470)
>         at 
> org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69)
>         at 
> org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79)
>         at org.apache.hadoop.yarn.webapp.View.render(View.java:235)
>         at 
> org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49)
>         at 
> org.apache.hadoop.yarn.webapp.hamlet2.HamletImpl$EImp._v(HamletImpl.java:117)
>         at org.apache.hadoop.yarn.webapp.hamlet2.Hamlet$TD.__(Hamlet.java:848)
>         at 
> org.apache.hadoop.yarn.webapp.view.TwoColumnLayout.render(TwoColumnLayout.java:71)
>         at 
> org.apache.hadoop.yarn.webapp.view.HtmlPage.render(HtmlPage.java:82)
>         at 
> org.apache.hadoop.yarn.webapp.Controller.render(Controller.java:212)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.RmController.scheduler(RmController.java:86)
>         ... 56 more{noformat}
> For example, configure: {noformat}  <property>
>     <name>yarn.scheduler.capacity.root.queues</name>
>     <value>default,a</value>
>   </property>
>   <property>
>     <name>yarn.scheduler.capacity.root.accessible-node-labels</name>
>     <value>x</value>
>   </property>
>   <property>
>     <name>yarn.scheduler.capacity.root.default.accessible-node-labels</name>
>     <value>x</value>
>   </property>
>   <property>
>     
> <name>yarn.scheduler.capacity.root.default.accessible-node-labels.x.maximum-capacity</name>
>     <value>100</value>
>   </property>
>   <property>
>     
> <name>yarn.scheduler.capacity.root.default.accessible-node-labels.x.capacity</name>
>     <value>100</value>
>   </property>{noformat}
> , then the above exception is thrown when refreshing the scheduler UI 
> (/cluster/scheduler)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to