[ 
https://issues.apache.org/jira/browse/YARN-10442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surendra Singh Lilhore updated YARN-10442:
------------------------------------------
    Attachment: YARN-10442.002.patch

> RM should make sure node label file highly available
> ----------------------------------------------------
>
>                 Key: YARN-10442
>                 URL: https://issues.apache.org/jira/browse/YARN-10442
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 3.1.1
>            Reporter: Surendra Singh Lilhore
>            Assignee: Surendra Singh Lilhore
>            Priority: Major
>         Attachments: YARN-10442.001.patch, YARN-10442.002.patch
>
>
> One of my cluster RM failed transition to Active because node label file 
> blocks are missing. I think RM should to make sure important files are highly 
> available. 
> {noformat}
> Caused by: com.google.protobuf.InvalidProtocolBufferException: Could not 
> obtain block: BP-2121803626-10.0.0.22-1597301807397:blk_1073832522_91774 
> file=/yarn/node-labels/nodelabel.mirror
>       at 
> com.google.protobuf.AbstractParser.parsePartialDelimitedFrom(AbstractParser.java:238)
>       at 
> com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:253)
>       at 
> com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:259)
>       at 
> com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:49)
>       at 
> org.apache.hadoop.yarn.proto.YarnServerResourceManagerServiceProtos$AddToClusterNodeLabelsRequestProto.parseDelimitedFrom(YarnServerResourceManagerServiceProtos.java:7493)
>       at 
> org.apache.hadoop.yarn.nodelabels.FileSystemNodeLabelsStore.loadFromMirror(FileSystemNodeLabelsStore.java:168)
>       at 
> org.apache.hadoop.yarn.nodelabels.FileSystemNodeLabelsStore.recover(FileSystemNodeLabelsStore.java:205)
>       at 
> org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager.initNodeLabelStore(CommonNodeLabelsManager.java:254)
>       at 
> org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager.serviceStart(CommonNodeLabelsManager.java:268)
>       at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)(AbstractService.java:194){noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to