Surendra Singh Lilhore created YARN-10442:
---------------------------------------------

             Summary: RM should make sure node label file highly available
                 Key: YARN-10442
                 URL: https://issues.apache.org/jira/browse/YARN-10442
             Project: Hadoop YARN
          Issue Type: Bug
          Components: resourcemanager
    Affects Versions: 3.1.1
            Reporter: Surendra Singh Lilhore
            Assignee: Surendra Singh Lilhore


One of my cluster RM failed transition to Active because node label file blocks 
are missing. I think RM should to make sure important files are highly 
available. 
{code:java}
Caused by: com.google.protobuf.InvalidProtocolBufferException: Could not obtain 
block: BP-2121803626-10.0.0.22-1597301807397:blk_1073832522_91774 
file=/yarn/node-labels/nodelabel.mirrorCaused by: 
com.google.protobuf.InvalidProtocolBufferException: Could not obtain block: 
BP-2121803626-10.0.0.22-1597301807397:blk_1073832522_91774 
file=/yarn/node-labels/nodelabel.mirror at 
com.google.protobuf.AbstractParser.parsePartialDelimitedFrom(AbstractParser.java:238)
 at 
com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:253) 
at 
com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:259) 
at 
com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:49) 
at 
org.apache.hadoop.yarn.proto.YarnServerResourceManagerServiceProtos$AddToClusterNodeLabelsRequestProto.parseDelimitedFrom(YarnServerResourceManagerServiceProtos.java:7493)
 at 
org.apache.hadoop.yarn.nodelabels.FileSystemNodeLabelsStore.loadFromMirror(FileSystemNodeLabelsStore.java:168)
 at 
org.apache.hadoop.yarn.nodelabels.FileSystemNodeLabelsStore.recover(FileSystemNodeLabelsStore.java:205)
 at 
org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager.initNodeLabelStore(CommonNodeLabelsManager.java:254)
 at 
org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager.serviceStart(CommonNodeLabelsManager.java:268)
 at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to