[ 
https://issues.apache.org/jira/browse/YARN-8310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter updated YARN-8310:
--------------------------------
    Attachment: YARN-8310.003.patch
                YARN-8310.branch-2.003.patch

> Handle old NMTokenIdentifier, AMRMTokenIdentifier, and 
> ContainerTokenIdentifier formats
> ---------------------------------------------------------------------------------------
>
>                 Key: YARN-8310
>                 URL: https://issues.apache.org/jira/browse/YARN-8310
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 2.6.0
>            Reporter: Robert Kanter
>            Assignee: Robert Kanter
>            Priority: Major
>         Attachments: YARN-8310.001.patch, YARN-8310.002.patch, 
> YARN-8310.003.patch, YARN-8310.branch-2.001.patch, 
> YARN-8310.branch-2.002.patch, YARN-8310.branch-2.003.patch
>
>
> In some recent upgrade testing, we saw this error causing the NodeManager to 
> fail to startup afterwards:
> {noformat}
> org.apache.hadoop.service.ServiceStateException: 
> com.google.protobuf.InvalidProtocolBufferException: Protocol message 
> contained an invalid tag (zero).
>       at 
> org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105)
>       at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:173)
>       at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)
>       at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:441)
>       at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>       at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:834)
>       at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:895)
> Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol 
> message contained an invalid tag (zero).
>       at 
> com.google.protobuf.InvalidProtocolBufferException.invalidTag(InvalidProtocolBufferException.java:89)
>       at 
> com.google.protobuf.CodedInputStream.readTag(CodedInputStream.java:108)
>       at 
> org.apache.hadoop.yarn.proto.YarnSecurityTokenProtos$ContainerTokenIdentifierProto.<init>(YarnSecurityTokenProtos.java:1860)
>       at 
> org.apache.hadoop.yarn.proto.YarnSecurityTokenProtos$ContainerTokenIdentifierProto.<init>(YarnSecurityTokenProtos.java:1824)
>       at 
> org.apache.hadoop.yarn.proto.YarnSecurityTokenProtos$ContainerTokenIdentifierProto$1.parsePartialFrom(YarnSecurityTokenProtos.java:2016)
>       at 
> org.apache.hadoop.yarn.proto.YarnSecurityTokenProtos$ContainerTokenIdentifierProto$1.parsePartialFrom(YarnSecurityTokenProtos.java:2011)
>       at 
> com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:200)
>       at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:217)
>       at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:223)
>       at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49)
>       at 
> org.apache.hadoop.yarn.proto.YarnSecurityTokenProtos$ContainerTokenIdentifierProto.parseFrom(YarnSecurityTokenProtos.java:2686)
>       at 
> org.apache.hadoop.yarn.security.ContainerTokenIdentifier.readFields(ContainerTokenIdentifier.java:254)
>       at 
> org.apache.hadoop.security.token.Token.decodeIdentifier(Token.java:177)
>       at 
> org.apache.hadoop.yarn.server.utils.BuilderUtils.newContainerTokenIdentifier(BuilderUtils.java:322)
>       at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recoverContainer(ContainerManagerImpl.java:455)
>       at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recover(ContainerManagerImpl.java:373)
>       at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:316)
>       at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>       ... 5 more
> {noformat}
> The NodeManager fails because it's trying to read a 
> {{ContainerTokenIdentifier}} in the "old" format before we changed them to 
> protobufs (YARN-668).  This is very similar to YARN-5594 where we ran into a 
> similar problem with the ResourceManager and RM Delegation Tokens.
> To provide a better experience, we should make the code able to read the old 
> format if it's unable to read it using the new format.  We didn't run into 
> any errors with the other two types of tokens that YARN-668 incompatibly 
> changed (NMTokenIdentifier and AMRMTokenIdentifier), but we may as well fix 
> those while we're at it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to