[ https://issues.apache.org/jira/browse/YARN-8108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16444483#comment-16444483 ]
Eric Yang commented on YARN-8108: --------------------------------- There are 3 registrations of RMAuthenticationFilter to context cluster, logs, and static. There is 1 registration of SpnegoFilter to logs, static. /ws, /proxy, /app path spec are blanket by the default SpnegoFilter when embedded proxyserver is enabled (because proxyserver filter is initialized after cluster context). I tried to reduce RMAuthenticationFilter to 1, and discover that it still has conflict between /proxy and /cluster. I tried to disable SpnegoFilter, then /proxy become insecure. In YARN-1553, webproxy was converted to use HttpServer2.Builder. This change picked up webproxy initSpnego filter. In YARN-1482, code was modified to allow WebAppProxy to run in RM. In HADOOP-10075 + HADOOP-10703 are written to apply handlers > context > filter > servlet globally. Existing code which doesn't have handler applied, defineFilter applies to context. This is all working fine, if there is only one context that enclosing all hadoop logic. However, if multiple webapps are put on the same server, older code written pre-dated HADOOP-10703 may have separated context initialized with separated AuthenticationFilter that would result in request is a replay error. A couple problems with Hadoop code misuse of web application: - Servlet code are setup to be context. Context should be RM, webproxy, timelineserver. Today, it is cluster, logs, static. - YARN servlet logic are written as Filter. - Handler logic are written as Filter. - Filter are applied to wildcard path and assuming it is applied globally, which may not be true. This problem appears to have existed since creation of HttpServer2 and getting more complex to manage with introduction of jetty 9 upgrade. A few JIRA like YARN-2397 are attempts to band-aid the general code reuse and specialized AuthenticationFilter. Each of the pieces were evolved independently, and there was no conflicts until everything is put into RM. One possible solution is to rewrite RMAuthenticationFilter, and AuthenticationFilter to become AuthenticationHandler, and it is applied globally. This would be the same as turning RMAuthenticationFilter into a global Filter. I am not 100% sure if RMAuthenticationFilter should oversee all Kerberos login activity when proxyserver is in embedded mode, but I am more inclined to make it so after analyze the code. I post here to see if anyone who are more familiar with YARN code base can shed some lights on best approach to address this issue. > RM metrics rest API throws GSSException in kerberized environment > ----------------------------------------------------------------- > > Key: YARN-8108 > URL: https://issues.apache.org/jira/browse/YARN-8108 > Project: Hadoop YARN > Issue Type: Bug > Affects Versions: 3.0.0 > Reporter: Kshitij Badani > Priority: Major > Attachments: YARN-8108.001.patch > > > Test is trying to pull up metrics data from SHS after kiniting as 'test_user' > It is throwing GSSException as follows > {code:java} > b2b460b80713|RUNNING: curl --silent -k -X GET -D > /hwqe/hadoopqe/artifacts/tmp-94845 --negotiate -u : > http://rm_host:8088/proxy/application_1518674952153_0070/metrics/json2018-02-15 > 07:15:48,757|INFO|MainThread|machine.py:194 - > run()||GUID=fc5a3266-28f8-4eed-bae2-b2b460b80713|Exit Code: 0 > 2018-02-15 07:15:48,758|INFO|MainThread|spark.py:1757 - > getMetricsJsonData()|metrics: > <html> > <head> > <meta http-equiv="Content-Type" content="text/html;charset=utf-8"/> > <title>Error 403 GSSException: Failure unspecified at GSS-API level > (Mechanism level: Request is a replay (34))</title> > </head> > <body><h2>HTTP ERROR 403</h2> > <p>Problem accessing /proxy/application_1518674952153_0070/metrics/json. > Reason: > <pre> GSSException: Failure unspecified at GSS-API level (Mechanism level: > Request is a replay (34))</pre></p> > </body> > </html> > {code} > Rootcausing :Â proxyserver on RM can't be supported for Kerberos enabled > cluster because AuthenticationFilter is applied twice in Hadoop code (once in > httpServer2 for RM, and another instance from AmFilterInitializer for proxy > server). This will require code changes to hadoop-yarn-server-web-proxy > project -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org