[ 
https://issues.apache.org/jira/browse/YARN-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16204208#comment-16204208
 ] 

Allen Wittenauer commented on YARN-7326:
----------------------------------------

Without looking too hard at the current state (so I apologize if I've missed 
something) but to me, there are three showstopper issues:

1) Obviously the RegistryDNS 100% cpu issue.  [I'm truly surprised that no one 
else had noticed its awful performance characteristics.]

2) Banish the separate API server, now that YARN-6626 has been committed.  It's 
confusing and greatly increases the operating costs (and worse, potential 
security exposure) for little-to-no real benefit vs just using the REST API 
from the RM.  So just remove it from the docs and the yarn command.

3) Integrate the yarn service commands into yarn application as mentioned by 
Eric Yang.

I'd really like to see, but also wouldn't block the merge for:

1) Actually integrate the docs with the rest of yarn-site.  I'm not sure what 
benefit there is of having a separate documentation section, especially given 
#2 above and that the registrydns server could be used independently of the 
REST API.

2) A more complex example that doesn't use Docker.  This is important given 
that the docker bits in YARN have some significant security problems.  A lot of 
sites probably can't or won't enable the Docker subsystem for quite a while as 
a result.

3) Slider migration guide.

> Some issues in RegistryDNS
> --------------------------
>
>                 Key: YARN-7326
>                 URL: https://issues.apache.org/jira/browse/YARN-7326
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Jian He
>            Assignee: Jian He
>
> [~aw] helped to identify these issues: 
> Now some general bad news, not related to this patch:
> Ran a few queries, but this one is a bit concerning:
> {code}
> root@ubuntu:/hadoop/logs# dig @localhost -p 54 .
> ;; Warning: query response not set
> ; <<>> DiG 9.10.3-P4-Ubuntu <<>> @localhost -p 54 .
> ; (2 servers found)
> ;; global options: +cmd
> ;; Got answer:
> ;; ->>HEADER<<- opcode: QUERY, status: NOTAUTH, id: 47794
> ;; flags: rd ad; QUERY: 0, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
> ;; WARNING: recursion requested but not available
> ;; Query time: 0 msec
> ;; SERVER: 127.0.0.1#54(127.0.0.1)
> ;; WHEN: Thu Oct 12 16:04:54 PDT 2017
> ;; MSG SIZE  rcvd: 12
> root@ubuntu:/hadoop/logs# dig @localhost -p 54 axfr .
> ;; Connection to ::1#54(::1) for . failed: connection refused.
> ;; communications error to 127.0.0.1#54: end of file
> root@ubuntu:/hadoop/logs# 
> {code}
> It looks like it effectively fails when asked about a root zone, which is bad.
> It's also kind of interesting in what it does and doesn't log. Probably 
> should be configured to rotate logs based on size not date.
> The real showstopper though: RegistryDNS basically eats a core. It is running 
> with 100% cpu utilization with and without jsvc. On my laptop, this is 
> triggering my fan.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to