[ https://issues.apache.org/jira/browse/YARN-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171443#comment-14171443 ]
Jason Lowe commented on YARN-2314: ---------------------------------- So Tez will automatically benefit on large clusters because the default is to not use the cache. However if we've found empirically that Tez needs the proxy cache to perform well then this patch would be a performance hit for Tez by default on clusters where the cache issues weren't a problem. I wasn't sure which default benefit you were referring to above (running faster because cache is enabled or working on a large cluster because cache is disabled). If Tez shows significant improvements with this cache turned on then I could see an argument to have the cache on by default since small clusters are common and large clusters are rare. > ContainerManagementProtocolProxy can create thousands of threads for a large > cluster > ------------------------------------------------------------------------------------ > > Key: YARN-2314 > URL: https://issues.apache.org/jira/browse/YARN-2314 > Project: Hadoop YARN > Issue Type: Bug > Components: client > Affects Versions: 2.1.0-beta > Reporter: Jason Lowe > Assignee: Jason Lowe > Priority: Critical > Attachments: YARN-2314.patch, disable-cm-proxy-cache.patch, > nmproxycachefix.prototype.patch > > > ContainerManagementProtocolProxy has a cache of NM proxies, and the size of > this cache is configurable. However the cache can grow far beyond the > configured size when running on a large cluster and blow AM address/container > limits. More details in the first comment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)