[ https://issues.apache.org/jira/browse/YARN-9527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901473#comment-16901473 ]
Jim Brennan commented on YARN-9527: ----------------------------------- We have been running with this patch on one of our large research clusters for about a month. I scanned for this issue again today and there were no instances of it. That is not definitive, but it is a good sign. We also have not had any new problems reported as a result of this change. I will continue to monitor our clusters for this. [~ebadger], did you want to see if we can get some other reviewers for this patch? > Rogue LocalizerRunner/ContainerLocalizer repeatedly downloading same file > ------------------------------------------------------------------------- > > Key: YARN-9527 > URL: https://issues.apache.org/jira/browse/YARN-9527 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn > Affects Versions: 2.8.5, 3.1.2 > Reporter: Jim Brennan > Assignee: Jim Brennan > Priority: Major > Attachments: YARN-9527.001.patch, YARN-9527.002.patch, > YARN-9527.003.patch, YARN-9527.004.patch > > > A rogue ContainerLocalizer can get stuck in a loop continuously downloading > the same file while generating an "Invalid event: LOCALIZED at LOCALIZED" > exception on each iteration. Sometimes this continues long enough that it > fills up a disk or depletes available inodes for the filesystem. -- This message was sent by Atlassian JIRA (v7.6.14#76016) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org