[ https://issues.apache.org/jira/browse/YARN-7244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184319#comment-16184319 ]
Jason Lowe commented on YARN-7244: ---------------------------------- bq. rather a new api as you mentioned in LocalDirAllocator named getLocalDirsForRead could enough to get valid dirs as it pulls all configured NM_LOCAL_DIRS and validates same. LocalDirAllocator should not have the new API, IMHO. That class is in hadoop-common and shouldn't be involved in solving this nodemanager-specific problem. I'm thinking we go with the pull approach with something like the following. Note that I'm not stuck on the specific names of new interfaces/classes, they're just examples for reference. # In AuxiliaryService add new methods to get and set the API object to interact with the NM's local dirs management, e.g.: {code} public AuxiliaryLocalPathHandler getAuxiliaryLocalPathHandler(); public void setAuxiliaryLocalPathHandler(AuxiliaryLocalPathHandler); {code} # The new AuxiliaryLocalPathHandler object would be in hadoop-yarn-api and look something like this: {code} public interface AuxiliaryLocalPathHandler { Path getLocalPathForRead(String); Path getLocalPathForWrite(String); Path getLocalPathForWrite(String, long); } {code} # AuxiliaryService would implement a LocalDirsHandler that maps the AuxiliarlyLocalDirsHandler calls to the NMs LocalDirsHandlerService. # The ShuffleHandler can leverage the new AuxiliaryLocalPathHandler to find shuffle input files rather than manage its own LocalDirAllocator. > ShuffleHandler is not aware of disks that are added > --------------------------------------------------- > > Key: YARN-7244 > URL: https://issues.apache.org/jira/browse/YARN-7244 > Project: Hadoop YARN > Issue Type: Bug > Reporter: Kuhu Shukla > Assignee: Kuhu Shukla > Attachments: YARN-7244.001.patch, YARN-7244.002.patch > > > The ShuffleHandler permanently remembers the list of "good" disks on NM > startup. If disks later are added to the node then map tasks will start using > them but the ShuffleHandler will not be aware of them. The end result is that > the data cannot be shuffled from the node leading to fetch failures and > re-runs of the map tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org