[
https://issues.apache.org/jira/browse/YARN-7244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16182678#comment-16182678
]
Jason Lowe commented on YARN-7244:
----------------------------------
Thanks for the patch!
The core issue here is that the NM is handing out directories to tasks that the
shuffle manager is unaware of. This filtering-or-not approach doesn't
completely solve the issue, since the ShuffleHandler will still attempt to
visit disks that the NM has already determined are bad. That could cause
performance problems if the ShuffleHandler tries to read a particularly
problematic disk over and over as it searches for outputs to shuffle for every
shuffle request.
It would be more ideal if the NM could convey to aux services what directories
are in use. Then the ShuffleHandler and NM would be in sync with respect to
what disks should or should not be used.
bq. Another way to handle this would have been to change the AuxiliaryServices
to pass the NMContext or the LocalDirAllocator from the NM .
That would be nice, as there are probably other things in the NMContext that
aux services may want to know about. However we could always go with a much
more direct route. We could add an API to AuxiliaryService that can set a
callback object that can be leveraged to retrieve the current list of paths
that are good for reading or writing, or we can an API to AuxiliaryService that
the NM can call to update that service on the list of paths good for reading
and writing. (i.e.: either a 'pull' or 'push' model for exposing the current
good directories to aux services).
The 'pull' model requires an interface or abstract class in yarn-api that
defines the API aux services can call to retrieve the directories, and we would
put the actual implementation of that interface in yarn-server-nodemanager.
Ideally the interface would look a lot like the existing getLocalDirsForRead(),
getLocalDirsForWrite(), etc. of the LocalDirsHandlerService so it's an easy
pass-through to implement on the nodemanager side.
The 'push' model requires adding a listener interface to
LocalDIrsHandlerService so we know when a disk is added or removed and can
callback into each aux service to update them on the current list of dirs for
reading and writing.
Haven't had a lot of time to figure out which would be more ideal in practice
in terms of ease-of-use and performance, but I think I'd rather see the aux
services be more in sync with the rest of the NM wrt. local dirs being actively
used.
> ShuffleHandler is not aware of disks that are added
> ---------------------------------------------------
>
> Key: YARN-7244
> URL: https://issues.apache.org/jira/browse/YARN-7244
> Project: Hadoop YARN
> Issue Type: Bug
> Reporter: Kuhu Shukla
> Assignee: Kuhu Shukla
> Attachments: YARN-7244.001.patch, YARN-7244.002.patch
>
>
> The ShuffleHandler permanently remembers the list of "good" disks on NM
> startup. If disks later are added to the node then map tasks will start using
> them but the ShuffleHandler will not be aware of them. The end result is that
> the data cannot be shuffled from the node leading to fetch failures and
> re-runs of the map tasks.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]