[ 
https://issues.apache.org/jira/browse/YARN-7244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16182720#comment-16182720
 ] 

Sunil G commented on YARN-7244:
-------------------------------

Thanks [~jlowe] for adding more clarity on this.

'pull' model may be better and could work for all such cases. As Jason 
suggested if apps could know the latest dirs from 
{{getLocalDirsForRead/Write}}, shuffle handler will have a list of valid dirs 
always. Only potential issue which I see is that, once a set of dirs are pulled 
from {{LocalDirAllocator#ctx.localDirs}}, these dirs will be validated only 
when one more getLocalPathForWrite/Read is invoked. So there could be a window 
where we may get a stale dirs. If new api 
{{LocalDirAllocator#getLocalDirsForRead}} could call {{confChanged}}, then i 
think it should be a source of truth for localDirs for given time snapshot.

bq.Do you think, we can improve this to skip as default behavior itself
Currently in this patch, you are trying to avoid disk validation check when 
shouldFilter is false. To add more context, may be we could skip this check 
here provided we have a valid dirs in ShuffleHandler end based on earlier api.

> ShuffleHandler is not aware of disks that are added
> ---------------------------------------------------
>
>                 Key: YARN-7244
>                 URL: https://issues.apache.org/jira/browse/YARN-7244
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Kuhu Shukla
>            Assignee: Kuhu Shukla
>         Attachments: YARN-7244.001.patch, YARN-7244.002.patch
>
>
> The ShuffleHandler permanently remembers the list of "good" disks on NM 
> startup. If disks later are added to the node then map tasks will start using 
> them but the ShuffleHandler will not be aware of them. The end result is that 
> the data cannot be shuffled from the node leading to fetch failures and 
> re-runs of the map tasks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to