[
https://issues.apache.org/jira/browse/YARN-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13920489#comment-13920489
]
Sunil G commented on YARN-1781:
-------------------------------
1. Disk Size can come down when some tasks finishes and service deletes few
data from local directory.
There are more chances that this Disk Full may be temporary scenario because of
many tasks accessing local directories.
So there can be counter checks to ensure that these directories are added back
to good list of directories.
Else these directory may lost.
2. As I have commented in
https://issues.apache.org/jira/browse/YARN-257?focusedCommentId=13919295&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13919295
In the LocalDirAllocator, it is better to check for high percentage of disk
used. And do not assign such a directory to that task.
These measures might possible help to resolve the new tasks not to fail because
of an immediate disk full scenario.
> NM should allow users to specify max disk utilization for local disks
> ---------------------------------------------------------------------
>
> Key: YARN-1781
> URL: https://issues.apache.org/jira/browse/YARN-1781
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: nodemanager
> Reporter: Varun Vasudev
> Assignee: Varun Vasudev
> Attachments: apache-yarn-1781.0.patch
>
>
> This is related to YARN-257(it's probably a sub task?). Currently, the NM
> does not detect full disks and allows full disks to be used by containers
> leading to repeated failures. YARN-257 deals with graceful handling of full
> disks. This ticket is only about detection of full disks by the disk health
> checkers.
> The NM should allow users to set a maximum disk utilization for local disks
> and mark disks as bad once they exceed that utilization. At the very least,
> the NM should at least detect full disks.
--
This message was sent by Atlassian JIRA
(v6.2#6252)