[
https://issues.apache.org/jira/browse/YARN-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15344527#comment-15344527
]
Allen Wittenauer commented on YARN-5274:
----------------------------------------
This is pretty much one of many things a health check script *could* be doing.
One of the key reasons why things like this weren't built into the code base
early on is because it's nearly impossible to figure out what is happening
locally and do the right thing:
* What happens on SSDs?
* What if there are no SMART enabled devices on the box?
* Do we loop through all devices are try to figure out what configured
directories map to which disks?
* What if we have a volume manager or pooled storage?
etc.
This really feels like overstepping our bounds and increasing code surface area
for not a lot of win and a lot of long term pain. This is especially true for
something like smartctl that requires privilege. That's a ton of baggage to
add.
FWIW: I feel like most of the stuff presented in the umbrella JIRA suffers from
the same problems. If one takes a simplistic view of how machines are
configured, fine. But that may not even cover the majority of real-world
installs!
> Use smartctl to determine health of disks
> -----------------------------------------
>
> Key: YARN-5274
> URL: https://issues.apache.org/jira/browse/YARN-5274
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: nodemanager
> Reporter: Varun Vasudev
>
> It would be nice to add support for smartctl(on machines where it is
> available) to determine disk health for the YARN local and log dirs(if
> smartctl is applicable). The current disk checking mechanism misses out on
> issues like bad sectors, etc.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]