[
https://issues.apache.org/jira/browse/YARN-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15345328#comment-15345328
]
Allen Wittenauer commented on YARN-5274:
----------------------------------------
bq. The node health script is meant for the health of the node. It can't mark a
single disk as bad.
Yes, I'm very familiar with both the health check (esp given I'm the one who
pushed for it to get added to begin with...) and smartctl.
bq. The health test to determine if a disk should be valid whether the disk is
a HDD or SSD. We shouldn't use smartctl if it doesn't apply to storage in
question, and fallback on the existing checks.
If I configure a file system to use /hadoop/1/tmp and /hadoop/1's mount device
is hadoop1/1, now what? Is it going to be smart enough to look to see what
devices the hadoop1 pool has in it?
bq. Where explicit monitoring does not exist, the NM can take some pro-active
steps to detect bad disks.
But that's my point: explicit monitoring DOES exist, just not inside Hadoop.
There are whole industries based around hardware monitoring that user's should
be deploying. Trying to do it all is part of why YARN is descending into
chaos. There are times when it is appropriate to walk away and say "this isn't
our core competency, let someone else do it.". This is one of them.
Besides: why is this a YARN-specific problem? Shouldn't this be in HADOOP so
that both HDFS and YARN can take advantage of any code written?
> Use smartctl to determine health of disks
> -----------------------------------------
>
> Key: YARN-5274
> URL: https://issues.apache.org/jira/browse/YARN-5274
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: nodemanager
> Reporter: Varun Vasudev
>
> It would be nice to add support for smartctl(on machines where it is
> available) to determine disk health for the YARN local and log dirs(if
> smartctl is applicable). The current disk checking mechanism misses out on
> issues like bad sectors, etc.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]