[
https://issues.apache.org/jira/browse/YARN-3254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Suma Shivaprasad updated YARN-3254:
-----------------------------------
Attachment: YARN-3254-006.patch
Thanks [~sunilg]. Attached patched with review comments addressed.
unused import java.util.NoSuchElementException
--> Removed
Java doc could also add param info as we ill new apis such as
DirectoryCollection#getDirectoryErrorInfo etc
Its better to use errorInformation instead of getValue.
directoryErrorInfo.put(entry.getKey(), entry.getValue());
--> Fixed
Could below message be kept at class level in LocalDirsHandlerService
final String diskCapacityExceededErrorMsg
--> Fixed
One doubt in DirectoryCollection.checkDirs()
As per latest patch, directoryErrorInfo is the super set which contains dirs
from errorDirs, fullDirs and nothing more.
Now while buildDiskErrorReport call, directoryCollection.isDiskUnHealthy is
invoked for erroredLocal/LogDirsList and diskFullLocal/LogDirsList. I think
this is not needed as contents from errorDirs and fillDirs are in that map for
sure. In that case, could we skip this check and remove that api. Please
correct me if I missed something here.
-->Thanks for catching this. There was one case where the errorDirs was being
updated but directoryErrorInfo was not being updated i.e in
createNonExistentDirs which caused the lookup to return null. Have fixed that
as part of the updated patch. Not sure if any other cases exist. Hence added a
check to just be safe with the error information retrieval.
> HealthReport should include disk full information
> -------------------------------------------------
>
> Key: YARN-3254
> URL: https://issues.apache.org/jira/browse/YARN-3254
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: nodemanager
> Affects Versions: 2.6.0
> Reporter: Akira Ajisaka
> Assignee: Suma Shivaprasad
> Fix For: 3.0.0-beta1
>
> Attachments: Screen Shot 2015-02-24 at 17.57.39.png, Screen Shot
> 2015-02-25 at 14.38.10.png, YARN-3254-001.patch, YARN-3254-002.patch,
> YARN-3254-003.patch, YARN-3254-004.patch, YARN-3254-005.patch,
> YARN-3254-006.patch
>
>
> When a NodeManager's local disk gets almost full, the NodeManager sends a
> health report to ResourceManager that "local/log dir is bad" and the message
> is displayed on ResourceManager Web UI. It's difficult for users to detect
> why the dir is bad.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]