[
https://issues.apache.org/jira/browse/YARN-9030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16689755#comment-16689755
]
Wangda Tan commented on YARN-9030:
----------------------------------
[~suma.shivaprasad], it seems the logic of verifyAndCreateRemoteDir is wrong,
here's my local edit of the method:
{code}
/**
* Verify and create the remote log directory.
*/
public void verifyAndCreateRemoteLogDir() {
// Checking the existence of the TLD
FileSystem remoteFS;
try {
remoteFS = getFileSystem(conf);
} catch (IOException e) {
throw new YarnRuntimeException("Unable to get Remote FileSystem instance",
e);
}
boolean remoteExists = true;
Path remoteRootLogDir = getRemoteRootLogDir();
try {
FsPermission perms = remoteFS.getFileStatus(remoteRootLogDir)
.getPermission();
if (!perms.equals(TLDIR_PERMISSIONS)) {
LOG.warn("Remote Root Log Dir [" + remoteRootLogDir
+ "] already exist, but with incorrect permissions. "
+ "Expected: [" + TLDIR_PERMISSIONS + "], Found: [" + perms + "]."
+ " The cluster may have problems with multiple users.");
}
} catch (FileNotFoundException e) {
remoteExists = false;
} catch (IOException e) {
throw new YarnRuntimeException(
"Failed to check permissions for dir [" + remoteRootLogDir + "]", e);
}
try {
Path qualified = remoteRootLogDir.makeQualified(remoteFS.getUri(),
remoteFS.getWorkingDirectory());
if (!remoteExists) {
LOG.warn("Remote Root Log Dir [" + remoteRootLogDir
+ "] does not exist. Attempting to create it.");
}
remoteFS.mkdirs(qualified, new FsPermission(TLDIR_PERMISSIONS));
// Not possible to query FileSystem API to check if it supports
// chmod, chown etc. Hence resorting to catching exceptions here.
// Remove when FS APi is ready
try {
remoteFS.setPermission(qualified, new FsPermission(TLDIR_PERMISSIONS));
} catch (UnsupportedOperationException use) {
LOG.info("Unable to set permissions for configured filesystem since"
+ " it does not support this", remoteFS.getScheme());
fsSupportsChmod = false;
}
UserGroupInformation loginUser = UserGroupInformation.getLoginUser();
String primaryGroupName = null;
try {
primaryGroupName = loginUser.getPrimaryGroupName();
} catch (IOException e) {
LOG.warn("No primary group found. The remote root log directory"
+ " will be created with the HDFS superuser being its group "
+ "owner. JobHistoryServer may be unable to read the directory.");
}
// set owner on the remote directory only if the primary group exists
if (primaryGroupName != null) {
try {
remoteFS.setOwner(qualified, loginUser.getShortUserName(),
primaryGroupName);
} catch (UnsupportedOperationException use) {
LOG.info("File System does not support setting user/group" + remoteFS
.getScheme(), use);
}
}
} catch (IOException e) {
throw new YarnRuntimeException(
"Failed to create remoteLogDir [" + remoteRootLogDir + "]", e);
}
}
{code}
Several issues of previosu code:
1)
{code}
Path qualified = remoteRootLogDir.makeQualified(remoteFS.getUri(),
remoteFS.getWorkingDirectory());
{code}
Need to be called in anycase (you're right, but should place it under try).
2)
{code}
remoteFS.mkdirs(qualified, new FsPermission(TLDIR_PERMISSIONS));
{code}
Need to be called in anycase. (Not only when remoteExists == false).
3) Removed duplicated else block at the end.
4) Removed unused parameters.
Does this make sense to you?
> Log aggregation changes to handle filesystems which do not support permissions
> ------------------------------------------------------------------------------
>
> Key: YARN-9030
> URL: https://issues.apache.org/jira/browse/YARN-9030
> Project: Hadoop YARN
> Issue Type: Bug
> Reporter: Suma Shivaprasad
> Assignee: Suma Shivaprasad
> Priority: Major
> Attachments: YARN-9030.1.patch
>
>
> Some cloud storages like ADLS do not support permissions in which case they
> throw an UnsupportedOperationException. Log aggregation should hanlde these
> case and not set permissions for log aggregation base dir/ sub dirs
> {noformat}
> 2018-11-12 15:37:28,726 WARN logaggregation.LogAggregationService
> (LogAggregationService.java:initApp(209)) - Application failed to init
> aggregation
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to check
> permissions for dir [abfs://[email protected]/app-logs]
> at
> org.apache.hadoop.yarn.logaggregation.filecontroller.LogAggregationFileController.verifyAndCreateRemoteLogDir(LogAggregationFileController.java:277)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initAppAggregator(LogAggregationService.java:238)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initApp(LogAggregationService.java:204)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:347)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:69)
> at
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197)
> at
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]