[ 
https://issues.apache.org/jira/browse/YARN-4467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated YARN-4467:
----------------------------------
    Description: 
Shell.checkIsBashSupported() creates a bash shell command to verify if the 
system supports bash. However, its error message is misleading, and the logic 
should be updated.

If the shell command throws an IOException, it does not imply the bash did not 
run successfully. If the shell command process was interrupted, its internal 
logic throws an InterruptedIOException, which is a subclass of IOException.
{code:title=Shell.checkIsBashSupported|borderStyle=solid}
    ShellCommandExecutor shexec;
    boolean supported = true;
    try {
      String[] args = {"bash", "-c", "echo 1000"};
      shexec = new ShellCommandExecutor(args);
      shexec.execute();
    } catch (IOException ioe) {
      LOG.warn("Bash is not supported by the OS", ioe);
      supported = false;
    }
{code}
An example of it appeared in a recent jenkins job
https://builds.apache.org/job/PreCommit-HADOOP-Build/8257/testReport/org.apache.hadoop.ipc/TestRPCWaitForProxy/testInterruptedWaitForProxy/

The test logic in TestRPCWaitForProxy.testInterruptedWaitForProxy starts a 
thread, wait it for 1 second, and interrupt the thread, expecting the thread to 
terminate. However, the method Shell.checkIsBashSupported swallowed the 
interrupt, and therefore failed.
{noformat}
2015-12-16 21:31:53,797 WARN  util.Shell (Shell.java:checkIsBashSupported(718)) 
- Bash is not supported by the OS
java.io.InterruptedIOException: java.lang.InterruptedException
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:930)
        at org.apache.hadoop.util.Shell.run(Shell.java:838)
        at 
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117)
        at org.apache.hadoop.util.Shell.checkIsBashSupported(Shell.java:716)
        at org.apache.hadoop.util.Shell.<clinit>(Shell.java:705)
        at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:79)
        at 
org.apache.hadoop.security.SecurityUtil.getAuthenticationMethod(SecurityUtil.java:639)
        at 
org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:273)
        at 
org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:261)
        at 
org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:803)
        at 
org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:773)
        at 
org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:646)
        at org.apache.hadoop.ipc.RPC.waitForProtocolProxy(RPC.java:397)
        at org.apache.hadoop.ipc.RPC.waitForProtocolProxy(RPC.java:350)
        at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:330)
        at 
org.apache.hadoop.ipc.TestRPCWaitForProxy$RpcThread.run(TestRPCWaitForProxy.java:115)
Caused by: java.lang.InterruptedException
        at java.lang.Object.wait(Native Method)
        at java.lang.Object.wait(Object.java:503)
        at java.lang.UNIXProcess.waitFor(UNIXProcess.java:264)
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:920)
        ... 15 more
{noformat}

The original design is not desirable, as it swallowed a potential interrupt, 
causing TestRPCWaitForProxy.testInterruptedWaitForProxy to fail. Unfortunately, 
Java does not allow this static method to throw exception. We should removed 
the static member variable, so that the method can throw the interrupt 
exception. The node manager should call the static method, instead of using the 
static member variable.

  was:
Shell.checkIsBashSupported() creates a bash shell command to verify if the 
system supports bash. However, its error message is misleading, and the logic 
should be updated.

If the shell command throws an IOException, it does not imply the bash did not 
run successfully. If the shell command process was interrupted, its internal 
logic throws an InterruptedIOException, which is a subclass of IOException.
{code:title=Shell.checkIsBashSupported|borderStyle=solid}
    ShellCommandExecutor shexec;
    boolean supported = true;
    try {
      String[] args = {"bash", "-c", "echo 1000"};
      shexec = new ShellCommandExecutor(args);
      shexec.execute();
    } catch (IOException ioe) {
      LOG.warn("Bash is not supported by the OS", ioe);
      supported = false;
    }
{code}
An example of it appeared in a recent jenkins job
https://builds.apache.org/job/PreCommit-HADOOP-Build/8257/testReport/org.apache.hadoop.ipc/TestRPCWaitForProxy/testInterruptedWaitForProxy/

The test logic in TestRPCWaitForProxy.testInterruptedWaitForProxy starts a 
thread, wait it for 1 second, and interrupt the thread, expecting the thread to 
terminate. However, the method Shell.checkIsBashSupported swallowed the 
interrupt, and therefore failed.
{noformat}
2015-12-16 21:31:53,797 WARN  util.Shell (Shell.java:checkIsBashSupported(718)) 
- Bash is not supported by the OS
java.io.InterruptedIOException: java.lang.InterruptedException
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:930)
        at org.apache.hadoop.util.Shell.run(Shell.java:838)
        at 
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117)
        at org.apache.hadoop.util.Shell.checkIsBashSupported(Shell.java:716)
        at org.apache.hadoop.util.Shell.<clinit>(Shell.java:705)
        at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:79)
        at 
org.apache.hadoop.security.SecurityUtil.getAuthenticationMethod(SecurityUtil.java:639)
        at 
org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:273)
        at 
org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:261)
        at 
org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:803)
        at 
org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:773)
        at 
org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:646)
        at org.apache.hadoop.ipc.RPC.waitForProtocolProxy(RPC.java:397)
        at org.apache.hadoop.ipc.RPC.waitForProtocolProxy(RPC.java:350)
        at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:330)
        at 
org.apache.hadoop.ipc.TestRPCWaitForProxy$RpcThread.run(TestRPCWaitForProxy.java:115)
Caused by: java.lang.InterruptedException
        at java.lang.Object.wait(Native Method)
        at java.lang.Object.wait(Object.java:503)
        at java.lang.UNIXProcess.waitFor(UNIXProcess.java:264)
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:920)
        ... 15 more
{noformat}

The original design is not desirable, as it swallowed a potential interrupt, 
causing TestRPCWaitForProxy.testInterruptedWaitForProxy to fail. Unfortunately, 
Java does not allow this static method to throw exception. We should removed 
the static member variable, so that the method can throw the interrupt 
exception.


> Shell.checkIsBashSupported swallowed an interrupted exception
> -------------------------------------------------------------
>
>                 Key: YARN-4467
>                 URL: https://issues.apache.org/jira/browse/YARN-4467
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>            Reporter: Wei-Chiu Chuang
>              Labels: shell, supportability
>         Attachments: HADOOP-12652.001.patch, YARN-4467.001.patch
>
>
> Shell.checkIsBashSupported() creates a bash shell command to verify if the 
> system supports bash. However, its error message is misleading, and the logic 
> should be updated.
> If the shell command throws an IOException, it does not imply the bash did 
> not run successfully. If the shell command process was interrupted, its 
> internal logic throws an InterruptedIOException, which is a subclass of 
> IOException.
> {code:title=Shell.checkIsBashSupported|borderStyle=solid}
>     ShellCommandExecutor shexec;
>     boolean supported = true;
>     try {
>       String[] args = {"bash", "-c", "echo 1000"};
>       shexec = new ShellCommandExecutor(args);
>       shexec.execute();
>     } catch (IOException ioe) {
>       LOG.warn("Bash is not supported by the OS", ioe);
>       supported = false;
>     }
> {code}
> An example of it appeared in a recent jenkins job
> https://builds.apache.org/job/PreCommit-HADOOP-Build/8257/testReport/org.apache.hadoop.ipc/TestRPCWaitForProxy/testInterruptedWaitForProxy/
> The test logic in TestRPCWaitForProxy.testInterruptedWaitForProxy starts a 
> thread, wait it for 1 second, and interrupt the thread, expecting the thread 
> to terminate. However, the method Shell.checkIsBashSupported swallowed the 
> interrupt, and therefore failed.
> {noformat}
> 2015-12-16 21:31:53,797 WARN  util.Shell 
> (Shell.java:checkIsBashSupported(718)) - Bash is not supported by the OS
> java.io.InterruptedIOException: java.lang.InterruptedException
>       at org.apache.hadoop.util.Shell.runCommand(Shell.java:930)
>       at org.apache.hadoop.util.Shell.run(Shell.java:838)
>       at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117)
>       at org.apache.hadoop.util.Shell.checkIsBashSupported(Shell.java:716)
>       at org.apache.hadoop.util.Shell.<clinit>(Shell.java:705)
>       at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:79)
>       at 
> org.apache.hadoop.security.SecurityUtil.getAuthenticationMethod(SecurityUtil.java:639)
>       at 
> org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:273)
>       at 
> org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:261)
>       at 
> org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:803)
>       at 
> org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:773)
>       at 
> org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:646)
>       at org.apache.hadoop.ipc.RPC.waitForProtocolProxy(RPC.java:397)
>       at org.apache.hadoop.ipc.RPC.waitForProtocolProxy(RPC.java:350)
>       at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:330)
>       at 
> org.apache.hadoop.ipc.TestRPCWaitForProxy$RpcThread.run(TestRPCWaitForProxy.java:115)
> Caused by: java.lang.InterruptedException
>       at java.lang.Object.wait(Native Method)
>       at java.lang.Object.wait(Object.java:503)
>       at java.lang.UNIXProcess.waitFor(UNIXProcess.java:264)
>       at org.apache.hadoop.util.Shell.runCommand(Shell.java:920)
>       ... 15 more
> {noformat}
> The original design is not desirable, as it swallowed a potential interrupt, 
> causing TestRPCWaitForProxy.testInterruptedWaitForProxy to fail. 
> Unfortunately, Java does not allow this static method to throw exception. We 
> should removed the static member variable, so that the method can throw the 
> interrupt exception. The node manager should call the static method, instead 
> of using the static member variable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to