[ 
https://issues.apache.org/jira/browse/YARN-7796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16340101#comment-16340101
 ] 

Jim Brennan commented on YARN-7796:
-----------------------------------

We ran into this same issue when running a container-executor that was compiled 
on RHEL 6 on a RHEL 7 system.  While we have verified that the patch in this 
Jira does avoid the segmentation fault, we are concerned that the root cause of 
the problem remains, and may bite us later.

The -fstack-check flag was added to the command line in YARN-6721

Based on my testing, a container-executor (without the patch from this Jira) 
compiled on RHEL 6 with the -fstack-check flag always hits this segmentation 
fault when run on RHEL 7.  But if you compile without this flag, the 
container-executor runs on RHEL 7 with no problems.  I also verified this with 
a simple program that just does the copy_file.

[~grepas] - was this the case for you? Were you running a container-executor 
that was compiled on an earlier redhat release?

> Container-executor fails with segfault on certain OS configurations
> -------------------------------------------------------------------
>
>                 Key: YARN-7796
>                 URL: https://issues.apache.org/jira/browse/YARN-7796
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 3.0.0
>            Reporter: Gergo Repas
>            Assignee: Gergo Repas
>            Priority: Major
>             Fix For: 3.1.0, 3.0.1
>
>         Attachments: YARN-7796.000.patch, YARN-7796.001.patch, 
> YARN-7796.002.patch
>
>
> There is a relatively big (128K) buffer allocated on the stack in 
> container-executor.c for the purpose of copying files. As indicated by the 
> below gdb stack trace, this allocation can fail with SIGSEGV. This happens 
> only on certain OS configurations - I can reproduce this issue on RHEL 6.9:
> {code:java}
> [Thread debugging using libthread_db enabled]
> main : command provided 0
> main : run as user is ***
> main : requested yarn user is ***
> Program received signal SIGSEGV, Segmentation fault.
> 0x00000000004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 
> "/yarn/nm/nmPrivate/container_1516711246952_0001_02_000001.tokens", 
> out_filename=0x932930 
> "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_000001.tokens",
>  perm=384)
>     at 
> /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966
> 966     char buffer[buffer_size];
> (gdb) bt
> #0  0x00000000004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 
> "/yarn/nm/nmPrivate/container_1516711246952_0001_02_000001.tokens", 
> out_filename=0x932930 
> "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_000001.tokens",
>  perm=384)
>     at 
> /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966
> #1  0x0000000000409a81 in initialize_app (user=<value optimized out>, 
> app_id=0x7ffd669fd2b7 "application_1516711246952_0001", 
> nmPrivate_credentials_file=0x7ffd669fd2d6 
> "/yarn/nm/nmPrivate/container_1516711246952_0001_02_000001.tokens", 
> local_dirs=0x9331c8, log_roots=<value optimized out>, args=0x7ffd669fb168)
>     at 
> /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:1122
> #2  0x0000000000403f90 in main (argc=<value optimized out>, argv=<value 
> optimized out>) at 
> /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c:558
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to