[ 
https://issues.apache.org/jira/browse/YARN-7796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16343452#comment-16343452
 ] 

Jim Brennan commented on YARN-7796:
-----------------------------------

[~miklos.szeg...@cloudera.com] - I believe the max stack size reported by 
ulimit is in KBs, so 10 MB for RHEL 6 and 8 MB for RHEL 7.   The 
container-executor 128 KB stack allocation should be well within those limits.

It does appear that gcc is generating dynamic stack-checking code when compiled 
on RHEL 6, and that seems to work when run on RHEL 6.  But the same binary does 
not seem to work on RHEL 7.  This implies to me that the code generated for 
RHEL 6 is in some way incompatible with the RHEL 7 kernel.  

 

> Container-executor fails with segfault on certain OS configurations
> -------------------------------------------------------------------
>
>                 Key: YARN-7796
>                 URL: https://issues.apache.org/jira/browse/YARN-7796
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 3.0.0
>            Reporter: Gergo Repas
>            Assignee: Gergo Repas
>            Priority: Major
>             Fix For: 3.1.0, 3.0.1
>
>         Attachments: YARN-7796.000.patch, YARN-7796.001.patch, 
> YARN-7796.002.patch
>
>
> There is a relatively big (128K) buffer allocated on the stack in 
> container-executor.c for the purpose of copying files. As indicated by the 
> below gdb stack trace, this allocation can fail with SIGSEGV. This happens 
> only on certain OS configurations - I can reproduce this issue on RHEL 6.9:
> {code:java}
> [Thread debugging using libthread_db enabled]
> main : command provided 0
> main : run as user is ***
> main : requested yarn user is ***
> Program received signal SIGSEGV, Segmentation fault.
> 0x00000000004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 
> "/yarn/nm/nmPrivate/container_1516711246952_0001_02_000001.tokens", 
> out_filename=0x932930 
> "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_000001.tokens",
>  perm=384)
>     at 
> /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966
> 966     char buffer[buffer_size];
> (gdb) bt
> #0  0x00000000004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 
> "/yarn/nm/nmPrivate/container_1516711246952_0001_02_000001.tokens", 
> out_filename=0x932930 
> "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_000001.tokens",
>  perm=384)
>     at 
> /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966
> #1  0x0000000000409a81 in initialize_app (user=<value optimized out>, 
> app_id=0x7ffd669fd2b7 "application_1516711246952_0001", 
> nmPrivate_credentials_file=0x7ffd669fd2d6 
> "/yarn/nm/nmPrivate/container_1516711246952_0001_02_000001.tokens", 
> local_dirs=0x9331c8, log_roots=<value optimized out>, args=0x7ffd669fb168)
>     at 
> /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:1122
> #2  0x0000000000403f90 in main (argc=<value optimized out>, argv=<value 
> optimized out>) at 
> /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c:558
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to