[ 
https://issues.apache.org/jira/browse/YARN-7857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347175#comment-16347175
 ] 

Jim Brennan commented on YARN-7857:
-----------------------------------

Thanks [[email protected]] for the detailed analysis!

I am not suggesting that we revert the fix from YARN-7796 - it clearly resolves 
the failure we were seeing and works in all cases.

The proposal in this Jira is to remove the {{-fstack-check}} flag because it 
has been shown to cause binary incompatibility issues, depending on the version 
of gcc a binary is compiled with and the OS it is run on.
{quote}As a conclusion, the stack check code seems to be legitimate. However, 
the code might address the same memory later ending up with the same crash 
without stack checking.
{quote}
I'm not sure I follow this? It sounds like you've shown that the size of the 
buffer matters - 110KB buffer compiled on RHEL 6 with {{-fstack-check}} works 
on RHEL 7 while it fails for a 128 KB buffer. But the 128 KB buffer works when 
compiled on RHEL 7 (with or without {{-fstack-check}}).   I agree that it may 
be that the RHEL 6 version of the stack checking code is tripping some kernel 
protection when the buffer is big enough.  The RHEL 7 version of the stack 
checking code does not.  That seems like an incompatibility.

My concern is that if we leave the {{-fstack-check}} flag there, some future 
change may cause a similar problem to the one we fixed in YARN-7796.

> -fstack-check compilation flag causes binary incompatibility for 
> container-executor between RHEL 6 and RHEL 7
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-7857
>                 URL: https://issues.apache.org/jira/browse/YARN-7857
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 3.0.0
>            Reporter: Jim Brennan
>            Assignee: Jim Brennan
>            Priority: Major
>
> The segmentation fault in container-executor reported in [YARN-7796]  appears 
> to be due to a binary compatibility issue with the {{-fstack-check}} flag 
> that was added in [YARN-6721]
> Based on my testing, a container-executor (without the patch from 
> [YARN-7796]) compiled on RHEL 6 with the -fstack-check flag always hits this 
> segmentation fault when run on RHEL 7.  But if you compile without this flag, 
> the container-executor runs on RHEL 7 with no problems.  I also verified this 
> with a simple program that just does the copy_file.
> I think we need to either remove this flag, or find a suitable alternative.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to