[
https://issues.apache.org/jira/browse/YARN-6091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15821472#comment-15821472
]
zhengchenyu commented on YARN-6091:
-----------------------------------
Here is the sever where appliation is running normally
{code}
[right server]cat /etc/redhat-release
CentOS Linux release 7.1.1503 (Core)
[right server]# cat /proc/version
Linux version 3.10.0-229.20.1.el7.x86_64 ([email protected]) (gcc
version 4.8.3 20140911 (Red Hat 4.8.3-9) (GCC) ) #1 SMP Tue Nov 3 19:10:07 UTC
2015
[right server]# gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/4.8.5/lto-wrapper
Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
--infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla
--enable-bootstrap --enable-shared --enable-threads=posix
--enable-checking=release --with-system-zlib --enable-__cxa_atexit
--disable-libunwind-exceptions --enable-gnu-unique-object
--enable-linker-build-id --with-linker-hash-style=gnu
--enable-languages=c,c++,objc,obj-c++,java,fortran,ada,go,lto --enable-plugin
--enable-initfini-array --disable-libgcj
--with-isl=/builddir/build/BUILD/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/isl-install
--with-cloog=/builddir/build/BUILD/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/cloog-install
--enable-gnu-indirect-function --with-tune=generic --with-arch_32=x86-64
--build=x86_64-redhat-linux
Thread model: posix
gcc version 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC)
[right server]# rpm -qa | grep glibc
glibc-devel-2.17-106.el7_2.8.x86_64
glibc-2.17-106.el7_2.8.x86_64
glibc-headers-2.17-106.el7_2.8.x86_64
glibc-common-2.17-106.el7_2.8.x86_64
{code}
Here is the sever where appliation is running unnormally
{code}
[wrong server]# cat /etc/redhat-release
CentOS Linux release 7.1.1503 (Core)
[wrong server]# cat /proc/version
Linux version 3.10.0-229.el7.x86_64 ([email protected]) (gcc
version 4.8.2 20140120 (Red Hat 4.8.2-16) (GCC) ) #1 SMP Fri Mar 6 11:36:42 UTC
2015
[wrong server]# gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/4.8.5/lto-wrapper
Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
--infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla
--enable-bootstrap --enable-shared --enable-threads=posix
--enable-checking=release --with-system-zlib --enable-__cxa_atexit
--disable-libunwind-exceptions --enable-gnu-unique-object
--enable-linker-build-id --with-linker-hash-style=gnu
--enable-languages=c,c++,objc,obj-c++,java,fortran,ada,go,lto --enable-plugin
--enable-initfini-array --disable-libgcj
--with-isl=/builddir/build/BUILD/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/isl-install
--with-cloog=/builddir/build/BUILD/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/cloog-install
--enable-gnu-indirect-function --with-tune=generic --with-arch_32=x86-64
--build=x86_64-redhat-linux
Thread model: posix
gcc version 4.8.5 20150623 (Red Hat 4.8.5-4) (GCC)
[wrong server]# rpm -qa | grep glibc
glibc-common-2.17-106.el7_2.6.x86_64
glibc-devel-2.17-106.el7_2.6.x86_64
glibc-2.17-106.el7_2.6.x86_64
glibc-headers-2.17-106.el7_2.6.x86_64
glibc-2.17-106.el7_2.6.i686
{code}
> the AppMaster register failed when use Docker on LinuxContainer
> ----------------------------------------------------------------
>
> Key: YARN-6091
> URL: https://issues.apache.org/jira/browse/YARN-6091
> Project: Hadoop YARN
> Issue Type: Bug
> Components: nodemanager, yarn
> Affects Versions: 2.8.0
> Environment: CentOS
> Reporter: zhengchenyu
> Priority: Critical
> Fix For: 2.8.0
>
> Original Estimate: 336h
> Remaining Estimate: 336h
>
> In some servers, When I use Docker on LinuxContainer, I found the aciton that
> AppMaster register to Resourcemanager failed. But didn't happen in other
> servers.
> I found the pclose (in container-executor.c) return different value in
> different server, even though the process which is launched by popen is
> running normally. Some server return 0, and others return 13.
> Because yarn regard the application as failed application when pclose return
> nonzero, and yarn will remove the AMRMToken, then the AppMaster register
> failed because Resourcemanager have removed this applicaiton's token.
> In container-executor.c, the judgement condition is whether the return code
> is zero. But man the pclose, the document tells that "pclose return -1"
> represent wrong. So I change the judgement condition, then slove this
> problem.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]