[ https://issues.apache.org/jira/browse/YARN-4309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15036979#comment-15036979 ]
Sidharta Seethana commented on YARN-4309: ----------------------------------------- hi [~vvasudev], I am using the find command that you have in the patch against broken symlinks - it is not clear to me how broken symlink info is captured (please see below). Could you please clarify? {code} q (19:50:35) ~/symlink-test$ ls -l total 0 q (19:50:47) ~/symlink-test$ ln -s world hello q (19:51:03) ~/symlink-test$ find -L . -maxdepth 5 -type l -ls 2149279432 0 lrwxrwxrwx 1 sseethana sseethana 5 Dec 2 19:51 ./hello -> world q (19:51:15) ~/symlink-test$ echo $? 0 q (19:51:52) ~/symlink-test$ uname -a Linux q 3.10.0-229.el7.x86_64 #1 SMP Fri Mar 6 11:36:42 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux {code} > Add debug information to application logs when a container fails > ---------------------------------------------------------------- > > Key: YARN-4309 > URL: https://issues.apache.org/jira/browse/YARN-4309 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager > Reporter: Varun Vasudev > Assignee: Varun Vasudev > Attachments: YARN-4309.001.patch, YARN-4309.002.patch, > YARN-4309.003.patch, YARN-4309.004.patch, YARN-4309.005.patch > > > Sometimes when a container fails, it can be pretty hard to figure out why it > failed. > My proposal is that if a container fails, we collect information about the > container local dir and dump it into the container log dir. Ideally, I'd like > to tar up the directory entirely, but I'm not sure of the security and space > implications of such a approach. At the very least, we can list all the files > in the container local dir, and dump the contents of launch_container.sh(into > the container log dir). > When log aggregation occurs, all this information will automatically get > collected and make debugging such failures much easier. -- This message was sent by Atlassian JIRA (v6.3.4#6332)