[
https://issues.apache.org/jira/browse/YARN-499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sandy Ryza updated YARN-499:
----------------------------
Summary: On container failure, surface logs to client (was: On container
failure, include last n lines of logs in diagnostics)
> On container failure, surface logs to client
> --------------------------------------------
>
> Key: YARN-499
> URL: https://issues.apache.org/jira/browse/YARN-499
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: nodemanager
> Affects Versions: 2.0.3-alpha
> Reporter: Sandy Ryza
> Assignee: Sandy Ryza
> Attachments: YARN-499.patch
>
>
> When a container fails, the only way to diagnose it is to look at the logs.
> ContainerStatuses include a diagnostic string that is reported back to the
> resource manager by the node manager.
> Currently in MR2 I believe whatever is sent to the task's standard out is
> added to the diagnostics string, but for MR standard out is redirected to a
> file called stdout. In MR1, this string was populated with the last few
> lines of the task's stdout file, and got printed to the console, allowing for
> easy debugging.
> Handling this would help to soothe the infuriating problem of an AM dying for
> a mysterious reason before setting a tracking URL (MAPREDUCE-3688).
> This could be done in one of two ways.
> * Use tee to send MR's standard out to both the stdout file and standard out.
> This requires modifying ShellCmdExecutor to roll what it reads in, as we
> wouldn't want to be storing the entire task log in NM memory.
> * Read the task's log files. This would require standardizing or making the
> container log files configurable. Right now the log files are determined in
> userland and all that is YARN is aware of the log directory.
> Does this present any issues I'm not considering? If so it this might only
> be needed for AMs?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira