[
https://issues.apache.org/jira/browse/YARN-221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14559577#comment-14559577
]
Xuan Gong commented on YARN-221:
--------------------------------
bq. All the known policies will be part of YARN including
SampleRateContainerLogAggregationPolicy. So we still need to config sample rate
for that policy. If we don't put it in YarnConfiguration, where can we put it?
It seems we already have a bunch of configuration properties in
YarnConfiguration that are specific the plugin implementation such as container
executor properties.
I thought about this. How about adding a new protocol field: String
ContainerLogAggregationPolicyParameter along with ContainerLogAggregationPolicy
in logAggregationContext. In ContainerLogAggregationPolicyParameter, users can
define any parameter format which their ContainerLogAggregationPolicy can
understand. For example, we could define ContainerLogAggregationPolicyParameter
as "SR:0.2" and in SampleRateContainerLogAggregationPolicy, we could add
implementation to understand and parse the parameter.
Also, we could change to
{code}
public interface ContainerLogAggregationPolicy {
public boolean shouldDoLogAggregation(ContainerId containerId, int
exitCode);
public void parseParameters(String parameters)
}
{code}
bq. How MR overrides the default policy. Maybe we can have YarnRunner at MR
level honor yarn property "yarn.container-log-aggregation-policy.class" on per
job level when it creates the ApplicationSubmissionContext with the proper
LogAggregationContext. In that way we don't have to create extra log
aggregation properties specific at MR layer.
Good question. Another possible solution could be "parsing them from
command-line" if users use ToolRunner.run to launch their MR application.
> NM should provide a way for AM to tell it not to aggregate logs.
> ----------------------------------------------------------------
>
> Key: YARN-221
> URL: https://issues.apache.org/jira/browse/YARN-221
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: log-aggregation, nodemanager
> Reporter: Robert Joseph Evans
> Assignee: Ming Ma
> Attachments: YARN-221-trunk-v1.patch, YARN-221-trunk-v2.patch,
> YARN-221-trunk-v3.patch, YARN-221-trunk-v4.patch, YARN-221-trunk-v5.patch
>
>
> The NodeManager should provide a way for an AM to tell it that either the
> logs should not be aggregated, that they should be aggregated with a high
> priority, or that they should be aggregated but with a lower priority. The
> AM should be able to do this in the ContainerLaunch context to provide a
> default value, but should also be able to update the value when the container
> is released.
> This would allow for the NM to not aggregate logs in some cases, and avoid
> connection to the NN at all.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)