[ 
https://issues.apache.org/jira/browse/YARN-221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14559577#comment-14559577
 ] 

Xuan Gong commented on YARN-221:
--------------------------------

bq. All the known policies will be part of YARN including 
SampleRateContainerLogAggregationPolicy. So we still need to config sample rate 
for that policy. If we don't put it in YarnConfiguration, where can we put it? 
It seems we already have a bunch of configuration properties in 
YarnConfiguration that are specific the plugin implementation such as container 
executor properties.

I thought about this. How about adding a new protocol field:  String 
ContainerLogAggregationPolicyParameter along with ContainerLogAggregationPolicy 
in logAggregationContext. In ContainerLogAggregationPolicyParameter, users can 
define any parameter format which their ContainerLogAggregationPolicy can 
understand. For example, we could define ContainerLogAggregationPolicyParameter 
as "SR:0.2" and in SampleRateContainerLogAggregationPolicy, we could add 
implementation to understand and parse the parameter.
Also, we could change to
{code}
public interface ContainerLogAggregationPolicy {
    public boolean shouldDoLogAggregation(ContainerId containerId,  int 
exitCode);
    public void parseParameters(String parameters)
}
{code} 

bq. How MR overrides the default policy. Maybe we can have YarnRunner at MR 
level honor yarn property "yarn.container-log-aggregation-policy.class" on per 
job level when it creates the ApplicationSubmissionContext with the proper 
LogAggregationContext. In that way we don't have to create extra log 
aggregation properties specific at MR layer.

Good question. Another possible solution could be "parsing them from 
command-line" if users use ToolRunner.run to launch their MR application.

> NM should provide a way for AM to tell it not to aggregate logs.
> ----------------------------------------------------------------
>
>                 Key: YARN-221
>                 URL: https://issues.apache.org/jira/browse/YARN-221
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: log-aggregation, nodemanager
>            Reporter: Robert Joseph Evans
>            Assignee: Ming Ma
>         Attachments: YARN-221-trunk-v1.patch, YARN-221-trunk-v2.patch, 
> YARN-221-trunk-v3.patch, YARN-221-trunk-v4.patch, YARN-221-trunk-v5.patch
>
>
> The NodeManager should provide a way for an AM to tell it that either the 
> logs should not be aggregated, that they should be aggregated with a high 
> priority, or that they should be aggregated but with a lower priority.  The 
> AM should be able to do this in the ContainerLaunch context to provide a 
> default value, but should also be able to update the value when the container 
> is released.
> This would allow for the NM to not aggregate logs in some cases, and avoid 
> connection to the NN at all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to