Xuan Gong commented on YARN-221:

[~mingma] Thanks for working on this. I have some general comments and want to 
discuss with you.
We could have a common interface called ContainerLogAggregationPolicy which can 
include at least this function:
* doLogAggregationForContainer (You might need a better name.). And this 
function will be called by AppLogAggregator to check whether the log for this 
container need to be aggregated.

So, instead of creating a enum type: ContainerLogAggregationPolicy

We could create some basic policy which implements the common interface 
ContainerLogAggregationPolicy, such as AllContainerLogAggregationPolicy, 
NonContainerLogAggregationPolicy, AMContainerOnlyLogAggregationPolicy, 
FailContainerOnlyLogAggregationPolicy, SampleRateContainerLogAggregationPolicy, 
I think that this way might be more extendible. And in the future, clients can 
implement their own ContainerLogAggregationPolicy which can be more complex.
With this, we do not need add any new configurations in service side.
+  public static final String LOG_AGGREGATION_SAMPLE_PERCENT = NM_PREFIX
+      + "log-aggregation.worker-sample-percent";
+  public static final float DEFAULT_LOG_AGGREGATION_SAMPLE_PERCENT = 1.0f;
+  public static final String LOG_AGGREGATION_AM_LOGS = NM_PREFIX
+      + "log-aggregation.am-enable";
+  public static final boolean DEFAULT_LOG_AGGREGATION_AM_LOGS = true;
can be removed

Also, instead of adding ContainerLogAggregationPolicy into CLC, we could add 
ContainerLogAggregationPolicy into LogAggregationContext which already can be 
accessed by NM.

Thoughts ?

> NM should provide a way for AM to tell it not to aggregate logs.
> ----------------------------------------------------------------
>                 Key: YARN-221
>                 URL: https://issues.apache.org/jira/browse/YARN-221
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: log-aggregation, nodemanager
>            Reporter: Robert Joseph Evans
>            Assignee: Ming Ma
>         Attachments: YARN-221-trunk-v1.patch, YARN-221-trunk-v2.patch, 
> YARN-221-trunk-v3.patch, YARN-221-trunk-v4.patch, YARN-221-trunk-v5.patch
> The NodeManager should provide a way for an AM to tell it that either the 
> logs should not be aggregated, that they should be aggregated with a high 
> priority, or that they should be aggregated but with a lower priority.  The 
> AM should be able to do this in the ContainerLaunch context to provide a 
> default value, but should also be able to update the value when the container 
> is released.
> This would allow for the NM to not aggregate logs in some cases, and avoid 
> connection to the NN at all.

This message was sent by Atlassian JIRA

Reply via email to