Xuan Gong commented on YARN-221:

Thanks for the latest patch. I think that we are close. The patch looks good 
overall.  One nit:
*  could we modify this doc in AppLogAggregatorImpl, too
    // Create a set of Containers whose logs will be uploaded in this cycle.
    // It includes:
    // a) all containers in pendingContainers: those containers are finished
    //    and satisfy the retentionPolicy.
    // b) some set of running containers: For all the Running containers,
    // we have ContainerLogsRetentionPolicy.AM_AND_FAILED_CONTAINERS_ONLY,
    // so simply set wasContainerSuccessful as true to
    // bypass FAILED_CONTAINERS check and find the running containers 
    // which satisfy the retentionPolicy.

Also, I realized that ContainerTokenIdentifier is used here
boolean shouldDoLogAggregation(ContainerTokenIdentifier containerToken,  int 
Currently, it is fine. But if in future, we might need other information which 
the ContainerTokenIdentifier can not provide. So, probably, we could have our 
own ContainerLogContext instead of using ContainerTokenIdentifier ? In that 
case, if we have requirement to use other information, we could add. 

Thoughts ?

> NM should provide a way for AM to tell it not to aggregate logs.
> ----------------------------------------------------------------
>                 Key: YARN-221
>                 URL: https://issues.apache.org/jira/browse/YARN-221
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: log-aggregation, nodemanager
>            Reporter: Robert Joseph Evans
>            Assignee: Ming Ma
>         Attachments: YARN-221-6.patch, YARN-221-7.patch, YARN-221-8.patch, 
> YARN-221-trunk-v1.patch, YARN-221-trunk-v2.patch, YARN-221-trunk-v3.patch, 
> YARN-221-trunk-v4.patch, YARN-221-trunk-v5.patch
> The NodeManager should provide a way for an AM to tell it that either the 
> logs should not be aggregated, that they should be aggregated with a high 
> priority, or that they should be aggregated but with a lower priority.  The 
> AM should be able to do this in the ContainerLaunch context to provide a 
> default value, but should also be able to update the value when the container 
> is released.
> This would allow for the NM to not aggregate logs in some cases, and avoid 
> connection to the NN at all.

This message was sent by Atlassian JIRA

Reply via email to