[ 
https://issues.apache.org/jira/browse/YARN-4265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Lu updated YARN-4265:
------------------------
    Attachment: YARN-4265-trunk.005.patch

Thanks [~djp] and [~liuml07] for the help! Since we need to handle files with 
appends, we cannot directly use directory modification time to decide if the 
contents of a directory has been changed. This means we also need to change 
some logics in the cleanLogs method. I redesigned cleanLogs method to perform a 
log scan with two methods:
Method 1: For the given directory, search (in a depth first fashion) to find 
out application log directories. For each of them, call method 2. 
Method 2: For the given application log directory, search all files inside. If 
there exists a file that has been "recently" (as defined by the configs) 
updates, skip removing this directory. Otherwise, remove this application log 
directory. 

In this way we can search inside a directory for all application log 
directories that need to be reclaimed. 

According to Junping's suggestion, I've also added a new unit test 
(testCleanLogs) to cover common cases for the cleanLogs method. 

> Provide new timeline plugin storage to support fine-grained entity caching
> --------------------------------------------------------------------------
>
>                 Key: YARN-4265
>                 URL: https://issues.apache.org/jira/browse/YARN-4265
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Li Lu
>            Assignee: Li Lu
>         Attachments: YARN-4265-trunk.001.patch, YARN-4265-trunk.002.patch, 
> YARN-4265-trunk.003.patch, YARN-4265-trunk.004.patch, 
> YARN-4265-trunk.005.patch, YARN-4265.YARN-4234.001.patch, 
> YARN-4265.YARN-4234.002.patch
>
>
> To support the newly proposed APIs in YARN-4234, we need to create a new 
> plugin timeline store. The store may have similar behavior as the 
> EntityFileTimelineStore proposed in YARN-3942, but cache date in cache id 
> granularity, instead of application id granularity. Let's have this storage 
> as a standalone one, instead of updating EntityFileTimelineStore, to keep the 
> existing store (EntityFileTimelineStore) stable. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to