[ 
https://issues.apache.org/jira/browse/YARN-5470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15408087#comment-15408087
 ] 

Junping Du edited comment on YARN-5470 at 8/4/16 4:42 PM:
----------------------------------------------------------

Thanks Xuan for contributing the patch. 

{noformat}
+        + "to get exact matched log files. Use \"ALL\" or \".*\"to "
+        + "fetch all the log files for the container. Specific -regex "
+        + "for using java regex to find matched log files.");
{noformat}
I think ".\*" means to match all only under regular expression. Isn't it? For 
normal case, we use "\*" directly for wildcard, isn't it? So we can have 
several ways to fetch all: "-logFiles ALL", "-logFiles \*" and "-logFiles 
-regex .\*".

{noformat}
   private List<String> getMatchedLogFiles(ContainerLogsRequest options,
-      Collection<String> candidate) throws IOException {
+      Collection<String> candidate, boolean useRegex) throws IOException {
     List<String> matchedFiles = new ArrayList<String>();
     List<String> filePattern = options.getLogTypes();
+    boolean fetchAll = fetchAllLogFiles(
+        filePattern.toArray(new String[filePattern.size()]));
     for (String file : candidate) {
-      if (isFileMatching(file, filePattern)) {
+      if (fetchAll) {
         matchedFiles.add(file);
       }
+      if (useRegex) {
+        if (isFileMatching(file, filePattern)) {
+          matchedFiles.add(file);
+        }
+      } else {
+        if (filePattern.contains(file)) {
+          matchedFiles.add(file);
+        }
+      }
     }
     return matchedFiles;
   }
{noformat}
This could add duplicated log file if user specify ALL and some specific log 
file. Why not we use Set instead of List for log file to return? Do we need to 
fetch item in order or index? If not, replacing it with SET then we can get rid 
of duplication issue and also do some quick path like return candidate directly 
when fetchAll = true.


was (Author: djp):
Thanks Xuan for contributing the patch. 

{noformat}
+        + "to get exact matched log files. Use \"ALL\" or \".*\"to "
+        + "fetch all the log files for the container. Specific -regex "
+        + "for using java regex to find matched log files.");
{noformat}
I think ".\*" means to match all only under regular expression. Isn't it? For 
normal case, we use "\*" directly for wildcard, isn't it? So we can have 
several ways to fetch all: "-logFiles ALL", "-logFiles *" and "-logFiles -regex 
.*".

{noformat}
   private List<String> getMatchedLogFiles(ContainerLogsRequest options,
-      Collection<String> candidate) throws IOException {
+      Collection<String> candidate, boolean useRegex) throws IOException {
     List<String> matchedFiles = new ArrayList<String>();
     List<String> filePattern = options.getLogTypes();
+    boolean fetchAll = fetchAllLogFiles(
+        filePattern.toArray(new String[filePattern.size()]));
     for (String file : candidate) {
-      if (isFileMatching(file, filePattern)) {
+      if (fetchAll) {
         matchedFiles.add(file);
       }
+      if (useRegex) {
+        if (isFileMatching(file, filePattern)) {
+          matchedFiles.add(file);
+        }
+      } else {
+        if (filePattern.contains(file)) {
+          matchedFiles.add(file);
+        }
+      }
     }
     return matchedFiles;
   }
{noformat}
This could add duplicated log file if user specify ALL and some specific log 
file. Why not we use Set instead of List for log file to return? Do we need to 
fetch item in order or index? If not, replacing it with SET then we can get rid 
of duplication issue and also do some quick path like return candidate directly 
when fetchAll = true.

> Differentiate exactly match with regex in yarn log CLI
> ------------------------------------------------------
>
>                 Key: YARN-5470
>                 URL: https://issues.apache.org/jira/browse/YARN-5470
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Xuan Gong
>            Assignee: Xuan Gong
>         Attachments: YARN-5470.1.patch, YARN-5470.2.patch
>
>
> Since YARN-5089, we support regular expression in YARN log CLI "-logFiles" 
> option. However, we should differentiate exactly match with regex match as 
> user could put something like "system.out" here which have different 
> semantics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to