[ 
https://issues.apache.org/jira/browse/YARN-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14047081#comment-14047081
 ] 

huozhanfeng commented on YARN-2231:
-----------------------------------

Index: MapReduceChildJVM.java
===================================================================
--- MapReduceChildJVM.java      (revision 1387)
+++ MapReduceChildJVM.java      (revision 1388)
@@ -37,6 +37,7 @@
 @SuppressWarnings("deprecation")
 public class MapReduceChildJVM {
 
+       private static final String tailCommand = "tail";
   private static String getTaskLogFile(LogName filter) {
     return ApplicationConstants.LOG_DIR_EXPANSION_VAR + Path.SEPARATOR + 
         filter.toString();
@@ -161,9 +162,12 @@
 
     TaskAttemptID attemptID = task.getTaskID();
     JobConf conf = task.conf;
-
+    long logSize = TaskLog.getTaskLogLength(conf);
+    
     Vector<String> vargs = new Vector<String>(8);
-
+    if(logSize > 0){
+       vargs.add("(");
+    }
     vargs.add(Environment.JAVA_HOME.$() + "/bin/java");
 
     // Add child (task) java-vm options.
@@ -206,7 +210,6 @@
     vargs.add("-Djava.io.tmpdir=" + childTmpDir);
 
     // Setup the log4j prop
-    long logSize = TaskLog.getTaskLogLength(conf);
     setupLog4jProperties(task, vargs, logSize);
 
     if (conf.getProfileEnabled()) {
@@ -229,8 +232,22 @@
 
     // Finally add the jvmID
     vargs.add(String.valueOf(jvmID.getId()));
-    vargs.add("1>" + getTaskLogFile(TaskLog.LogName.STDOUT));
-    vargs.add("2>" + getTaskLogFile(TaskLog.LogName.STDERR));
+    if (logSize > 0) {
+       vargs.add("|");
+        vargs.add(tailCommand);
+        vargs.add("-c");
+        vargs.add(String.valueOf(logSize));
+        vargs.add(">>"+getTaskLogFile(TaskLog.LogName.STDOUT));
+        vargs.add("; exit $PIPESTATUS ) 2>&1 | ");
+        vargs.add(tailCommand);
+        vargs.add("-c");
+        vargs.add(String.valueOf(logSize));
+        vargs.add(">>"+getTaskLogFile(TaskLog.LogName.STDERR));
+        vargs.add("; exit $PIPESTATUS");
+      } else {
+           vargs.add("1>" + getTaskLogFile(TaskLog.LogName.STDOUT));
+           vargs.add("2>" + getTaskLogFile(TaskLog.LogName.STDERR));
+      }
 
     // Final commmand
     StringBuilder mergedCommand = new StringBuilder();


> Provide feature  to limit MRJob's stdout/stderr size
> ----------------------------------------------------
>
>                 Key: YARN-2231
>                 URL: https://issues.apache.org/jira/browse/YARN-2231
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: log-aggregation, nodemanager
>    Affects Versions: 2.3.0
>         Environment: CentOS release 5.8 (Final)
>            Reporter: huozhanfeng
>              Labels: features
>
> When a MRJob print too much stdout or stderr log, the disk will be filled. 
> Now it has influence our platform management.
> I have improved org.apache.hadoop.mapred.MapReduceChildJVM.java(come 
> from@org.apache.hadoop.mapred.TaskLog) to generate the execute cmd
> as follows:
> exec /bin/bash -c "( $JAVA_HOME/bin/java -Djava.net.preferIPv4Stack=true 
> -Dhadoop.metrics.log.level=WARN  -Xmx1024m -Djava.io.tmpdir=$PWD/tmp 
> -Dlog4j.configuration=container-log4j.properties 
> -Dyarn.app.container.log.dir=/logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_000002
>  -Dyarn.app.container.log.filesize=10240 -Dhadoop.root.logger=DEBUG,CLA 
> org.apache.hadoop.mapred.YarnChild 10.106.24.108 53911 
> attempt_1403930653208_0003_m_000000_0 2 | tail -c 102 
> >/logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_000002/stdout
>  ; exit $PIPESTATUS ) 2>&1 |  tail -c 10240 
> >/logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_000002/stderr
>  ; exit $PIPESTATUS "
> But it doesn't take effect.
> And then, when I use "export YARN_NODEMANAGER_OPTS=-Xdebug 
> -Xrunjdwp:transport=dt_socket,address=8788,server=y,suspend=y" for debuging 
> NodeManager, I find when I set the BreakPoints at 
> org.apache.hadoop.util.Shell(line 450:process = builder.start()) and 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch(line
>  161:List<String> newCmds = new ArrayList<String>(command.size())) the cmd 
> will work.
> I doubt there's concurrency problem caused  pipe shell will not perform 
> properly. It matters, and I need your help.
> my email: huozhanf...@gmail.com
> thanks



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to