huozhanfeng created YARN-2231:

             Summary: Provide feature  to limit MRJob's stdout/stderr size
                 Key: YARN-2231
             Project: Hadoop YARN
          Issue Type: Improvement
          Components: log-aggregation, nodemanager
    Affects Versions: 2.3.0
         Environment: CentOS release 5.8 (Final)
            Reporter: huozhanfeng

When a MRJob print too much stdout or stderr log, the disk will be filled. Now 
it has influence our platform management.

I have improved 
from@org.apache.hadoop.mapred.TaskLog) to generate the execute cmd
as follows:
exec /bin/bash -c "( $JAVA_HOME/bin/java 
-Dhadoop.metrics.log.level=WARN  -Xmx1024m$PWD/tmp -Dhadoop.root.logger=DEBUG,CLA 
org.apache.hadoop.mapred.YarnChild 53911 
attempt_1403930653208_0003_m_000000_0 2 | tail -c 102 
 ; exit $PIPESTATUS ) 2>&1 |  tail -c 10240 
 ; exit $PIPESTATUS "

But it doesn't take effect.

And then, when I use "export YARN_NODEMANAGER_OPTS=-Xdebug 
-Xrunjdwp:transport=dt_socket,address=8788,server=y,suspend=y" for debuging 
NodeManager, I find when I set the BreakPoints at 
org.apache.hadoop.util.Shell(line 450:process = builder.start()) and 
 161:List<String> newCmds = new ArrayList<String>(command.size())) the cmd will 

I doubt there's concurrency problem caused with pipe shell will not perform 
properly. It matters, and I need your help.

my email:


This message was sent by Atlassian JIRA

Reply via email to