[ 
https://issues.apache.org/jira/browse/YARN-7179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ting Dai updated YARN-7179:
---------------------------
    Description: 
In hadoop-0.23.0, inside readAcontainerLogs function, when valueStream is 
corrupted, then writer will be corrupted.

{code:java}
  public static void readAcontainerLogs(DataInputStream valueStream, Writer 
writer) throws IOException {
      ....
      while (true) {
         try {
           fileType = valueStream.readUTF();
         } catch (EOFException e) {
           return;
         }
        fileLengthStr = valueStream.readUTF();       //corrupted
        fileLength = Long.parseLong(fileLengthStr); //0
        writer.write("\n\nLogType:");
        writer.write(fileType);
        writer.write("\nLogLength:");
        writer.write(fileLengthStr);
        writer.write("\nLog Contents:\n");
        BoundedInputStream bis = new BoundedInputStream(valueStream, 
fileLength); //empty stream
        InputStreamReader reader = new InputStreamReader(bis); 
        int currentRead = 0;
        int totalRead = 0;
        while ((currentRead = reader.read(cbuf, 0, bufferSize)) != -1) {  
//always return -1
          writer.write(cbuf);
          totalRead += currentRead;
        }
      }
  }
{code}

     
When the fileLengthStr is corrupted, especially when it is "0", but the 
valueStream is actually not empty. This will cause bis, reader be empty due to 
fileLength. The empty reader causes currentRead be -1 immediately, making 
writer.write inside the while loop never execute.
During next iteration, the fileType, fileLength and bis, reader will be 
corrupted.
For example, if I have a DataInputStream like:
    {color:#d04437}"text", "0", "This is the content", "16", "Another content". 
{color}
But the writer will write the following log data:
      {color:#d04437}"LogType:text   
      LogLength:0  
      Log Contents:
      LogType:This is the content  
      LogLength:2  Log Contents:" {color}


  was:
In hadoop-0.23.0, inside readAcontainerLogs function, when valueStream is 
corrupted, then writer will be corrupted.

{code:java}
  public static void readAcontainerLogs(DataInputStream valueStream, Writer 
writer) throws IOException {
      ....
      while (true) {
         try {
           fileType = valueStream.readUTF();
         } catch (EOFException e) {
           return;
         }
        fileLengthStr = valueStream.readUTF();       //corrupted
        fileLength = Long.parseLong(fileLengthStr); //0
        writer.write("\n\nLogType:");
        writer.write(fileType);
        writer.write("\nLogLength:");
        writer.write(fileLengthStr);
        writer.write("\nLog Contents:\n");
        BoundedInputStream bis = new BoundedInputStream(valueStream, 
fileLength); //empty stream
        InputStreamReader reader = new InputStreamReader(bis); 
        int currentRead = 0;
        int totalRead = 0;
        while ((currentRead = reader.read(cbuf, 0, bufferSize)) != -1) {  
//always return -1
          writer.write(cbuf);
          totalRead += currentRead;
        }
      }
  }
{code}

     
When the fileLengthStr is corrupted, especially when it is "0", but the 
valueStream is actually not empty. This will cause bis, reader be empty due to 
fileLength. The empty reader causes currentRead be -1 immediately, making 
writer.write inside the while loop never write.
During next iteration, the fileType, fileLength and bis, reader will be 
corrupted.
For example, if I have a DataInputStream like:
    {color:#d04437}"text", "0", "This is the content", "16", "Another content". 
{color}
But the writer will write the following log data:
      {color:#d04437}"LogType:text   
      LogLength:0  
      Log Contents:
      LogType:This is the content  
      LogLength:2  Log Contents:" {color}



> The log data is corrupted 
> --------------------------
>
>                 Key: YARN-7179
>                 URL: https://issues.apache.org/jira/browse/YARN-7179
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: log-aggregation
>            Reporter: Ting Dai
>
> In hadoop-0.23.0, inside readAcontainerLogs function, when valueStream is 
> corrupted, then writer will be corrupted.
> {code:java}
>   public static void readAcontainerLogs(DataInputStream valueStream, Writer 
> writer) throws IOException {
>       ....
>       while (true) {
>          try {
>            fileType = valueStream.readUTF();
>          } catch (EOFException e) {
>            return;
>          }
>         fileLengthStr = valueStream.readUTF();       //corrupted
>         fileLength = Long.parseLong(fileLengthStr); //0
>         writer.write("\n\nLogType:");
>         writer.write(fileType);
>         writer.write("\nLogLength:");
>         writer.write(fileLengthStr);
>         writer.write("\nLog Contents:\n");
>         BoundedInputStream bis = new BoundedInputStream(valueStream, 
> fileLength); //empty stream
>         InputStreamReader reader = new InputStreamReader(bis); 
>         int currentRead = 0;
>         int totalRead = 0;
>         while ((currentRead = reader.read(cbuf, 0, bufferSize)) != -1) {  
> //always return -1
>           writer.write(cbuf);
>           totalRead += currentRead;
>         }
>       }
>   }
> {code}
>      
> When the fileLengthStr is corrupted, especially when it is "0", but the 
> valueStream is actually not empty. This will cause bis, reader be empty due 
> to fileLength. The empty reader causes currentRead be -1 immediately, making 
> writer.write inside the while loop never execute.
> During next iteration, the fileType, fileLength and bis, reader will be 
> corrupted.
> For example, if I have a DataInputStream like:
>     {color:#d04437}"text", "0", "This is the content", "16", "Another 
> content". {color}
> But the writer will write the following log data:
>       {color:#d04437}"LogType:text   
>       LogLength:0  
>       Log Contents:
>       LogType:This is the content  
>       LogLength:2  Log Contents:" {color}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to