[
https://issues.apache.org/jira/browse/YARN-947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zhijie Shen updated YARN-947:
-----------------------------
Attachment: YARN-947.3.patch
Created a new patch incrementally, which includes the following modifications:
1. The biggest thing here is to add another two set of protobuf records in
addition to the set of XXXXHistoryData, which are the set of XXXXStartData and
that of XXXXFinishData. In fact, XXXXHistoryData = XXXXStartData +
XXXXFinishData. The duplicate part is the Id, which serves as the key.
XXXXStartData contains the fields that are determined when the object (RMApp,
RMAppAttempt and RMContainer) starts, while XXXXFinishData contains the fields
that are determined when the object finishes. With the separated records, we
can redesign the writer interface to write part of the data when the object
starts and the other when the object finishes, therefore reducing the loss of
information when the history data cannot be completely record (e.g. RM crash).
2. Change all protobuf records from interface to abstract class, and add the
builtin newInstance method for users to call.
3. Improve toString() of XXXXPBImpl here as well, which is filed in YARN-1066.
Therefore, I'll close that jira as duplicate
4. Fix a bug in ContainerHistoryDataPBImpl.
5. Instead of recording ContainerState, I change to record ContainerExitCode.
The reason is stated in YARN-1123:
https://issues.apache.org/jira/browse/YARN-1123?focusedCommentId=13793962&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13793962
ContainerState is always FINISHED for all the containers, which is meaningless.
Instead, ContainerExitStatus, which is the exit code, can indicate the problems
in the container.
[~vinodkv], would you please review it again?
> Defining the history data classes for the implementation of the
> reading/writing interface
> -----------------------------------------------------------------------------------------
>
> Key: YARN-947
> URL: https://issues.apache.org/jira/browse/YARN-947
> Project: Hadoop YARN
> Issue Type: Sub-task
> Reporter: Zhijie Shen
> Assignee: Zhijie Shen
> Fix For: YARN-321
>
> Attachments: YARN-947.1.patch, YARN-947.2.patch, YARN-947.3.patch
>
>
> We need to define the history data classes have the exact fields to be
> stored. Therefore, all the implementations don't need to have the duplicate
> logic to exact the required information from RMApp, RMAppAttempt and
> RMContainer.
> We use protobuf to define these classes, such that they can be ser/des
> to/from bytes, which are easier for persistence.
--
This message was sent by Atlassian JIRA
(v6.1#6144)