[ 
https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637300#comment-14637300
 ] 

Zhijie Shen commented on YARN-3908:
-----------------------------------

bq. Is it the event id + timestamp? How about the event type? If you look at 
the equals() and the hashCode() implementations of TimelineEvent, it uses the 
timestamp, the event type, and even the info as a whole, but the id is not used 
for equality. How does that square with the stated intent that the event id and 
the timestamp form the identity?

There's no event type now. In v1, it's called type, but in v2 is renamed to id. 
We want to use id + ts to identify an event object uniquely to support the case 
that an event happens multiple times. And we can avoid the combination ID like 
"container_allocation_13421543243". Does this make sense?

bq. Is pretty much the only access pattern "give me all the events that belong 
to this entity"?

Yeah, get the events in chronological order of one entity, or just getting part 
of them via filtering.

bq. Two TimelineEvents are equal only if the timestamp is equal AND the type is 
equal AND the entire info maps are equal. What would we query by event type, 
timestamp and event info key? Do users always have to specify the timestamp?

There's no type, but only ID. In the current reader API, we cannot do 
sub-entity filtering, but in the future, we can try to support , for example, 
getting the events in a given time window. If two event has the same <id, ts>, 
but different info, we may consider them as the same event, but carry different 
information. The latter put one will append more k/v pairs or update the 
existing ones.

bq. Do we need to store only the latest event for each timestamp, or all of 
them? It would almost sound like the key should be type and timestamp, but what 
about the entire event info map?

In DB, i think proper logic is: if we put <event1, ts1> and <event1, ts2>, we 
should have two separate records persisted; and if we put <event1, ts1, info: 
\[k1=v1, k2=v2\]> and <event1, ts1, info: \[k1=v1'\]> again, we should update 
the same record and let k1=v1'.



> Bugs in HBaseTimelineWriterImpl
> -------------------------------
>
>                 Key: YARN-3908
>                 URL: https://issues.apache.org/jira/browse/YARN-3908
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Zhijie Shen
>            Assignee: Vrushali C
>         Attachments: YARN-3908-YARN-2928.001.patch, 
> YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, 
> YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, 
> YARN-3908-YARN-2928.005.patch
>
>
> 1. In HBaseTimelineWriterImpl, the info column family contains the basic 
> fields of a timeline entity plus events. However, entity#info map is not 
> stored at all.
> 2 event#timestamp is also not persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to