Vrushali C commented on YARN-4053:

The way I see this, it comes down to a basic question of whether we really 
*need* floating point precision in metric values. For instance, cost is a 
metric which could have a decimal value upon calculation. But, in my opinion 
say a cost of 5 dollars versus 5.347891 dollars versus a cost of 5.78913 are 
not that different. A cost of 6.x dollars is different from 5.x.  I believe 
that it does not matter THAT much that cost is 5.347891 or 5.79813.  These are 
hadoop applications, the time duration is rarely going to be exactly consistent 
for the exactly same code. So metrics will usually have a slight fluctuation 
between different runs of the exact same job. 

Storage and querying of Longs is straightforward and clean. No ambiguity in 

Contrasting that with storage of various numerical data types in metrics:
- all the complexity of storing of column prefixes that can tell us which type 
is stored so that serialization to/from hbase can be done correctly.
- the filtering in hbase becomes so much more complicated with all these 
different datatypes.

> Change the way metric values are stored in HBase Storage
> --------------------------------------------------------
>                 Key: YARN-4053
>                 URL: https://issues.apache.org/jira/browse/YARN-4053
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>    Affects Versions: YARN-2928
>            Reporter: Varun Saxena
>            Assignee: Varun Saxena
>         Attachments: YARN-4053-YARN-2928.01.patch
> Currently HBase implementation uses GenericObjectMapper to convert and store 
> values in backend HBase storage. This converts everything into a string 
> representation(ASCII/UTF-8 encoded byte array).
> While this is fine in most cases, it does not quite serve our use case for 
> metrics. 
> So we need to decide how are we going to encode and decode metric values and 
> store them in HBase.

This message was sent by Atlassian JIRA

Reply via email to