Varun Saxena commented on YARN-4053:

So to resolve this we need some other way of storing metric values. Options are 
as under :
# Keep the current way of storing metric values. And write a custom filter to 
match the values. But this would need the new filter to be deployed on all 
region servers. This solution hence may not be feasible. But if we do not want 
to do this, for lexicographic comparison to work, sizes of bytes compared 
should be equal.
# Store values as primitive types. That is, long as 8 bytes, integer as 4 bytes 
and so on. But this can create problems in lexicographical comparison too. Say 
metric m1 is stored as long. But a query to reader might be of the form {{m1 > 
40000}}. As 40000 will be interpreted as Integer, we will try to compare 4 
bytes against 8 bytes.
So the solution for this is to store every integral value as long(8 bytes) and 
floating point values as double. Same approach can be used while matching at 
reader side.
# But above solution may not work if we want to support BigInteger and 
BigDecimal values(i.e. numerical values > 8 bytes). Although 8 bytes should be 
enough but aggregated values may exceed 8 bytes. In this case, we can probably 
decide values upto how many bytes do we need to support. 16 bytes, for that 
matter even 12 bytes should be more than enough for all realistic scenarios. 
While encoding we can do padding with zeroes in front if number is less than 16 
# Another option can be to continue supporting string representation and 
restrict max number of digits we want to support before and after decimal 
point. Say 30 digits before decimal point and 3 after. We can pad rest of the 
bytes with zeroes while storing so that comparison can be done.

> Change the way metric values are stored in HBase Storage
> --------------------------------------------------------
>                 Key: YARN-4053
>                 URL: https://issues.apache.org/jira/browse/YARN-4053
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>    Affects Versions: YARN-2928
>            Reporter: Varun Saxena
>            Assignee: Varun Saxena
> Currently HBase implementation uses GenericObjectMapper is used to convert 
> and store values in backend HBase storage. This converts everything into a 
> string representation(ASCII/UTF-8 encoded byte array).
> While this is fine in most cases, it does not quite serve our use case for 
> metrics. 
> So we need to decide how are we going to encode and decode metric values and 
> store them in HBase.

This message was sent by Atlassian JIRA

Reply via email to