[
https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14697445#comment-14697445
]
Varun Saxena commented on YARN-4053:
------------------------------------
So to resolve this we need some other way of storing metric values. Options are
as under :
# Keep the current way of storing metric values. And write a custom filter to
match the values. But this would need the new filter to be deployed on all
region servers. This solution hence may not be feasible. But if we do not want
to do this, for lexicographic comparison to work, sizes of bytes compared
should be equal.
# Store values as primitive types. That is, long as 8 bytes, integer as 4 bytes
and so on. But this can create problems in lexicographical comparison too. Say
metric m1 is stored as long. But a query to reader might be of the form {{m1 >
40000}}. As 40000 will be interpreted as Integer, we will try to compare 4
bytes against 8 bytes.
So the solution for this is to store every integral value as long(8 bytes) and
floating point values as double. Same approach can be used while matching at
reader side.
# But above solution may not work if we want to support BigInteger and
BigDecimal values(i.e. numerical values > 8 bytes). Although 8 bytes should be
enough but aggregated values may exceed 8 bytes. In this case, we can probably
decide values upto how many bytes do we need to support. 16 bytes, for that
matter even 12 bytes should be more than enough for all realistic scenarios.
While encoding we can do padding with zeroes in front if number is less than 16
bytes.
# Another option can be to continue supporting string representation and
restrict max number of digits we want to support before and after decimal
point. Say 30 digits before decimal point and 3 after. We can pad rest of the
bytes with zeroes while storing so that comparison can be done.
> Change the way metric values are stored in HBase Storage
> --------------------------------------------------------
>
> Key: YARN-4053
> URL: https://issues.apache.org/jira/browse/YARN-4053
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: timelineserver
> Affects Versions: YARN-2928
> Reporter: Varun Saxena
> Assignee: Varun Saxena
>
> Currently HBase implementation uses GenericObjectMapper is used to convert
> and store values in backend HBase storage. This converts everything into a
> string representation(ASCII/UTF-8 encoded byte array).
> While this is fine in most cases, it does not quite serve our use case for
> metrics.
> So we need to decide how are we going to encode and decode metric values and
> store them in HBase.
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)