[ 
https://issues.apache.org/jira/browse/YARN-5336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15630472#comment-15630472
 ] 

Vrushali C commented on YARN-5336:
----------------------------------

Some other interesting points to keep in mind:

As per https://hbase.apache.org/book.html#table_schema_rules_of_thumb , we 
should aim to have cells no larger than 10 MB, or 50 MB if we use mob. 
Otherwise, consider storing your cell data in HDFS and store a pointer to the 
data in HBase.

Aim to have regions sized between 10 and 50 GB.

Aim to have cells no larger than 10 MB, or 50 MB if you use mob. Otherwise, 
consider storing your cell data in HDFS and store a pointer to the data in 
HBase.

A typical schema has between 1 and 3 column families per table. HBase tables 
should not be designed to mimic RDBMS tables.

Around 50-100 regions is a good number for a table with 1 or 2 column families. 
Remember that a region is a contiguous segment of a column family.

Keep your column family names as short as possible. The column family names are 
stored for every value (ignoring prefix encoding). They should not be 
self-documenting and descriptive like in a typical RDBMS.

About Medium sized objects (https://hbase.apache.org/book.html#hbase_mob)

While HBase can technically handle binary objects with cells that are larger 
than 100 KB in size, HBase’s normal read and write paths are optimized for 
values smaller than 100KB in size. When HBase deals with large numbers of 
objects over this threshold, referred to here as medium objects, or MOBs, 
performance is degraded due to write amplification caused by splits and 
compactions. When using MOBs, ideally your objects will be between 100KB and 
10MB. HBase FIX_VERSION_NUMBER adds support for better managing large numbers 
of MOBs while maintaining performance, consistency, and low operational 
overhead. MOB support is provided by the work done in HBASE-11339. To take 
advantage of MOB, you need to use HFile version 3. Optionally, configure the 
MOB file reader’s cache settings for each RegionServer (see Configuring the MOB 
Cache), then configure specific columns to hold MOB data. Client code does not 
need to change to take advantage of HBase MOB support. The feature is 
transparent to the client.



> Put in some limit for accepting key-values in hbase writer
> ----------------------------------------------------------
>
>                 Key: YARN-5336
>                 URL: https://issues.apache.org/jira/browse/YARN-5336
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Vrushali C
>            Assignee: Vrushali C
>              Labels: YARN-5355
>
> As recommended by [~jrottinghuis] , need to add in some limit (default and 
> configurable) for accepting key values to be written to the backend.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to