[jira] [Updated] (YARN-3595) Performance optimization using connection cache of Phoenix timeline writer

Li Lu (JIRA) Thu, 07 May 2015 22:27:08 -0700

     [ 
https://issues.apache.org/jira/browse/YARN-3595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Li Lu updated YARN-3595:
------------------------
    Description: 
The story about the connection cache in Phoenix timeline storage is a little 
bit long. In YARN-3033 we planned to have shared writer layer for all 
collectors in the same collector manager. In this way we can better reuse the 
same heavy-weight storage layer connection, therefore it's more friendly to 
conventional storage layer connections which are typically heavy-weight. 

Phoenix, on the other hand, implements its own connection interface layer to be 
light-weight, thread-unsafe. To make these connections work with our "multiple 
collector, single writer" model, we're adding a thread indexed connection 
cache. However, many performance critical factors are yet to be tested. 

In this JIRA we're tracing performance optimization efforts using this 
connection cache. Previously we had a draft, but there was one implementation 
challenge on cache evictions: There may be races between Guava cache's removal 
listener calls (which close the connection) and normal references to the 
connection. We need to carefully define the way they synchronize. 

Performance-wise, at the very beginning stage we may need to understand:

# If the current, thread-based indexing is an appropriate approach, or we can 
use some better ways to index the connections. 
# the best size of the cache, presumably as the proposed default value of a 
configuration. 
# how long we need to preserve a connection in the cache. 

Please feel free to add this list. 

  was:
The story about the connection cache in Phoenix timeline storage is a little 
bit long. In YARN-3033 we planned to have shared writer layer for all 
collectors in the same collector manager. In this way we can better reuse the 
same heavy-weight storage layer connection, therefore it's more friendly to 
conventional storage layer connections which are typically heavy-weight. 

Phoenix, on the other hand, implements its own connection interface layer to be 
light-weight, thread-unsafe. To make these connections work with our "multiple 
collector, single writer" model, we're adding a thread indexed connection 
cache. However, many performance critical factors are yet to be tested. 

In this JIRA we're tracing performance optimization efforts using this 
connection cache. Currently 


At the very beginning stage we may need to understand:

# If the current, thread-based indexing is an appropriate approach, or we can 
use some better ways to index the connections. 
# the best size of the cache, presumably as the proposed default value of a 
configuration. 
# how long we need to preserve a connection in the cache. 

Please feel free to add this list. 


> Performance optimization using connection cache of Phoenix timeline writer
> --------------------------------------------------------------------------
>
>                 Key: YARN-3595
>                 URL: https://issues.apache.org/jira/browse/YARN-3595
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Li Lu
>            Assignee: Li Lu
>
> The story about the connection cache in Phoenix timeline storage is a little 
> bit long. In YARN-3033 we planned to have shared writer layer for all 
> collectors in the same collector manager. In this way we can better reuse the 
> same heavy-weight storage layer connection, therefore it's more friendly to 
> conventional storage layer connections which are typically heavy-weight. 
> Phoenix, on the other hand, implements its own connection interface layer to 
> be light-weight, thread-unsafe. To make these connections work with our 
> "multiple collector, single writer" model, we're adding a thread indexed 
> connection cache. However, many performance critical factors are yet to be 
> tested. 
> In this JIRA we're tracing performance optimization efforts using this 
> connection cache. Previously we had a draft, but there was one implementation 
> challenge on cache evictions: There may be races between Guava cache's 
> removal listener calls (which close the connection) and normal references to 
> the connection. We need to carefully define the way they synchronize. 
> Performance-wise, at the very beginning stage we may need to understand:
> # If the current, thread-based indexing is an appropriate approach, or we can 
> use some better ways to index the connections. 
> # the best size of the cache, presumably as the proposed default value of a 
> configuration. 
> # how long we need to preserve a connection in the cache. 
> Please feel free to add this list. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3595) Performance optimization using connection cache of Phoenix timeline writer

Reply via email to