[ 
https://issues.apache.org/jira/browse/YARN-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15588202#comment-15588202
 ] 

Rohith Sharma K S commented on YARN-5611:
-----------------------------------------

Summary of discussion for supporting  update API for timeout with [~vinodkv], 
[~vvasudev] and [~sunilg].

*Current Patch Approach*: update API is synchronous call. And its timeout 
parameter affect from now i.e targetTimeoutFromNow(in seconds) for given 
ApplicationTimeoutTypes. Say, user wants to update timeout, if user passes 30 
seconds for update then RM will calculate (now + 30 seconds ) and store in 
state store. 

*Concern on update API approach* 
# synchronous call is too expensive. In case of statestore problem, other RPC 
calls are blocked.
So basically approach of synchronous call need to be changed to asynchronous 
call. Note, updatePriority is also blocking call which need to be solved once 
server side supported. 

*Challenges*
# Yarn-client need to be adopt polling mechanism for RM for its update 
confirmation. The first important thing need consider for polling mechanism is 
*Idempotent* API.  So, to the user, *input remain relative*, but client need to 
convert relative time to targetedTime and send a request to RM with 
targeted-time. The RM takes requested targeted timeout as-is and use it. Say, 
user input timeout as 30 seconds, then yarn client change it to 
targetedTimeout=(now + 30seconds) and sends to RM. To the RM, update request is 
always targeted time.
# Multiple client update timeout for same application. Here, issue is how does 
yarn-client get to know about its update confirmation.? How long yarn-client 
need to re-try/wait before bailing out.? 
# *More importantly, Timezone issue*  if yarnclient converts it as explained in 
first challenge.  Say, RM is running in IST , and yarnclient is running in PDT 
timezone. Even though Yarnclient converts user input to targeted time and send 
to RM, since both client-server timezone are different, the targetedTimeout 
value is totally different.


*Some of the thoughts solving the challenges*
# Server(RM)
## Always accept targeted timeout in UTC format and store in state-store as UTC 
format only. For RM monitoring, covert UTC timestamp to local machine timezone 
and use it. This solves Timezone issue from RM.
## Once update request received by RM, firstly RM should store new value in 
statestore. Secondly, update to monitoring service. 
## RM should check targeted timeout is same as request time.If yes, return a 
boolean flag indicating update is success. Always keep a copy of updated 
timeout values in RMApp and return these values when somebody query for timeout 
values. 
# YarnClient:
## Convert user input timeout to targeted-timeout in UTC and send update 
request to RM
## Keep sending update request in loop as long as response received as true 
indicating update confirmation.
## Bail out and fail the client operation after some configured value. 


Any comments on approach are welcome. 

> Provide an API to update lifetime of an application.
> ----------------------------------------------------
>
>                 Key: YARN-5611
>                 URL: https://issues.apache.org/jira/browse/YARN-5611
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Rohith Sharma K S
>            Assignee: Rohith Sharma K S
>         Attachments: 0001-YARN-5611.patch, 0002-YARN-5611.patch, 
> 0003-YARN-5611.patch, YARN-5611.v0.patch
>
>
> YARN-4205 monitors an Lifetime of an applications is monitored if required. 
> Add an client api to update lifetime of an application. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to