Rohith Sharma K S commented on YARN-3813:

Thanks [~sunilg] for going through the design doc and feedback.
bq. BasicAppMonitoringManager which can keep an entry of <appId, 
Basically we mean Auxillary service is a separate service that start a new 
thread monitoring running applications i.e. very similar to any other service 
in RM like ZKRMStateStore/ClientRMService. 

bq. Could we have a new TIMEOUT event in RMAppImpl for this. In that case, we 
may not need a flag.
Yes, having a separate TIMEOUT event and TIMEOUT state is good approach and 
other option. Initially we consider to have new state TIMEOUT which require 
very huge changes across all the modules. To keep it simple, able to manage in 
KILLED state with proper diagnostic message and having new flag. New flag is 
for identifying whether app is timeout or not, which require in calculating 
metrics and considering RM restart feature.

> Support Application timeout feature in YARN. 
> ---------------------------------------------
>                 Key: YARN-3813
>                 URL: https://issues.apache.org/jira/browse/YARN-3813
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: scheduler
>            Reporter: nijel
>         Attachments: YARN Application Timeout .pdf
> It will be useful to support Application Timeout in YARN. Some use cases are 
> not worried about the output of the applications if the application is not 
> completed in a specific time. 
> *Background:*
> The requirement is to show the CDR statistics of last few  minutes, say for 
> every 5 minutes. The same Job will run continuously with different dataset.
> So one job will be started in every 5 minutes. The estimate time for this 
> task is 2 minutes or lesser time. 
> If the application is not completing in the given time the output is not 
> useful.
> *Proposal*
> So idea is to support application timeout, with which timeout parameter is 
> given while submitting the job. 
> Here, user is expecting to finish (complete or kill) the application in the 
> given time.
> One option for us is to move this logic to Application client (who submit the 
> job). 
> But it will be nice if it can be generic logic and can make more robust.
> Kindly provide your suggestions/opinion on this feature. If it sounds good, i 
> will update the design doc and prototype patch

This message was sent by Atlassian JIRA

Reply via email to