[
https://issues.apache.org/jira/browse/YARN-3813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14618545#comment-14618545
]
Sunil G commented on YARN-3813:
-------------------------------
Hi [~nijel]
Thanks for sharing the draft. I have couple of doubts.
- bq.Add a new auxiliary service RMAppTimeOutService to track the running
applications and invoke the kill action.
I have a suggestion here.We can have a BasicAppMonitoringManager which can keep
an entry of <appId, app.getSubmissionTime>.
AppMonitoringManager interface can expose apis like addAppMonitoringInfo,
removeAppMonitoringInfo. In BasicAppMonitoringManager impl, a timer task can
monitor the registered entries added via addAppMonitoringInfo during app
submission time. If any apps times out, it can raise a TIMEOUT event to
RMAppImpl.
- bq.Add a flag in RMApp to identify the timed out application. This is for
metric purpose.
Could we have a new TIMEOUT event in RMAppImpl for this. In that case, we may
not need a flag.
> Support Application timeout feature in YARN.
> ---------------------------------------------
>
> Key: YARN-3813
> URL: https://issues.apache.org/jira/browse/YARN-3813
> Project: Hadoop YARN
> Issue Type: New Feature
> Components: scheduler
> Reporter: nijel
> Attachments: YARN Application Timeout .pdf
>
>
> It will be useful to support Application Timeout in YARN. Some use cases are
> not worried about the output of the applications if the application is not
> completed in a specific time.
> *Background:*
> The requirement is to show the CDR statistics of last few minutes, say for
> every 5 minutes. The same Job will run continuously with different dataset.
> So one job will be started in every 5 minutes. The estimate time for this
> task is 2 minutes or lesser time.
> If the application is not completing in the given time the output is not
> useful.
> *Proposal*
> So idea is to support application timeout, with which timeout parameter is
> given while submitting the job.
> Here, user is expecting to finish (complete or kill) the application in the
> given time.
> One option for us is to move this logic to Application client (who submit the
> job).
> But it will be nice if it can be generic logic and can make more robust.
> Kindly provide your suggestions/opinion on this feature. If it sounds good, i
> will update the design doc and prototype patch
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)