Chris Douglas commented on YARN-3784:

- Docs for timeout don't include units
- Many whitespace changes in {{FiCaSchedulerApp}}
- change nested if to {{&&}} at:
+    if (this.preemptionTimeout != 0) {
+      if (timeout > this.preemptionTimeout) {
- Would it be possible to test more than the timeout reported is non-zero? If 
this used a {{Clock}} instead of calling {{System.currentTimeMillis}} directly, 
the unit test could be easier to write...

If containers are preempted for multiple causes (e.g., over-capacity, NM 
decommission), then the time to preempt could vary widely. The ProportionalCPP 
also limits the preempted capacity per round, so a global timeout will be very 
pessimistic. Would it make sense to change {{timeout}} to be {{nextkill}}? More 
general solutions would be significantly more work...

> Indicate preemption timout along with the list of containers to AM 
> (preemption message)
> ---------------------------------------------------------------------------------------
>                 Key: YARN-3784
>                 URL: https://issues.apache.org/jira/browse/YARN-3784
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Sunil G
>            Assignee: Sunil G
>         Attachments: 0001-YARN-3784.patch
> Currently during preemption, AM is notified with a list of containers which 
> are marked for preemption. Introducing a timeout duration also along with 
> this container list so that AM can know how much time it will get to do a 
> graceful shutdown to its containers (assuming one of preemption policy is 
> loaded in AM).
> This will help in decommissioning NM scenarios, where NM will be 
> decommissioned after a timeout (also killing containers on it). This timeout 
> will be helpful to indicate AM that those containers can be killed by RM 
> forcefully after the timeout.

This message was sent by Atlassian JIRA

Reply via email to