[ https://issues.apache.org/jira/browse/YARN-6401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15947656#comment-15947656 ]
Jason Lowe commented on YARN-6401: ---------------------------------- Ah, sorry. I was thinking it was ignoring SIGTERM and thus not cleaning up because it would get killed by the subsequent SIGKILL. Instead it sounds like it _is_ responding to SIGTERM but not cleaning up. Isn't that a bit odd? The whole point of SIGTERM is to request a shutdown of the process rather than forcing one. I'm not an httpd expert, so I started digging into the docs to try to understand why it wouldn't do something sane with TERM but does with a non-standard signal like WINCH. Turns out it does handle TERM, but it's aggressive such that in-progress requests may be interrupted/canceled. WINCH only advises things to exit, which sounds like active requests could continue to be processed but the listen port is no longer monitored so no new requests will be processed. What worries me here is that we can still end up with an unorderly shutdown even if YARN sent WINCH instead of TERM. The default delay between the TERM and KILL signals is relatively short, which is why the processing httpd does for TERM seems more appropriate here. If a request could take hundreds of milliseconds to process then the KILL is going to arrive too soon after the WINCH signal unless the delay between the two signals is widened. However that delay is not a per-app setting, and making it a per-app setting would cause a DoS problem. Containers are often killed because YARN needs the container to leave in a timely manner (e.g.: container running beyond limits, preemption, etc.). So I still think this is something better handled by the application framework (in this case Slider) rather than YARN. MapReduce has a similar example. MapReduce jobs can be killed via YARN, but it's harsh and things are often lost when this occurs. That's why the {{mapred job -kill}} command first tries to kill the job by contacting the AM and requesting it to do an orderly shutdown outside of YARN, and only falls back on YARN to terminate the containers if the job is unresponsive to the kill request. I think the same thing applies here. If we really want an orderly shutdown to httpd so we won't kill outstanding requests (even if they can take a while) then Slider (or some layer on top of Slider) should support sending the WINCH signals to the containers for the app and then the app can terminate when all containers have completed their shutdown. Then the application can implement an arbitrary, application-specific shutdown sequence and timing. If YARN needs to do the killing directly then we cannot wait an arbitrary amount of time for the app to cleanup and shutdown gracefully. I think YARN will still need some support to send the WINCH signal in either case. Currently containers can be sent signals after YARN-1897, but it's only a restricted subset that can be translated cross-platform. That would need to be extended to support more arbitrary signals like WINCH. > terminating signal should be able to specify per application to support > graceful-stop > ------------------------------------------------------------------------------------- > > Key: YARN-6401 > URL: https://issues.apache.org/jira/browse/YARN-6401 > Project: Hadoop YARN > Issue Type: Improvement > Reporter: kyungwan nam > > when stop container, first send SIGTERM to the process. > after a while, send SIGKILL if the process is still alive. > above process is always the same for any application. > but, to graceful-stop, sometimes it need to send another signal instead of > SIGTERM. > for instance, if apache httpd on slider is running, SIGWINCH should be came > to stop gracefully. > the way to stop gracefully is depend on application. > it will be good if we can define a signal to terminate per application. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org