[
https://issues.apache.org/jira/browse/YARN-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Naganarasimha G R updated YARN-3367:
------------------------------------
Attachment: YARN-3367-YARN-2928.v1.010.patch
Thanks [~varun_saxena] for the comments
bq. I think we can try to drain the queue on stop and process the async events
or some sync event sitting in the queue. We would need to do this before we
call shutdownNow as that will interrupt the thread.
I had earlier tried to take care of trying to drain but missed in later
patches. But IMO we should not wait in definitely as there might be chances
that server might be down, so what i have done in the patch is to use shutdown
so that the live workers are not stopped and it waits for 10 seconds and then
exit. Or if we want to have more sophisticated way then we need to introduce
some additional logic so that it doesn't get blocked and drains everything.
Thoughts ?
> Replace starting a separate thread for post entity with event loop in
> TimelineClient
> ------------------------------------------------------------------------------------
>
> Key: YARN-3367
> URL: https://issues.apache.org/jira/browse/YARN-3367
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: timelineserver
> Affects Versions: YARN-2928
> Reporter: Junping Du
> Assignee: Naganarasimha G R
> Labels: yarn-2928-1st-milestone
> Attachments: YARN-3367-YARN-2928.v1.005.patch,
> YARN-3367-YARN-2928.v1.006.patch, YARN-3367-YARN-2928.v1.007.patch,
> YARN-3367-YARN-2928.v1.008.patch, YARN-3367-YARN-2928.v1.009.patch,
> YARN-3367-YARN-2928.v1.010.patch, YARN-3367-feature-YARN-2928.003.patch,
> YARN-3367-feature-YARN-2928.v1.002.patch,
> YARN-3367-feature-YARN-2928.v1.004.patch, YARN-3367.YARN-2928.001.patch,
> sjlee-suggestion.patch
>
>
> Since YARN-3039, we add loop in TimelineClient to wait for
> collectorServiceAddress ready before posting any entity. In consumer of
> TimelineClient (like AM), we are starting a new thread for each call to get
> rid of potential deadlock in main thread. This way has at least 3 major
> defects:
> 1. The consumer need some additional code to wrap a thread before calling
> putEntities() in TimelineClient.
> 2. It cost many thread resources which is unnecessary.
> 3. The sequence of events could be out of order because each posting
> operation thread get out of waiting loop randomly.
> We should have something like event loop in TimelineClient side,
> putEntities() only put related entities into a queue of entities and a
> separated thread handle to deliver entities in queue to collector via REST
> call.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)