[ 
https://issues.apache.org/jira/browse/YARN-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15135321#comment-15135321
 ] 

Sangjin Lee commented on YARN-3367:
-----------------------------------

The reason you're not seeing that is because 
{{TestV2TimelineClient#putObjects()}} is eating up the interrupt status:

{code}
      if (sleepBeforeReturn) {
        try {
          Thread.sleep(TIME_TO_SLEEP);
        } catch (InterruptedException e) {
          // do nothing, its a test code
        }
      }
{code}

For the test to work, it should at least restore the interrupt. Otherwise, the 
code becomes un-interruptible. With that change, the check for 
{{Thread.currentThread().isInterrupted()}} makes the test pass correctly 
(you'll see the right log).

In general, the thread becomes un-interruptible if any operation catches and 
handles an InterruptedException without restoring the interrupt.

Another point: I think you want to restore the previous code for waiting for 
the executor to stop completely. Otherwise, the stop() method may simply return 
and the draining feature may not work. So I'd suggest restoring the code that 
waits for the shutdown to complete to give draining a chance.


> Replace starting a separate thread for post entity with event loop in 
> TimelineClient
> ------------------------------------------------------------------------------------
>
>                 Key: YARN-3367
>                 URL: https://issues.apache.org/jira/browse/YARN-3367
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>    Affects Versions: YARN-2928
>            Reporter: Junping Du
>            Assignee: Naganarasimha G R
>              Labels: yarn-2928-1st-milestone
>         Attachments: YARN-3367-YARN-2928.v1.005.patch, 
> YARN-3367-YARN-2928.v1.006.patch, YARN-3367-YARN-2928.v1.007.patch, 
> YARN-3367-YARN-2928.v1.008.patch, YARN-3367-YARN-2928.v1.009.patch, 
> YARN-3367-YARN-2928.v1.010.patch, YARN-3367-YARN-2928.v1.011.patch, 
> YARN-3367-feature-YARN-2928.003.patch, 
> YARN-3367-feature-YARN-2928.v1.002.patch, 
> YARN-3367-feature-YARN-2928.v1.004.patch, YARN-3367.YARN-2928.001.patch, 
> sjlee-suggestion.patch
>
>
> Since YARN-3039, we add loop in TimelineClient to wait for 
> collectorServiceAddress ready before posting any entity. In consumer of  
> TimelineClient (like AM), we are starting a new thread for each call to get 
> rid of potential deadlock in main thread. This way has at least 3 major 
> defects:
> 1. The consumer need some additional code to wrap a thread before calling 
> putEntities() in TimelineClient.
> 2. It cost many thread resources which is unnecessary.
> 3. The sequence of events could be out of order because each posting 
> operation thread get out of waiting loop randomly.
> We should have something like event loop in TimelineClient side, 
> putEntities() only put related entities into a queue of entities and a 
> separated thread handle to deliver entities in queue to collector via REST 
> call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to