[
https://issues.apache.org/jira/browse/YARN-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15120627#comment-15120627
]
Sangjin Lee commented on YARN-3367:
-----------------------------------
I went over the patch in some more detail, and it's definitely much closer. I
have one major (\?) suggestion and a few minor ones.
What you have implemented in {{TimelineClientImpl.EntitiesHolder}} is basically
what's already available in the JDK in {{FutureTask}}. What's needed here is a
result-bearing synchronizer along with the ability to run the task. That's
exactly what {{FutureTask}} is. You can define {{EntitiesHolder}} this way to
take advantage of it:
{code}
private final class EntitiesHolder extends FutureTask<Void> {
private final
org.apache.hadoop.yarn.api.records.timelineservice.TimelineEntities entities;
private final boolean isSync;
EntitiesHolder(
final
org.apache.hadoop.yarn.api.records.timelineservice.TimelineEntities entities,
final boolean isSync) {
super(new Callable<Void>() {
// publishEntities()
public Void call() throws Exception {
MultivaluedMap<String, String> params = new MultivaluedMapImpl();
params.add("appid", contextAppId.toString());
params.add("async", Boolean.toString(!isSync));
putObjects("entities", params, entities);
return null;
}
});
this.entities = entities;
this.isSync = isSync;
}
public boolean isSync() {
return isSync;
}
public org.apache.hadoop.yarn.api.records.timelineservice.TimelineEntities
getEntities() {
return entities;
}
}
{code}
If you have a {{FutureTask}}, when you need to run {{publishEntities()}} you
can simply call {{run()}}. For example,
{code}
if (entitiesHolder != null) {
if (entitiesHolder.isSync()) {
entitiesHolder.run();
} else {
{code}
When you need to join with the result of that run on another thread (either its
normal completion or an exception), you can simply call {{get()}}. You can
catch {{ExecutionException}} and look at its cause to get the real exception:
{code}
// In sync call we need to wait till its published and if any error then
// throw it back
try {
entitiesHolder.get();
} catch (ExecutionException e) {
throw new YarnException(
"Failed while adding entity to the queue for publishing",
e.getCause());
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
throw new YarnException(
"Failed while adding entity to the queue for publishing", e);
}
{code}
Doing it this way removes a significant amount of code and results in much
simpler and more robust code. I have saved the prototype, and I'd be happy to
share the diff. Let me know.
Onto other feedback,
(TimelineClientImpl.java)
- l.877: I would get rid of this constructor, and simply pass in the boolean in
both cases
- l.921: Why only 2 calls? Also, should this be configurable?
- l.923: This doesn't need to be {{ScheduledExecutorService}}.
{{ExecutorService}} is all we need. Also, let's rename it to {{executor}}
instead of {{scheduler}}.
- l.925: {{stopped}} is superfluous because the underlying {{ExecutorService}}
manages the stoppage via interrupt. Let's remove it.
- l.931: name {{createThread()}} is not quite accurate; how about
{{createRunnable()}}?
- l.1007: if you want to check the status, you can replace {{stopped}} with
{{scheduler.isShutdown()}}
- l.1025: restore the interrupt status (via
{{Thread.currentThread().interrupt()}})
- l.1040: same
- l.1047: {{Executors.newSingleThreadedExecutor()}}
- l.1048: {{executor.execute(createRunnable());}}
> Replace starting a separate thread for post entity with event loop in
> TimelineClient
> ------------------------------------------------------------------------------------
>
> Key: YARN-3367
> URL: https://issues.apache.org/jira/browse/YARN-3367
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: timelineserver
> Affects Versions: YARN-2928
> Reporter: Junping Du
> Assignee: Naganarasimha G R
> Labels: yarn-2928-1st-milestone
> Attachments: YARN-3367-YARN-2928.v1.005.patch,
> YARN-3367-YARN-2928.v1.006.patch, YARN-3367-feature-YARN-2928.003.patch,
> YARN-3367-feature-YARN-2928.v1.002.patch,
> YARN-3367-feature-YARN-2928.v1.004.patch, YARN-3367.YARN-2928.001.patch
>
>
> Since YARN-3039, we add loop in TimelineClient to wait for
> collectorServiceAddress ready before posting any entity. In consumer of
> TimelineClient (like AM), we are starting a new thread for each call to get
> rid of potential deadlock in main thread. This way has at least 3 major
> defects:
> 1. The consumer need some additional code to wrap a thread before calling
> putEntities() in TimelineClient.
> 2. It cost many thread resources which is unnecessary.
> 3. The sequence of events could be out of order because each posting
> operation thread get out of waiting loop randomly.
> We should have something like event loop in TimelineClient side,
> putEntities() only put related entities into a queue of entities and a
> separated thread handle to deliver entities in queue to collector via REST
> call.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)