Vinod Kumar Vavilapalli updated YARN-4372:
    Attachment: YARN-4372-20151119.1.txt

Attaching a patch that should fix this.

The patch does the following:
 - Reordered the service creation in MiniYARNCluster to be AHS -> RM -> NMs.
 - Moved URI creation in TimelineClient to service-start so that RM can safely 
start sending events. This should be fine for existing users of TimelineClient 
also as they cannot do anything for real before client.start().
 - Added a simple test in TestMiniYARNCluster to validate the service order.


Responding to your comments on this JIRA as well as YARN-2859.

bq. the timeline service was having some issue starting the web service though 
the port was correctly set (was not an expert with jersey and guice, hence had 
stopped further analysis there)
Can you try my uploaded patch here together with YARN-4350 and see if you still 
find issues?

bq. In MiniYARNCluster RM servicewrapper is first added and then AHSwrapper, 
and also actual AHS service is started in a thread, so RM's will be using the 
wrong timelineclient address(port is zero) as AHS service is not yet 
This should be fixed in the patch - I changed the order to be AHS -> RM -> NM. 
The separate thread is not an issue as RM will only start after AHS 
successfully starts.

bq. In Timeline client Impl's serviceInit URI for timeline REST service is set. 
So even though we create the correct service order (as per previous step), RM's 
SMP will fail to publish, as timelineweb address is got only after the AHS 
service is started.
Fixed TimelineClient to delay the URI creation.

bq. Also one more point(not related to this jira) to note in MiniYARNCluster, 
we no more support old AHS interfce so basically 
yarn.timeline-service.generic-application-history.store-class should not be 
configured in ApplicationHistoryServerWrapper so that levelDBtimelinestore is 
created. which i feel is correct atleast for existing 2.7.x versions. And if 
you agree with it, can we fix it along with this jira as its a small thing ?,
Even if that store is specified, TimelineDataManager will continue to be 
instantiated in the server, so your assumption is wrong that levelDB is not 
created. Agree?

> Cannot enable system-metrics-publisher inside MiniYARNCluster
> -------------------------------------------------------------
>                 Key: YARN-4372
>                 URL: https://issues.apache.org/jira/browse/YARN-4372
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Vinod Kumar Vavilapalli
>            Assignee: Vinod Kumar Vavilapalli
>         Attachments: YARN-4372-20151119.1.txt
> [~Naganarasimha] found this at YARN-2859, see [this 
> comment|https://issues.apache.org/jira/browse/YARN-2859?focusedCommentId=15005746&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15005746].
> The way daemons are started inside MiniYARNCluster, RM is not setup correctly 
> to send information to TimelineService.

This message was sent by Atlassian JIRA

Reply via email to