[ https://issues.apache.org/jira/browse/YARN-1151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16414411#comment-16414411 ]
Xuan Gong commented on YARN-1151: --------------------------------- Thanks for the comments. [~vinodkv] All of your comments make sense. Actually, I thought about those while I had offline discussion with [~leftnoteasy] For example, we use customized aux-service, NM would load the class from the configured local path. If there is anything wrong(such as the configured local path does not exist, wrong file permission to load the class, etc), the NM would fail. So, here, we want to keep the same behavior as right now. To load the object remotely, we make it as blocking call. If there is anything wrong with HDFS (HDFS is down), we would let NM fail immediately. About whether we should simply re-use ResourceLocalizationService, * ResourceLocalizationService and AuxService are the same level services which mean they do not have a strong dependency on each other. Also, when we start NM, the current workflow is something like: ResourceLocalizationService init–> AuxServices init–>ResourceLocalizationService start–>AuxServices start. If we want to use ResourceLocalizationService, we have to change the workflow which means the AuxServices need to do init and start after ResourceLocalizationService has been successfully started. * All the localization call in ResourceLocalizationService is a un-blocking call which can somehow solve the issue if HDFS is down (We could simply wait until HDFS is up given it is a un-blocking call). But it will introduce many complexities. We need to introduce a state-machine for the AuxService, how the NMs handle the inconsistent AuxServices state (In NM 1, aux services have started, but in NM2, they are not), how the applications handle it (Should the application fail immediately or should it wait until the related aux service starts) Overall, if we want to re-use ResourceLocalizationService, as I understand, it will be a big change. It will be large critical changes (It might be an incompatibility change because somehow we change the application behavior). > Ability to configure auxiliary services from HDFS-based JAR files > ----------------------------------------------------------------- > > Key: YARN-1151 > URL: https://issues.apache.org/jira/browse/YARN-1151 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager > Affects Versions: 2.1.0-beta, 2.9.0 > Reporter: john lilley > Assignee: Xuan Gong > Priority: Major > Labels: auxiliary-service, yarn > Attachments: YARN-1151.1.patch, YARN-1151.branch-2.poc.2.patch, > YARN-1151.branch-2.poc.3.patch, YARN-1151.branch-2.poc.patch, [YARN-1151] > [Design] Configure auxiliary services from HDFS-based JAR files.pdf > > > I would like to install an auxiliary service in Hadoop YARN without actually > installing files/services on every node in the system. Discussions on the > user@ list indicate that this is not easily done. The reason we want an > auxiliary service is that our application has some persistent-data components > that are not appropriate for HDFS. In fact, they are somewhat analogous to > the mapper output of MapReduce's shuffle, which is what led me to > auxiliary-services in the first place. It would be much easier if we could > just place our service's JARs in HDFS. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org