[ https://issues.apache.org/jira/browse/YARN-5610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15523560#comment-15523560 ]
Gour Saha commented on YARN-5610: --------------------------------- [~jianhe] addressing the second set of comments/feedback. Will upload another patch with the code changes. {quote} Application#get/setNumberOfContainers, this API is a bit inconsistent * when submit, it means the default number of containers for each component * when query app status, it means the total number of containers acroos all components. I feel it's better to make it consistent, for the number of running containers, user can just infer it from the size of containers list or add a new filed called containers_running {quote} I think you raised a very good point. We have an attribute expectedNumberOfContainers which captures the actual no of containers at runtime. I think we overloaded numberOfContainers to be the requested no of containers for app submission payload and then the actual no of containers at runtime. Keeping your suggestion in mind, I am planning to rename expectedNumberOfContainers to numberOfRunningContainers so that the notion of numberOfContainers remains consistent. You are right that the size of the containers list will give the total no of running containers across all components of an app, but an explicit field to capture that was desired by the early Swagger API reviewers. {quote} Question on the quickLinks API, why does the component need to have quicklinks field? IIUC, the quicklinks is mainly to show on the UI? In that case, having it in the application object is enough? {quote} Application and component level quicklinks are required during app submission POST payload. Each app level quicklink will be satisfied by a corresponding component and hence needs the correlation specified in the app definition. You are right that, once the app is deployed the GET call will return quicklinks at the app level only. {quote} Can you upload the new Swagger specification ? {quote} Filed sub-task YARN-5675. Will upload a patch there in yaml format. {quote} Configuration object: how are the properties filed passed to the container ? {quote} They are created on the work dir and mounted as a file into the container {quote} ConfigFile#dest_file (The absolute path that this configuration file should be mounted as) The user cannot assume an abritrary absolute path is valid in the remote container ? It needs to be a relative path to the container local dir? {quote} This is the absolute path that will be visible from inside the container. The relative path at the host level will be mounted into the container as this absolute path. You are right, that the app owner will not know and cannot even specify what the absolute path will be at the host level. {quote} ConfigFile#src_file: I think this could support all kinds of config files on HDFS, so that user doesn't need to specify all configs in the request payload. e.g. user simply provide a URI of the configFile on HDFS, YARN will localize the file for container to use {quote} You are right, the src_file could be a HDFS file. Note, as per the swagger definition src_file is required when config file is of type template, and would typically be a network file or HDFS file. Additionally, Slider also does property substitutions (which is an important feature) in this template file using property values specified in the request payload. {quote} I dont see the place where queuName is set, also why is queueName set to label_expression ? {noformat} if (queueName != null) { resCompOptTriples.addAll(Arrays.asList(compName, ResourceKeys.YARN_LABEL_EXPRESSION, queueName)); } {noformat} {quote} You are right, the queue feature was implemented for quick internal testing but was not fully wired in the submitted patch. I have formally added it in the swagger definition and completed the wiring in the code. The new code patch will contain all these changes. {quote} Given that APP_NAME is staic and specified by the user, why do we need to provide a place_holder for App_name and then substitue? why can't user specify the app_name in the first place. {quote} The reason is to prevent the user to specify the hard coded app name in multiple places. The same app definition JSON is typically used to submit multiple apps with different names. Hence without the ${APP_NAME} placeholder the user will need to modify the app name in multiple places. {quote} All these null checks are not needed, because they are validated upfront already if (comp.getArtifact() != null && comp.getArtifact().getType() != null && comp.getArtifact().getType() == Artifact.TypeEnum.DOCKER) { {quote} Actually these checks are required, since for a simple app definition with no components the code creates a default empty component with name DEFAULT_COMPONENT_NAME. In this case it does not create any artifacts and hence this check prevents NPE in such cases. {quote} comp.getArtifact().getId() == null ? application.getArtifact() .getId() : comp.getArtifact().getId()); {quote} Again these checks are required because the app definition can contain an app-level artifact specified and empty at component-level, in which case as per swagger API definition we need to use the app-level value at the component-level. {quote} invokeSliderClientRunnable: I don't quite understand why we need to set the setContextClassLoader and then reset back, could you explain {quote} Removed {quote} why is this code needed to be called in every rest API? {noformat} File sliderJarFile = SliderUtils .findContainingJar(SliderClient.class); if (sliderJarFile != null) { logger.debug("slider.libdir={}", sliderJarFile.getParentFile() .getAbsolutePath()); System.setProperty("slider.libdir", sliderJarFile.getParentFile() .getAbsolutePath()); } } catch (Throwable t) { logger.warn("Unable to determine 'slider.libdir' path", t); } {noformat} {quote} Removed from everywhere except createSliderApp, since the create action needs to know the path to upload all Slider jars if the dependencies have not been uploaded to HDFS at the framework level. Uploading jars for every create action is not desirable, but at least functionally will make create app work, until Slider’s dependency tarball is uploaded at the framework (cluster) level (a one-time operation). {quote} remove unused destroySliderClient method {quote} Removed {quote} deleteApplication API: listing all apps from RM is an expensive call. Maybe we can directly call kill/stop application and handle the Exception and return proper return-code. {quote} This has been a significant pain point in Slider code-base. Even if I remove these calls (primarily added to handle REST calls elegantly and provide as much helpful messaging to the end user), you will see that SliderClient.actionDestroy still makes findAllLiveInstances call. We need to do a cleanup at the SliderClient level. It was never designed to run as a service. I filed bug SLIDER-1168 to take care of this. Note, it won't be very straight-forward since SliderClient currently does not store the applicationId in its persistent state. Anyway, we should be able to find an elegant solution for SLIDER-1168. Once we do the Slider-level cleanup, I will replace these calls from the REST service code, with alternate faster checks. {quote} getApplication API: listing all apps is an expensive call in YARN. Instead We can handle the ApplicationNotFoundExcetion from Yarn if the app does not exist. {quote} The effort mentioned in the previous point will take care of this as well. {quote} getSliderApplicationStatus, getSliderApplicationRegistry: I wonder why the corresponding implementation in slider: SliderClient#actionStatus, actionRegistry need to write the output into a file and then let caller read the file. why not returnning the output directly to the caller. {quote} Slider Client was designed to be invoked from the command line and never envisioned to be used as a service. Hence all these code needs cleanup or newer API needs to be introduced in slider-client. SLIDER-1170 has been filed to take care of this. The REST service code will be modified after we fix SLIDER-1170. {quote} This code is not needed, it can never happen. {noformat} // ensure that the application name from the status report matches the // requested application name if (StringUtils.isEmpty(appName) || !app.getName().equals(appName)) { return Response.status(Status.NOT_FOUND).build(); } {noformat} {quote} Removed. {quote} This code seems not needed, as artifact is used nowhere, we can remove {noformat} // application info if (applicationRoles != null && !componentNames.isEmpty()) { JsonObject applicationRole = jsonGetAsObject(applicationRoles, componentNames.get(0)); if (applicationRole != null) { Artifact artifact = new Artifact(); // how to get artifact id - docker image name?? artifact.setId(null); } } {noformat} {quote} I need to find a way to get the docker image name, which is why this code is lingering around. Can YARN-5430 return the docker image name of a container as well? {quote} Again, may be we can just call stop to save the getSliderList call. {noformat} int livenessCheck = getSliderList(appName); if (livenessCheck == 0) { return stopSliderApplication(appName); {noformat} {quote} Agreed, SLIDER-1168 will help remove this as well. {quote} The RestApi interface is empty, may be we can remove ? {quote} Sorry, this is due to my laziness. Added the public APIs to this interface - will help us to implement newer versions of the API in future. > Initial code for native services REST API > ----------------------------------------- > > Key: YARN-5610 > URL: https://issues.apache.org/jira/browse/YARN-5610 > Project: Hadoop YARN > Issue Type: Sub-task > Reporter: Gour Saha > Assignee: Gour Saha > Attachments: YARN-4793-yarn-native-services.001.patch, > YARN-5610-yarn-native-services.002.patch > > > This task will be used to submit and review patches for the initial code drop > for the native services REST API -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org