[ https://issues.apache.org/jira/browse/YARN-5610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15474563#comment-15474563 ]
Gour Saha commented on YARN-5610: --------------------------------- [~jianhe] thank you for reviewing the patch. Addressing the first set of comments/feedback. Will upload the patch with the code changes. h4. API Models: {quote} \{artifact, resource, launch_command, number_of_containers\} in Application seems duplicated with those inside the component. I feel in this scenario, a default global setting for artifacts, launch_command etc. is not that appropriate, different components may likely have different requirements. IMHO, we only need the ones in Component, this makes the interface cleaner and underlying implementation simpler? {quote} The reason these attributes are at the Application level as well is because there will be simple applications which will not have any components. For these simple app-definitions forcing app-owners to create components is not going to be very user-friendly. {quote} unique_component_support: what is the primary use-case to have distinct component name ? {quote} This is SLIDER-1100. It is a very powerful feature where app-owners don't need to define multiple roles of a component with almost everything same, except a few configurations. This attribute is how they express this feature through the API. More details can be found in the Slider Jira. {quote} What is the BaseResource object for? Why does Application, ApplicationStatus, Container, Resource need to extend this class? {quote} This class is to host common attributes of the resources. To start with we have uri. My guess is the list will grow. {quote} What does the Artifact#APPLICATION mean ? {quote} In case, of complex and nested applications some components will be by themselves full blown and independent applications itself. The APPLICATION type artifact refers to such external application definitions, compared to the simpler artifact types like a docker image. {quote} ApplicationState: What is difference RUNNNG vs STARTED, FINISHED vs STOPPED {quote} STARTED is when Yarn has moved an application from ACCEPTED to RUNNING state, but according to app-owner the application is not running/useful until IPs gets assigned, DNS entry gets added and/or the app reaches a stable running state where it can start serving end-user requests. So according to an app-owner, it is STARTED but not RUNNING yet. So both these states are helpful since STARTED tells them that it has been deployed by Yarn and RUNNING tells them that their application is ready to serve requests. I think FINISHED needs to be removed as STOPPED is good enough. Let me look further why it was introduced. {quote} Application#lifetime: it is String type. Does this mean we have to define a scheme for user to specify the time in string format? How about just using long type ? {quote} That is because app owners can specify the time with a unit such as 30mins or 10hours or 20days. The swagger definition defines this. The implementation has been kept simple for now, but will support the units. {quote} ApplicationStatus#errorMessage, how about call it diagnostics ? sometimes we may also return non-error messages. {quote} This is a good point. Let me change the definition. h4. Implementation: {quote} “hadoop-yarn-services-api” should be under hadoop-yarn-slider module as peer to hadoop-yarn-slider-core {quote} I think we need to discuss this further since very likely the REST service will be running in the RM JVM so it might go into the resourcemanager module. But let’s discuss this further. {quote} why the changes needed in hadoop-project/pom.xml {quote} Seems like the top level dependencies are specified here, hence had to add swagger and its content related dependencies. Am I wrong, should the top-level dependencies be specified in some other pom? {quote} We should not use a deprecated getPort() method logger.info("Listening at port = {}", applicationApiServer.getPort());, jenkins will report error. {quote} Fixed {quote} couple of things for below code HADOOP_CONFIG = getHadoopConfigs(); SLIDER_CONFIG = getSliderClientConfiguration(); We cannot load hdfs config, that's for hdfs servers. Any reason you need the hdfs configs? {quote} Agreed. Removed. {quote} Instead of calling these two methods, I think we can just call YarnConfiguration yarnConf = new YarnConfiguration(). This will automatically load the yarn-site and core-site configs. {quote} Done {quote} Why do we need to explicitly call initHadoopBinding, which is already called the super.init() previously. {quote} Good catch. Removed. {quote} These two catch clauses are identical, and Exception extends Throwable, so we only need catch Throwable, if that's desired. {quote} Agreed. Removed the Throwable catch block. {quote} This will never return null, because the numberOfContainers is intialized as 1. you might want to check zero ? if (application.getNumberOfContainers() == null) \{ throw new IllegalArgumentException(ERROR_CONTAINERS_COUNT_INVALID); \} {quote} Actually, if you pass a JSON for the create/POST call with number_of_containers set to empty string “” or null the Application object is created with numberOfContainers set to null. Hence that check is there. Zero is an acceptable number since there might be certain components which will be defined but no instances will be started in the beginning. After the app is started these components at some point will likely be flexed up from 0 to some n no of instances (n > 0). {quote} The lifetime field will never be null, because it is intilized as "unlimited" by default {quote} Again for the same reason as numberOfContainers if in the JSON lifetime been explicitly set to “” or null this check is required. {quote} IIUC, all these code are not needed, because appOptions is only used for logging, uniqueGlobalPropertyCache is not used logically, Python is not required any more in yarn-slider {quote} Python code has been removed. appOptions and uniqueGlobalPropertyCache are required as appOptions is the way application configuration properties are injected into Slider client. {quote} In agent-less world, the status command is probably not required. We need a different mechanism to determine container status. let's remove this for now appConfOptTriples.addAll(Arrays.asList(compName, configPrefix.toLowerCase() + ".statusCommand", DEFAULT_STATUS_CMD)); {quote} Agreed. Removed. {quote} remove the unused parameter globalConf in createAppConfigComponent {quote} I have some TODOs on global config which I wanted to get back to. Let me check and remove if not required. {quote} remove unused method createAppConfigGlobal {quote} I remember writing some comment as to why this method is required. Let me check and remove if not required. {quote} remove the commented out code in this class, there are quite a few places {quote} Most of the commented code is related to Slider AM resource setting and global config. I will revisit this in the next patch and clean up if not required. {quote} Can you please explain what this code tries to accomplish ? it's hard to understand what it tries to do. {quote} Sorry I should have added some comments there. This is what I added now - // If artifact is of type APPLICATION, then in the POST JSON there will // be no component definition for that artifact. Hence it's corresponding id // field is added. Every external APPLICATION has a unique id field. List<String> convertedDeps = new ArrayList<>(); . . // If the DNS dependency property is set to true for a component, it means // that it is ensured that DNS entry has been added for all the containers // of this component, before moving on to the next component in the DAG. if (hasPropertyWithValue(component, PROPERTY_DNS_DEPENDENCY, "true")) { . . {quote} The lifetime is tied to the application according to the API, why does it need to pass down to every component? IIUC, the flow should be, once the app timeouts, all components for this app timeout. {quote} This is a dirty way how lifetime is handled in Slider. It will be thrown away once YARN-3813 is used to set and manage lifetime. > Initial code for native services REST API > ----------------------------------------- > > Key: YARN-5610 > URL: https://issues.apache.org/jira/browse/YARN-5610 > Project: Hadoop YARN > Issue Type: Sub-task > Reporter: Gour Saha > Assignee: Gour Saha > Attachments: YARN-4793-yarn-native-services.001.patch > > > This task will be used to submit and review patches for the initial code drop > for the native services REST API -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org