[ 
https://issues.apache.org/jira/browse/YARN-5610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15474563#comment-15474563
 ] 

Gour Saha commented on YARN-5610:
---------------------------------

[~jianhe] thank you for reviewing the patch. Addressing the first set of 
comments/feedback. Will upload the patch with the code changes.

h4. API Models:
{quote}
\{artifact, resource, launch_command, number_of_containers\}
in Application seems duplicated with those inside the component. I feel in this 
scenario, a default global setting for artifacts, launch_command etc. is not 
that appropriate, different components may likely have different requirements. 
IMHO, we only need the ones in Component, this makes the interface cleaner and 
underlying implementation simpler?
{quote}
The reason these attributes are at the Application level as well is because 
there will be simple applications which will not have any components. For these 
simple app-definitions forcing app-owners to create components is not going to 
be very user-friendly.

{quote}
unique_component_support: what is the primary use-case to have distinct 
component name ?
{quote}
This is SLIDER-1100. It is a very powerful feature where app-owners don't need 
to define multiple roles of a component with almost everything same, except a 
few configurations. This attribute is how they express this feature through the 
API. More details can be found in the Slider Jira.

{quote}
What is the BaseResource object for? Why does Application, ApplicationStatus, 
Container, Resource need to extend this class?
{quote}
This class is to host common attributes of the resources. To start with we have 
uri. My guess is the list will grow.

{quote}
What does the Artifact#APPLICATION mean ?
{quote}
In case, of complex and nested applications some components will be by 
themselves full blown and independent applications itself. The APPLICATION type 
artifact refers to such external application definitions, compared to the 
simpler artifact types like a docker image.

{quote}
ApplicationState: What is difference RUNNNG vs STARTED, FINISHED vs STOPPED
{quote}
STARTED is when Yarn has moved an application from ACCEPTED to RUNNING state, 
but according to app-owner the application is not running/useful until IPs gets 
assigned, DNS entry gets added and/or the app reaches a stable running state 
where it can start serving end-user requests. So according to an app-owner, it 
is STARTED but not RUNNING yet. So both these states are helpful since STARTED 
tells them that it has been deployed by Yarn and RUNNING tells them that their 
application is ready to serve requests. I think FINISHED needs to be removed as 
STOPPED is good enough. Let me look further why it was introduced.

{quote}
Application#lifetime: it is String type. Does this mean we have to define a 
scheme for user to specify the time in string format? How about just using long 
type ?
{quote}
That is because app owners can specify the time with a unit such as 30mins or 
10hours or 20days. The swagger definition defines this. The implementation has 
been kept simple for now, but will support the units.

{quote}
ApplicationStatus#errorMessage, how about call it diagnostics ? sometimes we 
may also return non-error messages.
{quote}
This is a good point. Let me change the definition.

h4. Implementation:
{quote}
“hadoop-yarn-services-api” should be under hadoop-yarn-slider module as peer to 
hadoop-yarn-slider-core
{quote}
I think we need to discuss this further since very likely the REST service will 
be running in the RM JVM so it might go into the resourcemanager module. But 
let’s discuss this further.

{quote}
why the changes needed in hadoop-project/pom.xml
{quote}
Seems like the top level dependencies are specified here, hence had to add 
swagger and its content related dependencies. Am I wrong, should the top-level 
dependencies be specified in some other pom?

{quote}
We should not use a deprecated getPort() method logger.info("Listening at port 
= {}", applicationApiServer.getPort());, jenkins will report error.
{quote}
Fixed

{quote}
couple of things for below code

HADOOP_CONFIG = getHadoopConfigs();

SLIDER_CONFIG = getSliderClientConfiguration();
We cannot load hdfs config, that's for hdfs servers. Any reason you need the 
hdfs configs?
{quote}
Agreed. Removed.

{quote}
Instead of calling these two methods, I think we can just call 
YarnConfiguration yarnConf = new YarnConfiguration(). This will automatically 
load the yarn-site and core-site configs.
{quote}
Done

{quote}
Why do we need to explicitly call initHadoopBinding, which is already called 
the super.init() previously.
{quote}
Good catch. Removed.

{quote}
These two catch clauses are identical, and Exception extends Throwable, so we 
only need catch Throwable, if that's desired.
{quote}
Agreed. Removed the Throwable catch block.

{quote}
This will never return null, because the numberOfContainers is intialized as 1. 
you might want to check zero ?
      if (application.getNumberOfContainers() == null) \{
        throw new IllegalArgumentException(ERROR_CONTAINERS_COUNT_INVALID);
      \}
{quote}
Actually, if you pass a JSON for the create/POST call with number_of_containers 
set to empty string “” or null the Application object is created with 
numberOfContainers set to null. Hence that check is there. Zero is an 
acceptable number since there might be certain components which will be defined 
but no instances will be started in the beginning. After the app is started 
these components at some point will likely be flexed up from 0 to some n no of 
instances (n > 0).

{quote}
The lifetime field will never be null, because it is intilized as "unlimited" 
by default
{quote}
Again for the same reason as numberOfContainers if in the JSON lifetime been 
explicitly set to “” or null this check is required.

{quote}
IIUC, all these code are not needed, because appOptions is only used for 
logging, uniqueGlobalPropertyCache is not used logically, Python is not 
required any more in yarn-slider
{quote}
Python code has been removed. appOptions and uniqueGlobalPropertyCache are 
required as appOptions is the way application configuration properties are 
injected into Slider client.

{quote}
In agent-less world, the status command is probably not required. We need a 
different mechanism to determine container status. let's remove this for now

appConfOptTriples.addAll(Arrays.asList(compName, configPrefix.toLowerCase()
+ ".statusCommand", DEFAULT_STATUS_CMD));
{quote}
Agreed. Removed.

{quote}
remove the unused parameter globalConf in createAppConfigComponent
{quote}
I have some TODOs on global config which I wanted to get back to. Let me check 
and remove if not required.

{quote}
remove unused method createAppConfigGlobal
{quote}
I remember writing some comment as to why this method is required. Let me check 
and remove if not required.

{quote}
remove the commented out code in this class, there are quite a few places
{quote}
Most of the commented code is related to Slider AM resource setting and global 
config. I will revisit this in the next patch and clean up if not required.

{quote}
Can you please explain what this code tries to accomplish ? it's hard to 
understand what it tries to do.
{quote}
Sorry I should have added some comments there. This is what I added now -
    // If artifact is of type APPLICATION, then in the POST JSON there will
    // be no component definition for that artifact. Hence it's corresponding id
    // field is added. Every external APPLICATION has a unique id field.
    List<String> convertedDeps = new ArrayList<>();
.
.

    // If the DNS dependency property is set to true for a component, it means
    // that it is ensured that DNS entry has been added for all the containers
    // of this component, before moving on to the next component in the DAG.
    if (hasPropertyWithValue(component, PROPERTY_DNS_DEPENDENCY, "true")) {
.
.

{quote}
The lifetime is tied to the application according to the API, why does it need 
to pass down to every component? IIUC, the flow should be, once the app 
timeouts, all components for this app timeout.
{quote}
This is a dirty way how lifetime is handled in Slider. It will be thrown away 
once YARN-3813 is used to set and manage lifetime.


> Initial code for native services REST API
> -----------------------------------------
>
>                 Key: YARN-5610
>                 URL: https://issues.apache.org/jira/browse/YARN-5610
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Gour Saha
>            Assignee: Gour Saha
>         Attachments: YARN-4793-yarn-native-services.001.patch
>
>
> This task will be used to submit and review patches for the initial code drop 
> for the native services REST API 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to