[jira] [Commented] (YARN-7217) Improve API service usability for updating service spec and state

Eric Yang (JIRA) Thu, 26 Oct 2017 13:07:40 -0700

    [ 
https://issues.apache.org/jira/browse/YARN-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16221127#comment-16221127
 ]


Eric Yang commented on YARN-7217:
---------------------------------

[~jianhe] . Thank you for reviewing the patch, and here are the answers:

{quote}
- should solr and fs be a pluggable implementation of a common interface ? 
Basically, should it be either fs or solr back-end. Right now it's both there.
{quote}

This JIRA is a transition phase.  Solr is used as alternate storage mechanism 
to bridge the gap that current HDFS storage mechanism can not achieve for 
listing applications for all users.  Let's leave the storage change to another 
JIRA.

{quote}
- getServicesList: it assumes solr is enabled, if not, it will throw NPE. I 
think we should conditionally check if solr is enabled, if not, throw exception 
saying only solr backend is supported for this endpoint.
{quote}

getServicesList never throw NPE.  Ysc is initialized at constructor.  If Solr 
is disabled, it will throw SERVICE_UNAVAILABLE http code.  This is verified in 
testGetServicesList in TestApiServer test case.

{quote}
- similarly for getServiceSpec endpoint, it will throw NPE because ysc is null, 
if solr is not enabled.
{quote}

Same problem as above, ysc is never null because it is initialized in the 
constructor.  If ysc is not initialized when Solr is disabled per suggestion, 
then NPE situation can occur.  I agree that the coding style can be more 
consistent on how SOLR is enabled, and revise code accordingly.

{quote}
- similarly TestYarnNativeServices#testChangeSpec, as discussed, we won't need 
to restart the entire service to update the spec ? what's the use case for this 
?
{quote}

Per discussion this morning, it is best to keep configuration change and 
restart operation as two separate calls.  This allows configuration to be 
updated and hold off on deployment until suitable time window becomes 
available, then restart the service.  This gives system administrator more fine 
grind control to persist desired configuration change, then choose to restart 
service or choose to add more nodes without restart.

{quote}
- Should it be if solr is enabled, create the solrClient ? if solr is not 
enabled, there's no point creating the solrClient
{quote}

Solr enabled flag is persisted in YarnSolrClient object to keep its internal 
state atomic instead of tracking the flag in ServiceClient.  I can add if 
statement to skip initialization of yarn solr client.  However, it seems 
redundant to have to deal with NPE in if statements, if YarnSolrClient skipped 
initialization.  Hence, I will not make change here.

{quote}
- updateComponent api should also update the spec in solr ?
- the username parameter is not used in findAppEntry API at all, but the 
deployApp inserts the username, then why is the username required in the first 
place ?
- similarly, username is not used in deleteApp, then why do we need to get the 
username in caller in the first place
{quote}

I will fix these bugs.

{quote}
All services configs are currently in YarnServiceConf class, I think we can put 
the new configs there to not mix with the core YarnConfigurations, until the 
feature and config namings are stable, we can merge them back to 
YarnConfiguration.
{quote}

We should avoid to introduce sub configuration without expose them to upper 
level.  The chance of someone else introduce duplicate hierarchy is high, then 
it becomes painful to merge.  I recommend to upstream the configuration knobs 
to upper level to avoid doing the same thing over and over.  This is difference 
in philosophy of how to handle changes, since we are already on a branch, there 
is no risk to introduce to yarh-common directly.  I will not make a change here.

{quote}
could you explain below logic ? looks like it tries to look for all entries 
with "id:appName" and the while loop continues until the last one is find, and 
return the last one . Presumbaly there's only 1 entry, then why is a while loop 
required? If there are multiple entries, why returning the last one ?
{quote}

There will only be one match because this is a single entry query.  However, 
Solr doesn't have a single entry lookup interface, and I just use common 
Iterator interface provided by Solr.  This is the reason that it is in a while 
loop.  I can change it to if .. else to make it more readable.

Thanks for the suggestions, I will make the improvements and upload another 
patch.  Let me know if there is any doubts in my comments.  Thanks

> Improve API service usability for updating service spec and state
> -----------------------------------------------------------------
>
>                 Key: YARN-7217
>                 URL: https://issues.apache.org/jira/browse/YARN-7217
>             Project: Hadoop YARN
>          Issue Type: Task
>          Components: api, applications
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>         Attachments: YARN-7217.yarn-native-services.001.patch, 
> YARN-7217.yarn-native-services.002.patch, 
> YARN-7217.yarn-native-services.003.patch, 
> YARN-7217.yarn-native-services.004.patch, 
> YARN-7217.yarn-native-services.005.patch
>
>
> API service for deploy, and manage YARN services have several limitations.
> {{updateService}} API provides multiple functions:
> # Stopping a service.
> # Start a service.
> # Increase or decrease number of containers.  (This was removed in YARN-7323).
> The overloading is buggy depending on how the configuration should be applied.
> h4. Scenario 1
> A user retrieves Service object from getService call, and the Service object 
> contains state: STARTED.  The user would like to increase number of 
> containers for the deployed service.  The JSON has been updated to increase 
> container count.  The PUT method does not actually increase container count.
> h4. Scenario 2
> A user retrieves Service object from getService call, and the Service object 
> contains state: STOPPED.  The user would like to make a environment 
> configuration change.  The configuration does not get updated after PUT 
> method.
> This is possible to address by rearranging the logic of START/STOP after 
> configuration update.  However, there are other potential combinations that 
> can break PUT method.  For example, user like to make configuration changes, 
> but not yet restart the service until a later time.
> h4. Scenario 3
> There is no API to list all deployed applications by the same user.
> h4. Scenario 4
> Desired state (spec) and current state are represented by the same Service 
> object.  There is no easy way to identify "state" is desired state to reach 
> or, the current state of the service.  It would be nice to have ability to 
> retrieve both desired state, and current state with separated entry points.  
> By implementing /spec and /state, it can resolve this problem.
> h4. Scenario 5
> List all services deploy by the same user can trigger a directory listing 
> operation on namenode if hdfs is used as storage for metadata.  When hundred 
> of users use Service UI to view or deploy applications, this will trigger 
> denial of services attack on namenode.  The sparse small metadata files also 
> reduce efficiency of Namenode memory usage.  Hence, a cache layer for storing 
> service metadata can reduce namenode stress.
> h3. Proposed change
> ApiService can separate the PUT method into two PUT methods for configuration 
> changes vs operation changes.  New API could look like:
> {code}
> @PUT
> /ws/v1/services/[service_name]/spec
> Request Data:
> {
>   "name": "amp",
>   "components": [
>     {
>       "name": "mysql",
>       "number_of_containers": 2,
>       "artifact": {
>         "id": "centos/mysql-57-centos7:latest",
>         "type": "DOCKER"
>       },
>       "run_privileged_container": false,
>       "launch_command": "",
>       "resource": {
>         "cpus": 1,
>         "memory": "2048"
>       },
>       "configuration": {
>         "env": {
>           "MYSQL_USER":"${USER}",
>           "MYSQL_PASSWORD":"password"
>         }
>       }
>      }
>   ],
>   "quicklinks": {
>     "Apache Document Root": 
> "http://httpd.${SERVICE_NAME}.${USER}.${DOMAIN}:8080/";,
>     "PHP MyAdmin": "http://phpmyadmin.${SERVICE_NAME}.${USER}.${DOMAIN}:8080/";
>   }
> }
> {code}
> {code}
> @PUT
> /ws/v1/services/[service_name]/state
> Request data:
> {
>   "name": "amp",
>   "components": [
>     {
>       "name": "mysql",
>       "state": "STOPPED"
>      }
>   ]
> }
> {code}
> SOLR can be used to cache Yarnfile to improve lookup performance and reduce 
> stress of namenode small file problems and high frequency lookup.  SOLR is 
> chosen for caching metadata because its indexing feature can be used to build 
> full text search for application catalog as well.
> For service that requires configuration changes to increase or decrease node 
> count.  The calling sequence is:
> {code}
> # GET /ws/v1/services/{service_name}/spec
> # Change number_of_containers to desired number.
> # PUT /ws/v1/services/{service_name}/spec to update the spec.
> # PUT /ws/v1/services/{service_name}/state to stop existing service.
> # PUT /ws/v1/services/{service_name}/state to start service.
> {code}
> For components that can increase node count without rewrite configuration:
> {code}
> # GET /ws/v1/services/{service_name}/spec
> # Change number_of_containers to desired number.
> # PUT /ws/v1/services/{service_name}/spec to update the spec.
> # PUT /ws/v1/services/{service_name}/component/{component_name} to change 
> node count.
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (YARN-7217) Improve API service usability for updating service spec and state

Reply via email to