[
https://issues.apache.org/jira/browse/YARN-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16219725#comment-16219725
]
Jian He commented on YARN-7217:
-------------------------------
Meta questions as Billie mentioned:
- should solr and fs be a pluggable implementation of a common interface ?
Basically, should it be either fs or solr back-end. Right now it's both there.
Code comments:
- For below, one constructor can call another to avoid code duplication
{code}
public ApiServer() {
YARN_CONFIG = new YarnConfiguration();
try {
if (SERVICE_CLIENT==null) {
SERVICE_CLIENT = new ServiceClient();
SERVICE_CLIENT.init(YARN_CONFIG);
SERVICE_CLIENT.start();
}
} catch (Exception e) {
LOG.error("Fail to initialize ServiceClient, ApiServer is unavailable.");
}
}
/**
* Constructor used by ResourceManager.
*
* @param conf
*/
@Inject
public ApiServer(Configuration conf) {
YARN_CONFIG = conf;
try {
if (SERVICE_CLIENT==null) {
SERVICE_CLIENT = new ServiceClient();
SERVICE_CLIENT.init(YARN_CONFIG);
SERVICE_CLIENT.start();
}
} catch (Exception e) {
LOG.error("Fail to initialize ServiceClient, ApiServer is unavailable.");
}
}
{code}
- getServicesList: it assumes solr is enabled, if not, it will throw NPE. I
think we should conditionally check if solr is enabled, if not, throw exception
saying only solr backend is supported for this endpoint.
- similarly for getServiceSpec endpoint, it will throw NPE because ysc is null,
if solr is not enabled.
- updateServiceSpec: it only updates the spec in hdfs and solr, what's the use
case of this api ?
- similarly TestYarnNativeServices#testChangeSpec, as discussed, we won't need
to restart the entire service to update the spec ? what's the use case for this
?
- {{/services/{service_name}/state}} : call it status ? IMHO, state is like
STOPPED or STARTED, status is more like a group of information
- Should it be if solr is enabled, create the solrClient ? if solr is not
enabled, there's no point creating the solrClient
{code}
addService(yarnClient);
+ createYarnSolrClient(configuration);
super.serviceInit(configuration);
{code}
- The entire code can be inside the isEnabled block, similarly for other places
{code}
if (UserGroupInformation.isSecurityEnabled()) {
try {
userName = UserGroupInformation.getCurrentUser().getUserName();
} catch(KerberosAuthException e) {
userName = null;
}
}
if (ysc.isEnabled()) {
try {
ysc.deployApp(service, userName);
} catch (ServiceUnavailableException e) {
throw new YarnException("Fail to persist service spec.");
}
}
{code}
- we can create a common method for this, since it is used in so many places
{code}
if (UserGroupInformation.isSecurityEnabled()) {
try {
userName = UserGroupInformation.getCurrentUser().getUserName();
} catch(KerberosAuthException e) {
userName = null;
}
}
{code}
- updateComponent api should also update the spec in solr ?
- All services configs are currently in YarnServiceConf class, I think we can
put the new configs there to not mix with the core YarnConfigurations, until
the feature and config namings are stable, we can merge them back to
YarnConfiguration.
- ApiServerTest class is not used, remove?
- ServiceClientTest: createYarnSolrClient and serviceInit are the same as in
the parent class, no need to override ?
- the username parameter is not used in findAppEntry API at all, but the
deployApp inserts the username, then why is the username required in the
first place ?
- similarly, username is not used in deleteApp, then why do we need to get the
username in caller in the first place
- could you explain below logic ? looks like it tries to look for all entries
with "id:appName" and the while loop continues until the last one is find, and
return the last one . Presumbaly there's only 1 entry, then why is a while
loop required? If there are multiple entries, why returning the last one ?
{code}
QueryResponse response;
try {
response = solr.query(query);
Iterator<SolrDocument> appList = response.getResults()
.listIterator();
while (appList.hasNext()) {
SolrDocument d = appList.next();
entry = mapper.readValue(d.get("yarnfile_s").toString(),
Service.class);
found = true;
}
if (!found) {
throw new ApplicationNotFoundException("Application entry is " +
"not found: " + appName);
}
} catch (SolrServerException | IOException e) {
LOG.error("Error in finding deployed application: " + appName, e);
}
return entry;
{code}
- YarnSolrClient#getEnabled is duplicated with isEnabled method.
- YarnSolrClient#check, if the caller is only invoking the solrClient API if
solr is enabled, then it won't enter this method in the first place, so the
check is not needed?
> Improve API service usability for updating service spec and state
> -----------------------------------------------------------------
>
> Key: YARN-7217
> URL: https://issues.apache.org/jira/browse/YARN-7217
> Project: Hadoop YARN
> Issue Type: Task
> Components: api, applications
> Reporter: Eric Yang
> Assignee: Eric Yang
> Attachments: YARN-7217.yarn-native-services.001.patch,
> YARN-7217.yarn-native-services.002.patch,
> YARN-7217.yarn-native-services.003.patch,
> YARN-7217.yarn-native-services.004.patch,
> YARN-7217.yarn-native-services.005.patch
>
>
> API service for deploy, and manage YARN services have several limitations.
> {{updateService}} API provides multiple functions:
> # Stopping a service.
> # Start a service.
> # Increase or decrease number of containers. (This was removed in YARN-7323).
> The overloading is buggy depending on how the configuration should be applied.
> h4. Scenario 1
> A user retrieves Service object from getService call, and the Service object
> contains state: STARTED. The user would like to increase number of
> containers for the deployed service. The JSON has been updated to increase
> container count. The PUT method does not actually increase container count.
> h4. Scenario 2
> A user retrieves Service object from getService call, and the Service object
> contains state: STOPPED. The user would like to make a environment
> configuration change. The configuration does not get updated after PUT
> method.
> This is possible to address by rearranging the logic of START/STOP after
> configuration update. However, there are other potential combinations that
> can break PUT method. For example, user like to make configuration changes,
> but not yet restart the service until a later time.
> h4. Scenario 3
> There is no API to list all deployed applications by the same user.
> h4. Scenario 4
> Desired state (spec) and current state are represented by the same Service
> object. There is no easy way to identify "state" is desired state to reach
> or, the current state of the service. It would be nice to have ability to
> retrieve both desired state, and current state with separated entry points.
> By implementing /spec and /state, it can resolve this problem.
> h4. Scenario 5
> List all services deploy by the same user can trigger a directory listing
> operation on namenode if hdfs is used as storage for metadata. When hundred
> of users use Service UI to view or deploy applications, this will trigger
> denial of services attack on namenode. The sparse small metadata files also
> reduce efficiency of Namenode memory usage. Hence, a cache layer for storing
> service metadata can reduce namenode stress.
> h3. Proposed change
> ApiService can separate the PUT method into two PUT methods for configuration
> changes vs operation changes. New API could look like:
> {code}
> @PUT
> /ws/v1/services/[service_name]/spec
> Request Data:
> {
> "name": "amp",
> "components": [
> {
> "name": "mysql",
> "number_of_containers": 2,
> "artifact": {
> "id": "centos/mysql-57-centos7:latest",
> "type": "DOCKER"
> },
> "run_privileged_container": false,
> "launch_command": "",
> "resource": {
> "cpus": 1,
> "memory": "2048"
> },
> "configuration": {
> "env": {
> "MYSQL_USER":"${USER}",
> "MYSQL_PASSWORD":"password"
> }
> }
> }
> ],
> "quicklinks": {
> "Apache Document Root":
> "http://httpd.${SERVICE_NAME}.${USER}.${DOMAIN}:8080/",
> "PHP MyAdmin": "http://phpmyadmin.${SERVICE_NAME}.${USER}.${DOMAIN}:8080/"
> }
> }
> {code}
> {code}
> @PUT
> /ws/v1/services/[service_name]/state
> Request data:
> {
> "name": "amp",
> "components": [
> {
> "name": "mysql",
> "state": "STOPPED"
> }
> ]
> }
> {code}
> SOLR can be used to cache Yarnfile to improve lookup performance and reduce
> stress of namenode small file problems and high frequency lookup. SOLR is
> chosen for caching metadata because its indexing feature can be used to build
> full text search for application catalog as well.
> For service that requires configuration changes to increase or decrease node
> count. The calling sequence is:
> {code}
> # GET /ws/v1/services/{service_name}/spec
> # Change number_of_containers to desired number.
> # PUT /ws/v1/services/{service_name}/spec to update the spec.
> # PUT /ws/v1/services/{service_name}/state to stop existing service.
> # PUT /ws/v1/services/{service_name}/state to start service.
> {code}
> For components that can increase node count without rewrite configuration:
> {code}
> # GET /ws/v1/services/{service_name}/spec
> # Change number_of_containers to desired number.
> # PUT /ws/v1/services/{service_name}/spec to update the spec.
> # PUT /ws/v1/services/{service_name}/component/{component_name} to change
> node count.
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]