[
https://issues.apache.org/jira/browse/YARN-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208278#comment-16208278
]
Eric Yang commented on YARN-7217:
---------------------------------
[~billie.rinaldi] A noble idea doesn't always translate to reality. Is it
possible to add a standby namenode without restart datanode? Is it possible to
add another resource manager without restart node manager? Is it possible to
add another Hive server2 without restart Knox for load balance? All above
questions have the same answer, no. Hadoop have spent significant 12 years of
resources to ensure rpc call retries and elastic datanode can happen. In the
recent years, Ambari has been invented to ensure Hadoop configuration are
recorded before service operation are performed for serviceability. However,
most applications will not receive the same amount of investment like Hadoop in
developing reliability and serviceability. Therefore, I do not know if flex
without restart would be deem as important as it was once promised.
Please note that, the current PUT method retains flex operation as it was
written. It only provide additional end point to record number of container
needs to be increased or decreased in the event of service restart. Service
can resume with the same number of containers prior to stop. Perhaps, we can
add a flag to Service object in component section. A flag to indicate if the
component can increase/decrease nodes without restart. This will be an hint to
backend to allow increase node count without restart. This helps to keep
existing slider flex functionality for HBase Region server and new emerging
application. Does this sound like a reasonable enhancement?
> PUT method for update service for Service API doesn't function correctly
> ------------------------------------------------------------------------
>
> Key: YARN-7217
> URL: https://issues.apache.org/jira/browse/YARN-7217
> Project: Hadoop YARN
> Issue Type: Task
> Components: api, applications
> Reporter: Eric Yang
> Assignee: Eric Yang
> Attachments: YARN-7217.yarn-native-services.001.patch,
> YARN-7217.yarn-native-services.002.patch
>
>
> The PUT method for updateService API provides multiple functions:
> # Stopping a service.
> # Start a service.
> # Increase or decrease number of containers.
> The overloading is buggy depending on how the configuration should be applied.
> Scenario 1
> A user retrieves Service object from getService call, and the Service object
> contains state: STARTED. The user would like to increase number of
> containers for the deployed service. The JSON has been updated to increase
> container count. The PUT method does not actually increase container count.
> Scenario 2
> A user retrieves Service object from getService call, and the Service object
> contains state: STOPPED. The user would like to make a environment
> configuration change. The configuration does not get updated after PUT
> method.
> This is possible to address by rearranging the logic of START/STOP after
> configuration update. However, there are other potential combinations that
> can break PUT method. For example, user like to make configuration changes,
> but not yet restart the service until a later time.
> The alternative is to separate the PUT method into PUT method for
> configuration vs status. This increase the number of action that can be
> performed. New API could look like:
> {code}
> @PUT
> /ws/v1/services/[service_name]/config
> Request Data:
> {
> "name":"[service_name]",
> "number_of_containers": 5
> }
> {code}
> {code}
> @PUT
> /ws/v1/services/[service_name]/state
> Request data:
> {
> "name": "[service_name]",
> "state": "STOPPED|STARTED"
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]