[
https://issues.apache.org/jira/browse/YARN-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208192#comment-16208192
]
Eric Yang commented on YARN-7217:
---------------------------------
[~gsaha] said:
{quote}
Are you saying that the API will not support flexing of no of containers when
the service is running?
{quote}
Flex operation was implemented sub-optimally in Slider. i.e. Increased node
count for HBase Region server, stop the service, and resume the service.
Should the service resume from 4 nodes that was initially started or 10 nodes
that was later increased? While Slider had a noble idea to have flex
operation, but an operation without storing configuration can cause operation
hazard because the operation become non-repeatable.
YARN-7216 is proposed to decouple configuration changes and performing an
operation. This JIRA propose to separate out PUT method for updateService into
two calls for config changes, and service operation. This will provide more
insights into over all transaction, and provide ability to reproduce.
Do we need a flex operation without restart service? The answer is likely no.
There are 3 services (Datanode, NodeManager, HBase region server) that can add
slave nodes without restarting masters. The majority of use case for changing
node count will result in configuration changes and force restart. Majority of
software follow the second model to make config changes, then restart services.
This ensure the change request is repeatable. Hence, the need to support flex
operation without configuration change would be greatly reduced.
> PUT method for update service for Service API doesn't function correctly
> ------------------------------------------------------------------------
>
> Key: YARN-7217
> URL: https://issues.apache.org/jira/browse/YARN-7217
> Project: Hadoop YARN
> Issue Type: Task
> Components: api, applications
> Reporter: Eric Yang
> Assignee: Eric Yang
> Attachments: YARN-7217.yarn-native-services.001.patch,
> YARN-7217.yarn-native-services.002.patch
>
>
> The PUT method for updateService API provides multiple functions:
> # Stopping a service.
> # Start a service.
> # Increase or decrease number of containers.
> The overloading is buggy depending on how the configuration should be applied.
> Scenario 1
> A user retrieves Service object from getService call, and the Service object
> contains state: STARTED. The user would like to increase number of
> containers for the deployed service. The JSON has been updated to increase
> container count. The PUT method does not actually increase container count.
> Scenario 2
> A user retrieves Service object from getService call, and the Service object
> contains state: STOPPED. The user would like to make a environment
> configuration change. The configuration does not get updated after PUT
> method.
> This is possible to address by rearranging the logic of START/STOP after
> configuration update. However, there are other potential combinations that
> can break PUT method. For example, user like to make configuration changes,
> but not yet restart the service until a later time.
> The alternative is to separate the PUT method into PUT method for
> configuration vs status. This increase the number of action that can be
> performed. New API could look like:
> {code}
> @PUT
> /ws/v1/services/[service_name]/config
> Request Data:
> {
> "name":"[service_name]",
> "number_of_containers": 5
> }
> {code}
> {code}
> @PUT
> /ws/v1/services/[service_name]/state
> Request data:
> {
> "name": "[service_name]",
> "state": "STOPPED|STARTED"
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]