[
https://issues.apache.org/jira/browse/YARN-9075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16710477#comment-16710477
]
Billie Rinaldi commented on YARN-9075:
--------------------------------------
Thanks for taking a look, [~cheersyang]!
bq. 1. Does this allow us to set different configs for AUX services on
different nodes?
I have thought it would be useful to start different aux services on different
nodes. We could have a Service object specify which nodes to run on through its
PlacementPolicy (possibly a reduced capability placement policy). This will
allow us to have a single manifest, so each NM can read the manifest and
determine which aux services it should load. For the first patch, I was only
planning to support the same aux services on all nodes, as we do today, but I
agree it would be useful to add support for different services on different
nodes in a later patch. I may need to add sub-tasks to this ticket.
bq. 2. What happens if a AUX service fail to reload (during upgrade)? Would
that crash NM process?
I think aux service init or start failures currently cause runtime exceptions
in the NM.
bq. 3. I guess this is not depending on HDFS, any implementation of FileSystem
would work?
Yes, we should be able to read the manifest from any FileSystem.
bq. 4. I think it needs to expose some query APIs via NMAdmin to check AUX
service status, general runtime info, e.g resource usage, port listening,
version etc. Does this make sense?
I have considered adding an NM REST endpoint such as
/ws/v1/node/auxiliaryservices where the user could retrieve information about
the aux services that are currently running. I hadn't thought about adding
information beyond aux service name and version. Start time sounds like a good
candidate for inclusion. I'm not sure how we would get the resource usage and
port information when the aux services are running in-process.
> Dynamically add or remove auxiliary services
> --------------------------------------------
>
> Key: YARN-9075
> URL: https://issues.apache.org/jira/browse/YARN-9075
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: nodemanager
> Reporter: Billie Rinaldi
> Assignee: Billie Rinaldi
> Priority: Major
> Attachments: YARN-9075_Dynamic_Aux_Services_V1.pdf
>
>
> It would be useful to support adding, removing, or updating auxiliary
> services without requiring a restart of NMs.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]