[ 
https://issues.apache.org/jira/browse/YARN-9075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16710477#comment-16710477
 ] 

Billie Rinaldi commented on YARN-9075:
--------------------------------------

Thanks for taking a look, [~cheersyang]!

bq. 1. Does this allow us to set different configs for AUX services on 
different nodes?
I have thought it would be useful to start different aux services on different 
nodes. We could have a Service object specify which nodes to run on through its 
PlacementPolicy (possibly a reduced capability placement policy). This will 
allow us to have a single manifest, so each NM can read the manifest and 
determine which aux services it should load. For the first patch, I was only 
planning to support the same aux services on all nodes, as we do today, but I 
agree it would be useful to add support for different services on different 
nodes in a later patch. I may need to add sub-tasks to this ticket.

bq. 2. What happens if a AUX service fail to reload (during upgrade)? Would 
that crash NM process?
I think aux service init or start failures currently cause runtime exceptions 
in the NM. 

bq. 3. I guess this is not depending on HDFS, any implementation of FileSystem 
would work?
Yes, we should be able to read the manifest from any FileSystem.

bq. 4. I think it needs to expose some query APIs via NMAdmin to check AUX 
service status, general runtime info, e.g resource usage, port listening, 
version etc. Does this make sense?
I have considered adding an NM REST endpoint such as 
/ws/v1/node/auxiliaryservices where the user could retrieve information about 
the aux services that are currently running. I hadn't thought about adding 
information beyond aux service name and version. Start time sounds like a good 
candidate for inclusion. I'm not sure how we would get the resource usage and 
port information when the aux services are running in-process. 

> Dynamically add or remove auxiliary services
> --------------------------------------------
>
>                 Key: YARN-9075
>                 URL: https://issues.apache.org/jira/browse/YARN-9075
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: nodemanager
>            Reporter: Billie Rinaldi
>            Assignee: Billie Rinaldi
>            Priority: Major
>         Attachments: YARN-9075_Dynamic_Aux_Services_V1.pdf
>
>
> It would be useful to support adding, removing, or updating auxiliary 
> services without requiring a restart of NMs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to