[
https://issues.apache.org/jira/browse/YARN-1139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13797804#comment-13797804
]
Steve Loughran commented on YARN-1139:
--------------------------------------
# you don't need to convert any exceptions now, because the inner
{{serviceStart()/serviceStop()}} methods throw exceptions. Just pass them up.
The only reason the existing services didn't have their exception catch/wrap
logic changed as part of YARN-117 is that I didn't want to add extra changes
# AbstractService catches a failure and relays to noteFailure(), which, for the
first exception caught, gets saved away; {{getFailureCause()}} and
{{getFailureState()}} returns that exception and the state when it happened.
# when an exception is caught during state changes, it triggers a
{{Service.stop()}} action -which is why it is required to be a best-effort
operation & do its best even when trying to stop a partially inited or started
service
# it then calls {{ServiceStateException.convert(e);}} to convert the exception
into a RuntimeException; if it is one it is left alone, otherwise it is
surrounded by a ServiceStateException.
# The composite service runs through its children starting each one in turn.
The first one that fails by throwing a runtime exception will trigger the
noteFailure operation on the parent, then the composite service's stop()
operation -which then walks back through all inited services (but not the
UNINITED ones -things failed when we tried that), stopping them in turn.
What that means is that if a child service fails, the composite should pick
that up and save it as its own failure cause.
I've actually done a couple more child-holding services for my own work, which
I'd happily push back into trunk/2.3
[https://github.com/hortonworks/hoya/tree/develop/hoya-core/src/main/java/org/apache/hadoop/hoya/yarn/service]
* The
[SequenceService|https://github.com/hortonworks/hoya/blob/develop/hoya-core/src/main/java/org/apache/hadoop/hoya/yarn/service/SequenceService.java]
runs its children in sequence, failing when one fails
* The
[CompoundService|https://github.com/hortonworks/hoya/blob/develop/hoya-core/src/main/java/org/apache/hadoop/hoya/yarn/service/CompoundService.java]
stops as soon as any one of its children fail, again propagating any faults up
These both implement a [Parent interface| Parent.java] so that they can be
treated uniformally -and allow other bits of the code to add children
Alongside that:
* [EventNotifyingService|
https://github.com/hortonworks/hoya/blob/develop/hoya-core/src/main/java/org/apache/hadoop/hoya/yarn/service/EventNotifyingService.java]
: sleeps, notifies a callback, stops
*
[ForkedProcessService|https://github.com/hortonworks/hoya/blob/develop/hoya-core/src/main/java/org/apache/hadoop/hoya/yarn/service/ForkedProcessService.java]:
forks off a native process, stops when the process stops, kills the process
when it itself is stopped, and forwards up exceptions on a process failure
These let me build up more complex workflows like this one [to start
accumulo|https://github.com/hortonworks/hoya/blob/develop/hoya-core/src/main/java/org/apache/hadoop/hoya/providers/accumulo/AccumuloProviderService.java#L331]
-runs a sequence of "accumulo init" (if needed), followed by, in parallel,
"accumulo start" and a delayed event callback. That callback will, if accumulo
start hasn't failed in the meantime, trigger the request for containers for
whatever other accumulo roles have been added.
Anyway, the services will catch, record, wrap and relay exceptions, the parents
just need to be able to handle the fact that it will be a RuntimeException that
comes back -and there is no need to catch and wrap it again if you want to pass
it upstream.
> [Umbrella] Convert all RM components to Services
> ------------------------------------------------
>
> Key: YARN-1139
> URL: https://issues.apache.org/jira/browse/YARN-1139
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: resourcemanager
> Affects Versions: 2.1.0-beta
> Reporter: Karthik Kambatla
> Assignee: Tsuyoshi OZAWA
>
> Some of the RM components - state store, scheduler etc. are not services.
> Converting them to services goes well with the "Always On" and "Active"
> service separation proposed on YARN-1098.
> Given that some of them already have start(), stop() methods, it should not
> be too hard to convert them to services.
> That would also be a cleaner way of addressing YARN-1125.
--
This message was sent by Atlassian JIRA
(v6.1#6144)