Hi Tom
Responses inline.
On Feb 11, 2011, at 6:34 PM, Tom White wrote:
> Hi David,
>
> This sounds like an interesting project. If I understand correctly,
> this would be a separate service, but would interact with other
> services to expand/contract them? So there'd be a core piece.
Well the monitor and coordinator could be a single whirr service
started at boot by the cli, and one of the monitors would selected as the
coordinator by leader election.
> It's probably useful to implement manual expansion of clusters first
> (https://issues.apache.org/jira/browse/WHIRR-214), then add logic to
> (optionally) do it automatically. I don't think auto-scaling makes
> sense for all services, does it? (E.g. you probably don't want to
> automatically grow a ZK cluster.) Which service do you have in mind
> for auto-scaling?
Regarding manual expansion on the cluster (WHIRR-214), this would of
course be a requirement, I had planned to start work on this as per Andrei's
suggestion.
Of course auto-scaling would not make sense for all services (zk is one
of them) but imagine it would/could be useful for hadoop/hbase/cassandra, i.e.
all the services that would benefit from WHIRR-214.
Cheers
David
> Cheers,
> Tom
>
> On Fri, Feb 11, 2011 at 6:22 AM, David Alves <[email protected]> wrote:
>> Hi
>>
>> I have the following requirement for my work, and I'd like to hear
>> opinions on the possible inclusion of such features on whirr.
>> I need an elastic scaling monitor and coordinator, i.e. a whirr
>> process that would be running on some or all of the nodes that:
>> - would collect load metrics (both generic and specific to each
>> application)
>> - would feed them through an elastic decision making engine (also
>> specific to each application as it depends on the specific metrics)
>> - would then act on those decisions by either expanding or
>> contracting the cluster.
>>
>> Some specifics:
>> - it must not be completely distributed, i.e. it can have a specific
>> assigned node that will monitor/coordinate but this node must not be fixed,
>> i.e. it could/should change if the previous coordinator leaves the cluster.
>> - each application would define the set of metrics that it emits and
>> use a local monitor process to feed them to the coordinator.
>> - the monitor process should emit some standard metrics (Disk I/O,
>> CPU Load, Net I/O, memory)
>> - the coordinator would have a pluggable decision engine policy also
>> defined by the application that would consume metrics and make a decision.
>> - whirr would take care of requesting/releasing nodes and
>> adding/removing them from the relevant services.
>>
>> Some implementation ideas:
>> - it could tun on top of zookeeper. zk is already a requirement for
>> several services and would allow to reliably store coordinator state so that
>> another node can pickup if the previous coordinator leaves the cluster.
>> - it could use Avro to serialize/deserialize metrics data
>> - it should be optional, i.e. simply another service that the whirr
>> cli starts
>> - it would also be nice to have a monitor/coordinator web page that
>> would display metrics and view cluster status in an aggregated view.
>>
>> What do you think?
>> Is this something you envision as possible, and if yes are there any
>> use cases other that mine?
>> I'd, of course, be willing to do and contribute work on this.
>>
>> Cheers
>> David
>>
>>