Hi David, This sounds like an interesting project. If I understand correctly, this would be a separate service, but would interact with other services to expand/contract them? So there'd be a core piece.
It's probably useful to implement manual expansion of clusters first (https://issues.apache.org/jira/browse/WHIRR-214), then add logic to (optionally) do it automatically. I don't think auto-scaling makes sense for all services, does it? (E.g. you probably don't want to automatically grow a ZK cluster.) Which service do you have in mind for auto-scaling? Cheers, Tom On Fri, Feb 11, 2011 at 6:22 AM, David Alves <[email protected]> wrote: > Hi > > I have the following requirement for my work, and I'd like to hear > opinions on the possible inclusion of such features on whirr. > I need an elastic scaling monitor and coordinator, i.e. a whirr > process that would be running on some or all of the nodes that: > - would collect load metrics (both generic and specific to each > application) > - would feed them through an elastic decision making engine (also > specific to each application as it depends on the specific metrics) > - would then act on those decisions by either expanding or contracting > the cluster. > > Some specifics: > - it must not be completely distributed, i.e. it can have a specific > assigned node that will monitor/coordinate but this node must not be fixed, > i.e. it could/should change if the previous coordinator leaves the cluster. > - each application would define the set of metrics that it emits and > use a local monitor process to feed them to the coordinator. > - the monitor process should emit some standard metrics (Disk I/O, CPU > Load, Net I/O, memory) > - the coordinator would have a pluggable decision engine policy also > defined by the application that would consume metrics and make a decision. > - whirr would take care of requesting/releasing nodes and > adding/removing them from the relevant services. > > Some implementation ideas: > - it could tun on top of zookeeper. zk is already a requirement for > several services and would allow to reliably store coordinator state so that > another node can pickup if the previous coordinator leaves the cluster. > - it could use Avro to serialize/deserialize metrics data > - it should be optional, i.e. simply another service that the whirr > cli starts > - it would also be nice to have a monitor/coordinator web page that > would display metrics and view cluster status in an aggregated view. > > What do you think? > Is this something you envision as possible, and if yes are there any > use cases other that mine? > I'd, of course, be willing to do and contribute work on this. > > Cheers > David > >
