+1 sounds good. Tom
On Fri, Feb 11, 2011 at 10:50 AM, David Alves <[email protected]> wrote: > Hi Tom > > Responses inline. > > On Feb 11, 2011, at 6:34 PM, Tom White wrote: > >> Hi David, >> >> This sounds like an interesting project. If I understand correctly, >> this would be a separate service, but would interact with other >> services to expand/contract them? So there'd be a core piece. > > Well the monitor and coordinator could be a single whirr service > started at boot by the cli, and one of the monitors would selected as the > coordinator by leader election. > >> It's probably useful to implement manual expansion of clusters first >> (https://issues.apache.org/jira/browse/WHIRR-214), then add logic to >> (optionally) do it automatically. I don't think auto-scaling makes >> sense for all services, does it? (E.g. you probably don't want to >> automatically grow a ZK cluster.) Which service do you have in mind >> for auto-scaling? > > Regarding manual expansion on the cluster (WHIRR-214), this would of > course be a requirement, I had planned to start work on this as per Andrei's > suggestion. > Of course auto-scaling would not make sense for all services (zk is > one of them) but imagine it would/could be useful for hadoop/hbase/cassandra, > i.e. all the services that would benefit from WHIRR-214. > > Cheers > David > >> Cheers, >> Tom >> >> On Fri, Feb 11, 2011 at 6:22 AM, David Alves <[email protected]> wrote: >>> Hi >>> >>> I have the following requirement for my work, and I'd like to hear >>> opinions on the possible inclusion of such features on whirr. >>> I need an elastic scaling monitor and coordinator, i.e. a whirr >>> process that would be running on some or all of the nodes that: >>> - would collect load metrics (both generic and specific to each >>> application) >>> - would feed them through an elastic decision making engine (also >>> specific to each application as it depends on the specific metrics) >>> - would then act on those decisions by either expanding or >>> contracting the cluster. >>> >>> Some specifics: >>> - it must not be completely distributed, i.e. it can have a specific >>> assigned node that will monitor/coordinate but this node must not be fixed, >>> i.e. it could/should change if the previous coordinator leaves the cluster. >>> - each application would define the set of metrics that it emits and >>> use a local monitor process to feed them to the coordinator. >>> - the monitor process should emit some standard metrics (Disk I/O, >>> CPU Load, Net I/O, memory) >>> - the coordinator would have a pluggable decision engine policy also >>> defined by the application that would consume metrics and make a decision. >>> - whirr would take care of requesting/releasing nodes and >>> adding/removing them from the relevant services. >>> >>> Some implementation ideas: >>> - it could tun on top of zookeeper. zk is already a requirement for >>> several services and would allow to reliably store coordinator state so >>> that another node can pickup if the previous coordinator leaves the cluster. >>> - it could use Avro to serialize/deserialize metrics data >>> - it should be optional, i.e. simply another service that the whirr >>> cli starts >>> - it would also be nice to have a monitor/coordinator web page that >>> would display metrics and view cluster status in an aggregated view. >>> >>> What do you think? >>> Is this something you envision as possible, and if yes are there any >>> use cases other that mine? >>> I'd, of course, be willing to do and contribute work on this. >>> >>> Cheers >>> David >>> >>> > >
