Thanks Joshua. mahadev
On 1/13/09 10:43 AM, "Joshua Tuberville" <joshuatubervi...@eharmony.com> wrote: > Thanks to everyone for proposed schemes and I created ZOOKEEPER-272 per your > request Mahadev. > > Joshua > > > -----Original Message----- > From: Mahadev Konar [mailto:maha...@yahoo-inc.com] > Sent: Monday, January 12, 2009 7:04 PM > To: zookeeper-user@hadoop.apache.org > Subject: Re: Maximum number of children > > I was going to suggest bucketing with predifined hashes. > /root/template/data/hashbucket/hash > > For the issue raised by Joshua regarding the length of the output from the > server -- > This is a bug. We seem to allow any number of children (< int) of a node and > the getchildren call fails to return the children. This leads to a chicken > and egg problem on how to get rid of the nodes if you do not know them. > > Here we arent saving nething since the server has already processed the > request and sent us the data. We should get rid of this hard coded limit. I > am not sure why we had this limit. > > Can you open a jira for this Joshua? > > thanks > mahadev > > > On 1/12/09 5:39 PM, "Stu Hood" <stuh...@mailtrust.com> wrote: > >> To continue with your current design, you could create a trie based on shared >> hash prefixes. >> >> /root/template/date/ 1a5e67/2b45dc >> /root/template/date/ 1a5e67/3d4a1f >> /root/template/date/ 3d4a1f/1a5e67 >> /root/template/date/ 3d4a1f/2b45dc >> >> Alternatively, you could use what the maildir mail storage format uses: >> /root/template/date/ eh/eharmony.com/jo/joshuatuberville >> >> Just check with the second one that all of the characters you support in >> email >> addresses are supported in znode names. >> >> Thanks, >> Stu >> >> >> -----Original Message----- >> From: "Joshua Tuberville" <joshuatubervi...@eharmony.com> >> Sent: Monday, January 12, 2009 7:53pm >> To: "'zookeeper-user@hadoop.apache.org'" <zookeeper-user@hadoop.apache.org> >> Subject: Maximum number of children >> >> Hello, >> >> We are attempting to use ZooKeeper to coordinate daily email thresholds. To >> do this we created a node hierarchy of >> >> /root/template/date/email_hash >> >> The idea being that we only send the template to an email address once per >> day. This is intended to support millions of email hashes per day. From the >> ZooKeeper perspective we just attempt a create and if it succeeds we proceed >> and if we get a node exists exception we stop processing. This seems to >> operate fine for over 2 million email hashes so far in testing. However we >> also want to prune all previous days nodes to conserve memory. We have run >> into a hard limit while using the getChildren method for a given >> /root/template/date. If the List of children exceeds the hardcoded 4,194,304 >> byte limit ClientCnxn$SendThread.readLength() throws an exception on line >> 490. >> So we have an issue that we can not delete a node that has children nor is it >> possible to delete a node who has children whose total names exceed 4 Mb. >> >> Any feedback or guidance is appreciated. >> >> Joshua Tuberville >> >> >