Hello,

I have a customized crawler project which consists of 4 subsystems.

a. Links discovery (5+ EC2 instance)
b. Web crawlers (20+ EC2 instance)
c. Links DB (MongoDB x 1)
d. Contents DB (MongoDB x 1)


1. Web crawlers (b) fetch links in batch from Links DB (c) and do the
crawling, and save the result into the Contents DB (d).
2. Links discovery fetch content in batch from Contents DB (d) and do
the analyze and save in the interested links in Link DB (c)


Current system work fine, but I want to explore a more scalable way to
design the system such that (c) and (d) would become the bottom-neck
in the future.

Seems both (c) and (d) can be replaced by 0mq?

I haven't do any serious testing yet, but any recommendation from
experienced users?


Thanks.
_______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Reply via email to