When I run my recursive script, I expect it to traverse every link within our rules to four levels by default.  As the Spider goes through our website it collects a great number of links - level 1 = 1, level 2 = 27, level 3 = 475, level 4 = 2670.  It does a great job of getting through everything until suddenly it gets to the 129th out of 2670 on level 4, then it throws the following exception and gives the following baffling information (baffling because it doesn't seem like it should do this, read through the stack trace and my comments afterward before you draw any conclusions please - this is completely consistent - happens in the same place every time, so I haven't ruled out a website issue, but please continue):

So it's still a complete mystery, but it's suggestive that a single IE process always disappears on exactly the 2800th link.  I googled for "internet explorer" + 2800 but had no success. 

That's a lot of links, and a round number.  I wonder if there's some protective mechanism in IE itself at work.  

Maybe you could farm out the work among scripts communicating with more IE processes? 


_______________________________________________
Wtr-general mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/wtr-general

Reply via email to