---------- Forwarded message ---------- From: ramesh kumar <[email protected]> Date: 9 March 2011 13:27 Subject: RE: Reg. Research using Wikipedia To: [email protected]
Dear Mr.Gerard, Thanks for your instant response. But is there a time-gap to check between one request into another request. for ex: like 1 sec, or 1 milli sec. If so, I can set a sleep state in my program. At the same, I have 3.1 million (3101144) wiki article titles. So if I set 1 sec between one request, so for 1 day it takes 60(sec)*60(min)*24(hr)=86400 /2= 43200 requests per day(considering 1 sec sleep between 1 request to the other) 3101144/43200=71 days. I feel the program takes 71 days to finish all the 3.1 million article titles. Is there anyway, our university IP address will be given permission or sending a official email from our department head to Wikipedia Server administrator to consider that the program, I run from this particular IP address is not any attack. so, our administrator allows us to do faster request like 0.5 sec. So, I can finish my experiment within 35 days. expecting your positive reply regards Ramesh --------------------------------------------------------------------------------------------------------------------------------------------------------------------- > Date: Wed, 9 Mar 2011 10:39:43 +0000 > Subject: Re: Reg. Research using Wikipedia > From: [email protected] > To: [email protected] > > I asked the wikitech-l list, which is where the system administrators > talk, and they said: > > "If they use the API and wait for one request to finish before they > start the next one (i.e. don't make parallel requests), that's pretty > much always fine." > > http://lists.wikimedia.org/pipermail/wikitech-l/2011-March/052137.html > > Hopefully this will put your network administrators' minds at rest :-) > > > - d. > > > > On 9 March 2011 09:47, ramesh kumar <[email protected]> wrote: > > Dear Members, > > I am Ramesh, pursuing my PhD in Monash University, Malaysia. My Research is > > on blog classification using Wikipedia Categories. > > As for my experiment, I use 12 main categories of Wikipedia. > > I want to identify " which particular article belongs to which main 12 > > categories?". > > So I wrote a program to collect the subcategories of each article and > > classify based on 12 categories offline. > > I have downloaded already wiki-dump which consists of around 3 million > > article titles. > > My program takes this 3 million article titles and goes to > > online Wikipedia website and fetch the subcategories. > > Our university network administrators are worried that, Wikipedia would > > consider as DDOS attack and could block our IP address, if my program > > functions. > > In order to get permission from Wikipedia, I was searching allover. I could > > able to find wikien-l members can help me. > > Could you please suggest me, whom to contact, what is the procedure to get > > approval for our IP address to do the process or other suggestions > > Eagerly waiting for a positive reply > > Thanks and Regards > > Ramesh _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
