---------- Forwarded message ----------
From: ramesh kumar <[email protected]>
Date: 9 March 2011 13:27
Subject: RE: Reg. Research using Wikipedia
To: [email protected]


Dear Mr.Gerard,
Thanks for your instant response.
But is there a time-gap to check between one request into another request.
for ex: like 1 sec, or 1 milli sec.
If so, I can set a sleep state in my program. At the same, I have 3.1
million (3101144) wiki article titles.
So if I set 1 sec between one request, so for 1 day it takes
60(sec)*60(min)*24(hr)=86400 /2= 43200 requests per day(considering 1
sec sleep between 1 request to the other)
3101144/43200=71 days.
I feel the program takes 71 days to finish all the 3.1 million article titles.
Is there anyway, our university IP address will be given permission or
sending a official email from our department head to Wikipedia Server
administrator to consider that the program, I run from this particular
IP address is not any attack. so, our administrator allows us to do
faster request like 0.5 sec. So, I can finish my experiment within 35
days.
expecting your positive reply
regards
Ramesh

---------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Date: Wed, 9 Mar 2011 10:39:43 +0000
> Subject: Re: Reg. Research using Wikipedia
> From: [email protected]
> To: [email protected]
>
> I asked the wikitech-l list, which is where the system administrators
> talk, and they said:
>
> "If they use the API and wait for one request to finish before they
> start the next one (i.e. don't make parallel requests), that's pretty
> much always fine."
>
> http://lists.wikimedia.org/pipermail/wikitech-l/2011-March/052137.html
>
> Hopefully this will put your network administrators' minds at rest :-)
>
>
> - d.
>
>
>
> On 9 March 2011 09:47, ramesh kumar <[email protected]> wrote:
> > Dear Members,
> > I am Ramesh, pursuing my PhD in Monash University, Malaysia. My Research is
> > on blog classification using Wikipedia Categories.
> > As for my experiment, I use 12 main categories of Wikipedia.
> > I want to identify " which particular article belongs to which main 12
> > categories?".
> > So I wrote a program to collect the subcategories of each article and
> > classify based on 12 categories offline.
> > I have downloaded already wiki-dump which consists of around 3 million
> > article titles.
> > My program takes this 3 million article titles and goes to
> > online Wikipedia website and fetch the subcategories.
> > Our university network administrators are worried that, Wikipedia would
> > consider as DDOS attack and could block our IP address, if my program
> > functions.
> > In order to get permission from Wikipedia, I was searching allover. I could
> > able to find wikien-l members can help me.
> > Could you please suggest me, whom to contact, what is the procedure to get
> > approval for our IP address to do the process or other suggestions
> > Eagerly waiting for a positive reply
> > Thanks and Regards
> > Ramesh

_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to