Dear Ben Sidi Ahmed,
DBpedia might have all the data you need, already extracted (also in 
over 80 languages):
http://wiki.dbpedia.org/Downloads37

Here are all first 2 sentences of each article in a structured format:
http://downloads.dbpedia.org/3.7/en/short_abstracts_en.nt.bz2
Here is the first abstract:
http://downloads.dbpedia.org/3.7/en/long_abstracts_en.nt.bz2

If you just want them for single articles you can also query the DBpedia 
API:
The first 2 sentences for London (all languages) :
http://dbpedia.org/snorql/?query=SELECT+*+WHERE+{%0D%0A%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2FLondon%3E+rdfs%3Acomment+%3Fshort_abstract+.%0D%0A}

The first 2 sentences for London (only English ):
http://dbpedia.org/snorql/?query=SELECT+*+WHERE+{%0D%0A%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2FLondon%3E+rdfs%3Acomment+%3Fshort_abstract+.%0D%0AFILTER+%28lang%28%3Fshort_abstract%29%3D%22en%22%29%0D%0A}

All that contain the keyword "London":
http://dbpedia.org/snorql/?query=SELECT+*+WHERE+{%0D%0A%3Fs+rdfs%3Acomment+%3Fshort_abstract+.%0D%0AFILTER+%28lang%28%3Fshort_abstract%29%3D%22en%22%29%0D%0AFILTER+%28bif%3Acontains%28%3Fshort_abstract%2C+%22London%22%29+%29%0D%0A}

You can also query them on a synchronized database (which gets updates 
every 5 minutes from Wikipedia):
http://live.dbpedia.org/

Hope that helps,
Sebastian


On 11/27/2011 06:02 PM, Khalida BEN SIDI AHMED wrote:
> Hello!
> I don't know if the subject of this question belongs to the scope of this
> group. Anyway, I will be pleased if I find an aswer to my question.
> I'm writing some Java code in order to realize NLP tasks upon texts using
> Wikipedia. What can I do in order to extract the first paragraph of a
> Wikipedia article? Thanks a lot.
>
> Truly yours
> Ben Sidi Ahmed
> _______________________________________________
> Wikitech-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>


-- 
Dipl. Inf. Sebastian Hellmann
Department of Computer Science, University of Leipzig
Projects: http://nlp2rdf.org , http://dbpedia.org
Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann
Research Group: http://aksw.org


_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to