On Wed, Jan 22, 2014 at 10:31 AM, Matthew Flaschen
<mflasc...@wikimedia.org>wrote:

> On 01/21/2014 09:47 PM, Amir Ladsgroup wrote:
>
>> One of the things I can't understand is why we are extracting summary of
>> pages for Yahoo? Is it our job to do it? the dumps are really huge
>> e.g. forwikidata:<http://dumps.wikimedia.org/wikidatawiki/20140106/>
>> wikidatawiki-20140106-abstract.xml<http://dumps.
>> wikimedia.org/wikidatawiki/20140106/wikidatawiki-20140106-abstract.xml
>> >14.1
>>
>> GB
>> Compare it to: full history:
>> wikidatawiki-20140106-pages-meta-history.xml.bz2<http://
>> dumps.wikimedia.org/wikidatawiki/20140106/wikidatawiki-20140106-pages-
>> meta-history.xml.bz2>8.8
>> GB
>>
>
> That's because the Yahoo one isn't compressed.
>
> why? can we make it compressed? It's really annoying to see that huge file
there for (even almost) no reason.


> I'm not sure if Yahoo still uses those abstracts, but I wouldn't be
> surprised at all if other people are.
>
> Matt Flaschen
>
>
>
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>



-- 
Amir
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to