Hi. I think there may still be problems with the 2016-04-07 English
Wikipedia dump. It's missing many articles in the Module namespace.

Here are some details:
* I downloaded
https://dumps.wikimedia.org/enwiki/20160407/enwiki-20160407-pages-articles.xml.bz2
. I got an XML file that was 10.8 GB (i.e.: it does not look severely
truncated)
* I ran the following grep commands. Note that Module:Hatnote is blank. I
ran the last grep to show that the criteria should be correct.
root~> grep "<title>Earth</title>" /home/root/xowa/wiki/
en.wikipedia.org/enwiki-latest-pages-articles.xml

    <title>Earth</title>
root~> grep "<title>Template:About</title>" /home/root/xowa/wiki/
en.wikipedia.org/enwiki-latest-pages-articles.xml

    <title>Template:About</title>
root~> grep "<title>Module:Hatnote</title>" /home/root/xowa/wiki/
en.wikipedia.org/enwiki-latest-pages-articles.xml
root~> grep "<title>Module:" /home/root/xowa/wiki/
en.wikipedia.org/enwiki-latest-pages-articles.xml
    <title>Module:Location map/data/Croatia/doc</title>
    <title>Module:Location map/data/USA Alabama/doc</title>
    ...
* The following Modules appear to be missing in the 2016-04-07 dump
Module:Use_mdy_dates
Module:Pp-move-indef
Module:Protection_banner
Module:Unsubst
* By my count, there were 2,970 articles in the Module namespace in the
2016-03-05 dump. In contrast, there are only 652 in the 2016-04-07 dump.

Let me know if you need any other information. I believe that the above can
be verified by anyone else, but I'd be happy to provide more detail

Thanks.




On Thu, Apr 14, 2016 at 8:49 AM, Ariel Glenn WMF <ar...@wikimedia.org>
wrote:

> It hasn't failed.  It's still running but the jobs that previously failed
> have been left in that status until they get rerun.  That's standard
> behavior.  Don't worry, be happy! :-)
>
> Ariel
>
> On Thu, Apr 14, 2016 at 2:15 PM, Nicolas Vervelle <nverve...@gmail.com>
> wrote:
>
>> But at least, pages-articles worked, so it's ok for me.
>>
>> On Thu, Apr 14, 2016 at 1:13 PM, Nicolas Vervelle <nverve...@gmail.com>
>> wrote:
>>
>>> Well, enwiki failed again today...
>>>
>>> On Wed, Apr 13, 2016 at 4:37 PM, Ariel Glenn WMF <ar...@wikimedia.org>
>>> wrote:
>>>
>>>> You are right. Two jobs were competing for enwiki since I allocated one
>>>> more lousy core to the host that runs them. I've fixed the config to avoid
>>>> that. It will resume in a few hours with cron.
>>>>
>>>> Ariel
>>>>
>>>> On Wed, Apr 13, 2016 at 4:37 PM, Nicolas Vervelle <nverve...@gmail.com>
>>>> wrote:
>>>>
>>>>> Thanks Ariel,
>>>>>
>>>>> It seems to have worked for some dumps (frwiki for example), but other
>>>>> dumps are still failing (enwiki for example)
>>>>>
>>>>> Nico
>>>>>
>>>>> On Tue, Apr 12, 2016 at 11:04 AM, Ariel Glenn WMF <ar...@wikimedia.org
>>>>> > wrote:
>>>>>
>>>>>> Hi Nicolas,
>>>>>>
>>>>>> These will be picked up on reruns, which will happen over the next
>>>>>> day or so.  The failure was caused by an obscure hhvm bug which only
>>>>>> triggers under certain circumstances.  For more information about that,
>>>>>> see: https://phabricator.wikimedia.org/T94277
>>>>>>
>>>>>> This morning I did jobs cleanup, switched the dump jobs to use php5
>>>>>> again and the dumps have restarted.
>>>>>>
>>>>>> Ariel
>>>>>>
>>>>>> On Tue, Apr 12, 2016 at 11:25 AM, Nicolas Vervelle <
>>>>>> nverve...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> Is anyone working on the failed dumps for April ? (enwiki, frwiki,
>>>>>>> ruwiki, itwiki, ...)
>>>>>>>
>>>>>>> Nico
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Xmldatadumps-l mailing list
>>>>>>> Xmldatadumps-l@lists.wikimedia.org
>>>>>>> https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
> _______________________________________________
> Xmldatadumps-l mailing list
> Xmldatadumps-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
>
>
_______________________________________________
Xmldatadumps-l mailing list
Xmldatadumps-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l

Reply via email to