Are you limiting your count to namespace 0? On Thu, Aug 20, 2020 at 10:45 AM Yuki Kumagai <yuki.kuma...@cognitionx.io> wrote:
> Hiya > > I have a question about wikipedia xml database dump. Apologies if this > wasn't an appropriate place for asking a question. > On a wikipedia page, it's mentioned that the current number of articles in > english is: 6,144,248 > https://en.wikipedia.org/wiki/Wikipedia:Size_of_Wikipedia > > However when I count the number of page elements in recent dump (excluding > redirects) it's about ~10 million > I was just wondering what would be the reason for this? > > Thank you in advance > > -- > *Yuki Kumagai* > Senior Engineer > CognitionX <https://cognitionx.com/> > > > > > > > Driving the acceleration and responsible deployment of AI > Stay up-to-date with our daily All Things AI > <https://confirmsubscription.com/h/d/13A269E463396CB2> newsletter > > > > > _______________________________________________ > > Xmldatadumps-l mailing list > > Xmldatadumps-l@lists.wikimedia.org > > https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l > >
_______________________________________________ Xmldatadumps-l mailing list Xmldatadumps-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l