Basically, the xml dumps have 2 IDs: page_id and revision_id. The page_id points to the article. In this case, 14640471 is the page_id for Mars (https://en.wikipedia.org/wiki/Mars)
The revision_id points to the latest revision for the article. For Mars, the latest revision_id is 699008434 which was generated on 2016-01-09 ( https://en.wikipedia.org/w/index.php?title=Mars&oldid=699008434). Note that a revision_id is generated every time a page is edited. So, to answer your question, the IDs never change. 14640471 will always point to Mars, while 699008434 points to the 2016-01-09 revision for Mars. That said, different dumps will have different revision_ids, because an article may be updated. If Mars gets updated tomorrow, and the English Wikipedia dump is generated afterwards, then that dump will list Mars with a new revision_id (something higher than 6999008434). However, that dump will still show Mars with a page_id of 1460471. You're probably better off using the page_id. Finally, you can see also reference the Wikimedia API to get a similar view to the dump: For example: https://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=Mars&rvprop=content|ids Hope this helps. On Mon, Jan 11, 2016 at 5:09 AM, Luigi Assom <[email protected]> wrote: > yep, same here! > > Also another question about consistency of _IDs in time. > I was working with an old version of wikipedia dump, and testing some > data models I built on the dumpusing as pivot a few topics. > I might have data corrupted on my side, but just to be sure: > are _IDs of article *persistent* over time, or are they subjected to > change? > > Might happen that due any fallback or merge in an article history, ID > would change? > E.g. as test article "Mars" would first point to a version _ID ="4285430" > and then changed to "14640471" > > I need to ensure _IDs will persist. > thank you! > > > *P.S. sorry for cross posting - I've replied from wrong email - could you > please delete the other message and keep only this email address? thank > you! * > > On Mon, Jan 11, 2016 at 11:05 AM, XDiscovery Team <[email protected]> > wrote: > >> yep, same here! >> >> Also another question about consistency of _IDs in time. >> I was working with an old version of wikipedia dump, and testing some >> data models I built on the dump using as pivot a few topics. >> I might have data corrupted on my side, but just to be sure: >> are _IDs of article *persistent* over time, or are they subjected to >> change? >> >> Might happen that due any fallback or merge in an article history, ID >> would change? >> E.g. as test article "Mars" would first point to a version _ID ="4285430" >> and then changed to "14640471" >> >> I need to ensure _IDs will persist. >> thank you! >> >> >> On Mon, Jan 11, 2016 at 6:22 AM, Tilman Bayer <[email protected]> >> wrote: >> >>> On Sun, Jan 10, 2016 at 4:05 PM, Bernardo Sulzbach < >>> [email protected]> wrote: >>> >>>> On Sun, Jan 10, 2016 at 9:55 PM, Neil Harris <[email protected]> >>>> wrote: >>>> > Hello! I've noticed that no enwiki dump seems to have been generated >>>> so far >>>> > this month. Is this by design, or has there been some sort of dump >>>> failure? >>>> > Does anyone know when the next enwiki dump might happen? >>>> > >>>> >>>> I would also be interested. >>>> >>>> -- >>>> Bernardo Sulzbach >>>> >>>> _______________________________________________ >>>> Wikitech-l mailing list >>>> [email protected] >>>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l >>>> >>> >>> CCing the Xmldatadumps mailing list >>> <https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l>, where >>> someone has already posted >>> <https://lists.wikimedia.org/pipermail/xmldatadumps-l/2016-January/001214.html> >>> about >>> what might be the same issue. >>> >>> -- >>> Tilman Bayer >>> Senior Analyst >>> Wikimedia Foundation >>> IRC (Freenode): HaeB >>> >>> _______________________________________________ >>> Xmldatadumps-l mailing list >>> [email protected] >>> https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l >>> >>> >> >> >> -- >> *Luigi Assom* >> Founder & CEO @ XDiscovery - Crazy on Human Knowledge >> *Corporate* >> www.xdiscovery.com >> *Mobile App for knowledge Discovery* >> APP STORE <http://tiny.cc/LearnDiscoveryApp> | PR >> <http://tiny.cc/app_Mindmap_Wikipedia> | WEB >> <http://www.learndiscovery.com/> >> >> T +39 349 3033334 | +1 415 707 9684 >> > > > > -- > *Luigi Assom* > > T +39 349 3033334 | +1 415 707 9684 > Skype oggigigi > > _______________________________________________ > Xmldatadumps-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l > > _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
