On Wed, Jul 9, 2014 at 6:13 PM, Daniel Kinzler <daniel.kinz...@wikimedia.de> wrote:
> Am 09.07.2014 08:14, schrieb Dimitris Kontokostas: > > Hi, > > > > Is it easy to brief the added value (or supported use cases) by switching > > to PubSubHubbub? > > * It's easier to handle than OAI, because it uses the standard dump format. > * It's also push-based, avoiding constant polling on small wikis. > * The OAI extension has been deprecated for a long time now. > > > The edit stream in Wikidata is so huge that I can hardly think of anyone > wanting > > to be in *real-time* sync with Wikidata > > With 20 p/s their infrastructure should be pretty scalable to not break. > > The "push" aspect is probably most useful for small wikis. It's true, for > large > wikis, you could just poll, since you would hardly ever poll in vain. > > IT would be very nice if the sync could be filtered by namespace, > category, etc. > But PubSubHubbub (i'll use "PuSH" from now on) doesn't really support > this, sadly. > > > Maybe I am biased with DBpedia but by doing some experiments on English > > Wikipedia we found that the ideal update with OAI-PMH time was every ~5 > minutes. > > OAI aggregates multiple revisions of a page to a single edit > > so when we ask: "get me the items that changed the last 5 minutes" we > skip the > > processing of many minor edits > > It looks like we lose this option with PubSubHubbub right? > > I'm not quite positive on this point, but I think with PuSH, this is done > by the > hub. If the hub gets 20 notifications for the same resource in one minute, > it > will only grab and distribute the latest version, not all 20. > > But perhaps someone from the PuSH development team could confirm this. > It 'd be great if the dev team can confirm this. Besides push notifications, is polling an option in PuSH? I briefed through the spec but couldn't find this. > > > As we already asked before, does PubSubHubbub supports mirroring a > wikidata > > clone? The OAI-PMH extension has this option > > Yes, there is a client extension for PuSH, allowing for seemless > replication of > one wiki into another, including creation and deletion (I don't know about > moves/renames). > > -- > Daniel Kinzler > Senior Software Developer > > Wikimedia Deutschland > Gesellschaft zur Förderung Freien Wissens e.V. > -- Kontokostas Dimitris
_______________________________________________ Wikidata-tech mailing list Wikidata-tech@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-tech