On Thu, Jul 9, 2020 at 3:35 PM Gerard Meijssen <[email protected]> wrote:
> Hoi, > Is this different from Special:MediaSearch ?? > I'm assuming that you are asking if the new WCQS is different from the Special:MediaSearch prototype [1]. And yes, it is quite different. WCQS is a low level SPARQL interface, oriented toward power users and tools, allowing federation with WDQS and the Wikdiata dataset. Special:MediaSearch is a higher level search interface, backed by elasticsearch. It is using the same underlying data, but in a very different way. Somewhat unrelated: we are also planning some work on Special:MediaSearch to better integrate is with our current search infrastructure [2]. [1] https://commons.wikimedia.org/wiki/Special:MediaSearch [2] https://phabricator.wikimedia.org/T257043 Thanks, > GerardM > > On Thu, 9 Jul 2020 at 15:23, Guillaume Lederrey <[email protected]> > wrote: > >> Hello all! >> >> The Search Platform team will join the WIkidata office hours on July 21st >> 16:00 UTC [1]. We are looking forward to discussing Wikidata Query Service >> and anything else you might find of interest. >> >> We've been hard at work on Wikimedia Commons Query Service (WCQS) [2]. >> This will be a SPARL endpoint similar to WDQS, but serving the Structured >> Data on Commons dataset. Our goal is to open a beta service, hosted on >> Wikimedia Cloud Service (WMCS) by the end of July. The service will require >> an account on Commons for authentication and will allow federation with >> WDQS. We don't have a streaming update process ready yet, the data will be >> reloaded from Commons dumps weekly for a start. >> >> As part of that work, the dumps for Structured Data on Commons are now >> available [3]. Note that the prefix used in the TTL dumps is "wd", which >> does not make much sense. We are working with WMDE on renaming the >> prefixes, but this is more complex than expected since "wd" is hardcoded in >> more places than it should be. Those prefix should only be valid in the >> local context of the dumps, so renaming them is technically a non breaking >> change. That being said, if you start using those dumps, make sure you >> don't rely on this prefix, or that you are ready for a rename [4]. >> >> We are planning to dig more into the data we have to get a better >> understanding of the use cases around WDQS [5] (not much content on that >> task yet, but it is coming). Some very preliminary analysis indicates that >> less then 2% of the queries on WDQS generate more than 90% of the load. >> This is definitely something we need to better understand. We will be >> working on defining the kind of questions we need to answer, and improving >> our data collection to be able to answer those questions. >> >> We have started an internal discussion around "planning for disaster" >> [6]. We want to better understand the potential failure scenarios around >> WDQS and have a plan if that worst case does happen. This will include some >> analytics work and some testing to better understand the constraints and >> what degraded mode we might still be able to provide in case of >> catastrophic failure. >> >> Thanks for reading! >> >> Guillaume >> >> [1] https://www.wikidata.org/wiki/Wikidata:Events#Office_hours >> [2] https://phabricator.wikimedia.org/T251488 >> [3] https://dumps.wikimedia.org/other/wikibase/commonswiki/ >> [4] >> https://dumps.wikimedia.org/other/wikibase/commonswiki/README_commonsrdfdumps.txt >> [5] https://phabricator.wikimedia.org/T257045 >> [6] https://phabricator.wikimedia.org/T257055 >> >> >> -- >> Guillaume Lederrey >> Engineering Manager, Search Platform >> Wikimedia Foundation >> UTC+1 / CET >> _______________________________________________ >> Wikidata mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/wikidata >> > _______________________________________________ > Wikidata mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wikidata > -- Guillaume Lederrey Engineering Manager, Search Platform Wikimedia Foundation UTC+1 / CET
_______________________________________________ Wikidata mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata
