Hi again, I am not saying that provenance statements are not necessary -- I am arguing that they are not necessary by default in Linked Data responses. The current situation is like displaying Wikipedia's page editing history at the bottom of each page. What percentage of users would find that useful? Try opening DBpedia and Wikidata URIs in a Linked Data browser and tell me which one gets rendered in a more user-friendly way?
The provenance statements are still in the RDF triplestore, meaning they can still be queried by SPARQL? Or they could be accessed via a secondary resource with a query param added to the URL or smth like that. If the above is controversial, then I would hope removing the schema terms from the responses would not be? Best, Martynas On Tue, Jan 6, 2026 at 3:57 PM <[email protected]> wrote: > > Dear all, > > I fully agree with Andra's response in terms of content. Provenance and > governance are crucial for responsible use and propagation of information. > > I myself work as a healthcare professional in the field of medical guidelines > and biomedical research and am confronted daily with the question of whether > a term definition is reliable or not. > > It makes a difference whether a term is defined by an authoritative body, > such as a WHO expert working group, or by an undefined institution. > > > > Also, from my involvement with LIFES, I see that term definitions, and the > sources of those definitions, are an overlooked aspect in the data community, > which greatly complicates machine interpretability and reuse of data. > > Especially when Wikidata/Wikibase is used for a controlled vocabulary in a > KAG. > > When no source is listed for a term (definition), lossless propagation of > information is not guaranteed and is therefore essentially useless for > further use. > > > > The problem outlined can be explained by two aspects. > > First, it is an intrinsic given. Systems such as Wikidata are not designed to > go beyond a lexicological concept. The world is much more complex than that > and needs to be described with a far more expressive encyclopedic model. > > In practice, knowledge graph-like systems get stuck for more complex > knowledge models. > > Second, it is a result of uncontrolled growth of source silos, which gives > term mapping a disproportionate role in the (poorly defined) propagation of > information. > > It would be better to address these through an extensive federative policy. > > > > Sincerely, > > Frans van der Horst > > > > > > Van: Andra Waagmeester <[email protected]> > Verzonden: dinsdag 6 januari 2026 12:53 > Aan: Discussion list for the Wikidata project <[email protected]> > Onderwerp: [Wikidata] Re: RDF Linked Data responses of Wikidata URIs > > > > Dear Martynas, > > > > I strongly disagree that the provenance statements should be removed > from the default responses, since it is exactly the provenance that makes > Wikidata so valuable. Wikidata comes with a lot of noise, since often > references are not provided. Personally, I mostly consider a Wikidata > statement without a reference, without any value and is best ignored. . So if > we remove the provenance, Wikidata becomes just a bag of noise. > > Having said this, I do acknowledge that wikidata comes with a lot of baggage > or weight, but there are some decent tools out there to subset Wikidata into > more manageable portions. > > We did a paper on that a few years back: > https://www.semantic-web-journal.net/system/files/swj3491.pdf > > > > Cheers, > > Andra > > > > Op di 6 jan 2026 om 11:15 schreef Martynas Jusevičius > <[email protected]>: > > Hi all, > > I hope this is the right place for this discussion :) > > First of all, as developer of software for RDF Linked Data > consumption, I am naturally delighted that Wikidata serves Linked Data > and supports content negotiation (not many services get it right). > > However, IMO, the amount of meta-triples not relevant to the requested > entity, and the sheer size of the RDF data that it causes, make > Wikidata's RDF responses pretty much unusable. > > Let's take a single entity as an example: > > curl -L -H "Accept: text/turtle" 'https://www.wikidata.org/entity/Q1748' > > The size of the Turtle response is 1.6MB! > > All of the schema metadata such as property and class descriptions are > not needed as they can be discovered by dereferencing the respective > term URIs: > > wdno:P2960 a owl:Class ; > owl:complementOf _:e8842935d39a233def3d267ae3737d8c . > > _:e8842935d39a233def3d267ae3737d8c a owl:Restriction ; > owl:onProperty wdt:P2960 ; > > owl:someValuesFrom owl:Thing . > > p:P518 a owl:ObjectProperty . > psv:P518 a owl:ObjectProperty . > pqv:P518 a owl:ObjectProperty . > prv:P518 a owl:ObjectProperty . > wdt:P518 a owl:ObjectProperty . > ps:P518 a owl:ObjectProperty . > pq:P518 a owl:ObjectProperty . > pr:P518 a owl:ObjectProperty . > > wd:Q1775415 a wikibase:Item ; > rdfs:label "feminine"@en ; > skos:prefLabel "feminine"@en ; > schema:name "feminine"@en ; > schema:description "grammatical gender"@en . > > and so on and so forth. > > Then I would argue that the provenance statements such as > <http://www.wikidata.org/entity/statement/Q1748-cfb94fd5-464b-1b83-a513-dd751882b7ce> > are also *not* necessary for the majority of use cases of the majority > of users. > > I suppose they are included to provide a complete and "truthy" > response, but by doing so the usability of the data is diminished. I > think the provenance statements should be removed from the default > responses and relegated to some "complete" or "truthy" profile with a > distinct URI, linked to from the default response. > > What do you think? > > Martynas > atomgraph.com > _______________________________________________ > Wikidata mailing list -- [email protected] > Public archives at > https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/6CALPNUWKMID3UE2RK7OCIZIGOAKNAVK/ > To unsubscribe send an email to [email protected] > > _______________________________________________ > Wikidata mailing list -- [email protected] > Public archives at > https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/VUMUFL7D35SCR2WRN6J5IT4EYHRGNG26/ > To unsubscribe send an email to [email protected] _______________________________________________ Wikidata mailing list -- [email protected] Public archives at https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/M46KYYOQSNUFFXTCN6BYQUD5JEFK3HFL/ To unsubscribe send an email to [email protected]
