As for subsetting Wikidata, what sort of users do have the resources
to do that? Also that would mean new entity URIs (due to a different
hostname) which are not widely known (including by LLMs), so not a
practical solution IMO.

On Tue, Jan 6, 2026 at 12:54 PM Andra Waagmeester <[email protected]> wrote:
>
> Dear Martynas,
>
>      I strongly disagree that the provenance statements should be removed 
> from the default responses, since it is exactly the provenance that makes 
> Wikidata so valuable. Wikidata comes with a lot of noise, since often 
> references are not provided. Personally, I mostly consider a Wikidata 
> statement without a reference, without any value and is best ignored. . So if 
> we remove the provenance, Wikidata becomes just a bag of noise.
> Having said this, I do acknowledge that wikidata comes with a lot of baggage 
> or weight, but there are some decent tools out there to subset Wikidata into 
> more manageable portions.
> We did a paper on that a few years back: 
> https://www.semantic-web-journal.net/system/files/swj3491.pdf
>
> Cheers,
>
> Andra
>
> Op di 6 jan 2026 om 11:15 schreef Martynas Jusevičius 
> <[email protected]>:
>>
>> Hi all,
>>
>> I hope this is the right place for this discussion :)
>>
>> First of all, as developer of software for RDF Linked Data
>> consumption, I am naturally delighted that Wikidata serves Linked Data
>> and supports content negotiation (not many services get it right).
>>
>> However, IMO, the amount of meta-triples not relevant to the requested
>> entity, and the sheer size of the RDF data that it causes, make
>> Wikidata's RDF responses pretty much unusable.
>>
>> Let's take a single entity as an example:
>>
>>     curl -L -H "Accept: text/turtle" 'https://www.wikidata.org/entity/Q1748'
>>
>> The size of the Turtle response is 1.6MB!
>>
>> All of the schema metadata such as property and class descriptions are
>> not needed as they can be discovered by dereferencing the respective
>> term URIs:
>>
>> wdno:P2960 a owl:Class ;
>>   owl:complementOf _:e8842935d39a233def3d267ae3737d8c .
>>
>> _:e8842935d39a233def3d267ae3737d8c a owl:Restriction ;
>>   owl:onProperty wdt:P2960 ;
>>
>> owl:someValuesFrom owl:Thing .
>>
>> p:P518 a owl:ObjectProperty .
>> psv:P518 a owl:ObjectProperty .
>> pqv:P518 a owl:ObjectProperty .
>> prv:P518 a owl:ObjectProperty .
>> wdt:P518 a owl:ObjectProperty .
>> ps:P518 a owl:ObjectProperty .
>> pq:P518 a owl:ObjectProperty .
>> pr:P518 a owl:ObjectProperty .
>>
>> wd:Q1775415 a wikibase:Item ;
>> rdfs:label "feminine"@en ;
>> skos:prefLabel "feminine"@en ;
>> schema:name "feminine"@en ;
>> schema:description "grammatical gender"@en .
>>
>> and so on and so forth.
>>
>> Then I would argue that the provenance statements such as
>> <http://www.wikidata.org/entity/statement/Q1748-cfb94fd5-464b-1b83-a513-dd751882b7ce>
>> are also *not* necessary for the majority of use cases of the majority
>> of users.
>>
>> I suppose they are included to provide a complete and "truthy"
>> response, but by doing so the usability of the data is diminished. I
>> think the provenance statements should be removed from the default
>> responses and relegated to some "complete" or "truthy" profile with a
>> distinct URI, linked to from the default response.
>>
>> What do you think?
>>
>> Martynas
>> atomgraph.com
>> _______________________________________________
>> Wikidata mailing list -- [email protected]
>> Public archives at 
>> https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/6CALPNUWKMID3UE2RK7OCIZIGOAKNAVK/
>> To unsubscribe send an email to [email protected]
>
> _______________________________________________
> Wikidata mailing list -- [email protected]
> Public archives at 
> https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/FR2MT2ZU3HRYOXZK3RHUVNH6HOW664PY/
> To unsubscribe send an email to [email protected]
_______________________________________________
Wikidata mailing list -- [email protected]
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/LYA5B65WALWRS4ERZPIS4MIONO2HWWDC/
To unsubscribe send an email to [email protected]

Reply via email to