I should think there could be various applications that would benefit from a 
more configurable output. For instance, just returning the labels for specified 
languages, returning a graph around a list of QID, - rather than just a single 
QID. W/o the ontology, only the truthy, provenance or not, qualifiers or not, 
literal values, w/o wikimedia links, redundancy This sounds like an API.

(I do not think that subsetting Wikidata would generate nye entity URIs)

(I do not get a turtle on 1.6 MB for Q1748...!?: 192K Q1748.json  1000K 
Q1748.jsonld   476K Q1748.ttl)

Finn Årup Nielsen
https://people.compute.dtu.dk/faan/
________________________________________
Fra: Martynas Jusevičius <[email protected]>
Sendt: 6. januar 2026 16:07
Til: Discussion list for the Wikidata project
Emne: [Wikidata] Re: RDF Linked Data responses of Wikidata URIs

As for subsetting Wikidata, what sort of users do have the resources
to do that? Also that would mean new entity URIs (due to a different
hostname) which are not widely known (including by LLMs), so not a
practical solution IMO.

On Tue, Jan 6, 2026 at 12:54 PM Andra Waagmeester <[email protected]> wrote:
>
> Dear Martynas,
>
>      I strongly disagree that the provenance statements should be removed 
> from the default responses, since it is exactly the provenance that makes 
> Wikidata so valuable. Wikidata comes with a lot of noise, since often 
> references are not provided. Personally, I mostly consider a Wikidata 
> statement without a reference, without any value and is best ignored. . So if 
> we remove the provenance, Wikidata becomes just a bag of noise.
> Having said this, I do acknowledge that wikidata comes with a lot of baggage 
> or weight, but there are some decent tools out there to subset Wikidata into 
> more manageable portions.
> We did a paper on that a few years back: 
> https://www.semantic-web-journal.net/system/files/swj3491.pdf
>
> Cheers,
>
> Andra
>
> Op di 6 jan 2026 om 11:15 schreef Martynas Jusevičius 
> <[email protected]>:
>>
>> Hi all,
>>
>> I hope this is the right place for this discussion :)
>>
>> First of all, as developer of software for RDF Linked Data
>> consumption, I am naturally delighted that Wikidata serves Linked Data
>> and supports content negotiation (not many services get it right).
>>
>> However, IMO, the amount of meta-triples not relevant to the requested
>> entity, and the sheer size of the RDF data that it causes, make
>> Wikidata's RDF responses pretty much unusable.
>>
>> Let's take a single entity as an example:
>>
>>     curl -L -H "Accept: text/turtle" 'https://www.wikidata.org/entity/Q1748'
>>
>> The size of the Turtle response is 1.6MB!
>>
>> All of the schema metadata such as property and class descriptions are
>> not needed as they can be discovered by dereferencing the respective
>> term URIs:
>>
>> wdno:P2960 a owl:Class ;
>>   owl:complementOf _:e8842935d39a233def3d267ae3737d8c .
>>
>> _:e8842935d39a233def3d267ae3737d8c a owl:Restriction ;
>>   owl:onProperty wdt:P2960 ;
>>
>> owl:someValuesFrom owl:Thing .
>>
>> p:P518 a owl:ObjectProperty .
>> psv:P518 a owl:ObjectProperty .
>> pqv:P518 a owl:ObjectProperty .
>> prv:P518 a owl:ObjectProperty .
>> wdt:P518 a owl:ObjectProperty .
>> ps:P518 a owl:ObjectProperty .
>> pq:P518 a owl:ObjectProperty .
>> pr:P518 a owl:ObjectProperty .
>>
>> wd:Q1775415 a wikibase:Item ;
>> rdfs:label "feminine"@en ;
>> skos:prefLabel "feminine"@en ;
>> schema:name "feminine"@en ;
>> schema:description "grammatical gender"@en .
>>
>> and so on and so forth.
>>
>> Then I would argue that the provenance statements such as
>> <http://www.wikidata.org/entity/statement/Q1748-cfb94fd5-464b-1b83-a513-dd751882b7ce>
>> are also *not* necessary for the majority of use cases of the majority
>> of users.
>>
>> I suppose they are included to provide a complete and "truthy"
>> response, but by doing so the usability of the data is diminished. I
>> think the provenance statements should be removed from the default
>> responses and relegated to some "complete" or "truthy" profile with a
>> distinct URI, linked to from the default response.
>>
>> What do you think?
>>
>> Martynas
>> atomgraph.com
>> _______________________________________________
>> Wikidata mailing list -- [email protected]
>> Public archives at 
>> https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/6CALPNUWKMID3UE2RK7OCIZIGOAKNAVK/
>> To unsubscribe send an email to [email protected]
>
> _______________________________________________
> Wikidata mailing list -- [email protected]
> Public archives at 
> https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/FR2MT2ZU3HRYOXZK3RHUVNH6HOW664PY/
> To unsubscribe send an email to [email protected]
_______________________________________________
Wikidata mailing list -- [email protected]
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/LYA5B65WALWRS4ERZPIS4MIONO2HWWDC/
To unsubscribe send an email to [email protected]
_______________________________________________
Wikidata mailing list -- [email protected]
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/6OQZEUPZIX6QFJLJFAOUPETEYVW2LWGF/
To unsubscribe send an email to [email protected]

Reply via email to