| Sebotic added a comment. |
@thiemowmde agreed, it would bring down the error rate for this specific identifier/string. But currently, that's the only one we have a size distribution for. I think, Lydia intended to solve this issue here for every property of data type string. So, if for technical reasons (MySQL index field length) it should be limited to 768 for now, this would also be fine for chemistry for now, but how about other properties?
The general question is, if it would be useful/doable to alllow e.g. a medium blob for a text field in the first place, or create a new data type which stores text in blobs. Currently, medium blobs are used to store the whole json, if I am correct. Certainly, this creates indexing, access and potential misuse problems. It also prohibits data access via WDQS.
Another point to keep in mind is that SPARQL queries will get problematic if the field length gets too long. I think I hit a limit of ~4K chars regarding max query length on query.wikidata.org. Is there a maximum length which the currently Wikidata design and SPARQL endpoint agree on which, in order to lift it, would require a major redesign of Wikidata technical infrastructure?
Cc: daniel, thiemowmde, EgonWillighagen, Sebotic, Scott_WUaS, Sadads, Pasleim, Aklapper, Lydia_Pintscher, D3r1ck01, Izno, Wikidata-bugs, aude, Mbch331
_______________________________________________ Wikidata-bugs mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
