| Mahir256 renamed this task from "Wikidata search suggestions do not return anything if a character whose decomposition contains nukta is present" to "Wikidata search suggestions do not display on screen if character whose decomposition contains nukta is present in search query". Mahir256 updated the task description. (Show Details) Herald added a subscriber: PokestarFan. |
CHANGES TO TASK DESCRIPTION
Most Indic-language sites and Commons appear to process characters such as ढ़, য়, ਖ਼, and ଡ଼—note that these are combined, i.e. //not// already decomposed into a consonant and a nukta—appropriately when they are present in search queries, returning appropriate suggestions. (The bolding of the text within the search suggestions corresponding to what was typed does not appear, but that's not quite as troublesome of a matter.) Wikidata's search functionality does not handle these characters properly at all
Wikidata's search functionality returns the proper JSON response given a search containing the aforementioned characters, but the results are not rendered properly at all. In particular, the warning "//The value passed for "search" contains invalid or non-normalized data. Textual data should be valid, NFC-normalized Unicode without C0 control characters other than HT (\t), LF (\n), and CR (\r).//" is attached to the results. This causes either 1) the waiting icon to remain indefinitely, if it is a new search query, or 2) the previous results to remain, if it is a modification to another search query.
To see this1) for yourself, you can change your interface language to Bengali, copy the text "বিষয়শ্রেণী:" ("Category:" in Bengali) and paste it into Wikidata's search box, and see no category pages pop up. Change the "য়" in that word to "য + ়", after removing the two spaces and the plus from that quotation, and such category pages will appear. To see 2) for yourself, change the "য + ়" back to "য়" and add "উইকিপিডিয়া" (Wikipedia) to the end, and such category pagesthe results shown will appearnot change.
This does not appear to be an issue with all characters for which a decomposition exists in the Unicode standard, as searches such as "Cañada" (where the "ñ" decomposes into "n + U+0303") do return suggestions properly, without any warning attached to the JSON response.
Wikidata's search functionality returns the proper JSON response given a search containing the aforementioned characters, but the results are not rendered properly at all. In particular, the warning "//The value passed for "search" contains invalid or non-normalized data. Textual data should be valid, NFC-normalized Unicode without C0 control characters other than HT (\t), LF (\n), and CR (\r).//" is attached to the results. This causes either 1) the waiting icon to remain indefinitely, if it is a new search query, or 2) the previous results to remain, if it is a modification to another search query.
To see this1) for yourself, you can change your interface language to Bengali, copy the text "বিষয়শ্রেণী:" ("Category:" in Bengali) and paste it into Wikidata's search box, and see no category pages pop up. Change the "য়" in that word to "য + ়", after removing the two spaces and the plus from that quotation, and such category pages will appear. To see 2) for yourself, change the "য + ়" back to "য়" and add "উইকিপিডিয়া" (Wikipedia) to the end, and such category pagesthe results shown will appearnot change.
This does not appear to be an issue with all characters for which a decomposition exists in the Unicode standard, as searches such as "Cañada" (where the "ñ" decomposes into "n + U+0303") do return suggestions properly, without any warning attached to the JSON response.
TASK DETAIL
EMAIL PREFERENCES
To: Mahir256
Cc: PokestarFan, daniel, thiemowmde, Aftabuzzaman, Mahir256, Aklapper, GoranSMilovanovic, QZanden, Izno, Wikidata-bugs, aude, Mbch331
Cc: PokestarFan, daniel, thiemowmde, Aftabuzzaman, Mahir256, Aklapper, GoranSMilovanovic, QZanden, Izno, Wikidata-bugs, aude, Mbch331
_______________________________________________ Wikidata-bugs mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
