https://bugzilla.wikimedia.org/show_bug.cgi?id=70873
--- Comment #3 from Nik Everett <[email protected]> --- (In reply to Bartosz DziewoĆski from comment #2) > Oh, so URLs are one "segment", and this doesn't find "substrings"? That > makes sense. > > Splitting on these characters sounds reasonable to me. There are some cases > like "AC/DC", but that shouldn't cause any problems, right? You've got it. The way search works is that all the words are segmented (tokenized) and then normalized and then indexed for quick lookup. The trick is that each language is subtly different and I only speak English so I can only validate that choices make sense there. And its hard to propose changes that cross many languages. Anyway, I'll see if I can make a tool to easily look at how words are segmented in your language. And I'll see if I can make it easy to experiment a bit with stuff. -- You are receiving this mail because: You are the assignee for the bug. You are on the CC list for the bug. _______________________________________________ Wikibugs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
