Hi Stas, while you are at it, some things would be very useful to be search-able (maybe some are already by now): * "primary" (not references/qualifiers) years, for birth/death/flourit etc. * "primary" string/monolingual values (title, taxon name, etc.) * "primary" IDs, e.g. VIAF (might cause confusion with years, so maybe only add numerical IDs if 5+ digits?)
Cheers, Magnus On Wed, Oct 25, 2017 at 1:50 AM Stas Malyshev <[email protected]> wrote: > Hi! > > As I am working on improving Wikidata fulltext search[1], I'd like to > talk about search results page. Right now search results page for > Wikidata is less than ideal, here are the issues I see with it: > > - No match highlighting > - Meaningless data, like word count (anybody cares to guess what it is > counting? Anybody ever used it?) and byte count (more useful than word > count but not by much) > - Obviously, search quality is not super high, but that should be > improved with proper description indexing > > While working on improving the situation, I would like to solicit > opinions on the set of questions about how the search results page > should look like. Namely: > > 1. If the match is made on label/description that does not match current > display language, we could opt for: > a) Displaying the description that matched, highlighted. Optionally > maybe display the language of the match (in display language?) > b) Displaying the description in display language, un-highlighted. > Which option is preferable? > > 2. What we do if the match is on alias? Do we display matching alias, > original label or both? The question above also applies if the match is > on other language alias. > > 3. It looks clear to me that words count is useless. Is byte count > useful and does it need to be kept? > > 4. Do we want to display any other parameters of the entity? E.g. we > have in the index: statement_count, sitelink_count, label_count, > incoming_links, etc. Do we want to display any? > > 5. Display format for Wikidata and for other wikipedia sites is different: > Wikpedia: > > Title > Snippet > > Wikidata: > > Title: Description > > I.e. Wikipedia puts title on a separate line, while Wikidata keeps it on > the same line, separated by colon. Is there any reason for this > difference? Do we want to go back to the common format? > > Also if you have any other things/ideas/comments about how fulltext > search output for wikidata should be, please tell me. > > I am sending this to wikidata-tech and discovery team list only for now, > since it's still work in progress and half-baked, we could open this for > wider discussion later if necessary. > > [1] https://phabricator.wikimedia.org/T178851 > > Thanks, > -- > Stas Malyshev > [email protected] > > _______________________________________________ > Wikidata-tech mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wikidata-tech >
_______________________________________________ Wikidata-tech mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-tech
