Hi Stas,

while you are at it, some things would be very useful to be search-able
(maybe some are already by now):
* "primary" (not references/qualifiers) years, for birth/death/flourit etc.
* "primary" string/monolingual values (title, taxon name, etc.)
* "primary" IDs, e.g. VIAF (might cause confusion with years, so maybe only
add numerical IDs if 5+ digits?)

Cheers,
Magnus

On Wed, Oct 25, 2017 at 1:50 AM Stas Malyshev <[email protected]>
wrote:

> Hi!
>
> As I am working on improving Wikidata fulltext search[1], I'd like to
> talk about search results page. Right now search results page for
> Wikidata is less than ideal, here are the issues I see with it:
>
> - No match highlighting
> - Meaningless data, like word count (anybody cares to guess what it is
> counting? Anybody ever used it?) and byte count (more useful than word
> count but not by much)
> - Obviously, search quality is not super high, but that should be
> improved with proper description indexing
>
> While working on improving the situation, I would like to solicit
> opinions on the set of questions about how the search results page
> should look like. Namely:
>
> 1. If the match is made on label/description that does not match current
> display language, we could opt for:
> a) Displaying the description that matched, highlighted. Optionally
> maybe display the language of the match (in display language?)
> b) Displaying the description in display language, un-highlighted.
> Which option is preferable?
>
> 2. What we do if the match is on alias? Do we display matching alias,
> original label or both? The question above also applies if the match is
> on other language alias.
>
> 3. It looks clear to me that words count is useless. Is byte count
> useful and does it need to be kept?
>
> 4. Do we want to display any other parameters of the entity? E.g. we
> have in the index: statement_count, sitelink_count, label_count,
> incoming_links, etc. Do we want to display any?
>
> 5. Display format for Wikidata and for other wikipedia sites is different:
> Wikpedia:
>
> Title
> Snippet
>
> Wikidata:
>
> Title: Description
>
> I.e. Wikipedia puts title on a separate line, while Wikidata keeps it on
> the same line, separated by colon. Is there any reason for this
> difference? Do we want to go back to the common format?
>
> Also if you have any other things/ideas/comments about how fulltext
> search output for wikidata should be, please tell me.
>
> I am sending this to wikidata-tech and discovery team list only for now,
> since it's still work in progress and half-baked, we could open this for
> wider discussion later if necessary.
>
> [1] https://phabricator.wikimedia.org/T178851
>
> Thanks,
> --
> Stas Malyshev
> [email protected]
>
> _______________________________________________
> Wikidata-tech mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/wikidata-tech
>
_______________________________________________
Wikidata-tech mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-tech

Reply via email to