Hey :)

Thanks for getting this started.

On Wed, Oct 25, 2017 at 2:49 AM, Stas Malyshev <smalys...@wikimedia.org> wrote:
> Hi!
>
> As I am working on improving Wikidata fulltext search[1], I'd like to
> talk about search results page. Right now search results page for
> Wikidata is less than ideal, here are the issues I see with it:
>
> - No match highlighting
> - Meaningless data, like word count (anybody cares to guess what it is
> counting? Anybody ever used it?) and byte count (more useful than word
> count but not by much)
> - Obviously, search quality is not super high, but that should be
> improved with proper description indexing
>
> While working on improving the situation, I would like to solicit
> opinions on the set of questions about how the search results page
> should look like. Namely:
>
> 1. If the match is made on label/description that does not match current
> display language, we could opt for:
> a) Displaying the description that matched, highlighted. Optionally
> maybe display the language of the match (in display language?)
> b) Displaying the description in display language, un-highlighted.
> Which option is preferable?

When showing labels from fallback languages we do have little language
indicators in other places. I believe we should have this here as
well. Otherwise I believe it is confusing where certain labels
suddenly come from because you might not see them when going to the
actual item.

> 2. What we do if the match is on alias? Do we display matching alias,
> original label or both? The question above also applies if the match is
> on other language alias.

I'm slightly leaning toward showing both.

> 3. It looks clear to me that words count is useless. Is byte count
> useful and does it need to be kept?

It helps in the cases where you want to get an understanding about how
large an item is and if it is worth your attention. If people actually
use it... Not sure. They definitely do in recent changes and history.

> 4. Do we want to display any other parameters of the entity? E.g. we
> have in the index: statement_count, sitelink_count, label_count,
> incoming_links, etc. Do we want to display any?

I'd say in this case we could get rid of the word/byte count. To get a
good glimpse of the quality of the item I'd say we'd want to show
count of statements (excluding identifier statements), identifiers and
sitelinks.

> 5. Display format for Wikidata and for other wikipedia sites is different:
> Wikpedia:
>
> Title
> Snippet
>
> Wikidata:
>
> Title: Description
>
> I.e. Wikipedia puts title on a separate line, while Wikidata keeps it on
> the same line, separated by colon. Is there any reason for this
> difference? Do we want to go back to the common format?

Not sure if we had a reason tbh.

> Also if you have any other things/ideas/comments about how fulltext
> search output for wikidata should be, please tell me.
>
> I am sending this to wikidata-tech and discovery team list only for now,
> since it's still work in progress and half-baked, we could open this for
> wider discussion later if necessary.
>
> [1] https://phabricator.wikimedia.org/T178851
>
> Thanks,
> --
> Stas Malyshev
> smalys...@wikimedia.org
>
> _______________________________________________
> Wikidata-tech mailing list
> Wikidata-tech@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata-tech



-- 
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata

Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.

_______________________________________________
Wikidata-tech mailing list
Wikidata-tech@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-tech

Reply via email to