On 8 May 2013 18:26, Sumana Harihareswara <suma...@wikimedia.org> wrote:

> Recently a lot of people have been talking about what's possible and
> what's necessary regarding MediaWiki, CatScan-like tools, and real
> category intersection; this mail has some pointers.
>
> The long-term solution is a sparkly query for, e.g., people with aspects
> novelist + Singaporean, and it would be great if Wikidata could be the
> data-source.  Generally people don't really want to search using
> hierarchical categories; they want tags and they want AND. But
> MediaWiki's current power users do use hierarchical labels, so any
> change would have to deal with current users' expectations.  Also my
> head hurts just thinking of the "but my intuitively obvious ontology is
> better than yours" arguments.
>

To put a nice clear stake in the ground, a magic-world-of-loveliness
sparkly proposal for 2015* might be:

* Categories are implemented in Wikidata
* -> They're in whatever language the user wants (so fr:Chat and en:Cat and
nl:kat and zh-han-t:貓 …)
* -> They're properly queryable
* -> They're shared between wikis (pooled expertise)

* Pages are implicitly in the parent categories of their explicit categories
* -> Pages in <Politicians from the Netherlands> are in <People from the
Netherlands by profession> (its first parent) and <People from the
Netherlands> (its first parent's parent) and <Politicians> (its second
parent) and <People> (its second parent's parent) and …
* -> Yes, this poses issues given the sometimes cyclic nature of
categories' hierarchies, but this is relatively trivial to code around

* Readers can search, querying across categories regardless of whether
they're implicit or explicit
* -> A search for the intersection of <People from the Netherlands> with
<Politicians> will effectively return results for <Politicians from the
Netherlands> (and the user doesn't need to know or care that this is an
extant or non-extant category)
* -> Searches might be more than just intersections, e.g. "<Painters from
the United Kingdom> AND <Living people> NOT <Members of the Royal Academy>"
or whatever.
* -> Such queries might be cached (and, indeed, the intersections that
people search for might be used to suggest new categorisation schemata that
wikis had previously not considered - e.g. <British politicians> & <People
with pet cats> & <People who died in hot-ballooning accidents)

* Editors can tag articles with leaf or branch categories, potentially
over-lapping and the system will rationalise the categories on save to the
minimally-spanning subset (or whatever is most useful for users, the
database, and/or both)
* -> Editors don't need to know the hierarchy of categories *a priori* when
adding pages to them (yay, less difficulty)
* -> Power editors don't need to type in loads of different categories if
they have a very specific one in mind (yay, still flexible)
* -> Categories shown to readers aren't necessarily the categories saved in
the database, at editorial judgement (otherwise, would a page not be in
just a single category, namely the intersection of all its tagged
categories?)

​Apart from the time and resources needed to make this happen and
operational, does this sound like something we'd want to do? It feels like
this, or something like it, would serve our editors and readers the best
from their perspective, if not our sysadmins. :-)

[Snip]
​

> I think the best place to pursue this topic is probably in
> https://meta.wikimedia.org/wiki/Talk:Beyond_categories .  It's unlikely
> Wikimedia Foundation will be able to make engineers available to work on
> this anytime soon, but I would not be surprised if the Wikidata
> developer community or volunteers found this interesting enough to work on.


​I guess I should post this there too, maybe once someone's told me if it's
mad-cap. ;-)​

J.
-- 
James D. Forrester
Product Manager, VisualEditor
Wikimedia Foundation, Inc.

jforres...@wikimedia.org | @jdforrester
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to