Without deliberately making it an even longer term plan, as I think it is a
great idea, another long goal solution to the same problem would be (as
Flow gets Wikipedians into the idea of tagging) that categories get largely
replaced by tags.  That way they lose much of their absoluteness and
therefore some of their controversy.

Categories are hard for Wikipedia because compromise is not possible.
 Consensus can be reached on a subtly different compromise version of the
wording of a sentence or paragraph, but there is no compromise on
categories.  A category either exists or does not. A page either goes in or
does not.

With tags, a biography could relatively uncontroversially  be tagged as
"Novelist, Woman, Best Selling, American, Blonde Haired, Enjoys Spicy Food"
even if nearly everybody agrees that half the tags while true are entirely
unimportant and not relevant to the subject's area of notability.  Whether
some tags like race and appearance should exist at all may still generate
debate, but if they are only ever available modifiers and not hard
categories their offense would be softened.

For some subjects, entirely uncontroversial tags could be extracted from
Wikidata.

It would be content shakeup and therefore perhaps politically difficult,
but it would take a lot of the technical challenge out of joins, even
permitting joins (automatically or manually) with tags translated into
equivalent versions in other languages.

All possible combinations of tag derived categories would then "exist", and
it would just be a matter of debate as to whether there is a justification
to add a link from a page to "Biography+Novelist+Enjoys Spicy Food" or if
that is a meaningless category.  If reverted, the one person interested in
that exact category could still always visit it, it's just that other users
would not be directed to it unless they probe talk page debates.

Luke Welling


On Thu, May 9, 2013 at 12:38 PM, James Forrester
<jforres...@wikimedia.org>wrote:

> [I worry we're talking about operational details, which should be a wider
> discussion, rather than a technology/feasibility conversation to which this
> list is more suited. Perhaps moving this on-wiki would be best?]
>
> On 9 May 2013 09:28, Brad Jorsch <bjor...@wikimedia.org> wrote:
>
> > On Wed, May 8, 2013 at 10:47 PM, James Forrester
> > <jforres...@wikimedia.org> wrote:
> > > * Pages are implicitly in the parent categories of their explicit
> > categories
> > > * -> Pages in <Politicians from the Netherlands> are in <People from
> the
> > > Netherlands by profession> (its first parent) and <People from the
> > > Netherlands> (its first parent's parent) and <Politicians> (its second
> > > parent) and <People> (its second parent's parent) and …
> > > * -> Yes, this poses issues given the sometimes cyclic nature of
> > > categories' hierarchies, but this is relatively trivial to code around
> >
> > Category cycles are the least of it. The fact that the existing
> > category hierarchy isn't based on any sensible-for-inference ontology
> > is a bigger problem.
> >
> > Let's consider what would happen to one of my favorite examples on
> enwiki:
> > * The article for Romania is in <Black Sea countries>. Ok.
> > * And that category is in <Black Sea>, so Romania is in that too.
> > Which is a little strange, but not too bad.
> > * And <Black Sea> is in <Seas of Russia> and <Landforms of Ukraine>.
> > Huh? Romania doesn't belong in either of those, despite that being
> > equivalent to your example where pages in <Politicians from the
> > Netherlands> also end up in <People> via <Politicians>.
> >
> > And it gets worse the further up you go. You would have Romania in
> > <Liquids> a few more levels up.
> >
> > For this to work, each wiki would have to redo its category hierarchy
> > as a real ontology based on is-a relationships, rather than the
> > current is-somehow-related-to. Or we would have to introduce some
> > magic word or something to tell MediaWiki that <Politicians> is-a
> > <People> is a valid inference while <Black Sea countries> is-a <Black
> > Sea> isn't.
> >
> > In other words, code-wise adding "tags" to an article is the same as
> > categories with inference and querying. But trying to use the existing
> > category setup as it exists on something like enwiki as "tags" for
> > inference (or querying, to a lesser extent) seems like GIGO.
> >
>
> Quite - the bit of my proposal where the categories would get created on
> Wikidata from scratch as a synthesis of the needs of the editing community.
> :-)
>
> Implicitly, these would have clear semantics about the correctitude of
> their usage governed by something analogous to how Wikidata's community are
> managing the roll-out of statements on the system. In terms of tools to
> prevent this becoming an issue, Wikidata's nature means we could easily
> make sure that the domain of a category would be limited (e.g. "Fluids"
> maps to "substances", not "instances of substances").
>
>
>
> > > * Readers can search, querying across categories regardless of whether
> > > they're implicit or explicit
> > > * -> A search for the intersection of <People from the Netherlands>
> with
> > > <Politicians> will effectively return results for <Politicians from the
> > > Netherlands> (and the user doesn't need to know or care that this is an
> > > extant or non-extant category)
> >
> > A person who is originally from the Netherlands but moved to Germany
> > and became a politician there would be in <People from the
> > Netherlands> and <Politicians>, but maybe should not be in
> > <Politicians from the Netherlands> depending on how exactly you define
> > that category.
> >
>
> Indeed; I deliberately chose to use <Politicians from the Netherlands>
> rather than <Politicians of the Netherlands> or <Politicians in the
> Netherlands> which are distinct categories with entirely different
> semantics, but you're right that semantics would need to be clear.
>
> J.
> --
> James D. Forrester
> Product Manager, VisualEditor
> Wikimedia Foundation, Inc.
>
> jforres...@wikimedia.org | @jdforrester
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to