[I worry we're talking about operational details, which should be a wider discussion, rather than a technology/feasibility conversation to which this list is more suited. Perhaps moving this on-wiki would be best?]
On 9 May 2013 09:28, Brad Jorsch <bjor...@wikimedia.org> wrote: > On Wed, May 8, 2013 at 10:47 PM, James Forrester > <jforres...@wikimedia.org> wrote: > > * Pages are implicitly in the parent categories of their explicit > categories > > * -> Pages in <Politicians from the Netherlands> are in <People from the > > Netherlands by profession> (its first parent) and <People from the > > Netherlands> (its first parent's parent) and <Politicians> (its second > > parent) and <People> (its second parent's parent) and … > > * -> Yes, this poses issues given the sometimes cyclic nature of > > categories' hierarchies, but this is relatively trivial to code around > > Category cycles are the least of it. The fact that the existing > category hierarchy isn't based on any sensible-for-inference ontology > is a bigger problem. > > Let's consider what would happen to one of my favorite examples on enwiki: > * The article for Romania is in <Black Sea countries>. Ok. > * And that category is in <Black Sea>, so Romania is in that too. > Which is a little strange, but not too bad. > * And <Black Sea> is in <Seas of Russia> and <Landforms of Ukraine>. > Huh? Romania doesn't belong in either of those, despite that being > equivalent to your example where pages in <Politicians from the > Netherlands> also end up in <People> via <Politicians>. > > And it gets worse the further up you go. You would have Romania in > <Liquids> a few more levels up. > > For this to work, each wiki would have to redo its category hierarchy > as a real ontology based on is-a relationships, rather than the > current is-somehow-related-to. Or we would have to introduce some > magic word or something to tell MediaWiki that <Politicians> is-a > <People> is a valid inference while <Black Sea countries> is-a <Black > Sea> isn't. > > In other words, code-wise adding "tags" to an article is the same as > categories with inference and querying. But trying to use the existing > category setup as it exists on something like enwiki as "tags" for > inference (or querying, to a lesser extent) seems like GIGO. > Quite - the bit of my proposal where the categories would get created on Wikidata from scratch as a synthesis of the needs of the editing community. :-) Implicitly, these would have clear semantics about the correctitude of their usage governed by something analogous to how Wikidata's community are managing the roll-out of statements on the system. In terms of tools to prevent this becoming an issue, Wikidata's nature means we could easily make sure that the domain of a category would be limited (e.g. "Fluids" maps to "substances", not "instances of substances"). > > * Readers can search, querying across categories regardless of whether > > they're implicit or explicit > > * -> A search for the intersection of <People from the Netherlands> with > > <Politicians> will effectively return results for <Politicians from the > > Netherlands> (and the user doesn't need to know or care that this is an > > extant or non-extant category) > > A person who is originally from the Netherlands but moved to Germany > and became a politician there would be in <People from the > Netherlands> and <Politicians>, but maybe should not be in > <Politicians from the Netherlands> depending on how exactly you define > that category. > Indeed; I deliberately chose to use <Politicians from the Netherlands> rather than <Politicians of the Netherlands> or <Politicians in the Netherlands> which are distinct categories with entirely different semantics, but you're right that semantics would need to be clear. J. -- James D. Forrester Product Manager, VisualEditor Wikimedia Foundation, Inc. jforres...@wikimedia.org | @jdforrester _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l