Re: [Wikidata] Wikidata considered unable to support hierarchical search in Structured Data for Commons

Daniel Kinzler Wed, 17 Oct 2018 07:05:05 -0700

My (very belated) thoughts on this issue:

Wiki content grows in a messy way, and it stays messy until the messiness causes
problems. Once it causes problems, people are motivated to clean it up.


I propose to implement hierarchical search based on very simple, predictable
rules, e.g. by having a configurable list of transitive relationships that get
evaluated to a certain depth. I'd go for subclasses, geographical inclusion, and
subspecies at first.

Doing this will NOT produce good results. You would have to implement a lot of
special cases and heuristics to work around dirty data. I say: let it produce
bad results, tell people why the results are bad, and what they can do about it!

The Wikimedia community is AMAZING at making good use of whatever capabilities
the software, and adapting content to make the software produce the results they
want. By providing limited but clearly defined software support for hierarchical
search, we allow the community to optimize the content to work with that search.
Keeping the rules simple means that other consumers can then follow the same
rules, and the content will work for them as well.

-- daniel

Am 29.09.2018 um 19:25 schrieb Gerard Meijssen:
> Hoi,
> There is also the age old conundrum where some want to enforce their rules for
> the good all all because (argument of the day follows).
> 
> First of all, Wikidata is very much a child of Wikipedia. It has its own
> structures and people have endeavoured to build those same structures in
> Wikidata never mind that it is a very different medium and never mind that 
> there
> are 280+ Wikipedias that might consider things to be different.  The start of
> Wikidata was also an auspicious occasion where it was thought to be OK to 
> adopt
> an external German authority. That proved to be a disaster and there are still
> residues of this awful decision. It took not long to show the short comings of
> this schedule and it was replaced by something more sensible.
> 
> However, we got something really Wiki and it was all too wild. It took not 
> long
> for me to ask for someone to explain the current structures and nobody
> volunteered. So I did what I do best, I largely ignored the results of the
> classes and subclasses. It does not work for me. It works against me so me
> current strategy is to ignore this nonsense and concentrate on including data.
> The reason is simple; once data is included, it is easy to slice it and dice
> it.structure it as we see fit at a later date.
> 
> So when our priority becomes to make our data reusable, more open we should
> agree on it. So far we have not because we choose to fight each other. Some 
> have
> ideas, some have invested too much in what we have at this time. When we are 
> to
> make our data reusable, we should agree on what it is exactly we aim to 
> achieve.
> Is it to support Commons, it is to support some external standard that is
> academically sound. I would always favour what is practical and easily 
> measured. 
> 
> I would support Commons first. It has the benefit that it will bring our
> communities together in a clear objective. It has the benefit that changes in
> the operations of Wikidata support the whole of the Wikimedia universe and
> consequentially financial, technical and operational needs and investments are
> easily understood. It also means that all the bureaucracy that has 
> materialised
> will show to be in the way when it is.
> 
> So my question is not if we are a Wiki, my question is are we a Wiki enough 
> and
> willing to change our way for our own good.
> Thanks,
>       GerardM
> 
> On Sat, 29 Sep 2018 at 16:38, Thad Guidry <thadgui...@gmail.com
> <mailto:thadgui...@gmail.com>> wrote:
> 
>     Ettore,
> 
>     Wikidata has the ability of crowdsourcing...unfortunately, it is not
>     effectively utilized.
> 
>     Its because Wikidata does not yet provide a voting feature on
>     statements...where as the vote gets higher...more resistance to change the
>     statement is required.
>     But that breaks the notion of a "wiki" for some folks.
>     And there we circle back to Gerard's age old question of ... should 
> Wikidata
>     really be considered a wiki at all for the benefit of society ?  or should
>     it apply voting/resistance to keep it tidy, factual and less messy.
> 
>     We have the technology to implement voting/resistance on statements.  I
>     personally would utilize that feature and many others probably would as
>     well.  Crowdsourcing the low voted facts back to applications like
>     OpenRefine, or the recently sent out Survey vote mechanism for spam 
> analysis
>     on the low voted statements could highlight where things are untidy and
>     implement vote casting to clean them up.
> 
>     "...the burden of proof has to be placed on authority, and it should be
>     dismantled if that burden cannot be met..."
> 
>     -Thad
>     +ThadGuidry <https://plus.google.com/+ThadGuidry>
> 
> 
>     On Sat, Sep 29, 2018 at 2:49 AM Ettore RIZZA <ettoreri...@gmail.com
>     <mailto:ettoreri...@gmail.com>> wrote:
> 
>         Hi,
> 
>         The Wikidata's ontology is a mess, and I do not see how it could be
>         otherwise. While the creation of new properties is controlled, any 
> fool
>         can decide that a woman <https://www.wikidata.org/wiki/Q467>is no 
> longer
>         a human or is part of family. Maybe I'm a fool too? I wanted to remove
>         the claim that a ship <https://www.wikidata.org/wiki/Q11446> is an
>         instance of "ship type" because it produces weird circular inferences 
> in
>         my application; but maybe that makes sense to someone else.
> 
>         There will never be a universal ontology on which everyone agrees. I
>         wonder (sorry to think aloud) if Wikidata should not rather facilitate
>         the use of external classifications. Many external ids are knowledge
>         organization systems (ontologies, thesauri, classifications ...) I 
> dream
>         of a simple query that could search, in Wikidata, "all elements of the
>         same class as 'poodle' according to the classification of imagenet
>         <http://imagenet.stanford.edu/synset?wnid=n02113335>.
> 
>     _______________________________________________
>     Wikidata mailing list
>     Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>
>     https://lists.wikimedia.org/mailman/listinfo/wikidata
> 
> 
> 
> _______________________________________________
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
> 


-- 
Daniel Kinzler
Principal Software Engineer, Core Platform
Wikimedia Foundation

_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Wikidata considered unable to support hierarchical search in Structured Data for Commons

Reply via email to