On Wed, Dec 16, 2015 at 11:12 AM, Andrea Zanni <[email protected]>
wrote:

> On Sun, Dec 13, 2015 at 9:35 PM, Jane Darnell <[email protected]> wrote:
>
> > Andrea,
> > I totally agree on the mission/vision thing, but am not sure what you
> mean
> > exactly by scale - do you mean that Wikidata shouldn't try to be so
> > granular that it has a statement to cover each factoid in any Wikipedia
> > article, or do you mean we need to talk about what constitutes notability
> > in order not to grow Wikidata exponentially to the point the servers
> crash?
> > Jane
> >
> >
> Hi Jane, I explained myself poorly (sometime English is too difficult :-)
>
> What I mean is that the scale of the error *could* be of another scale,
> another order of magnitude.
> The propagation of the error is multiplied, it's not just a single error on
> a wikipage: it's an error propagated in many wikipages, and then Google,
> etc.
> A single point of failure.
>


Exactly: a single point of failure. A system where a single point of
failure can have such consequences, potentially corrupting knowledge
forever, is a bad system. It's not robust.

In the op-ed, I mentioned the Brazilian aardvark hoax[1] as an example of
error propagation (which happened entirely without Wikidata's and the
Knowledge Graph's help). It took the New Yorker quite a bit of research to
piece together and confirm what happened, research which I understand would
not have happened if the originator of the hoax had not been willing to
talk about his prank.

It was the same with the fake Maurice Jarre quotes in Wikipedia[2] that
made their way into mainstream press obituaries a few years ago. If the
hoaxer had not come forward, no one would have been the wiser. The fake
quotes would have remained a permanent part of the historical record.

More recent cases include the widely repeated (including by Associated
Press, for God's sake, to this day) claim that Joe Streater was involved in
the Boston College basketball point shaving scandal[3] and the Amelia
Bedelia hoax.[4]

If even things people insert as a joke propagate around the globe as a
result of this vulnerability, then there is a clear and present potential
for purposeful manipulation. We've seen enough cases of that, too.[5]

This is not the sort of system the Wikimedia community should be helping to
build. The very values at the heart of the Wikimedia movement are about
transparency, accountability, multiple points of view, pluralism,
democracy, opposing dominance and control by vested interests, and so
forth.

What is the way forward?

Wikidata should, as a matter of urgency, rescind its decision to make its
content available under the CC0 licence. Global propagation without
attribution is a terrible idea.

Quite apart from that, in my opinion Wikidata's CC0 licensing also
infringes Wikipedia contributors' rights as enshrined in Wikipedia's CC
BY-SA licence, a point Lydia Pintscher did not even contest on the Signpost
talk page. As I understand her response,[6] she restricts herself to
asserting that the responsibility for any potential licence infringement
lies with Wikidata contributors rather than with her and Wikimedia
Deutschland. That's passing the buck.

If Wikidata is not prepared to follow CC BY-SA, the way DBpedia does[7],
the next step should be a DMCA takedown notice for material mass-imported
from Wikipedia.

And of course, Wikidata needs to step up its efforts to cite verifiable
sources.


[1] http://www.newyorker.com/tech/elements/how-a-raccoon-became-an-aardvark
[2]
http://www.theguardian.com/commentisfree/2009/may/04/journalism-obituaries-shane-fitzgerald
[3]
http://awfulannouncing.com/2014/guilt-wikipedia-joe-streater-became-falsely-attached-boston-college-point-shaving-scandal.html
Associated Press:
http://bigstory.ap.org/article/list-worst-scandals-college-sports
[4] http://www.dailydot.com/lol/amelia-bedelia-wikipedia-hoax/
[5]
http://www.newsweek.com/2015/04/03/manipulating-wikipedia-promote-bogus-business-school-316133.html
and
http://www.dailydot.com/lifestyle/wikipedia-plastic-surgery-otto-placik-labiaplasty/
and many others
[6]
https://en.wikipedia.org/w/index.php?title=Wikipedia_talk:Wikipedia_Signpost/2015-12-09/Op-ed&diff=695228403&oldid=695228022
[7] http://wiki.dbpedia.org/terms-imprint


> Of course, the opposite is also true: it's a single point of openness,
> correction, information.
> I was just wondering if this different scale is a factor in making
> Wikipedia and Wikidata different enough to accept/reject Andreas arguments.
>
> Andrea
>
>
>
> > On Sun, Dec 13, 2015 at 7:10 PM, Andrea Zanni <[email protected]>
> > wrote:
> >
> > > I really feel we are drowning in a glass of water.
> > > The issue of "data quality" or "reliability" that Andreas raises is
> well
> > > known:
> > > what I don't understand if the "scale" of it is much bigger on Wikidata
> > > than Wikipedia,
> > > and if this different scale makes it much more important. The scale of
> > the
> > > issue is maybe something worth discussing, and not the issue itself? Is
> > the
> > > fact that Wikidata is centralised different from statements on
> > Wikipedia? I
> > > don't know, but to me this is a more neutral and interesting question.
> > >
> > > I often say that the Wikimedia world made quality an "heisemberghian"
> > > feature: you always have to check if it's there.
> > > The point is: it's been always like this.
> > > We always had to check for quality, even when we used Britannica or
> > > authority controls or whatever "reliable" sources we wanted. Wikipedia,
> > and
> > > now Wikidata, is made for everyone to contribute, it's open and honest
> in
> > > being open, vulnerable, prone to errors. But we are transparent, we say
> > > that in advance,  we can claim any statement to the smallest detail. Of
> > > course it's difficult, but we can do it. Wikidata, as Lydia said, can
> > > actually have conflicting statements in every item: we "just" have to
> put
> > > them there, as we did to Wikipedia.
> > >
> > > If Google uses our data and they are wrong, that's bad for them. If
> they
> > > correct the errors and do not give us the corrections, that's bad for
> us
> > > and not ethical from them. The point is: there is no license (for what
> I
> > > know) that can force them to contribute to Wikidata. That is, IMHO, the
> > > problem with "over-the-top" actors: they can harness collective
> > intelligent
> > > and "not give back." Even with CC-BY-SA, they could store (as they are
> > > probably already doing) all the data in their knowledge vault, which is
> > > secret as it is an incredible asset for them.
> > >
> > > I'd be happy to insert a new clause of "forced transparency" in
> CC-BY-SA
> > or
> > > CC0, but it's not there.
> > >
> > > So, as we are  working via GLAMs with Wikipedia for getting reliable
> > > sources and content, we are working with them also for good statements
> > and
> > > data. Putting good data in Wikidata makes it better, and I don't
> > understand
> > > what is the problem here (I understand, again, the issue of putting too
> > > much data and still having a small community).
> > > For example: if we are importing different reliable databases, andthe
> > > institutions behind them find it useful and helpful to have an
> aggregator
> > > of identifiers and authority controls, what is the issue? There is
> value
> > in
> > > aggregating data, because you can spot errors and inconsistencies. It's
> > not
> > > easy, of course, to find a good workflow, but, again, that is *another*
> > > problem.
> > >
> > > So, in conclusion: I find many issues in Wikidata, but not on the
> > > mission/vision, just in the complexity of the project, the size of the
> > > dataset, the size of the community.
> > >
> > > Can we talk about those?
> > >
> > > Aubrey
> > >
> > >
> > >
> > > On Sun, Dec 13, 2015 at 6:40 PM, Andreas Kolbe <[email protected]>
> > wrote:
> > >
> > > > On Sun, Dec 13, 2015 at 5:32 PM, geni <[email protected]> wrote:
> > > >
> > > > > On 13 December 2015 at 15:57, Andreas Kolbe <[email protected]>
> > > wrote:
> > > > >
> > > > > > Jane,
> > > > > >
> > > > > > The issue is that you can't cite one Wikipedia article as a
> source
> > in
> > > > > > another.
> > > > > >
> > > > >
> > > > >
> > > > > However you can within the same article per [[WP:LEAD]].
> > > > >
> > > >
> > > >
> > > > Well, of course, if there are reliable sources cited in the body of
> the
> > > > article that back up the statements made in the lead. You still need
> to
> > > > cite a reliable source though; that's Wikipedia 101.
> > > > _______________________________________________
> > > > Wikimedia-l mailing list, guidelines at:
> > > > https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> > > > [email protected]
> > > > Unsubscribe:
> https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > > > <mailto:[email protected]?subject=unsubscribe>
> > > >
> > > _______________________________________________
> > > Wikimedia-l mailing list, guidelines at:
> > > https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> > > [email protected]
> > > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > > <mailto:[email protected]?subject=unsubscribe>
> > _______________________________________________
> > Wikimedia-l mailing list, guidelines at:
> > https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> > [email protected]
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > <mailto:[email protected]?subject=unsubscribe>
> >
> _______________________________________________
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> [email protected]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:[email protected]?subject=unsubscribe>
>
_______________________________________________
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
[email protected]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 
<mailto:[email protected]?subject=unsubscribe>

Reply via email to