Liam,

I am interested in anything demonstrating that the things I am concerned
about are not a problem.

Further Comments interspersed below.

On Fri, Nov 27, 2015 at 12:51 PM, Liam Wyatt <[email protected]> wrote:

> On 27 November 2015 at 12:08, Andreas Kolbe <[email protected]> wrote:
>
> > The Wikimedia movement has always had an important principle: that all
> > content should be traceable to a "reliable source". Throughout the first
> > decade of this movement and beyond, Wikimedia content has never been
> > considered a reliable source. For example, you can't use a Wikipedia
> > article as a reference in another Wikipedia article.
> >
> > Another important principle has been the disclaimer: pointing out to
> people
> > that the data is anonymously crowdsourced, and that there is no guarantee
> > of reliability or fitness for use.
> >
> > Both of these principles are now being jettisoned.
> >
> > Wikipedia content is considered a reliable source in Wikidata...
> >
>  <snip>
>
> I agree that "reliable source" referencing and "crowdsourced content" are
> indeed principles of our movement. However, I disagree that Wikidata is
> "jettisoning" them. In fact, quite the contrary!
>
> The purpose of the statement "imported from --> English Wikipedia" in the
> "reference" field of a Wikidata item's statement is PRECISELY to indicate
> to the user that this information has not been INDEPENDENTLY verified to a
> reliable source and that Wikipedia is NOT considered a reliable source.
> Furthermore, it provides a PROVENANCE of that information to help stop
> people from circular referencing. That is - clearly stating that the
> specific fact in Wikidata has come from Wikipedia helps to avoid the
> structured-data equivalent of "citogenisis": https://xkcd.com/978/ If/When
> a person can provide a reliable reference for that same fact, they are
> encouraged to add an actual reference. Note, the wikidata statement used
> for facts coming in from Wikipedia use the property "imported from". This
> is deliberately different from the property "reference URL" which is what
> you would use when adding an actual reference to a third-party reliable
> online source.
>


How does the presence of that information in Wikidata help if the Google
user just gets the info in the Knowledge Graph without any indication that
it comes from Wikidata? Because CC0 specifically waives the right to
attribution that Wikipedia retains.[1][2] No re-user of Wikidata content is
required to say where the data came from, and they typically don't.

So, absent this information, don't you think it likely that users will
simply propagate information they find in Google and on other reusers'
sites? Rather than preventing citogenesis, I think it's citogenesis on
steroids, given that Google has far more users than any Wikimedia project.

This CC0, no-attribution arrangement may financially benefit Google,
because they can dispense with a source link that might lead users away
from their own site and their ads, but how does it benefit the public, or
indeed benefit Wikimedia? Are we all just working to make Google richer, or
are we working for the public?

Moreover, according to data on Wikimedia Labs[3], about half of all
statements in Wikidata have *no reference whatsoever*. That's *in addition*
to the third that are only referenced to a Wikipedia.

Yet all of this material is meant to form an input to the Google Knowledge
Graph, following Google's abandonment of Freebase in favour of
Wikidata.[4][5]



> Furthermore, the fact that many statements in Wikidata are not given a
> reference (yet) is not necessarily a "problem". For example - this
> https://www.wikidata.org/wiki/Q21481859 is a Wikidata item for a
> scientific
> publication with 2891 co-authors!! This is an extreme example, but it
> demonstrates my point... None of those 2891 statements has a specific
> reference listed for it, because all of them are self-evidently referenced
> to the scientific publication itself. The same is true of the other
> properties applied to this item (volume, publication date, title, page
> number...). All of these could be "referenced" to the very first property
> in the Wikidata item - the DOI of the scientific article:
> http://www.sciencedirect.com/science/article/pii/S0370269312008581 This
> item is not "less reliable" because it doesn't have the same footnote
> repeated almost three thousand times, but if you merely look at statistics
> of "unreferenced wikidata statements" it would APPEAR that it is very
> poorly cited.
> So, I think we need a more nuanced view of what "proper referencing" means
> in the context of Wikidata.
>


I take your point, even though I am unsure what value this Wikidata listing
adds for the public, given that it merely reproduces details from the
publisher's page. Might we be reinventing the wheel? And if there is value
added for the public in some way that escapes me, surely it would not be
difficult to have the bot add the reference automatically when importing
the data from the publisher's page, thereby showing that it is referenced
and making it easier to spot when someone subsequently adds the name of his
classmate as a joke?

I'll add an extreme example of my own, from the opposite end of the
spectrum: for five months in 2014, Wikidata told the world that Franklin D.
Roosevelt was also known as "Adolf Hitler".[6]

If obvious unsourced vandalism lasts as long as that, I am not sanguine
about the likelihood of more subtle distortions being spotted in a timely
manner. Note that manipulation of Knowledge Graph content was reportedly a
problem with Freebase as well.[4]


[1] https://creativecommons.org/publicdomain/zero/1.0/
[2]
https://en.wikipedia.org/wiki/Wikipedia:Text_of_Creative_Commons_Attribution-ShareAlike_3.0_Unported_License
[3] https://tools.wmflabs.org/wikidata-todo/stats.php
[4]
https://www.seroundtable.com/google-freebase-wikidata-knowledge-graph-19591.html
[5]
http://searchengineland.com/google-close-freebase-helped-feed-knowledge-graph-211103
[6] https://www.wikidata.org/w/index.php?title=Q8007&oldid=124603129
https://www.wikidata.org/w/index.php?title=Q8007&diff=next&oldid=154290374
_______________________________________________
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
[email protected]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 
<mailto:[email protected]?subject=unsubscribe>

Reply via email to