Hoi,
Thank you for another approach. When Wikidata imports data from Wikipedia,
it essentially stands on the shoulders of giants. Yes, there are sources in
Wikipedia and it does not prevent occasional issues. Yes, we import a lot
of data from Wikipedia and this makes life at Wikidata easy and what we do
obvious. It all started with improving quality at Wikidata by making
interwiki links manageable and we are still often involved in fixing
wikilinks in Wikidata because the assumptions to link some articles are
"funny".

When you look at Wikipedia, a lot of the fixtures are essentially about
data. A category or a list can be replicated in many ways by querying
Wikidata.The inverse is that Wikidata can be populated from Wikipedia.
Consequently when we say that we know about men and women in so many
Wikipedias it is because of this that we can and do. When Wikipedia is
correct, Wikidata is. When Wikipedias do not agree, you will find this
expressed in Wikidata.

When people build tools, bots and they have done so for a long time it is
EXACTLY based on the assumption that Wikipedia is essentially correct and,
it is why the quality and quantity of Wikidata is already this good. When
you want to consider Wikidata and its complexity, it is important to look
at the statistics. The statistics by Magnus are the most relevant because
they help explain many of the issues of Wikidata.

One important point. No Wikipedia can claim Wikidata as it is a composite.
Wikipedia policies do not apply. When people insist that all the data in
Wikidata has to be 100% correct, forget it. Wikipedia is not 00% correct
either and that is what we build upon. It has never been this way and it is
impossible to do this any time soon.

What we can do is build upon existing qualities, compare and curate. It is
for instance fairly easy to improve on Wikipedia based upon the information
that is already there but shown to be problematic. It is easy when we
collaborate as it will improve the quality of what we offer. One problem is
that we are SO bad at collaboration. Wikipedians work on one article at a
time and when I work on awards there are easily 60 persons involved and I
trust Wikipedia to be right. The kind of issues I encounter I blog about
regularly. I am not involved in single items or they have to be of
relevance to me like Bassel, the only Wikipedian sentenced to death. So I
did add new items that exist as red links in the award he received and I
did ask Magnus to help me with a list for the award he received. I added
the website I used on the award and that is as far as I go.

When you want to talk about the issues, what is it that you want to
achieve. So far there has been little interest in Wikidata. When you want
to learn about issues, research the issues. Find methods to calculate the
error rate, find methods to compare Wikidata with the Wikipedias and with
other sources in a meaningful way. But do approach it like Magnus does. His
contributions help us make a positive difference. When you find numbers for
now that you cannot replicate with the next dump and the next, they are
essentially without much value because they do not enable us to improve on
what we have. They do not help us engage our minds to make a difference. I
ask Amir regularly to run a bot based on the statistics produced by Magnus,
we are not at the stage where we have such tasks automated...

Andrea, Wikidata is a wiki. It is young and it has already proven itself
for several applications. What can be done improves as our data improves.
We have a lack of data on many subjects because it is where Wikipedia is
lacking. How will we approach for instance the fact that we have fewer than
1000 Syrians and one of them is an emperor of the Roman empire and another
is Bassel?

Let us be bold and allow us to be a wiki. Let us work towards the quality
that is possible to achieve and do not burden us with the assumptions of
some Wikipedias. When you are serious, get involved.
Thanks,
      GerardM

On 13 December 2015 at 19:10, Andrea Zanni <zanni.andre...@gmail.com> wrote:

> I really feel we are drowning in a glass of water.
> The issue of "data quality" or "reliability" that Andreas raises is well
> known:
> what I don't understand if the "scale" of it is much bigger on Wikidata
> than Wikipedia,
> and if this different scale makes it much more important. The scale of the
> issue is maybe something worth discussing, and not the issue itself? Is the
> fact that Wikidata is centralised different from statements on Wikipedia? I
> don't know, but to me this is a more neutral and interesting question.
>
> I often say that the Wikimedia world made quality an "heisemberghian"
> feature: you always have to check if it's there.
> The point is: it's been always like this.
> We always had to check for quality, even when we used Britannica or
> authority controls or whatever "reliable" sources we wanted. Wikipedia, and
> now Wikidata, is made for everyone to contribute, it's open and honest in
> being open, vulnerable, prone to errors. But we are transparent, we say
> that in advance,  we can claim any statement to the smallest detail. Of
> course it's difficult, but we can do it. Wikidata, as Lydia said, can
> actually have conflicting statements in every item: we "just" have to put
> them there, as we did to Wikipedia.
>
> If Google uses our data and they are wrong, that's bad for them. If they
> correct the errors and do not give us the corrections, that's bad for us
> and not ethical from them. The point is: there is no license (for what I
> know) that can force them to contribute to Wikidata. That is, IMHO, the
> problem with "over-the-top" actors: they can harness collective intelligent
> and "not give back." Even with CC-BY-SA, they could store (as they are
> probably already doing) all the data in their knowledge vault, which is
> secret as it is an incredible asset for them.
>
> I'd be happy to insert a new clause of "forced transparency" in CC-BY-SA or
> CC0, but it's not there.
>
> So, as we are  working via GLAMs with Wikipedia for getting reliable
> sources and content, we are working with them also for good statements and
> data. Putting good data in Wikidata makes it better, and I don't understand
> what is the problem here (I understand, again, the issue of putting too
> much data and still having a small community).
> For example: if we are importing different reliable databases, andthe
> institutions behind them find it useful and helpful to have an aggregator
> of identifiers and authority controls, what is the issue? There is value in
> aggregating data, because you can spot errors and inconsistencies. It's not
> easy, of course, to find a good workflow, but, again, that is *another*
> problem.
>
> So, in conclusion: I find many issues in Wikidata, but not on the
> mission/vision, just in the complexity of the project, the size of the
> dataset, the size of the community.
>
> Can we talk about those?
>
> Aubrey
>
>
>
> On Sun, Dec 13, 2015 at 6:40 PM, Andreas Kolbe <jayen...@gmail.com> wrote:
>
> > On Sun, Dec 13, 2015 at 5:32 PM, geni <geni...@gmail.com> wrote:
> >
> > > On 13 December 2015 at 15:57, Andreas Kolbe <jayen...@gmail.com>
> wrote:
> > >
> > > > Jane,
> > > >
> > > > The issue is that you can't cite one Wikipedia article as a source in
> > > > another.
> > > >
> > >
> > >
> > > However you can within the same article per [[WP:LEAD]].
> > >
> >
> >
> > Well, of course, if there are reliable sources cited in the body of the
> > article that back up the statements made in the lead. You still need to
> > cite a reliable source though; that's Wikipedia 101.
> > _______________________________________________
> > Wikimedia-l mailing list, guidelines at:
> > https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> > Wikimedia-l@lists.wikimedia.org
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > <mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe>
> >
> _______________________________________________
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> Wikimedia-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe>
>
_______________________________________________
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 
<mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe>

Reply via email to