Erik,

Should interactive web, internet of things, or offline services
relying on Foundation encyclopedia CC-BY-SA content be required to
attribute authorship by specifying the revision date from which the
transluded content is derived?

On Thu, Oct 12, 2017 at 7:01 AM, Erik Moeller <eloque...@gmail.com> wrote:
> On Tue, Oct 10, 2017 at 7:31 AM, Andreas Kolbe <jayen...@gmail.com> wrote:
>
>> Wikidata has its own problems in that regard that have triggered ongoing
>> discussions and concerns on the English Wikipedia.[1]
>
> Tensions between different communities with overlapping but
> non-identical objectives are unavoidable. Repository projects like
> Wikidata and Wikimedia Commons provide huge payoff: they dramatically
> reduce duplication of effort, enable small language communities to
> benefit from the work done internationally, and can tackle a more
> expansive scope than the immediate needs of existing projects. A few
> examples include:
>
> - Wiki Loves Monuments, recognized as the world's largest photo competition
> - Partnerships with countless galleries, libraries, archives, and museums
> - Wikidata initiatives like mySociety's "Everypolitician" project or Gene Wiki
>
> This is not without its costs, however. Differing policies, levels of
> maturity, and social expectations will always fuel some level of
> conflict, and the repository approach creates huge usability
> challenges. The latter is also true for internal wiki features like
> templates, which shift information out of the article space,
> disempowering users who no longer understand how the whole is
> constructed from its parts.
>
> I would call these usability and "legibility" issues the single
> biggest challenge in the development of Wikidata, Structured Data for
> Commons, and other repository functionality. Much related work has
> already been done or is ticketed in Phabricator, such as the effective
> propagation of changes into watchlists, article histories, and
> notifications. Much more will need to follow.
>
> With regard to the issue of citations, it's worth noting that it's
> already possible to _conditionally_ load data from Wikidata, excluding
> information that is unsourced or only sourced circularly (i.e. to
> Wikipedia itself). [1] Template invocations can also override values
> provided by Wikidata, for example, if there is a source, but it is not
> considered reliable by the standards of a specific project.
>
>> If a digital voice assistant propagates a Wikimedia mistake without telling
>> users where it got its information from, then there is not even a feedback
>> form. Editability is of no help at all if people can't find the source.
>
> I'm in favor of always indicating at least provenance (something like
> "Here's a quote from Wikipedia:"), even for short excerpts, and I
> certainly think WMF and chapters can advocate for this practice.
> However, where short excerpts are concerned, it's not at all clear
> that there is a _legal_ issue here, and that full compliance with all
> requirements of the license is a reasonable "ask".
>
> Bing's search result page manages a decent compromise, I think: it
> shows excerpts from Wikipedia clearly labeled as such, and it links to
> the CC-BY-SA license if you expand the excerpt, e.g.:
> https://www.bing.com/search?q=france
>
> I know that over the years, many efforts have been undertaken to
> document best practices for re-use, ranging from local
> community-created pages to chapter guides and tools like the
> "Lizenzhinweisgenerator". I don't know what the best-available of
> these is nowadays, but if none exists, it might be a good idea to
> develop a new, comprehensive guide that takes into account voice
> applications, tabular data, and so on.
>
> Such a guide would ideally not just be written from a license
> compliance perspective, but also include recommendations, e.g., on how
> to best indicate provenance, distinguishing "here's what you must do"
> from "here's what we recommend".
>
>>> Wikidata will often provide a shallow first level of information about
>>> a subject, while other linked sources provide deeper information. The
>>> more structured the information, the easier it becomes to validate in
>>> an automatic fashion that, for example, the subset of country
>>> population time series data represented in Wikidata is an accurate
>>> representation of the source material. Even when a large source
>>> dataset is mirrored by Wikimedia (for low-latency visualization, say),
>>> you can hash it, digitally sign it, and restrict modifiability of
>>> copies.
>
>> Interesting, though I'm not aware of that being done at present.
>
> At present, Wikidata allows users to model constraints on internal
> data validity. These constraints are used for regularly generated
> database reports as well as on-demand lookup via
> https://www.wikidata.org/wiki/Special:ConstraintReport . This kicks
> in, for example, if you put in an insane number in a population field,
> or mark a country as female.
>
> There is a project underway to also validate against external sources; see:
>
>   
> https://www.mediawiki.org/wiki/Wikibase_Quality_Extensions#Special_Page_Cross-Check_with_external_databases
>
> Wikidata still tends to deal with relatively small amounts of data; a
> highly annotated item like Germany (Q183), for example, comes in at
> under 1MB in uncompressed JSON form. Time series data like GDP is
> often included only for a single point in time, or for a subset of the
> available data. The relatively new "Data:" namespace on Commons exists
> to store raw datasets; this is only used to a very limited extent so
> far, but there are some examples of how such data can be visualized,
> e.g.:
>
>   https://en.wikipedia.org/wiki/Template:Graph:Population_history
>
> Giving volunteers more powerful tools to select and visualize data
> while automating much of the effort of maintaining data integrity
> seems like an achievable and strategic goal, and as these examples
> show, some building blocks for this are already in place.
>
>>> But the proprietary knowledge graphs are valuable to users in ways
>>> that the previous generation of search engines was not. Interacting
>>> with a device like you would with a human being ("Alexa/Google/Siri,
>>> is yarrow edible?") makes knowledge more accessible and usable,
>>> including to people who have difficulty reading long texts, or who are
>>> not literate at all. In this sense I don't think WMF should ever find
>>> itself in the position to argue _against_ inclusion of information
>>> from Wikimedia projects in these applications.
>
>> There is a distinct likelihood that they will make reading Wikipedia
>> articles progressively obsolete, just like the availability of Googling has
>> dissuaded many people from sitting down and reading a book.
>
> There is an important distinction between "lookup" and "learning"; the
> former is a transactional activity ("Is this country part of the Euro
> zone?") and the latter an immersive one ("How did the EU come
> about?"). Where we now get instant answers from home assistants or
> search engines, we may have previously skimmed, or performed our own
> highly optimized search in the local knowledge repository called a
> "bookshelf".
>
> In other words, even if some instant answers lead to a drop in
> Wikipedia views, it would be unreasonable to assume that those views
> were "reads" rather than "skims". When you're on a purely
> transactional journey, you appreciate almost anything that shortens
> it.
>
> I don't think Wikimedia should fight the gravity of a user's
> intentions out of its own pedagogical motives. Rather, it should make
> both lookup and learning as appealing as possible. Doing well in the
> "lookup" category is important to avoid handing too much control off
> to gatekeepers, and being good in the "learning" category holds the
> greatest promise for lasting positive impact.
>
> As for the larger social issue, at least in the US, the youngest (most
> googley) generation is the one that reads the most books, and
> income/education are very strong predictors of whether people do or
> not:
> http://www.pewresearch.org/fact-tank/2015/10/19/slightly-fewer-americans-are-reading-print-books-new-survey-finds/
>
>>> The applications themselves are not the problem; the centralized
>>> gatekeeper control is. Knowledge as an open service (and network) is
>>> actually the solution to that root problem. It's how we weaken and
>>> perhaps even break the control of the gatekeepers. Your critique seems
>>> to boil down to "Let's ask Google for more crumbs". In spite of all
>>> your anti-corporate social justice rhetoric, that seems to be the path
>>> to developing a one-sided dependency relationship.
>
>> I considered that, but in the end felt that given the extent to which
>> Google profited from volunteers' work, it wasn't an unfair ask.
>
> While I think your proposal to ask Google to share access to resources
> it already has digitized or licensed is worth considering, I would
> suggest being very careful about the long term implications of any
> such agreements. Having a single corporation control volunteers'
> access to proprietary resources means that such access can also be
> used as leverage down the road, or abruptly be taken away for other
> reasons.
>
> I think it would be more interesting to spin off the existing
> "Wikipedia Library" into its own international organization (or home
> it with an existing one), tasked with giving free knowledge
> contributors (including potentially to other free knowledge projects
> like OSM) access to proprietary resources, and pursuing public and
> private funding of its own. The development of many relationships may
> take longer, but it is more sustainable in the long run. Moreover, it
> has the potential to lead to powerful collaborations with existing
> public/nonprofit digitization and preservation efforts.
>
>> Publicise the fact that Google and others profit from volunteer work, and
>> give very little back. The world could do with more articles like this:
>>
>> https://www.washingtonpost.com/news/the-intersect/wp/2015/07/22/you-dont-know-it-but-youre-working-for-facebook-for-free/
>
> I have plenty of criticisms of Facebook, but the fact that users don't
> get paid for posting selfies isn't one of them. My thoughts on how the
> free culture movement (not limited to Wikipedia) should interface with
> the for-profit sector are as follows, FWIW:
>
> 1) Demand appropriate levels of taxation on private profits, [2]
> sufficient investments in public education and cultural institutions,
> and "open licensing" requirements on government contracts with private
> corporations.
>
> 2) Require compliance with free licenses, first gently, then more
> firmly. This is a game of diminishing returns, and it's most useful to
> go after the most blatant and problematic cases. As noted above, "fair
> use" limits should be understood and taken into consideration.
>
> 3) Encourage corporations to be "good citizens" of the free culture
> world, whether it's through indicating provenance beyond what's
> legally required, or by contributing directly (open source
> development, knowledge/data donations, in-kind goods/services,
> financial contributions). The payoff for them is goodwill and a
> thriving (i.e. also profitable) open Internet that more people in more
> places use for more things.
>
> 4) Build community-driven, open, nonprofit alternatives to
> out-of-control corporate quasi-monopolies. As far as proprietary
> knowledge graphs are concerned, I will reiterate: open data is the
> solution, not the problem.
>
> Cheers,
> Erik
>
> [1] See the getValue function in
> https://en.wikipedia.org/wiki/Module:WikidataIB , specifically its
> "onlysourced" parameter. The module also adds a convenient "Edit this
> on Wikidata" link to each claim included from there.
>
> [2] As far as Wikimedia organizations are concerned, specific tax
> policy will likely always be out of scope of political advocacy, but
> the other points need not be.
>
> _______________________________________________
> Wikimedia-l mailing list, guidelines at: 
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
> https://meta.wikimedia.org/wiki/Wikimedia-l
> New messages to: Wikimedia-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 
> <mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe>

_______________________________________________
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 
<mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe>

Reply via email to