Il 17/01/2016 00:49, Risker ha scritto:
Hmm.  The majority of those crawlers are from search engines - the very
search engines that keep us in the top 10 of their results (and often in
the top 3), thus leading to the usage and donations that we need to
survive. If they have to pay, then they might prefer to change their
algorithm, or reduce the frequency of scraping (thus also failing to catch
updates to articles including removal of vandalism in the lead paragraphs,
which is historically one of the key reasons for frequently crawling the
same articles).  Those crawlers are what attracts people to our sites, to
read, to make donations, to possibly edit.  Of course there are lesser
crawlers, but they're not really big players.
As usual you nailed it! That's why I wrote "negotiation" implying any extra cost should be fairly modulated but also it shouldn't force over the tops to leave our services.

I'm at a loss to understand why the Wikimedia Foundation should take on the
costs and indemnities associated with hiring staff to create a for-pay API
that would have to meet the expectations of a customer (or more than one
customer) that hasn't even agreed to pay for access.  If they want a
specialized API (and we've been given no evidence that they do), let THEM
hire the staff, pay them, write the code in an appropriately open-source
way, and donate it to the WMF with the understanding that it could be
modified as required, and that it will be accessible to everyone.
+1 is not enough let's +1e12

It is good that the WMF has studied the usage patterns.  Could a link be
given to the report, please?  It's public, correct?  This is exactly the
point of transparency.  If only the WMF has the information, then it gives
an excuse for the community's comments to be ignored "because they don't
know the facts".  So let's lay out all the facts on the table, please.

From Lila's ongoing choices I'm pretty sure they will.

Il 17/01/2016 03:11, Denny Vrandecic ha scritto:
To give a bit more thoughts: I am not terribly worried about current
crawlers. But currently, and more in the future, I expect us to provide
more complex and this expensive APIs: a SPARQL endpoint, parsing APIs, etc.
These will be simply expensive to operate. Not for infrequent users - say,
to the benefit of us 70,000 editors - but for use cases that involve tens
or millions of requests per day. These have the potential of burning a lot
of funds to basically support the operations of commercial companies whose
mission might or might not be aligned with our.
Then a good synthesis would be "let's Google(*) fund scholarships/summer of codes/whatever to build new functionalities then make Google reimburse (**) our facilities' usage/increase our userbase(***)".

(*) by "Google" I mean any big player
(**) by "reimburse" I mean give us a fairly and proportionally determined amount of money based upon *actual* exploitation of our hardware/networking resources. This "reimburse" could also be colo space or whatever we'd need. (***) as several people already pointed out we're in a symbiotic relationship with Google (and others): they need our knowledge, we need their traffic. As long as our sectors are distinct all is right with the symbiosis.

IMHO there's room to increase our advantages without breaking the symbiosis but, above all, without missing our mission.


Wikimedia-l mailing list, guidelines at:
New messages to:

Reply via email to