Hi again,

The solutions discussed here seem to be quite a bit more general than what I was thinking about. Of course it would be nice to have a uniform, cross-client way to indicate tools in any MW Web service or API, but this is a slightly bigger (and probably more long-term) goal than what I had in mind. It is a good idea to suggest a standard approach to tool developers there and to have a documentation page on that, but it would take some time until this is adopted by enough tools to work.

For our present task, we just need some more signals we can use. Analysing SPARQL queries requires us to parse them anyway, so comments are fine. In general, the data we are looking at has a lot of noise, so we cannot rely on a single field. We will combine user agents, X-analytics, query comments, and also query shapes (if you get 1M+ similar looking queries in one hour, you know its a bot). With the current data, the query shape is often our main clue, so comments would already be a big step forward.

Best,

Markus


On 04.10.2016 07:05, Yuri Astrakhan wrote:
For consistency between all possible clients, we seem to have only two
options:  either part of the query, or the X-Analytics header.   The
user-agent header is not really an option because it is not available
for all types of clients, and we want to have just one way for everyone.
Headers other than X-Analytics will need custom handling, whereas we
already have plenty of Varnish code to deal with X-Analytics header,
split it into parts, and for Hive to parse it. Yes it will be an extra
line of code in JS ($.ajax instead of $.get), but I am sure this is not
such a big deal if we provide cookie cutter code. Parsing query string
in varnish/hive is also some complex extra work, so lets keep
X-Analytics. Proposed required values (semicolon separated):
* tool=<name of the tool>
* toolver=<version of the tool>
* contact=<some way of contacting you, e.g. @twitter, em...@example.com
<mailto:em...@example.com>, +1.212.555.1234, ...>

Bikeshedding ?   See also:  https://wikitech.wikimedia.org/wiki/X-Analytics

On Tue, Oct 4, 2016 at 12:45 AM Stas Malyshev <smalys...@wikimedia.org
<mailto:smalys...@wikimedia.org>> wrote:

    Hi!

    > Using custom HTTP headers would, of course, complicate calls for the
    > tool authors (i.e., myself). $.ajax instead of $.get and all that. I
    > would be less inclined to change to that.

    Yes, if you're using browser, you probably can't change user agent. In
    that case I guess we need either X-Analytics or put it in the query. Or
    maybe Referer header would be fine then - it is also recorded. If
    Referer is distinct enough it can be used then.

    --
    Stas Malyshev
    smalys...@wikimedia.org <mailto:smalys...@wikimedia.org>

    _______________________________________________
    Analytics mailing list
    analyt...@lists.wikimedia.org <mailto:analyt...@lists.wikimedia.org>
    https://lists.wikimedia.org/mailman/listinfo/analytics



_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata



_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Reply via email to