Hello Fae,

Are you generating json somewhere or thiking to create an API for that
information?

I have some ideas that I started to develop last year, I will ping you in
private.

Best regards and thanks by helping us with the important information.

Rodrigo Padula

2015-02-26 14:01 GMT-03:00 Fæ <fae...@gmail.com>:

> Now the run has been running a bit longer, it can be seen[1] that the
> real sink is the logging table from wikidata. Other projects are
> pretty quick from a few minutes to Commons taking half an hour.
> Wikidata takes 7 hours for each query. It's a big log!
>
> I am testing out separating wikidata from the other queries, so this
> might get the majority done much more quickly. Either way, all the
> reports should be finished within about 3 days. After that the monthly
> updates should be done within an afternoon. :-)
>
> Links
> 1. https://meta.wikimedia.org/wiki/Special:Contributions/Faebot
>
> Fae
>
> On 26 February 2015 at 06:02, MZMcBride <z...@mzmcbride.com> wrote:
> > Fæ wrote:
> >>Re: https://meta.wikimedia.org/wiki/User:Faebot/thanks
> >>
> >>I am glad the tables are useful, hopefully stimulating more positive
> >>use of the thanks notifier by contributors.
> >>
> >>The reports are updating *slowly*, currently at April 2014... This is
> >>in part because of the WMFlabs outage yesterday, though in general any
> >>report of the "logging" table is going to be slow (it is the largest
> >>table on the wiki database). The first run-through will take several
> >>days as it is going back through all of 2014. Once it is only
> >>reporting on the previous month, I suspect it will finish monthly
> >>updates within the first first day.
> >
> > Hi.
> >
> > Hmm, I'm not sure how you're measuring largest, but I imagine on most
> > wikis there are more rows in the revision table than there are in the
> > logging table. For example, on the English Wikipedia, there are
> > approximately 598,859,006 rows in the revision table and 62,731,285 rows
> > in the logging table. I suspect on most wikis, revision, text, and maybe
> > archive would typically be larger than logging, except in weird cases
> > such as loginwiki[_p]. And then of course there are the *links tables.
> But
> > it depends on whether you're comparing size on disk or number of rows.
>
> Yes I got it wrong! :-)
>
> > You probably want to use logging_userindex instead of logging. The former
> > is typically significantly faster due to the way we use table views.
>
> That's how it works. The key line is
>     SELECT log_user_text, count(log_action)
>     FROM {}wiki_p.logging_userindex
>
> Where {} iterates over wiki short names.
>
> > I have a bit of experience with database reports. Off-hand I'd say it
> > should be possible to query all of this information in under an hour.
> With
> > the index on logging[_userindex].log_action, even a large table such as
> > logging shouldn't be too awful to query for this information. If you have
> > queries that are taking a very long time, we should probably investigate.
>
> Maybe, I have experimented a little with ways of writing the query
> (such as using log_user rather than log_user_text) but have not found
> a magic trick yet. Remember I'm reporting all of 2014 first. After
> this first run, the time taken is unlikely to be noticed by anyone, so
> volunteer time vs. processing time will mean I'll probably get on with
> other projects, rather than spending my wiki-time making this SQL
> based report more efficient. Naturally, if someone wants to offer me
> some funding, or bribes me by nominating my Commons RFA, I'll look
> again. ;-)
>
> Cheers,
> Fae
>
> _______________________________________________
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> Wikimedia-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe>
_______________________________________________
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 
<mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe>

Reply via email to