On Thu, Mar 21, 2013 at 10:55 PM, Yuri Astrakhan
<[email protected]>wrote:

> API is fairly complex to meassure and performance target. If a bot requests
> 5000 pages in one call, together with all links & categories, it might take
> a very long time (seconds if not tens of seconds). Comparing that to
> another api request that gets an HTML section of a page, which takes a
> fraction of a second (especially when comming from cache) is not very
> useful.
>

This is true, and I think we'd want to look at a metric like 99th
percentile latency.  There's room for corner cases taking much longer, but
they really have to be corner cases.  Standards also have to be flexible,
with different acceptable ranges for different uses.  Yet if 30% of
requests for an api method to fetch pages took tens of seconds, we'd likely
have to disable it entirely until its use or the number of pages per
request could be limited.

On Fri, Mar 22, 2013 at 1:32 AM, Peter Gehres <[email protected]> wrote:
>
> > From where would you propose measuring these data points?  Obviously
> > network latency will have a great impact on some of the metrics and a
> > consistent location would help to define the pass/fail of each test. I do
> > think another benchmark Ops "features" would be a set of
> > latency-to-datacenter values, but I know that is a much harder taks.
> Thanks
> > for putting this together.
> >
> >
> > On Thu, Mar 21, 2013 at 6:40 PM, Asher Feldman <[email protected]
> > >wrote:
> >
> > > I'd like to push for a codified set of minimum performance standards
> that
> > > new mediawiki features must meet before they can be deployed to larger
> > > wikimedia sites such as English Wikipedia, or be considered complete.
> > >
> > > These would look like (numbers pulled out of a hat, not actual
> > > suggestions):
> > >
> > > - p999 (long tail) full page request latency of 2000ms
> > > - p99 page request latency of 800ms
> > > - p90 page request latency of 150ms
> > > - p99 banner request latency of 80ms
> > > - p90 banner request latency of 40ms
> > > - p99 db query latency of 250ms
> > > - p90 db query latency of 50ms
> > > - 1000 write requests/sec (if applicable; writes operations must be
> free
> > > from concurrency issues)
> > > - guidelines about degrading gracefully
> > > - specific limits on total resource consumption across the stack per
> > > request
> > > - etc..
> > >
> > > Right now, varying amounts of effort are made to highlight potential
> > > performance bottlenecks in code review, and engineers are encouraged to
> > > profile and optimize their own code.  But beyond "is the site still up
> > for
> > > everyone / are users complaining on the village pump / am I ranting in
> > > irc", we've offered no guidelines as to what sort of request latency is
> > > reasonable or acceptable.  If a new feature (like aftv5, or flow) turns
> > out
> > > not to meet perf standards after deployment, that would be a high
> > priority
> > > bug and the feature may be disabled depending on the impact, or if not
> > > addressed in a reasonable time frame.  Obviously standards like this
> > can't
> > > be applied to certain existing parts of mediawiki, but systems other
> than
> > > the parser or preprocessor that don't meet new standards should at
> least
> > be
> > > prioritized for improvement.
> > >
> > > Thoughts?
> > >
> > > Asher
> > > _______________________________________________
> > > Wikitech-l mailing list
> > > [email protected]
> > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> > _______________________________________________
> > Wikitech-l mailing list
> > [email protected]
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> >
> _______________________________________________
> Wikitech-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to