https://bugzilla.wikimedia.org/show_bug.cgi?id=21919

--- Comment #47 from Bawolff <[email protected]> 2011-03-17 04:51:39 UTC ---
I poked at this last weekend, and a little bit this week.

I think most of the issues brought up in previous reviews have been addressed,
and that this might be ready for another review.

Some remaining issues that I'm not sure and would really like some input on:
Major ones:
*Caching. Currently it caches the feed in memcache for up to a maximum of 12
hours (It also does checks to see if the feed is still valid, so the moment
someone adds something to the relevant categories the cache is no longer
considered valid, so everything is nice)
**Is 12 hours an appropriate max time limit (Given that cache entries are
checked that they're still good before being used).
**Should we be sending maxage/s-maxage headers to cache on the client side
and/or in squids. (Currently its sent with cache-control private i believe)
***If we did do caching on that level, the longest (s)maxage that would be
acceptable would be very short (like half an hour) since we want these things
to be up to date
**Should cache times differ between Sitemap and RSS/ATOM formats.

*It does a check to see if cl_timestamp is null, and returns the current
timestamp if so. This seems very wrong. I'm not sure what circumstances a null
cl_timestamp can happen (There was one but it looked accidental and is now
removed). I'm not sure if checking for a null cl_timestamp and throwing an
exception if that happens is the right way to go. (Is there any circumstance
cl_timestamp can be null?)

*The flagged revision support is flaky.
This may be a valid concern. Its outside of my knowledge of flagged revs what
the best approach is. It appears that DPL also detects dpl in the same way.

Minor ones:
*The default parameters could be tweaked. I'm not sure what the ideal defaults
would be. A lot depends on which of the use-cases we want to optimize for.
*[bikeshed issue] This extension does rss feeds based on category membership.
That feature may be of general interest to some people totally independent from
the Sitemap aspect. Perhaps a more general name should be chosen like
IntersectionFeed. But thats really not important.




Response to some of the review comments:
*"It's doing DIY category intersection"

That is sort of the feature request. (The intersection part. The DIY part I'm
not sure what was meant)

*"Also, at least one of the two sort modes is likely to be very DB-inefficient.
Can't tell for sure cause I can't read the query too well, the code is not very
pretty and the query is complex and very dynamic"

The lastedit (which is really order by last_touched, which doesn't correspond
to edit) sort order is rather scary (I'm not familiar with db scaling and stuff
ery much, but I think it does a full table scan of the page table, and a file
sort which is very very bad from my understanding). Anyways, lastedit is not
very important to the use-case and could be removed.

the Categoryadd sort order is not as bad (Still not great). It might filesort
if the smallest category specified is not the first category specified. It is
no worst than the already deployed intersection (DynamicPageList) extension
(Which has many scary queries ;) Actually its the same query as intersection.

*"General hating on direct output/echo/$wgOut->disable()".

This is done the same way as its done in core. In ideal world, this extension
might be better done as an api module, but I don't think thats a critical
issue. Most special pages that output an xml document call $wgOut->disable().
All the feed stuff in core calls $wgOut->disable() and outputs with echo.

*"is there a way to integrate multiple site maps?"
We are not anywhere near outputting more than 1000 articles over a 2 day
period. So I don't feel this is a near or long term concern.


Thanks,
-bawolff

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
You are on the CC list for the bug.

_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to