https://bugzilla.wikimedia.org/show_bug.cgi?id=34778
Rob Sanderson <[email protected]> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |REOPENED Resolution|WONTFIX |--- --- Comment #34 from Rob Sanderson <[email protected]> --- Hi Tim, The Memento team has carefully analyzed your feedback. We hope our below response can convince you to change your opinion regarding Memento support in Wikipedia and would very much appreciate further communication regarding the matter. Many thanks! Rob => Problem 1: Asking users to install a Firefox extension to make navigation easier is not how I imagine a secure and user-friendly web would work. Perhaps if this were supported by unmodified browsers, it would be more attractive for us. The browsers have a long history of introducing features in advance of their use on the web, so I don't think it's a "chicken-and-egg" problem. Response: It's difficult to argue with this point. We would obviously much prefer native adoption by browsers over a plug-in solution. But, without a plug-in, there would be no way to demonstrate the cross-site time travel capability introduced by Memento. Also, it is hard to see what incentives browser manufacturers have to natively implement Memento's datetime negotiation as long as there is no critical mass of servers supporting it. Failed attempts to get the attention of Mozilla and Opera support this consideration, but if you have experience otherwise, then any assistance you might give would be greatly appreciated. At this point, Memento enjoys growing adoption by web archives (Internet Archive, British Library Web Archive, UK National Archives) and it has the unanimous support from the International Internet Preservation Consortium. Adoption by WikiPedia could help build the essential critical mass that, we think, could give us the momentum to credibly approach browser manufacturers. Given WikiPedia's track record as early adopters of innovative technologies (as emphasized by editors in the RFC discussion re Memento support), we were hopeful to have your support in working towards establishing that critical mass. ====== => Problem 2: TimeGate responses, as specified by the Internet-Draft, appear to be effectively uncacheable with currently used HTTP proxy software. We have no way to remove resources from a cache with a finer granularity than a URI. So when the page is changed, we would have the choice of either: * Purging the TimeGate URI when the page is changed, in which case all versions of that resource would be simultaneously purged, reducing the hit rate for rarely-accessed old revisions, or * Not purging it, in which case responses for recent Accept-Datetime values would become stale. Also, there would be no way to purge revisions which are removed from the database by RevisionDelete. Response: We very much share the concern of cacheability, as exemplified by the Memento protocol responses for Original Resources and Mementos. However, when it comes to TimeGates, the situation regarding caching deserves some further consideration: * RFC 2616 states, as quoted below, that 302 responses are by default not cached: "A response received with any other status code (e.g. status codes 302 and 307) MUST NOT be returned in a reply to a subsequent request unless there are cache-control directives or another header(s) that explicitly allow it." * Caching 302 responses from a TimeGate will yield marginal benefit, if any: - Datetime negotiation values exist on a continuum unlike e.g. media type negotiation for which values reside in a discrete set. In the latter case, chances that a cache has an entry for a specific value out of the (small) discrete set are significant. In the TimeGate case, chances are dramatically lower, if not insignificant understanding the size of the value space. For example, when only taking into account day granularity, the value space for Wikipedia has cardinality of over 3650 (365 days * 10+ years). Adding hours, minutes, and seconds to the value space brings this cardinality to over 365*10*24*60*60. Chances for a cache hit become very small. - The overhead on the server resulting from not caching TimeGate responses remains reduced as responses only contain headers without a representation in the body. Please see for example http://www.mementoweb.org/guide/rfc/ID/#a200-step4-http ====== => Problem 3: Additionally, the definition of the Vary header in the Internet-Draft appears to conflict with the definition in HTTP (RFC 2616), as implemented by MediaWiki, PHP, Squid, etc. It's unclear what the "negotiate" value is for or how it will interact with the Vary header values that MediaWiki must send to HTTP proxy servers. Response: We see no conflict with the Vary definition of RFC 2616 as it states the following about the field names used in Vary: "The field-names given are not limited to the set of standard request-header fields defined by this specification." Furthermore, the "negotiate" value for Vary has become widely used since its introduction in RFC 2295 that details Transparent Content Negotiation. The "negotiate" value is used by default for negotiated responses by Apache servers. However, we agree that the "negotiate" value serves no real purpose without the corresponding Negotiate request header and can be regarded as a remnant of the early days of Memento during which RFC 2295 was a significant inspiration. We are most willing to remove this value from the Vary header in the Memento protocol and hence also from the MediaWiki plugin. ===== => Problem 4: The Internet-Draft seems to unnecessarily overspecify server and client behaviour. For example, depending on the server software, it may be difficult to implement the requirement that TimeGates respond to request methods other than GET and POST with an HTTP 405 code. Response: The concern regarding HTTP 405 is fair and we would be most willing to remove this requirement from the specification. Other feedback regarding instances of overspecification would be very welcome as we could take them into account when wrapping up the Internet Draft. From our perspective, we have tried to clearly detail a variety of existing and anticipated situations in a consistent manner, trying to redact a specification that really helps implementers. But, in our enthusiasm, we may have gone overboard, indeed. ===== => Problem 5: (In reply to comment #29) > That said it would be kind of a cool thing, provided the effort was minimal. I don't think the effort would be minimal. The code quality is poor, and would suffer from a high rate of bit rot due to poor integration with the MediaWiki core. For example, mmAcceptDateTime() assumes $_GET['oldid'] will have a certain interpretation by the MediaWiki core, and sends header values corresponding to this interpretation, regardless of what MediaWiki decides to actually do with that parameter. The assumption is already incorrect and will become more incorrect over time. Response: This comment regarding poor software quality comes as a big surprise as we have invested very significant resources to improve the initial code base, through many iterations, in response to feedback from MediaWiki people. This is the first time we hear about the false assumption re mmAcceptDateTime(). Our developer Harihar Shankar states the following with this regard: "I am determining if the current resource is a version of an article by looking at the URL and check if there is "oldid" in it. This is definitely not the best way to do it, but I looked extensively in their documentation and I could not find a better alternative. This issue has not been brought up by the code reviewers so far." We would be very interested in learning what the appropriate approach is. And we are interested in hearing about other problems with the code. In both cases, we will be most happy to make required changes to bring the code to the desired quality level. -- You are receiving this mail because: You are the assignee for the bug. You are watching all bug changes. _______________________________________________ Wikibugs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
