Hi Dan,

Thank you very much for your offer of assistance form the WMF.  We have several 
issues that need to be addressed.

1.  Completely eliminating the use of Mediawiki's global variables.

In our extension, we have eliminated the use of all of Mediawiki's global 
variables except $wgScriptPath.  We use it to construct the URIs for the 
Memento headers with the wfAppendQuery and wfExpandUrl functions.  Is there a 
better way to get the full URI for the Mediawiki installation (including the 
'index.php' part of the path) without resorting to this variable so we can 
reconstruct the URIs of past articles?

2.  Test installations

We were hoping one of your test Wikipedia instances was available so that the 
community could experiment with our extension further.

3.  How best to handle performance testing

We are planning on conducting performance testing, either at Los Alamos, Old 
Dominion University, or one of the test Wikipedia instances, and wanted your 
input to determine what credible experiments should we set up to demonstrate 
the performance impact of our extension on a Mediawiki installation.

Our plan was to the following test groups:
    1.  no Memento Mediawiki extension installed - access to current and old 
revision (memento) pages
    2.  no Memento Mediawiki extension installed - using a screen scraping 
script to simulate the use of the history pages associated with each article in 
a way that attempts to achieve the goals of Memento, but only via Mediawiki's 
native UI
    3.  no Memento Mediawiki extension installed - use of Mediawiki's existing 
XML api to achieve the same goals of Memento
    4.  use of our native Memento Mediawiki extension with only the mandatory 
headers - access to current and old revision (memento) pages
    5.  use of our native Memento Mediawiki extension with only the mandatory 
headers - with the focus on performing time negotiation and acquiring the 
correct revision
    6.  use of our native Memento Mediawiki extension with all headers - access 
to current and old revision (memento) pages
    7.  use of our native Memento Mediawiki extension with all headers - again 
focusing on time negotiation

During each of these test runs, we would use a utility like vmstat, iostat, 
and/or collectl to measure load on the system, including memory/disk access, 
and compare the results across multiple runs.

Also, are there pre-existing tools for testing Mediawiki that we should be 
using and is there anything we are missing with our methodology?

3.  Architectural feedback to ensure that we've followed Mediawiki's best 
practices

Our extension is more object-oriented than its first incarnation, utilizing a 
mediator pattern, strategy pattern, template methods and factory methods to 
achieve its goals.  I can generate a simplified inheritance diagram to show the 
relationships, but was wondering if we should trim down the levels of 
inheritance for performance reasons.

4.  Advice on how best to market this extension

We can advertise the extension on the wikitech-l and mediawiki-l lists, and do 
have a Mediawiki Extension page, but were wondering if there were conferences, 
web sites, etc. that could be used to help get the word out that our extension 
is available for use, review, input, and further extension.  Any advice would 
be most helpful.

Thanks in advance,

Shawn M. Jones
Graduate Research Assistant
Department of Computer Science
Old Dominion University
________________________________________
From: [email protected] 
[[email protected]] on behalf of Dan Garry 
[[email protected]]
Sent: Monday, November 11, 2013 5:47 PM
To: Wikimedia developers
Subject: Re: [Wikitech-l] Memento Extension for MediaWiki: Advice on Further    
Development

Hi Shawn,

Thanks for starting this discussion!

Other than the suggestions that've been provided, how are you looking for
the WMF to help you with this extension? Our engineers are very limited on
time, so it might be helpful to hear from you about how you'd like us to
help.

Thanks,
Dan


On 1 November 2013 19:50, Shawn Jones <[email protected]> wrote:

> Hi,
>
> I'm currently working on the Memento Extension for Mediawiki, as announced
> earlier today by Herbert Van de Sompel.
>
> The goal of this extension is to work with the Memento framework, which
> attempts to display web pages as they appeared at a given date and time in
> the past.
>
> Our goal is for this to be a collaborative effort focusing on solving
> issues and providing functionality in "the Wikimedia Way" as much as
> possible.
>
> Without further ado, I have the following technical questions (I apologize
> in advance for the fire hose):
>
> 1.  The Memento protocol has a resource called a TimeMap [1] that takes an
> article name and returns text formatted as application/link-format.  This
> text contains a machine-readable list of all of the prior revisions
> (mementos) of this page.  It is currently implemented as a SpecialPage
> which can be accessed like
> http://www.example.com/index.php/Special:TimeMap/Article_Name.  Is this
> the best method, or is it more preferable for us to extend the Action class
> and add a new action to $wgActions in order to return a TimeMap from the
> regular page like
> http://www.example.com/index.php?title=Article_Name&action=gettimemapwithout 
> using the SpecialPage?  Is there another preferred way of solving
> this problem?
>
> 2.  We currently make several database calls using the the select method
> of the Database Object.  After some research, we realized that Mediawiki
> provides some functions that do what we need without making these database
> calls directly.  One of these needs is to acquire the oldid and timestamp
> of the first revision of a page, which can be done using
> Title->getFirstRevision()->getId() and
> Title->getFirstRevision()->getTimestamp() methods.  Is there a way to get
> the latest ID and latest timestamp?  I see I can do Title->getLatestRevID()
> to get the latest revision ID; what is the best way to get the latest
> timestamp?
>
> 3.  In order to create the correct headers for use with the Memento
> protocol, we have to generate URIs.  To accomplish this, we use the
> $wgServer global variable (through a layer of abstraction); how do we
> correctly handle situations if it isn't set by the installation?  Is there
> an alternative?  Is there a better way to construct URIs?
>
> 4.  We use exceptions to indicate when showErrorPage should be run; should
> the hooks that catch these exceptions and then run showErrorPage also
> return false?
>
> 5.  Is there a way to get previous revisions of embedded content, like
> images?  I tried using the ImageBeforeProduceHTML hook, but found that
> setting the $time parameter didn't return a previous revision of an image.
>  Am I doing something wrong?  Is there a better way?
>
> 6.  Are there any additional coding standards we should be following
> besides those on the "Manual:Coding_conventions" and "Manual:Coding
> Conventions - Mediawiki" pages?
>
> 7.  We have two styles for serving pages back to the user:
>        * 302-style[2], which uses a 302 redirect to tell the user's
> browser to go fetch the old revision of the page (e.g.
> http://www.example.com/index.php?title=Article&oldid=12345)
>        * 200-style[3], which actually modifies the page content in place
> so that it resembles the old revision of the page
>      Which of these styles is preferable as a default?
>
> 8.  Some sites don't wish to have their past Talk/Discussion pages
> accessible via Memento.  We have the ability to exclude namespaces (Talk,
> Template, Category, etc.) via configurable option.  By default it excludes
> nothing.  What namespaces should be excluded by default?
>
> Thanks in advance for any advice, assistance, further discussion, and
> criticism on these and other topics.
>
> Shawn M. Jones
> Graduate Research Assistant
> Department of Computer Science
> Old Dominion University
>
> [1] http://www.mementoweb.org/guide/rfc/ID/#Pattern6
> [2] http://www.mementoweb.org/guide/rfc/ID/#Pattern1.1
> [3] http://www.mementoweb.org/guide/rfc/ID/#Pattern1.2
> _______________________________________________
> Wikitech-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l




--
Dan Garry
Associate Product Manager for Platform
Wikimedia Foundation
_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to