Thanks Brian,

Defaulting to only allow $wgContentNamespaces, or more specifically, 
MWNamespace::getContentNamespaces(), worked great.

--Shawn

________________________________________
From: wikitech-l-boun...@lists.wikimedia.org 
[wikitech-l-boun...@lists.wikimedia.org] on behalf of Brian Wolff 
[bawo...@gmail.com]
Sent: Friday, November 01, 2013 6:43 PM
To: Wikimedia developers
Subject: Re: [Wikitech-l] Memento Extension for MediaWiki: Advice on Further    
Development

Hi, I responded inline.

On 11/1/13, Shawn Jones <sj...@cs.odu.edu> wrote:
> Hi,
>
> I'm currently working on the Memento Extension for Mediawiki, as announced
> earlier today by Herbert Van de Sompel.
>
> The goal of this extension is to work with the Memento framework, which
> attempts to display web pages as they appeared at a given date and time in
> the past.
>
> Our goal is for this to be a collaborative effort focusing on solving issues
> and providing functionality in "the Wikimedia Way" as much as possible.
>
> Without further ado, I have the following technical questions (I apologize
> in advance for the fire hose):
>
> 1.  The Memento protocol has a resource called a TimeMap [1] that takes an
> article name and returns text formatted as application/link-format.  This
> text contains a machine-readable list of all of the prior revisions
> (mementos) of this page.  It is currently implemented as a SpecialPage which
> can be accessed like
> http://www.example.com/index.php/Special:TimeMap/Article_Name.  Is this the
> best method, or is it more preferable for us to extend the Action class and
> add a new action to $wgActions in order to return a TimeMap from the regular
> page like
> http://www.example.com/index.php?title=Article_Name&action=gettimemap
> without using the SpecialPage?  Is there another preferred way of solving
> this problem?

Special Page vs Action is usually considered equally ok for this sort
of thing. However creating an api module would probably be the
preferred method to return such machine readable data about a page.

> 2.  We currently make several database calls using the the select method of
> the Database Object.  After some research, we realized that Mediawiki
> provides some functions that do what we need without making these database
> calls directly.  One of these needs is to acquire the oldid and timestamp of
> the first revision of a page, which can be done using
> Title->getFirstRevision()->getId() and
> Title->getFirstRevision()->getTimestamp() methods.  Is there a way to get
> the latest ID and latest timestamp?  I see I can do Title->getLatestRevID()
> to get the latest revision ID; what is the best way to get the latest
> timestamp?

Use existing wrapper functions around DB calls where you can, but if
you need to its ok to query the db directly.

For the last part, probably something along the lines of
WikiPage::factory( $titleObj )->getRevision()->getTimestamp()

> 3.  In order to create the correct headers for use with the Memento
> protocol, we have to generate URIs.  To accomplish this, we use the
> $wgServer global variable (through a layer of abstraction); how do we
> correctly handle situations if it isn't set by the installation?  Is there
> an alternative?  Is there a better way to construct URIs?

$wgServer is always filled out (Setup.php sets it if user doesn't).
However you probably shouldn't be using it directly. What the most
appropriate method to use depends on what sort of urls you want, but
generally the Title class has methods like getFullURL for this sort of
thing.


> 4.  We use exceptions to indicate when showErrorPage should be run; should
> the hooks that catch these exceptions and then run showErrorPage also return
> false?

I haven't looked at your code, so not sure about the context - but: In
general a hook returns true to denote no futher processing should take
place. Displaying an error message sounds like a good criteria to
return true. That said, things may depend on the hook and what
precisely you're doing.
>
> 5.  Is there a way to get previous revisions of embedded content, like
> images?  I tried using the ImageBeforeProduceHTML hook, but found that
> setting the $time parameter didn't return a previous revision of an image.
> Am I doing something wrong?  Is there a better way?

FlaggedRevisions manages to set old version of an image, so its
possible. I think you might want to do something with the
BeforeParserFetchFileAndTitle hook as well. For the time parameter,
make sure the function you're using has the $time parameter marked as
pass by reference. Also note: the time parameter is the timestamp that
the image version was created, it does not denote get whatever image
would be relavent at the time specified (I believe).

>
> 6.  Are there any additional coding standards we should be following besides
> those on the "Manual:Coding_conventions" and "Manual:Coding Conventions -
> Mediawiki" pages?

Those are the important ones. As a rule of thumb, try to make your
code look like it fits in with the rest of mediawiki.

>
> 7.  We have two styles for serving pages back to the user:
>        * 302-style[2], which uses a 302 redirect to tell the user's browser
> to go fetch the old revision of the page (e.g.
> http://www.example.com/index.php?title=Article&oldid=12345)
>        * 200-style[3], which actually modifies the page content in place so
> that it resembles the old revision of the page
>      Which of these styles is preferable as a default?

First reaction would be that the 302, as it more clearly indicates
your viewing an old page, and people could copy and paste the url in
order to get to see the exact same version. It also seems better to
have different urls for different objects (caching and all). [That's
just a first reaction, I haven't thought about it deeply]
>
> 8.  Some sites don't wish to have their past Talk/Discussion pages
> accessible via Memento.  We have the ability to exclude namespaces (Talk,
> Template, Category, etc.) via configurable option.  By default it excludes
> nothing.  What namespaces should be excluded by default?

That's going to be a political issue that varries by project probably.
As a first approximation maybe default only to things in
$wgContentNamespaces.

>
> Thanks in advance for any advice, assistance, further discussion, and
> criticism on these and other topics.
>
> Shawn M. Jones
> Graduate Research Assistant
> Department of Computer Science
> Old Dominion University
>
> [1] http://www.mementoweb.org/guide/rfc/ID/#Pattern6
> [2] http://www.mementoweb.org/guide/rfc/ID/#Pattern1.1
> [3] http://www.mementoweb.org/guide/rfc/ID/#Pattern1.2
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Good luck on developing your extension. Last of all I don't want to
sound negative, but please keep in mind that if your goal is
deployment on Wikipedia, that is not just a technical issue, but also
a political one, and a goal that is rather hard to accomplish...

Cheers,
Brian

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to