On Fri, 10 Sep 2010 23:11:27 +0000, Dan Nessett wrote:

> We are currently attempting to refactor some specific modifications to
> the standard MW code we use (1.13.2) into an extension so we can upgrade
> to a more recent maintained version. One modification we have keeps a
> flag in the revisions table specifying that article text was imported
> from WP. This flag generates an attribution statement at the bottom of
> the article that acknowledges the import.
> 
> I don't want to start a discussion about the various legal issues
> surrounding text licensing. However, assuming we must acknowledge use of
> licensed text, a legitimate technical issue is how to associate state
> with an article in a way that records the import of licensed text. I
> bring this up here because I assume we are not the only site that faces
> this issue.
> 
> Some of our users want to encode the attribution information in a
> template. The problem with this approach is anyone can come along and
> remove it. That would mean the organization legally responsible for the
> site would entrust the integrity of site content to any arbitrary
> author. We may go this route, but for the sake of this discussion I
> assume such a strategy is not viable. So, the remainder of this post
> assumes we need to keep such licensing state in the db.
> 
> After asking around, one suggestion was to keep the licensing state in
> the page_props table. This seems very reasonable and I would be
> interested in comments by this community on the idea. Of course, there
> has to be a way to get this state set, but it seems likely that could be
> achieved using an extension triggered when an article is edited.
> 
> Since this post is already getting long, let me close by asking whether
> support for associating licensing information with articles might be
> useful to a large number of sites. If so, the perhaps it belongs in the
> core.

The discussion about whether to support license data in the database has 
settled down. There seems to be some support. So, I think the next step 
is to determine the best technical approach. Below I provide a strawman 
proposal. Note that this is only to foster discussion on technical 
requirements and approaches. I have nothing invested in the strawman.

Implementation location: In an extension

Permissions: include two new permissions - 1) addlicensedata, and 2) 
modifylicensedata. These are pretty self-explanatory. Sites that wish to 
give all users the ability to provide and modify licensing data would 
assign these permissions to everyone. Sites that wish to allow all users 
to add licensing data, but restrict those who are allowed to modify it, 
would give the first permission to everyone and the second to a limited 
group.

Database schema: Add a "licensing" table to the db with the following 
columns - 1) revision_or_image, 2) revision_id, 3) image_id, 4) 
content_source, 5) license_id, 6) user_id.

The first three columns identify the revision or image to which the 
licensing data is associated. I am not particularly adept with SQL, so 
there may be a better way to do this. The content_source column is a 
string that is a URL or other reference that specifies the source of the 
content under license. The license_id identifies the specific license for 
the content. The user_id identifies the user that added the licensing 
information. The user_id may be useful if a site wishes to allow someone 
who added the licensing information to delete or modify it. However, 
there are complications with this. Since IP addresses are easily spoofed, 
it would mean this entry should only be valid for logged in users.

Add a "license" table with the following columns - 1) license_id, 2) 
license_text, 3) license name and 4) license_version. The license_id in 
the licensing table references rows in this table.

One complication is when a page or image is reverted, the licensing table 
must be modified to reflect the current state.

Data manipulation: The extension would use suitable hooks to insert, 
modify and render licensing data. Insertion and modification would 
probably use a relevant Edit Page or Article Management hook. Rendering 
would probably use a Page Rendering Hook.

Page rendering: You probably don't want to dump licensing data directly 
onto a page. Instead, it is preferable to output a short licensing 
statement like:

"Content on this page uses licensed content. For details, see licensing 
data."

The phrase "licensing data" would be a link to a special page that 
accesses the licensing table and displays the license data associated 
with the page.

-- 
-- Dan Nessett


_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to