On Mon, Jan 26, 2009 at 3:47 AM, Petr Kadlec <[email protected]> wrote:
>> Is the code available and I have missed it? Do we have any other
>> implementation?
>
> I tried to do something similar (two examples are at
> http://mormegil.info/wp/blame/AIM.htm
> http://mormegil.info/wp/blame/AFC_Ajax.htm); the code is nothing
> secret, even though it is not too clean, and there is also no rocket
> science [you have been warned:
> https://opensvn.csie.org/traccgi/MWTools/browser/MWTools/trunk/PageBlame]:

I also have a blame engine of my own design.  It is new and I haven't
released the source.

> The biggest problem I see with such tools is that it is IMHO unusable
> for any copyright-related purposes. My tool works by diffing the
> article revisions and tracking who was the last author of every word.
> Even though you can be much smarter than that, I don't believe you
> would be able to track all copyright-relevant contributions with that.
> As an example, consider using that tool on an article that was created
> by:
> 1. Importing an article with all its history from the English
> Wikipedia to some other-language wiki.
> 2. Translating it into the local language (for more fun, imagine a
> language using a different script, e.g. Russian, or even Chinese)
>
> There is IMHO no way the blame tool could track copyright properly
> through the translation (which it has to, copyright-wise). And even in
> the general case, I believe such tracking would be an AI-hard task
> (often, even a human is unable to do it properly…). Of course, such
> Blame tools are great for many reasons (which is why I wrote them),
> but I think the current context (license change, attribution etc.)
> does not fit them at all.

I think I have a more positive view than you do.  Blame engines as a
tool can certainly inform copyright discussions and provide relevant
information, even though I agree they aren't by themselves a complete
solution.

For example, with situations where one is trying to list a fixed
number of "major authors" (as provided in the GFDL, for example),
blaming tools can make a reasonable guess at which authors are
relevant.  They also help estimate the answer to important meta
questions, such as "How many authors does a typical Wikipedia article
really have?"

When the license calls for attribution to be treated in a "reasonable"
way, I suspect that one could make a good case that relying on a good
blame engine would often generate a reasonable attempt at attribution,
even though there are cases (like translation) where they will fail.
Attribution generated by blaming can be a good starting point, though
it may not necessarily be the final answer.

-Robert Rohde

_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to