On Mon, Jan 26, 2009 at 3:47 AM, Petr Kadlec <[email protected]> wrote: >> Is the code available and I have missed it? Do we have any other >> implementation? > > I tried to do something similar (two examples are at > http://mormegil.info/wp/blame/AIM.htm > http://mormegil.info/wp/blame/AFC_Ajax.htm); the code is nothing > secret, even though it is not too clean, and there is also no rocket > science [you have been warned: > https://opensvn.csie.org/traccgi/MWTools/browser/MWTools/trunk/PageBlame]:
I also have a blame engine of my own design. It is new and I haven't released the source. > The biggest problem I see with such tools is that it is IMHO unusable > for any copyright-related purposes. My tool works by diffing the > article revisions and tracking who was the last author of every word. > Even though you can be much smarter than that, I don't believe you > would be able to track all copyright-relevant contributions with that. > As an example, consider using that tool on an article that was created > by: > 1. Importing an article with all its history from the English > Wikipedia to some other-language wiki. > 2. Translating it into the local language (for more fun, imagine a > language using a different script, e.g. Russian, or even Chinese) > > There is IMHO no way the blame tool could track copyright properly > through the translation (which it has to, copyright-wise). And even in > the general case, I believe such tracking would be an AI-hard task > (often, even a human is unable to do it properly…). Of course, such > Blame tools are great for many reasons (which is why I wrote them), > but I think the current context (license change, attribution etc.) > does not fit them at all. I think I have a more positive view than you do. Blame engines as a tool can certainly inform copyright discussions and provide relevant information, even though I agree they aren't by themselves a complete solution. For example, with situations where one is trying to list a fixed number of "major authors" (as provided in the GFDL, for example), blaming tools can make a reasonable guess at which authors are relevant. They also help estimate the answer to important meta questions, such as "How many authors does a typical Wikipedia article really have?" When the license calls for attribution to be treated in a "reasonable" way, I suspect that one could make a good case that relying on a good blame engine would often generate a reasonable attempt at attribution, even though there are cases (like translation) where they will fail. Attribution generated by blaming can be a good starting point, though it may not necessarily be the final answer. -Robert Rohde _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
