On Mon, Nov 24, 2008 at 8:35 PM, Luca de Alfaro <[EMAIL PROTECTED]> wrote: [snip] > So I don't think based on what you say that the system is tripping over > diffs.
For example: I can't figure out why the text in the image caption is colored here http://wiki-trust.cse.ucsc.edu/index.php/Digital_room_correction I couldn't initially figure out why *anything* above the external link section was colored… though the inability to diff contributed to that. On Mon, Nov 24, 2008 at 8:22 PM, Luca de Alfaro <[EMAIL PROTECTED]> wrote: > I agree with Gregory that it is very useful to quantify the usefulness of > trust information on text -- otherwise, all comparison are very subjective. > In our WikiSym 08 paper, we measure various parameters of the "trust" > coloring we compute, including: > > - Recall of deletions. Only 3.4% of text is in the lower half of trust > values, yet this is 66% of the text that is deleted in the very next > revision. > - Precision of deletions. Text is the bottom half of trust values has > probability 33% of being deleted in the next revision, agaist a probability > of 1.9% for general text. The deletion probability raises to 62% for text > in the bottom 20% of trust values. > - We study the correlation between the trust of a word, sampled at random > in all revisions, and the future lifespan of a word (correcting for the > finite horizon effect due to the finite number of revisions in each > article), showing positive correlation. [snip] These performance metrics are better than I would have guessed from browsing through the output. How does the color mapping reflect the trust values? Basically when I use it I see a *lot* of colored things which are perfectly fine. At least for me, the difference between shades is far less cognitively significant than colored vs non-colored, so that may be the source of my confusion. Have you compared your system to a simple toy trust metric? I'd propose "revisions by users in their first week and before their first 7 (?) edits are untrusted". This reflects the existing automatic trust system on the site (auto-confirmation), and also reflects the a type of trust checking applied manually by editors. I think thats the bar any more sophisticated trust metric needs to outperform. Thank you so much for your response! _______________________________________________ Wikipedia-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikipedia-l
