The comparison tool on https://tools.wmflabs.org/copyvios/ can look for repeated phrases.
You might be able to tweak that a bit. On Sat, 4 May 2019 at 12:48, Haifeng Zhang <[email protected]> wrote: > Dear folks, > > Is there a way to compute content similarity between two Wikipedia > articles? > > For example, I can think of representing each article as a vector of > likelihoods over possible topics. > > But, I wonder there are other work people have already explored in the > past. > > > Thanks, > > Haifeng > _______________________________________________ > Wiki-research-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l > _______________________________________________ Wiki-research-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
