Dear Haifeng,

Would you not be able to use ordinary information retrieval techniques such as bag-of-words/phrases and tfidf? Explicit semantic analysis (ESA) uses this approach (though its primary focus is word semantic similarity).

There are a few papers for ESA: https://tools.wmflabs.org/scholia/topic/Q5421270

I have also used it in "Open semantic analysis: The case of word level semantics in Danish" http://www2.compute.dtu.dk/pubdb/views/edoc_download.php/7029/pdf/imm7029.pdf


Finn Årup Nielsen
http://people.compute.dtu.dk/faan/



On 04/05/2019 13:47, Haifeng Zhang wrote:
Dear folks,

Is there a way to compute content similarity between two Wikipedia articles?

For example, I can think of representing each article as a vector of 
likelihoods over possible topics.

But, I wonder there are other work people have already explored in the past.


Thanks,

Haifeng
_______________________________________________
Wiki-research-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


_______________________________________________
Wiki-research-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

Reply via email to