en:WP:DYK has a measure of 1,500+ characters of prose, which is a useful
Probably doesn't translate to CJK languages which have radically different
information content per character.
...let us be heard from red core to black sky
On Tue, Sep 20, 2016 at 9:26 PM, Robert West <w...@cs.stanford.edu> wrote:
> Hi everyone,
> Does anyone know if there's a straightforward (ideally
> language-independent) way of identifying stub articles in Wikipedia?
> Whatever works is ok, whether it's publicly available data or data
> accessible only on the WMF cluster.
> I've found lists for various languages (e.g., Italian
> <https://it.wikipedia.org/wiki/Categoria:Stub> or English
> <https://en.wikipedia.org/wiki/Category:All_stub_articles>), but the
> lists are in different formats, so separate code is required for each
> language, which doesn't scale.
> I guess in the worst case, I'll have to grep for the respective stub
> templates in the respective wikitext dumps, but even this requires to know
> for each language what the respective template is. So if anyone could point
> me to a list of stub templates in different languages, that would also be
> Up for a little language game? -- http://www.unfun.me
> Wiki-research-l mailing list
Wiki-research-l mailing list