Re: [Wiki-research-l] Identifying Wikipedia stubs in various languages

Stuart A. Yeates Tue, 20 Sep 2016 02:42:12 -0700

en:WP:DYK has a measure of 1,500+ characters of prose, which is a useful
cutoff. There is weaponised javascript to measure that at en:WP:Did you
know/DYKcheck


Probably doesn't translate to CJK languages which have radically different
information content per character.

cheers
stuart

--
...let us be heard from red core to black sky

On Tue, Sep 20, 2016 at 9:26 PM, Robert West <[email protected]> wrote:

> Hi everyone,
>
> Does anyone know if there's a straightforward (ideally
> language-independent) way of identifying stub articles in Wikipedia?
>
> Whatever works is ok, whether it's publicly available data or data
> accessible only on the WMF cluster.
>
> I've found lists for various languages (e.g., Italian
> <https://it.wikipedia.org/wiki/Categoria:Stub> or English
> <https://en.wikipedia.org/wiki/Category:All_stub_articles>), but the
> lists are in different formats, so separate code is required for each
> language, which doesn't scale.
>
> I guess in the worst case, I'll have to grep for the respective stub
> templates in the respective wikitext dumps, but even this requires to know
> for each language what the respective template is. So if anyone could point
> me to a list of stub templates in different languages, that would also be
> appreciated.
>
> Thanks!
> Bob
>
> --
> Up for a little language game? -- http://www.unfun.me
>
> _______________________________________________
> Wiki-research-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
>

_______________________________________________
Wiki-research-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

Re: [Wiki-research-l] Identifying Wikipedia stubs in various languages

Reply via email to