It appears that the summary may have normalized the formatting of the CSS.

143095  display: none;


Your query[1] assumes a space after "display:" and gives 218 results. Using
no space[2] gives 2,473 results, but still assumes that no other elements
occur in the style attribute. A regex query[3] with "display:" + optional
spaces + "none" gives 4,296 results, or a more reasonable average of 33 per
result. That query may be overly aggressive and match outside of style
contexts, but it also matches *list_style = text-align:center;display:none,*
and *style="font-size: normal; text-align: left; display: none;"* which I
think is a good thing (definitely in the latter case).

Parsing a dump of enwiki is more accurate than running insource: queries.

—Trey

[1]
https://en.wikipedia.org/w/index.php?title=Special%3ASearch&profile=advanced&search=insource%3A%22style%3D%5C%22display%3A+none%3B%5C%22%22&fulltext=Search&ns0=1&profile=advanced

[2]
https://en.wikipedia.org/w/index.php?title=Special%3ASearch&profile=advanced&search=insource%3A%22style%3D%5C%22display%3Anone%3B%5C%22%22&fulltext=Search&ns0=1&profile=advanced

[3]
https://en.wikipedia.org/w/index.php?title=Special%3ASearch&profile=default&search=insource%3A%2Fdisplay%3A+*none%2F&fulltext=Search


Trey Jones
Software Engineer, Discovery
Wikimedia Foundation

On Tue, Oct 27, 2015 at 10:41 AM, Robert Rohde <[email protected]> wrote:

> Okay, I misunderstood those as page counts, which would be way too high.
> Even if they are explicit usage counts, I am still surprised they are that
> high.
>
> BTW, is it surprising to anyone else that style elements aren't searchable
> by default?  Searching for "efcfff" [1], gives only a single article result
> despite "background: #efcfff;" being reported 200k times.
>
> We can however search using "insource:efcfff" [2], which reports 5516
> articles, implying this color is applied _on average_ roughly 39 times per
> article.
>
> "display: none;" would appear even more impressive, with a reported 140k
> uses in just 218 articles [3] or an average of 656 usages per page
> containing it.  That doesn't feel very likely to me.  One possibility would
> be if you mistakenly counted some or all pages outside of the main
> namespace.  Though only 218 articles use "display: none", there are nearly
> 31000 other pages that include it [4], which seems like a much more
> reasonable way to get to 140k total uses.
>
> -Robert
>
> [1]
>
> https://en.wikipedia.org/w/index.php?search=efcfff&title=Special%3ASearch&go=Go
> [2]
>
> https://en.wikipedia.org/w/index.php?title=Special%3ASearch&profile=advanced&search=insource%3Aefcfff&fulltext=Search&ns0=1&profile=advanced
> [3]
>
> https://en.wikipedia.org/w/index.php?title=Special%3ASearch&profile=advanced&search=insource%3A%22style%3D%5C%22display%3A+none%3B%5C%22%22&fulltext=Search&ns0=1&profile=advanced
> [4]
>
> https://en.wikipedia.org/w/index.php?title=Special:Search&search=insource%3A%22style%3D%5C%22display%3A%20none%3B%5C%22%22&fulltext=Search&profile=all
>
>
> On Tue, Oct 27, 2015 at 2:32 PM, MZMcBride <[email protected]> wrote:
>
> > Robert Rohde wrote:
> > >On Mon, Oct 26, 2015 at 2:13 AM, MZMcBride <[email protected]> wrote:
> > >>The following are the top ten instances of inline styling from main
> > >>namespace pages on the English Wikipedia, as of about 2015-10-02:
> > >>
> > >>1552197 text-align: center;
> > >>499756  text-align: left;
> > >>355952  background: #dfffdf;
> > >>235222  background: #cfcfff;
> > >>215038  background: #efcfff;
> > >>210702  text-align: right;
> > >>143095  display: none;
> > >>93646   background: #efefef;
> > >>86391   font-size: 90%;
> > >>80420   background: #fff;
> > >
> > >I'm not sure what your bug is, but those counts are way too high to be
> > >accurate reflections of the wikitext in the main namespace on enwiki.
> >
> > Err, based on what? :-)
> >
> > These numbers are instances of style="[...]", not page counts. Looking at
> > a specific example from <https://phabricator.wikimedia.org/P2230>:
> >
> > 1164   font-family: 'microsoft yi baiti', 'noto sans yi', nsimsun-18030,
> >        simsun-18030, 'sil yi', code2000;
> >
> > These 1,164 inline styling instances all come from a single article:
> > <https://en.wikipedia.org/w/index.php?oldid=672244691&action=edit>.
> >
> > Maybe that's the confusion? I tried to make my descriptions as clear as
> > possible and I'm not saying a major bug is impossible, of course, but I
> > don't have any reason so far to doubt the data I collected.
> >
> > Another strange case is "background-color: {{/meta/color}};", which had
> > 16,432 instances. This almost looks like it would try to transclude a
> > subpage of the article, but due to subpages being disabled in the main
> > namespace on the English Wikipedia, it's actually transcluding a template
> > named "/meta/color": <https://en.wikipedia.org/wiki/Template:/meta/color
> >.
> >
> > I did concurrently look at the approximate number of non-redirect pages
> > that contain inline styling. My findings were that about 408,777
> > non-redirect pages contain some kind of inline styling on the English
> > Wikipedia (cf. <https://phabricator.wikimedia.org/T115228#1752223>).
> >
> > MZMcBride
> >
> >
> >
> > _______________________________________________
> > Wikitech-l mailing list
> > [email protected]
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> >
> _______________________________________________
> Wikitech-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to