https://bugzilla.wikimedia.org/show_bug.cgi?id=6569

Gabriel Wicke <[email protected]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         AssignedTo|[email protected]. |[email protected]
                   |org                         |

--- Comment #7 from Gabriel Wicke <[email protected]> 2011-11-14 13:45:23 UTC 
---
Additional information from
http://lists.wikimedia.org/pipermail/wikitext-l/2011-November/000483.html.
Nested definition lists are rare enough to allow us to decide on a new standard
without breaking too many pages:

> Can we deconstruct the current parser's processing steps and build a set
> of rules that must be followed?

I think the commonly-used structures are quite clearly defined, but the
behaviour of these strange permutations is quite unspecified. The parser
output for the case reported in the bug already changed in the meantime..

> I think we need to get a dump of English Wikipedia and start using a
> simple PEG parser to scan through it looking for patterns and figuring
> out how often certain things are used - if ever.

I just ran an en-wiki article dump through a zcat/tee/grep pipeline:

pattern            count        example
------------------------------------------------------------------
^            548498738     (total number of lines)
^;            681495
^;[^:]+:        153997        ; bla : blub
^[;:*#]+;[^:]+:        3817        *; bla : blub
^;;                     2332
^[:;*#]*;[^:]*::        41        most probably ;::
^[;:*#]*;[^:]+::    17        ;; bla :: blub

Nested definition lists are not exactly common. Lines starting with ';;'
often appear as comments in code listings. The most common other
application appears to be indentation and emphasis. Any change in the
produced structure that keeps indentation and bolding should thus avoid
breaking pages.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
You are on the CC list for the bug.

_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to