--- Comment #30 from Aryeh Gregor <> 2010-12-06 
19:18:35 UTC ---
(In reply to comment #29)
> Yes I know, but the id duplication is another problem (also for HTML5
> conformance and for having autogenerated summaries to link to the appropriate
> section when we click on them).

The algorithm already accounts for this.  E.g.,

== Foo ==
== "Foo" ==

will give the latter an anchor of #Summary_2 in current trunk.  First the
anchor is generated, then a number is appended if it's the same as a previous
anchor.  You have to have this code anyway, to handle cases like

== Foo ==
== Foo ==

So punycode doesn't gain anything for uniqueness.

> Here we were speaking about invalid characters, and it is clear that a valid 
> ID
> must not contain any dot (and at least must not start with it)

Valid ID's in XHTML 1.0 and HTML5 may contain a dot.  Valid ID's in HTML5 may
start with a dot.

> and that
> converting them using ".XX" hex sequences for each non-ASCII UTF-8-encoded
> character

We no longer do this in trunk.  We just convert runs of whitespace and other
bad characters to a single underscore, and otherwise output as-is (possibly
with a number appended).

> Really, the generated IDs should be the same and compatible for direct use in
> URLs, or in CSS selectors, or for the XML syntax. This is possible, but it 
> will
> require a better encoding than the bogous current one, plus the general need 
> to
> make them unique by adding some suffixes for duplicates.

The id's being output by trunk in HTML5 mode can be used directly in URLs (in
reasonably recent browsers), can be used in CSS selectors with proper escaping
(although I doubt much of anyone does), and can be used in XML just as in any
other markup language.

Configure bugmail:
------- You are receiving this mail because: -------
You are the assignee for the bug.
You are on the CC list for the bug.

Wikibugs-l mailing list

Reply via email to