https://bugzilla.wikimedia.org/show_bug.cgi?id=11547


Philippe Verdy <[email protected]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |[email protected]




--- Comment #3 from Philippe Verdy <[email protected]>  2009-11-14 17:43:32 
UTC ---
URL encoding is definitely NOT the correct way to make the "user@" part of
emails address valid. Read the RFCs:
URL encoding just applies to the hierarchical page name within a domain space
(and under a hierarchical protocol like "http(s):" and "ftp(s):"),
as well as in query parameters (when they are supported in those protocols).

Valid user names in email addresses also use a "safe" alphabet different from
that for domain names (which also DO NOT use URL encoding but the encodings
supported in IDNA, if they are internationized, and DNS specifications
otherwise).

For example, the underscore character "_" (which is part of my own email
address and cannot be subtituted into a "+" or "-" and not even into "%7E") or
the exclamation punctuation mark "!" is perfectly safe (and standard) in the
"user@" part (which in fact is not really described as a user name, but as an
identity specifier whose internal syntax may contain a user name and some other
authorization data, that cannot be safely stripped out or separated (some sites
will use the colon ":" instead of the exclamation mark).

Mapping any Unicode characters with UTF-8 or other representations into a valid
"user@" part of an email address is completely unspecified (there's absolutely
no reliable algorithm to do this, as the mapping is completely domain-dependant
and may even be different from the mapping used for encoding usernames in URI
schemes other than "mailto:";). All that can be done is to check that the
"user@" part provided uses the valid ASCII subset which is specific to the
"mailto:"; URI scheme (and distinct from the ASCII subsets used: either in the
DNS protocol for domain names; or in the server-local address part of HTTP/FTP
URLs).

Note also that "user@" parts in email addresses are normally CASE-SIGNIFICANT
(even if most target SMTP servers, will accept emails using any case, and if
some RFCs require that users provide an email address containing a user name
that can be used as a valid label in a DNS subdomain, in order to activate some
functionality) ; STMP relay agents (as well as senders) MUST NOT change the
letter case in a pseudo-canonicalization (because they can't realiably know if
the recipient server makes the case distinction) : this could simply break the
authorization data which is part of the "user@" part (for example it could
contain Base64-encoded binary data, in addition to representing the user
identity on the target server where it will be delivered to the target
POP3/IMAP/WebMail user's mailbox).


-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
You are on the CC list for the bug.

_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to