https://bugzilla.wikimedia.org/show_bug.cgi?id=45161

--- Comment #3 from Brad Jorsch <[email protected]> ---
You seem to be making some major assumptions on how this natural sorting will
work, particularly with respect to punctuation, that may or may not be
warranted.

(In reply to comment #2)
> 
> * 'IPAddress'. It actually only supports IPv4 addresses. If natural sorting
> was
> implemented, it would be entirely 100% useless. Kill it.

Should be improved to support IPv6, which by no stretch will be handled
correctly by "natural sorting".

I also note that "10.123.234.255" in locales that use '.' for grouping by
thousands looks like the number 10123234225. While "255.0.0.1" does not look
like any number, so it would probably be interpreted as 255 followed by some
other junk, and be wrongly sorted before 10.123.234.255.

And in locales that use '.' for a decimal separator, is "10.23.45.6" the
numbers "10.23" and "45.6" separated by a period, or is it four integers? This
makes a difference if the table also contains "10.3.0.0".

> * 'currency'. By $.tablesorter definitions this actually means pounds,
> dollars,
> euros and yens; all other currency signs are not detected and treated as
> 'text'. It would be similarly superseded by natural sorting. Kill it.

If it's actually the case that your natural sorting handles it.

> * 'url'. It just removes ftp:, file:, http: and https: protocols and then
> sorts
> as text. I have absolutely no idea why would anybody consider this useful.
> Kill
> it.

That does seem like an odd choice. Perhaps the implementors should be asked for
a reason behind it.

> * 'isoDate'. That is YYYY-M(M)-D(D), where (X) means that this part is
> optional. Would be similarly superseded by natural sorting, kill it.

It also handles YYYY/MM/DD.

> * 'usLongDate'. It uses a 114-character regex I'm not going to decipher right
> now, but the regex ends with (AM|PM). English-specific, completely useless
> for
> other languages, kill it.
>
> * 'date'. It uses some generated regexes based on wgMonthNames and
> wgMonthNamesShort. Assumes that all languages use a
> XXX[,.'-/]XXX[,.'-/]YY(YY)
> date formats, where X can be either D or M depending on wgDefaultDateFormat
> and
> wgContentLanguage==en. I have no idea why exactly those separators were used.
> Probably doesn't work for most languages; I didn't test carefully, but bug
> 42607 is due to this malfunctioning. Kill it.

*Lots* of assumptions there.

What we probably need are specific parsers for various date formats, and/or a
way for a wiki to call $.tablesorter.addDateParser( 'name', 'format-string' )
to easily add additional date parsers for their language, and/or a general
parser that reads the format string from a "data-date-format" property on the
header cell.

> * 'time'. Matches a HH:MM time with optional AM/PM. Again, would be obsoleted
> by natural sorting, let's kill it.

In what world would your natural sorting not put 2AM after 1PM? Or is it smart
enough to decide it's looking at a time, as well?

> * 'number'. Would be obsoleted by natural sorting, let's kill it.

Your natural sorting is smart enough to handle numbers like 6.022e23 correctly?
Interesting.

> Is that enough justification?

Not really.

All in all, it seems that the existing parsers started out as relatively
English-centric and were then adjusted to try to address other situations. But
getting rid of all that to rely entirely on a poorly-specified "natural sort"
algorithm doesn't seem like much of an improvement.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
You are watching all bug changes.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to