Ariel T. Glenn wrote: > I"m all for the change, but it would have to be announced well > in advance of rollout and coordinated with other folks. For example, > I have a check against rev_len (in bytes) when writing out XML dumps, > in order to avoid rev id and rev content out of sync errors that we > have run into multiple times in the past. That code would need to be > changed to count characters of the text being used for prefetch > instead of bytes.
Are character counts between programming languages generally consistent? And is there a performance concern with counting characters vs. counting bytes? Another post in this thread suggested that it might be up to five times slower when counting characters. I've no idea if this is accurate, but even a small increase could have a nasty impact on dump-processing scripts (as opposed to the negligible impact on revision table inserts). MZMcBride _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
