On Wed, 29 May 2013 18:33:03 -0700, Tyler Romeo <[email protected]> wrote:

On Wed, May 29, 2013 at 9:26 PM, Tim Starling <[email protected]>wrote:

37% for the larger replacement array in Html::expandAttributes(), or
for the smaller one in Html::element()? And what was the test case
size: how many replaced bytes compared to non-replaced bytes?

If it was the strtr() in Html::element(), which is the only one which
gives a size reduction, perhaps you should compare it against
htmlspecialchars($s, ENT_NOQUOTES), which should use the same
algorithm as plain htmlspecialchars() but with the same size reduction
as strtr().


Ran another test. I tested on the string
'<&<&<&herllowodsiojgd<&sd<^<6&&"""' repeated 50 times, and I ran the
replacement function 500,000 times. The results were:

htmlspecialchars with ENT_NOQUOTES: 14.025s
htmlspecialchars without ENT_NOQUOTES: 13.457s
strtr: 24.842s
str_replace: 13.184s

Of course, these numbers tend to vary +/- 0.25s every time, so take it with
a grain of salt.

*-- *
*Tyler Romeo*
Stevens Institute of Technology, Class of 2016
Major in Computer Science
www.whizkidztech.com | [email protected]

These stats look a little less like htmlspecialchars is the most efficient we should use it. And a little more like strtr is implemented inefficiently, we should try using one of the other methods of string replacement.

Reading up online:
* http://stackoverflow.com/questions/8177296/when-to-use-strtr-vs-str-replace
* http://micro-optimization.com/strtr-vs-str_replace
* http://comments.gmane.org/gmane.comp.php.devel/77397

I get the impression that:
* strtr iterates and replaces character-by-character while str_replace replaces each pair in order as if you called str_replace multiple times just replacing rather than iterating * strtr can safely do an `a -> b, b -> a` replacement where 'abb' becomes 'baa' while str_replace cannot * strtr's algorithm may be even slower when the strings to be replaced are of varying sizes * strtr is going to be faster in PHP 5.4 as they've changed the algorithm it uses

We aren't doing any replacements that need strtr's guarantee. As long as our & -> &amp; replacement is the first replacement in str_replace's array then it should work exactly as we need it.

So it looks like we should just be replacing most of our strtr uses with str_replace instead.

Also, I'd be interested to see those benchmarks re-run on PHP 5.4 now that I we know that they changed the algorithm.

--
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://danielfriesen.name/]


_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to