https://bugzilla.wikimedia.org/show_bug.cgi?id=164

--- Comment #220 from Tim Starling <[email protected]> 2011-07-20 
12:14:54 UTC ---
(In reply to comment #218)
>             if ( $comparison < 0 ) {
>                 $min = $mid + 1;
>             } elseif ( $comparison > 0 ) {
>                 $max = $mid - 1;

The function is not a classical binary search, which only determines equality,
instead it finds the lower bound of a range. When the test item sorts below the
target ($comparison < 0), it can't be ruled out as a prospective lower bound.
That's why you need $min = $mid, not $min = $mid + 1. 

Maybe $max = $mid - 1 is possible in the $comparison > 0 case, with your
suggested alteration to the test for the before-the-start case. But the
existing code is tested and does work, and the improvement in convergence rate
would be small. 

> OK, I immediately found a bug in the binary search function (which is 
> defintely
> not beinary, has slower than expected convergence, and may even stale on
> testing the same index, due to incorrect handling of min/max indice after
> comparing):

I assume by "stale" you mean an infinite loop. This does not occur because the
midpoint is always different from either of the two endpoints until the
difference $max - $min is reduced to 1 or 0, at which point the loop exits. 

In a textbook binary search, offsetting the midpoint by 1 allows the loop to
continue until $min > $max. But as I said, we can't offset the midpoint in the
$comparison < 0 case, so a slightly earlier loop termination is required to
avoid an infinite loop.

> Note that the final test to return false is unnecessary and causes confusion 
> on
> usage (if you had written this in C/C++ instead of PHP, you would not be able
> to make the distinction between false (no lower bound not found) and 0 (valid
> lower bound). Your code should not depend on those PHP hacks, which consists 
> in
> returning values of distinct types (but with PHP's implicit promotion on basic
> types, it's really dangerous do do that) !
> 
> It's just simpler to let it return -1 (the content already in $max when we are
> before the first item), and then let the caller test the negative sign of the
> result, instead of comparing it then with false.

It's conventional for PHP functions to return false to indicate an error or
other unusual situation, see for instance http://php.net/array_search . It's
not at all dangerous. If I wrote the function in C, I would indeed have used
-1, and I would have called it an ugly hack to work around C's lack of weak
typing.

If you're aware of some actual bug in findLowerBound(), you should file a
separate bug report, or note it on
<http://www.mediawiki.org/wiki/Special:Code/MediaWiki/80443>

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to