https://bugzilla.wikimedia.org/show_bug.cgi?id=164

--- Comment #218 from Philippe Verdy <verd...@wanadoo.fr> 2011-07-20 08:54:53 
UTC ---
OK, I immediately found a bug in the binary search function (which is defintely
not beinary, has slower than expected convergence, and may even stale on
testing the same index, due to incorrect handling of min/max indice after
comparing):

291     function findLowerBound( $valueCallback, $valueCount,
$comparisonCallback, $target ) {
292                    $min = 0;
293                    $max = $valueCount - 1;
294                    do {
295                            $mid = $min + ( ( $max - $min ) >> 1 );
296                            $item = call_user_func( $valueCallback, $mid );
297                            $comparison = call_user_func(
$comparisonCallback, $target, $item );
298                            if ( $comparison > 0 ) {
299                                    $min = $mid;
300                            } elseif ( $comparison == 0 ) {
301                                    $min = $mid;
302                                    break;
303                            } else {
304                                    $max = $mid;
305                            }
306                    } while ( $min < $max - 1 );
307    
308                    if ( $min == 0 && $max == 0 && $comparison > 0 ) {
309                            // Before the first item
310                            return false;
311                    } else {
312                            return $min;
313                    }
314            }

The authout should have better read how TESTED binary searches are implemented.
This should really be:

    function findLowerBound( $valueCallback, $valueCount, $comparisonCallback,
$target ) {
        $min = 0;
        $max = $valueCount - 1;
        while ( $min <= $max ) {
            $mid = ($min + $max) >> 1;
            $item = call_user_func( $valueCallback, $mid );
            $comparison = call_user_func( $comparisonCallback, $item, $target
);
            if ( $comparison < 0 ) {
                $min = $mid + 1;
            } elseif ( $comparison > 0 ) {
                $max = $mid - 1;
            } else {
                return $mid; // or: $max = $mid; break;
            }
        }
        // $target not found, now $max < min (more exactly, $max = $min - 1).
        // $target is to insert between them: $max is the highest lower bound.

        if ( $max < 0 ) {
            // Before the first item
            return false; // Don't test with a simple if !
        } else {
            return $max; // May return 0 (lower bound on first item)
        }
    }

Note that the final test to return false is unnecessary and causes confusion on
usage (if you had written this in C/C++ instead of PHP, you would not be able
to make the distinction between false (no lower bound not found) and 0 (valid
lower bound). Your code should not depend on those PHP hacks, which consists in
returning values of distinct types (but with PHP's implicit promotion on basic
types, it's really dangerous do do that) !

It's just simpler to let it return -1 (the content already in $max when we are
before the first item), and then let the caller test the negative sign of the
result, instead of comparing it then with false.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

_______________________________________________
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to