https://bugzilla.wikimedia.org/show_bug.cgi?id=17057

           Summary: Unusually high number of file SHA1 hash collisions
           Product: Wikimedia
           Version: unspecified
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: major
          Priority: Normal
         Component: General/Unknown
        AssignedTo: [email protected]
        ReportedBy: [email protected]


Using enwiki_p, I'm getting data like this:

mysql> SELECT DISTINCT enwiki_p.page.page_title, commonswiki_p.image.img_name
    -> FROM enwiki_p.image, commonswiki_p.image, enwiki_p.categorylinks,
enwiki_p.page
    -> WHERE enwiki_p.image.img_sha1 = commonswiki_p.image.img_sha1
    -> AND enwiki_p.page.page_title = enwiki_p.image.img_name
    -> AND enwiki_p.categorylinks.cl_from = enwiki_p.page.page_id
    -> AND enwiki_p.categorylinks.cl_to = 'All_non-free_media'
    -> LIMIT 50;
+----------------+-----------------------------------------------+
| page_title     | img_name                                      |
+----------------+-----------------------------------------------+
| Imas360_10.jpg | +-_of_Led.svg                                 | 
| Imas360_10.jpg | 5von10.png                                    | 
| Imas360_10.jpg | Alfred_de_Musset.jpg                          | 
| Imas360_10.jpg | Amphipodredkils.jpg                           | 
| Imas360_10.jpg | Amphoe_6502.png                               | 
| Imas360_10.jpg | Aschenbecher_mit_Mechanik1.jpg                | 
| Imas360_10.jpg | Austria_1945-55.png                           | 
| Imas360_10.jpg | Bakaiku.JPG                                   | 
| Imas360_10.jpg | Bakweri_cocoyam_farmer_from_Cameroon.jpg      | 
| Imas360_10.jpg | Bartolomeu_Dias_Voyage.PNG                    | 
| Imas360_10.jpg | Benjamin_West.jpg                             | 
| Imas360_10.jpg | Blason-fr-en-Saint-Moreil.svg                 | 
| Imas360_10.jpg | Brno-Nový_Lískovec_from_Petrov_(Brno).JPG   | 
| Imas360_10.jpg | Brännkyrka_kyrka_2005-09-04nr1.jpg           | 
| Imas360_10.jpg | Bundesautobahn_113_number.svg                 | 
| Imas360_10.jpg | Clock_UT+7.png                                | 
| Imas360_10.jpg | Coat_of_Arms_of_Antigua_and_Barbuda.gif       | 
| Imas360_10.jpg | Codex_egberti_-_egbert.jpg                    | 
| Imas360_10.jpg | Cold_fingers.png                              | 
| Imas360_10.jpg | Cross.png                                     | 
| Imas360_10.jpg | Cutty_sark_October_2003.jpg                   | 
| Imas360_10.jpg | DNAn+1_C.svg                                  | 
| Imas360_10.jpg | DNAn+1_T.svg                                  | 
| Imas360_10.jpg | Dabrowskirynek.jpg                            | 
| Imas360_10.jpg | Dalmenyhouse_lighter.jpg                      | 
| Imas360_10.jpg | EtaCarinae.jpg                                | 
| Imas360_10.jpg | Europe_location_ARM.png                       | 
| Imas360_10.jpg | Five-pointed_star.svg                         | 
| Imas360_10.jpg | Flag_of_Kentucky.svg                          | 
| Imas360_10.jpg | Font_Wallace_Pt_Pasteur.jpg                   | 
| Imas360_10.jpg | GeorgeWBush.jpg                               | 
| Imas360_10.jpg | Gorillas_2609.jpg                             | 
| Imas360_10.jpg | Hallingkast.jpg                               | 
| Imas360_10.jpg | Harlekin_Columbine_Tivoli_Denmark.jpg         | 
| Imas360_10.jpg | Helicopter_rescue_sancy_takeoff.jpg           | 
| Imas360_10.jpg | Herb_Korybut.jpg                              | 
| Imas360_10.jpg | Hymenoptera_diagonal.jpg                      | 
| Imas360_10.jpg | IsleofWightmap_1945.jpg                       | 
| Imas360_10.jpg | Jarzabczy_Wierch_a2.jpg                       | 
| Imas360_10.jpg | Karte_Lage_Kanton_Uri.png                     | 
| Imas360_10.jpg | Kit_body_scga06.png                           | 
| Imas360_10.jpg | Kościół_Wniebowstąpienia_Poznań003.jpg   | 
| Imas360_10.jpg | Lilium_bulbiferum_mg-k.jpg                    | 
| Imas360_10.jpg | Macaronesia.jpg                               | 
| Imas360_10.jpg | Maisonmaton.jpg                               | 
| Imas360_10.jpg | Map_of_Scotland_within_the_United_Kingdom.png | 
| Imas360_10.jpg | Market_Square_Shopping_Centre_Geelong.jpg     | 
| Imas360_10.jpg | Mg-TableImage.svg                             | 
| Imas360_10.jpg | Michael_Boogerd.jpg                           | 
| Imas360_10.jpg | Monarch_caterpillar_and_egg.jpg               | 
+----------------+-----------------------------------------------+
50 rows in set (0.07 sec)

The hashes are identical according to the query, so this suggests that
something is very broken.

I've been told that null editing the pages can fix the hash, though it's
difficult to test with replag.


-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to