https://bugzilla.wikimedia.org/show_bug.cgi?id=41380

       Web browser: ---
             Bug #: 41380
           Summary: Corrupt images should be detected and reported - by
                    humans or automatic script.
           Product: Wikimedia
           Version: unspecified
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: Unprioritized
         Component: Media storage
        AssignedTo: [email protected]
        ReportedBy: [email protected]
                CC: [email protected], [email protected],
                    [email protected]
            Blocks: 41371
    Classification: Unclassified
   Mobile Platform: ---


We've reports on Wikimedia Commons thumbnails can't be generated because images
are corrupt.

* * *

[ Analysis ]

Well, the problem is these images seems to really be corrupted.

e.g. [[Commons:File:Augustinusbishop.gif]]

/home/dereckson ] fetch
https://upload.wikimedia.org/wikipedia/commons/7/7b/Augustinusbishop.gif
Augustinusbishop.gif                          100% of  346 kB  889 kBps
/home/dereckson ] mogrify Augustinusbishop.gif -resize 200x300
mogrify: corrupt image `Augustinusbishop.gif' @ error/gif.c/ReadGIFImage/1348.

* * *

[ A heavy to maintain solution ]

We should have a script verifying periodically our pictures and reporting
corrupted images detected.

PIL can detect such images with the verify method.

Here a sample script (works for any other format supported by PIL too):
https://bitbucket.org/denilsonsa/small_scripts/src/tip/jpeg_corrupt.py

The infrastructure we need should be optimized to detect at least 100 000
pictures per day (= 1.15 image per seconds), so if it runs continuously we can
have every picture verified every 150 days.

* * *

It should be evaluated if this is needed or if human reporting would work
better.

We also have to involve Wikimedia Commons community to manually fix the
corrupted pictures.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
You are on the CC list for the bug.

_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to