Hi, Thank you for sharing your thoughts, Jarek :)
Le vendredi 12 décembre 2014, 03:44:54 Tuszynski, Jarek W. a écrit : > So all the files in Category:Files with no machine-readable > license<https://commons.wikimedia.org/wiki/Category:Files_with_no_machine-r > eadable_license> need work to be done with licenses, not files. I do not > know what machine-readable metadata is needed but I can help with adding > them. Yes, many of those are tricky because there isn't necessarily a "real" license attached to them (example: https://commons.wikimedia.org/wiki/File: %22A_Basket_full_of_Wool%22_(6360159381).jpg ) or the license isn't specific enough. There are similar discussions at https://meta.wikimedia.org/wiki/Talk:File_metadata_cleanup_drive#How_to_handle_.22and_future_versions.22_cases and https://meta.wikimedia.org/wiki/Talk:File_metadata_cleanup_drive#.22Presumed_Public_domain.22 and the best we might be able to do is to come up with a list of such cases and ask our wonderful lawyers how to handle them :) > 2) Your number of files missing machine-readable metadata on Commons: > ~533,000, seems a bit low. According to > Special:MostTranscludedPages<https://commons.wikimedia.org/wiki/Special:Mos > tTranscludedPages> there are 24,136,218 files with licenses ({{License > template > tag<https://commons.wikimedia.org/wiki/Template:License_template_tag>}}), > and 23,452,741 files with infobox templates ({{Information}} or {{Infobox > template > tag<https://commons.wikimedia.org/wiki/Template:Infobox_template_tag>}}, > so I would expect 683,477 files without any infobox templates. There are currently ~677,674 files* without any of the following templates: 'Information','Painting', 'Blason-fr-en', 'Blason-fr-en-it', 'Blason-xx', 'COAInformation', 'Artwork', 'Art_Photo','Photograph', 'Book', 'Map', 'Musical_work', 'Specimen' If this list in incomplete (it probably is) or incorrect, let me know. *Source: https://tools.wmflabs.org/mrmetadata/commons_list.txt (warning, 18MB text file). But some of those do have machine-readable metadata picked up by CommonsMetadata even if they don't have an infobox, which brings the number down to ~533,000. It can be that they have templates we're not listing yet, or that they have MR metadata in their EXIF data. Some of the latter are false positives, per https://phabricator.wikimedia.org/T73719 > 3) As I mentioned on > Commons:Bots/Work_requests#An_example_pattern<https://commons.wikimedia.org > /wiki/Commons:Bots/Work_requests#An_example_pattern> I would like to first > give the original uploaders a chance to fix the files. We can do that by > writing a standard message, which without any threat of deletion, ask for > help with bringing their files up to current standards. I'm not opposed to this in principle, but I'm not sure I see the value. We're not going to delete files, or change attribution, or anything like that; we're only going to take the existing information and put it into a template so it's easier to access. My assumption is that most uploaders wouldn't care about such a change in formatting, and that it would entail more work for them to figure out how to do it themselves, than for a few bot owners to do it on a wider scale. Is this assumption unreasonable? > 4) At some point I started adding such files to [[Category:Media > missing infobox > template<https://commons.wikimedia.org/wiki/Category:Media_missing_infobox_ > template>]] for better tracking and started sub-categorizing them into > a. Files with OTRS > > b. Files with {{information}} template which have some parsing issues > > c. Files with {{PD-Art}} which should use {{Artwork}} template and > where the name of the uploader, upload date, and even source might not be > relevant > d. Files using PD license, like PD-old (except PD-Author or PD-User): > for those files it might also the name of the uploader, upload date, and > even source might not be relevant > It might be easier to add infoboxes for different groups of files. For > example Magnus' > add_information.php<http://toolserver.org/%7Emagnus/add_information.php> > tool does not work well for artworks. We also seem to have users that > specialize in different subjects and it might be easier to get their > attention with smaller groups of files of one type. Thank you for doing this! I think these will be great starting points for specific bot runs :) -- Guillaume Paumier _______________________________________________ Wikitech-ambassadors mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-ambassadors
