Hi,

Thank you for sharing your thoughts, Jarek :)

Le vendredi 12 décembre 2014, 03:44:54 Tuszynski, Jarek W. a écrit :
> So all the files in Category:Files with no machine-readable
> license<https://commons.wikimedia.org/wiki/Category:Files_with_no_machine-r
> eadable_license> need work to be done with licenses, not files. I do not
> know what machine-readable metadata is needed but I can help with adding
> them.

Yes, many of those are tricky because there isn't necessarily a "real" license 
attached to them (example: https://commons.wikimedia.org/wiki/File:
%22A_Basket_full_of_Wool%22_(6360159381).jpg ) or the license isn't specific 
enough.

There are similar discussions at 
https://meta.wikimedia.org/wiki/Talk:File_metadata_cleanup_drive#How_to_handle_.22and_future_versions.22_cases
and 
https://meta.wikimedia.org/wiki/Talk:File_metadata_cleanup_drive#.22Presumed_Public_domain.22
 and the best we might be able to do is to come up with a list of such cases 
and ask our wonderful lawyers how to handle them :)
 
> 2)      Your number of files missing machine-readable metadata on Commons:
> ~533,000,  seems a bit low. According to
> Special:MostTranscludedPages<https://commons.wikimedia.org/wiki/Special:Mos
> tTranscludedPages> there are 24,136,218 files with licenses ({{License
> template
> tag<https://commons.wikimedia.org/wiki/Template:License_template_tag>}}‏‎),
> and 23,452,741 files with infobox templates ({{Information}} or {{Infobox
> template
> tag<https://commons.wikimedia.org/wiki/Template:Infobox_template_tag>‏‎}},
> so I would expect 683,477 files without any infobox templates.

There are currently ~677,674 files* without any of the following templates: 

'Information','Painting', 'Blason-fr-en', 'Blason-fr-en-it', 'Blason-xx', 
'COAInformation', 'Artwork', 'Art_Photo','Photograph', 'Book', 'Map', 
'Musical_work', 'Specimen'

If this list in incomplete (it probably is) or incorrect, let me know.

*Source: https://tools.wmflabs.org/mrmetadata/commons_list.txt (warning, 18MB 
text file).

But some of those do have machine-readable metadata picked up by 
CommonsMetadata even if they don't have an infobox, which brings the number 
down to ~533,000. It can be that they have templates we're not listing yet, or 
that they have MR metadata in their EXIF data. Some of the latter are false 
positives, per https://phabricator.wikimedia.org/T73719

> 3)      As I mentioned on
> Commons:Bots/Work_requests#An_example_pattern<https://commons.wikimedia.org
> /wiki/Commons:Bots/Work_requests#An_example_pattern> I would like to first
> give the original uploaders a chance to fix the files. We can do that by
> writing a standard message, which without any threat of deletion, ask for
> help with bringing their files up to current standards. 

I'm not opposed to this in principle, but I'm not sure I see the value. We're 
not going to delete files, or change attribution, or anything like that; we're 
only going to take the existing information and put it into a template so it's 
easier to access.

My assumption is that most uploaders wouldn't care about such a change in 
formatting, and that it would entail more work for them to figure out how to do 
it themselves, than for a few bot owners to do it on a wider scale.

Is this assumption unreasonable?

> 4)      At some point I started adding such files to [[Category:Media
> missing infobox
> template<https://commons.wikimedia.org/wiki/Category:Media_missing_infobox_
> template>]] for better tracking and started sub-categorizing them into
 
> a.       Files with OTRS
> 
> b.      Files with {{information}} template which have some parsing issues
> 
> c.       Files with {{PD-Art}} which should use {{Artwork}} template and
> where the name of the uploader, upload date, and even source might not be
> relevant
 
> d.      Files using PD license, like PD-old (except PD-Author or PD-User):
> for those files it might also the name of the uploader, upload date, and
> even source might not be relevant
 
> It might be easier to add infoboxes for different groups of files. For
> example Magnus'
> add_information.php<http://toolserver.org/%7Emagnus/add_information.php>
> tool does not work well for artworks. We also seem to have users that
> specialize in different subjects and it might be easier to get their
> attention with smaller groups of files of one type.

Thank you for doing this! I think these will be great starting points for 
specific bot runs :) 

-- 
Guillaume Paumier

_______________________________________________
Wikitech-ambassadors mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-ambassadors

Reply via email to