Miriam added a comment. |
@Gilles thanks for this! Images and graphics have very different underlying image statistics: it is therefore fairly easy for a classifier to tell them a part. So it should be feasible.
If we can collect some training data, by finding one or more categories in Commons with a substantial number of diverse graphics images, I can try to quickly build a graphics VS photo classifier, by finetuning an existing image classifier (it won't be perfect, but no GPU needed ;) ) @Isaac maybe your colleague can help with this, by sharing which categories and keywords he used to create his training data?
As a side note, such a classifier can be helpful also to improve the accuracy of other image classifiers (e.g. object detectors or image quality classifiers), that are tipycally trained on photographic material and therefore fail completely when classifying non-photographic images.
We did studies in the past to quantitavely explain the importance and the nature of the difference between graphics and images: https://www.dropbox.com/s/y97h8kjx84hbrzk/p242-redi.pdf?dl=0
Cc: Mholloway, PDrouin-WMF, Krenair, d.astrikov, JoeWalsh, Nirzar, dcausse, fgiunchedi, JAllemandou, leila, Capt_Swing, mpopov, Nuria, DarTar, Halfak, Gilles, EBernhardson, dr0ptp4kt, Harej, MusikAnimal, Abit, elukey, diego, Cparle, Ramsey-WMF, Miriam, Isaac, Nandana, JKSTNK, Akovalyov, Lahi, Gq86, E1presidente, Anooprao, SandraF_WMF, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, V4switch, LawExplorer, Avner, Silverfish, _jensen, Susannaanas, Wong128hk, Jane023, Wikidata-bugs, Base, matthiasmullie, aude, Ricordisamoa, Wesalius, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Matanya, Mbch331, jeremyb
_______________________________________________ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs