Hey! I mentioned this at the meeting tonight and I thought I'd share it -
and wondering if anyone here has thoughts on how to script this to make it
a little more systematic?

My project was to improve the diversity of photos of careers, since NPOV is
slightly ambiguous there and we know there's impact for kids in terms of
representation.

The basic strategy was a wikidata query on… jobs? careers? This was then
joined with pageview data, so that I could prioritize the pages by traffic.
Someone on twitter helped me find the right Wikidata items and construct
the query; sadly I can't find it in my notes, though. (I’ve been poking at
building a new one with the help of chatgpt but haven’t had much time for
it.) The output was a csv that I then jammed into Google Sheets to track
it, but presumably it wouldn't be that hard to regenerate the list
dynamically (and extend it beyond enwiki).

I then simply did a lot of Flickr, IA, and usgov searches to find better
photos - not just women, also geographic/racial diversity. Some were pretty
easy (especially where the US government has many people in the named
career role) but others harder. As a general matter, I didn't start with
Commons; I mostly assumed I had to look off Commons first and then bring
the images to Commons, though that wasn't always true.

Some example edits:

   - adding women, an African, and an Asian to “Presidents”:
   
https://en.wikipedia.org/w/index.php?title=President_(government_title)&diff=prev&oldid=841458719
   - add a woman to "Sommelier":
   
https://en.wikipedia.org/w/index.php?title=Sommelier&diff=prev&oldid=842546650
   - add an African man and Mexican group to “Chef”:
   https://en.wikipedia.org/w/index.php?title=Chef&diff=prev&oldid=842550576
   - add a gender-diverse photo and black man to “System Administrator”; if
   I recall correctly the black man was reverted but i didn’t fight it too
   hard:
   
https://en.wikipedia.org/w/index.php?title=System_administrator&diff=prev&oldid=840728240

I seem to recall that the attempt to diversify “Lawyer” was reverted, but
most stuck at least in the short term.

Now that Wikidata has matured, and maybe more photos out there, it'd be
interesting to turn this into something more structured — eg, there's
obviously problems with relying on LLMs to do gender identification of
photos, but as a first pass to identify the most problematic pages?

Anyway, throwing that out into the void-
Luis
_______________________________________________
Wikimedia-SF mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to