Re: [Wikimedia-l] Giving Commons a bigger public

Hay (Husky) Sun, 24 May 2020 13:40:48 -0700

Hey everyone,
first of all, thanks for all the compliments on my tool! I'm really
glad that people think it is of use.

First of all, i made a couple of updates to the tool, including adding
i18n support, category search and links to Petscan. I've posted an
update on Twitter:
https://twitter.com/hayify/status/1264647593218408449

Just a couple of general observations and thoughts about this project.
When i started development i had two main goals:

1) To showcase the beauty and richness of the media on Wikimedia Commons
There are many files on Wikicommons that are of exceptional quality,
but unfortunately the current search results page doesn't really do a
lot of effort to showcase that, compared to search engines like Google
or Yandex. This is why the Structured Search results overview focuses
on big thumbnails and not on the metadata (i doubt many users are
interested in the filesize or resolution for example, which are shown
on the current Commons results page). This has been a big grudge of me
for many years, i've even came across a Phabricator ticket from almost
five(!) years ago where i was already proposing something like this
(https://phabricator.wikimedia.org/T104565).

2) To showcase the usefulness of structured data
Currently the only way to search for structured data is by using the
'haswbstatement' filter in the search engine. I doubt many people know
this (i didn't until Maarten Dammers pointed me to it) and the
user-friendliness is pretty low, given that you need to fill in
property and item numbers by hand. The best way to find media based on
structured data would be using a SPARQL endpoint (just as on
Wikidata), but unfortunately work on that has been slow (see
https://phabricator.wikimedia.org/T141602). So this tool is basically
a stopgap to author Commons search queries using haswbstatement until
we have a proper SPARQL endpoint for SDoC.

A couple of other points that might be of use:
* Structured Search queries are completely compatible with Wikimedia
Commons search engine queries. It's the exact same format. Any query
that can be done using a Commons search query can be done on
Structured Search (given that it uses the 'File' namespace). I've made
a Commons gadget you can use if you want a link from Commons search
results directly to the tool
(https://commons.wikimedia.org/wiki/User:Husky/sdsearch.js)
* I'm really pleased with the translations that have already been
made. Updating the localisations is still a manual process, i'll
change that to something more automatic in the future.
* I don't intend this to be the replacement of the regular search on
Commons, i just hope it can serve as an inspiration and as a solution
to people who want to have something more visual than the current
search UI.

I don't read this mailing list too much, so for feature requests or
bug reports it's probably best if you submit an issue on Github
(https://github.com/hay/wiki-tools) or reach out to me on Twitter
(@hayify) or via Wikimail.

Kind regards,
-- Hay

On Sun, May 24, 2020 at 9:20 PM Gerard Meijssen
<[email protected]> wrote:
>
> Hoi,
> I have been a professional developer for much of my working life. From what
> I know of what Hay has done, I know you are wrong depending on the approach
> taken. Building this functionality can be an iterative process, it does not
> need to be all singing all dancing from the start. At one time the WMF used
> the agile methodology and you can break up a project like this in parts,
> functional parts.
>
> * The first part is a hack to replace the current code with
> internationalised code and have it localised in Translatewiki.net.
> * You then want to build in functionality that measures its use. It also
> measures the extend labeling expands in Wikidata because of this tool. In
> essence this is not essential.
> * As the tool becomes more popular, it follows that the database may need
> tuning to allow for its expanded use
>
> * A next phase is for the code to be made into an extension enabling the
> native use in MediaWiki projects.  This does not mean Commons, it may be in
> any language projects that cares to use it. It is particularly the small
> languages (less than 100.000 articles).
> * Given that measurements are in place, it then follows that we learn what
> it takes to expand the usage of images. Not only but also for our projects.
> For a first time the small languages take precedence.. The primary reason
> is that for those languages there are few pictures that they find when they
> google or bing.
> * When there is an expressed demand for bigger languages < 1.000.000
> articles, we add these on the basis of a first come, first served basis.
> This is to ensure a steady growth path in the usage.
> * Once we understand the scaling issues, we can expand to Commons itself
> and any and all projects.
> * Once we consider sharing freely licensed media files a priority, we can
> speed the process up within the limits of what is technically feasible.
>
> At the same time, we keep the standalone function available. It will serve
> a subset of our potential public. This will help us understand the pent up
> demand for a service like this. When the WMF is truly  "agile" in its
> development, it is a business decision what priority it gets. Much of what
> I describe has been done by us before; it is not rocket science. The first
> phase could be done within a month. Scaling up the usage and integrating it
> in existing code and projects may indeed take the best of a year. Again,
> that is not so much a technical but much more a business consideration. As
> always, technical issues may crop up and they are refactored in an agile
> process.
> Thanks,
>       GerardM
>
> On Sun, 24 May 2020 at 20:36, Michael Peel <[email protected]> wrote:
>
> > Hi Gerard,
> >
> > I mostly agree with you. However, I disagree with this:
> >
> > > This proof of concept is largely based on existing WMF functionality so
> > it
> > > takes very little for the Wikimedia Foundation to adopt the code, do it
> > > properly particularly for the Internationalisation.
> >
> > Turning prototype code into production code is never trivial. When you’re
> > writing a prototype, you get to skip all performance and edge case
> > concerns, and you don’t need to integrate it into existing code, you’re
> > just interested in getting something working. I hope (and expect) that the
> > WMF will make improvements to Commons’ multilingual search in the future,
> > but it’s definitely not a “very little” amount of work that needs doing,
> > it’s a year or more worth of developer time.
> >
> > Thanks,
> > Mike
> > _______________________________________________
> > Wikimedia-l mailing list, guidelines at:
> > https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and
> > https://meta.wikimedia.org/wiki/Wikimedia-l
> > New messages to: [email protected]
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > <mailto:[email protected]?subject=unsubscribe>
> _______________________________________________
> Wikimedia-l mailing list, guidelines at: 
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
> https://meta.wikimedia.org/wiki/Wikimedia-l
> New messages to: [email protected]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 
> <mailto:[email protected]?subject=unsubscribe>

_______________________________________________
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: [email protected]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 
<mailto:[email protected]?subject=unsubscribe>

Re: [Wikimedia-l] Giving Commons a bigger public

Reply via email to