chelsyx added a comment.

Status of tasks of this ticket:

  • Search hits based on which element the search is hitting: file name vs. description vs. category
    • This is not feasible currently. Possible solution is T177353#3716344, and we will need help from search backend team.
  • "Unfindable" images metrics: lack of categorization, unhelpful file name, no description (or poor description)
    • Categories: The number of files having a "needing categories" category and the breakdown is shown on T177353#3743257. We have a query to count the number of files by the number of categories, category type (hidden vs not) and media type. But we are having some problems when using this query on mysql database. Possible solution is available, but it would take some time.
    • Description: We could use advanced search and/or parse the page content with hive (using a experimental table set up by analytics), but it would take some time.
    • File name: We could get this done by machine learning as described in T177353#3712897, but it would take some time to train and tune the model.
  • Investigate file annotations and if any tracking (logging) of them are available

Given the difficulties we are facing as described above, @debt and I decide to put this ticket to backlog and work on other SDoC metrics first.


TASK DETAIL
https://phabricator.wikimedia.org/T177353

EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: chelsyx
Cc: EBernhardson, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, Lahi, PDrouin-WMF, E1presidente, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331
_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to