| Jheald added a comment. |
Hi Magnus,
I am intrigued by the idea of the categorisation information being directly accessible in the file's wikibase page; and I presume the template hack to add a category statement on the File page would also keep the SQL tables up to date, which so many tools, as well as the category presentation infrastructure depend on. Code would need to be written to intercept new categories being added/changed/rewritten on the page, either by humans or tools, to make sure that this was routed appropriately to the wikibase.
However, I don't buy the idea of 'draining' the categories as information becomes accessible by SPARQL. I think this would go down very badly with Commons. At the minimum I think there's going to have to be a long period of parallel running between the category system and SPARQL-driven searches, during which the category system will need to be kept intact. Indeed, I suspect they will still continue to have some important roles even when SPARQL is fully implemented and well populated.
So, rather than removing category statements on the file items, instead better I think would be a qualifier to indicate that the categorisation entry could be accounted for by statements on the file item. It would be good if to some extent this could be updated by bot, as categorisations were added/revised.
As you have noted above, the translation of the meaning(s) of a category into statements can be very varied. I don't know whether you would agree, but I believe it would be *extremely* helpful to be able to store the main "machine meanings" of categories in some accessible place, where it could be easily edited by all-comers (humans and machines) and accessed by all-comers. (The "category combines" statement on Wikidata is a good example of how this information might be modelled).
I've suggested to Sandra that by far the best way to do this would be to have a wikibase entry for each category -- it would be easily accessible, easily writable, easily inspectable with tools we substantially already have. I think it would also be a very good platform for live-testing some of the Structured Data technology at scale -- eg multi-content revisions, federation, etc -- in a known environment, not subject to the progress with the more involved designs for the file pages. I'd be very interested to have your opinion on that. I know via Sandra that the project is very wary of adding anything to the roadmap, but it seems to me it might well pay for itself as a useful test platform down the line, and I'd be curious as to whether you'd think it would add that much of an additional requirement, given that all the enabling technology appears now to either already be in place, or to main-line for the project development.
Cc: Jheald, Magnus, Aklapper, SandraF_WMF, Lahi, PDrouin-WMF, Gq86, E1presidente, Ramsey-WMF, GoranSMilovanovic, Ivana_Isadora, QZanden, Acer, Jseddon, FloNight, Trizek-WMF, Susannaanas, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331, Keegan, Elitre, Qgil
_______________________________________________ Wikidata-bugs mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
