> More specifically, at OSM that's the only Q-numbers people are aware of.
I would like to share my use case ( sorry if sometimes is offtopic ) I am: - member of Wikimédia Magyarország Egyesület (Wikimedia Hungary) - OSM meetup organizer - in my mind: 'Q' == Wikidata ; 'Q' == Quality ( but this is a false associations ) - I have experience working with data warehousing / relational databases Q/P prefix for me like a https://en.wikipedia.org/wiki/Hungarian_notation * "Hungarian notation aims to remedy this by providing the programmer with explicit knowledge of each variable's data type."* but now I am not sure: - What is the real meaning of Q/P prefix -> Wikidata or Wikibase? I am involved in some open geodata projects. #1. adding Wikidata ID concordances to Natural Earth ( this is my work ) https://www.naturalearthdata.com/blog/miscellaneous/natural-earth-v4-1-0-release-notes/ #2. adding Wikidata ID concordances to https://whosonfirst.org/ ( Who's On First is a gazetteer of places. ) #3. OSM First time: I tried SPARQL + Wikidata Query Service My experience: - more and more data -> ( like: Q486972, human settlement ) -> more timeouts ( in my complex geo queries ) (a lot of farms imported in the Netherlands area, so I have to limit the search radius;... ) - data changes every time, so hard to write and validate complex program codes. After a few months, I have learned that for heavy data users the Wikidata Query Service sometimes not perfect. ( but good for light queries ! ) So now I am loading "Wikidata JSON dump" to Postgres/PostGIS database - and I am writing complex codes in SQL My codes are very complex codes ( jaro_winkler distance, geo distance, detecting Cebuno imports ; ranking multiple candidates for matching ) ; And finally I can control the performance of the system ( not timeout ) and I have reproducible results. for example: my simple SQL example code - you can see lot of P/Q codes inside , and you can expect - now I am know lot of Q/P codes by heart ! select wd_id ,wd_label ,get_wdcqv_globecoordinate(data,'P625','P518','Q1233637') as river_mouth ,get_wdcqv_globecoordinate(data,'P625','P518','Q7376362') as river_source from wd.wdx where wd_id='Q626'; And now the "Natural Earth" tables looks like this ( relational database ) +-------------+------------+-----------+ | name | wikidataid | iata_code | +-------------+------------+-----------+ | Birsa Munda | Q598231 | IXR | | Barnaul | Q1858312 | BAX | | Bareilly | Q2788745 | | this is my current workflow. But my real nightmare will start - if other databases start using Q/P prefix: for example, other Airport related databases start using Wikibase - with Q codes - http://ourairports.com/ ; - https://www.flightradar24.com/data/airports - https://www.airnav.com/airports/ So every airport have at least 4 different Q codes! And in the future, I have to check errors in this spreadsheet ( and sometimes I don't see the header ) +-------------+------------+-----------+-------------+-----------+-----------+ | name | wikidataid | iata_code | ourairports | flightR24 | AirNav | +-------------+------------+-----------+-------------+-----------+-----------+ | Birsa Munda | Q598231 | IXR | Q325324 | Q973 | Q1 | | Barnaul | Q1858312 | BAX | Q42 | Q1 | Q8312 | | Bareilly | Q2788745 | | Q1 | Q31 | Q45 | Q1 - everywhere - with different meanings And what if some users want to add the new airport ID-s back to the wikidata ( linking databases ) Why not so in the future, If I check the https://www.wikidata.org/wiki/Q598231 I will see a lot of different Q codes: Ourairports Q325324 FlightR24 Q973 AirNav Q1 And sometimes very hard to communicate for the new contributors that Q1(AirNav) =/= Q1(Wikidata) If I see any database/spreadsheet. - and I see a Q code - My current expectations that this is a Wikidata code. :) Just check: https://github.com/search?q=Q28+hungary&type=Code So my current opinion: - please don't use Q/P prefixes in any new/other databases! for me, unlearning a lot of Q/P values is hard, so as I have more-and-more experience in Wikidata data model - I would like less-and-less using any other Wikibase systems with similar Q/P prefixes. My other pain point is the "Wikidata JSON dump" , a little more information would be a big help for me: for detecting data quality of items: - last modification DateTime - last modification user type ( anonym_user, new_user, experienced_user, bot ) - edit counts by user type , for example: { anonym_user=2 , new_user=0 , experienced_user=0, bot=15 } Info about wikidata life cycle - Wikidata redirections / deletions ( now: only in the .ttl files ) I know - I am not a typical user ... and my problems, not a priority yet, imho: Integrating Wikidata iDs to other databases have already started ( OSM, Natural Earth, Who's On First , ... ) and need some guideline/support for this cases - before too late. Probably the current practice ( OSM, Natural Earth, Who's On First , ... ) is not optimal. A few months ago - I have learned an extremely painful lesson: https://phabricator.wikimedia.org/T202676#4533486 quote>>> *- "Q" does not mean "wikidata.org <http://wikidata.org>". It means "item" and is used by all Wikibase installations so far.* *- "Retroactively "reserving" the letter "Q" to be exclusively used by wikidata.org <http://wikidata.org> can't work. It was never meant to be like this, and there is no mechanism for this."- * *- "Q" only means "wikidata.org <http://wikidata.org>" to users who know about wikidata.org <http://wikidata.org>. These users should not have a problem understanding that the moment an OSM Wikibase installation exists, "osm:Q1" refers to this installation.* <<<<quote so now I am totally confused. probably, my current practice is a "bad practice" ? :( And the "Natural Earth" wikidata integrations should add a "wd:" prefix everywhere?, but maybe it is too late to change +-------------+---------------+-----------+ | name | wikidataid | iata_code | +-------------+---------------+-----------+ | Birsa Munda | wd:Q598231 | IXR | | Barnaul | wd:Q1858312 | BAX | | Bareilly | wd:Q2788745 | | this is my retrospective, thank you for reading. best, Imre Yuri Astrakhan <[email protected]> ezt írta (időpont: 2018. nov. 29., Cs, 7:17): > On Thu, Nov 29, 2018 at 12:51 AM Federico Leva (Nemo) <[email protected]> > wrote: > >> Yuri Astrakhan, 29/11/18 04:14: >> > The "Q" prefix has a strong identity in itself. Anyone will instantly >> > say - yes, it's a Wikidata identifier >> >> But that's because most people only know one Wikibase installation, not >> the other way around. >> > > Of course! More specifically, at OSM that's the only Q-numbers people are > aware of. All other ID systems do not have nearly the same level of > recognition. It would be silly to wait for government agencies to switch > to the Q-numbers too, right? Or to wait for 5-10 years until (and IF!) Q > numbers become more common at other projects that are large enough to > become well known, and use that potential future as a justification to not > use a much more convenient system for the next 10 years. The cost of that > 10 years of "wait and see" is a significant user confusion. > _______________________________________________ > Wikidata mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wikidata >
_______________________________________________ Wikidata mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata
