Smalyshev added a comment. > The WDTK RDF exports are generated based on the original specification. There > is no technical issue with this and it does not block development to do just > this.
If by original specification you mean assumption "all data is proleptic Gregorian", then it does not match the current data. I.e. if I just make code assume that, it will generate a real lot of broken dates which will not be interpreted properly by the query engine. In fact, almost all Julian dates will be wrong, and many others will be broken too. I'm not sure how useful it would be to take this road - why have broken data in our dump? > If not, then better wait until Lydia has a conclusion for what to do with > dates, rather than implementing your point of view without consensus. I'm not sure how it's better. If any decision is made, we can always change the code, but just sitting with our hands folded and doing nothing doesn't look like a good idea. > Re deep value model: the core of the issue is that you propose to represent > dates as the "original" string. Denny and I have clarified that we don't find > this an acceptable representation for dates. OK, but what I still miss is what you consider acceptable that would be able to represent current data. If we have date of 0000-02-31 in the data, what you propose for the RDF data to contain? What is it's marked as Julian date - what should the data contain? What if it is marked with calendar that is neither Gregorian nor Julian? > There is no upgrade path from this implementation to the one we actually want. Why not? If the format changes, you can update your data pretty easily by removing old value nodes/triples and replacing them with new ones. That provided somebody would actually use our beta data and get deep enough so it would be a problem it time it takes us to make a decision. Which, if that is going to take so long time, is yet another argument for not blocking on it. > Thus we can as well work on the hypothesis that dates are in ISO 8601:2000 as > originally intended. I understand that neither BlazeGraph not Virtuozo do not actually interpret the dates as ISO 8601:2000. We need them to understand our dates. I'm not sure how you propose to solve this? Or am I mistaken in interpreting Jan's conslusions and they are ISO 8601:2000? @Lydia_Pintscher could you clarify what you mean by "those dates"? We want to represent all dates, I think, so are you proposing to just ignore the triples with "weird" dates? That would mean the data would look as if these dates do not exist - which may confuse some queries (e.g person with no date of death is considered alive, but it's not the same as somebody having the date of death as April 31th 4BCE). TASK DETAIL https://phabricator.wikimedia.org/T94064 REPLY HANDLER ACTIONS Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign <username>. EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Smalyshev Cc: Lydia_Pintscher, Denny, Manybubbles, daniel, mkroetzsch, Smalyshev, JanZerebecki, Aklapper, jkroll, Wikidata-bugs, Jdouglas, aude, GWicke _______________________________________________ Wikidata-bugs mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
