Smalyshev added a comment.

> The WDTK RDF exports are generated based on the original specification. There 
> is no technical issue with this and it does not block development to do just 
> this.


If by original specification you mean assumption "all data is proleptic 
Gregorian", then it does not match the current data. I.e. if I just make code 
assume that, it will generate a real lot of broken dates which will not be 
interpreted properly by the query engine. In fact, almost all Julian dates will 
be wrong, and many others will be broken too. I'm not sure how useful it would 
be to take this road - why have broken data in our dump?

> If not, then better wait until Lydia has a conclusion for what to do with 
> dates, rather than implementing your point of view without consensus.


I'm not sure how it's better. If any decision is made, we can always change the 
code, but just sitting with our hands folded and doing nothing doesn't look 
like a good idea.

> Re deep value model: the core of the issue is that you propose to represent 
> dates as the "original" string. Denny and I have clarified that we don't find 
> this an acceptable representation for dates.


OK, but what I still miss is what you consider acceptable that would be able to 
represent current data. If we have date of 0000-02-31 in the data, what you 
propose for the RDF data to contain? What is it's marked as Julian date - what 
should the data contain? What if it is marked with calendar that is neither 
Gregorian nor Julian?

> There is no upgrade path from this implementation to the one we actually want.


Why not? If the format changes, you can update your data pretty easily by 
removing old value nodes/triples and replacing them with new ones. That 
provided somebody would actually use our beta data and get deep enough so it 
would be a problem it time it takes us to make a decision. Which, if that is 
going to take so long time, is yet another argument for not blocking on it.

> Thus we can as well work on the hypothesis that dates are in ISO 8601:2000 as 
> originally intended.


I understand that neither BlazeGraph not Virtuozo do not actually interpret the 
dates as ISO 8601:2000. We need them to understand our dates. I'm not sure how 
you propose to solve this? Or am I mistaken in interpreting Jan's conslusions 
and they are ISO 8601:2000?

@Lydia_Pintscher could you clarify what you mean by "those dates"? We want to 
represent all dates, I think, so are you proposing to just ignore the triples 
with "weird" dates? That would mean the data would look as if these dates do 
not exist - which may confuse some queries (e.g person with no date of death is 
considered alive, but it's not the same as somebody having the date of death as 
April 31th 4BCE).


TASK DETAIL
  https://phabricator.wikimedia.org/T94064

REPLY HANDLER ACTIONS
  Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign 
<username>.

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Smalyshev
Cc: Lydia_Pintscher, Denny, Manybubbles, daniel, mkroetzsch, Smalyshev, 
JanZerebecki, Aklapper, jkroll, Wikidata-bugs, Jdouglas, aude, GWicke



_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to