I am still trying to catch up with the whole discussion and to distill the results, both here and on the wiki.
In the meanwhile, I have tried to create a prototype of how a complex model can still be entered in a simple fashion. A simple demo can be found here: <http://simia.net/valueparser/> The prototype is not i18n. The user has to enter only the value, in a hopefully intuitive way (try it out), and the full interpretation is displayed here (that, alas, is not intuitive, admittedly). Cheers, Denny 2012/12/20 <[email protected]> > ** > > (Proposal 3, modified) > * value (xsd:double or xsd:decimal) > > * unit (a wikidata item) > > * totalDigits (xsd:smallint) > * fractionDigits (xsd:smallint) > * originalUnit (a wikidata item) > * originalUnitPrefix (a wikidata item) > JMc: I rearranged the list a bit and suggested simpler naming > > JMc: Is not originalUnitPrefix directly derived from originalUnit? > > JMc: May be more efficient to store not reconstruct the original value. May > even be better to store the original value somewhere else entirely, earlier > in the process, eg within the context that you indicate would be worthwhile > to capture, because I wouldnt expect alot of retrievals, but you anticipate > usage patterns certainly better than I. > > How about just: > > > Datatype: .number (Proposal 4) > > ----------------------------------------- > :value (xsd:double or xsd:decimal) > > :unit (a wikidata item) > :totalDigits (xsd:smallint) > :fractionDigits (xsd:smallint) > > > :original (a wikidata item that is a number object) > > On 20.12.2012 03:08, Gregor Hagedorn wrote: > > On 20 December 2012 02:20, <[email protected]> wrote: > > For me the question is how to name the precision information. Do not the > XSD facets "totalDigits" and "fractionDigits" work well enough? I mean > > Yes, that would be one way of modeling it. And I agree with you that, > although the xsd attributes originally are devised for datatypes, > there is nothing wrong with re-using it for quantities and > measurements. > > So one way of expressing a measurement with significant digits is: > (Proposal 1) > * normalizedValue > * totalDigits > * fractionDigits > * originalUnit > * normalizedUnit > > To recover the original information (e.g. that the original value was > in feet with a given number of significant digits) the software must > convert normalizedUnit to originalUnit, scale to totalDigits with > fractionDigits, calculate the remaining powers of ten, and use some > information that must be stored together with each unit whether this > then should be expressed using an SI unit prefix (the Exa, Tera, Giga, > Mega, kilo, hekto, deka, centi, etc.). Some units use them, others > not, and some units use only some. Hektoliter is common, hektometer > would be very odd. This is slightly complicated by the fact that for > some units prefix usage in lay topics differs from scientific use. > > If all numbers were expressed ONLY as total digits with fraction > digits and unit-prefix, i.e. no power-of-ten exponential, the above > would be sufficiently complete. However, without additional > information it does not allow to recover the entry: > > 100,230 * 10^3 tons > (value 1.0023e8, 6 total, 3 fractional digits, original unit tons, > normalized unit gram) > > I had therefore made (on the wiki) the proposal to express it as: > > (Proposal 2) > * normalizedValue > * significantDigits (= and I am happy with totalDigits instead) > * originalUnit > * originalUnitPrefix > * normalizedUnit > > However I see now that the analysis was wrong, indeed it needs > fractionDigits in addition to totalDigits, else a similar problem may > occur, i.e. the distribution of the total order of magnitude of the > number between non-fractional digits, fractional digits, powers of 10 > and powers-of-10-expressed through SI units is still not unambigous. > > So the minimal representation seems to be: > > (Proposal 3) > * normalizedValue (xsd:double or xsd:decimal) > * totalDigits (xsd:smallint) > * fractionDigits (xsd:smallint) > * originalUnit (a wikidata item) > * originalUnitPrefix (a wikidata item) > * normalizedUnit (a wikidata item) > > Adding the originalUnitPrefix has the advantage that it gathers > knowledge from users and data creators or resources about which unit > prefix is appropriate in a given context. > > I see the current wikidata plan to solve this problem by heuristics > very critical, I do not see the data set that sufficiently tests the > heuristics yet. Gathering information from data entered and creating a > formatting heuristics modules over the coming years (instead of weeks) > will be valuable for reformatting. The Proposal 3 allows to gather > this information. > > Gregor > > Note 1: The question of other means to express accuracy or precision, > e.g. by error margins, statistical measures of spread such as > variance, confidence intervals, percentiles, min/max etc. is not yet > covered. > > Given the present discussion, this should probably be separately agreed upon. > > Note 2: Wikipedia Infoboxes may desire to override it, this is for > data entering, review, curation, and a default display where no other > is defined > > _______________________________________________ > Wikidata-l mailing > [email protected]https://lists.wikimedia.org/mailman/listinfo/wikidata-l > > > > _______________________________________________ > Wikidata-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wikidata-l > > -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
_______________________________________________ Wikidata-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-l
