Re: [Wikidata-l] Data values

Herman Bruyninckx Wed, 19 Dec 2012 08:10:56 -0800

On Wed, 19 Dec 2012, Denny Vrandečić wrote:

Martynas,
could you please let me know where RDF or any of the W3C standards covers 
topics like units,
uncertainty, and their conversion. I would be very much interested in that.


NIST has created a standard in OWL: "QUDT - Quantities, Units, Dimensions and Data 
Types in OWL and XML":
 <http://www.qudt.org/qudt/owl/1.0.0/index.html>

I fully share Martynas' concerns: most of the problems that are being
discussed in this thread (and that are very relevant and interesting)
should not be solved with an "object oriented" approach (that is, via
properties of objects, and "inheritance") but by semantic modelling (that
is, "composition" of knowledge). For example, one single data base
representation of a unit can have multiple "displays" depending on who
wants to see the unit, and in which context; the viewer and the context are
rather simple to add via semantic primitives. For example, the "Topic Map"
semantic standard would fit here very well, in my opinion:
 <http://en.wikipedia.org/wiki/Topic_map>.

Cheers,
Denny


Herman

2012/12/19 Martynas Jusevičius <marty...@graphity.org>
      Hey wikidatians,

      occasionally checking threads in this list like the current one, I get
      a mixed feeling: on one hand, it is sad to see the efforts and
      resources waisted as Wikidata tries to reinvent RDF, and now also
      triplestore design as well as XSD datatypes. What's next, WikiQL
      instead of SPARQL?

      On the other hand, it feels reassuring as I was right to predict this:
      http://www.mail-archive.com/wikidata-l@lists.wikimedia.org/msg00056.html
      http://www.mail-archive.com/wikidata-l@lists.wikimedia.org/msg00750.html

      Best,

      Martynas
      graphity.org

      On Wed, Dec 19, 2012 at 4:11 PM, Daniel Kinzler
      <daniel.kinz...@wikimedia.de> wrote:
      > On 19.12.2012 14:34, Friedrich Röhrs wrote:
      >> Hi,
      >>
      >> Sorry for my ignorance, if this is common knowledge: What is the use 
case for
      >> sorting millions of different measures from different objects?
      >
      > Finding all cities with more than 100000 inhabitants requires the 
database to
      > look through all values for the property "population" (or even all 
properties
      > with countable values, depending on implementation an query planning), 
compare
      > each value with "100000" and return those with a greater value. To 
speed this
      > up, an index sorted by this value would be needed.
      >
      >> For cars there could be entries by the manufacturer, by some
      >> car-testing magazine, etc. I don't see how this could be adequatly
      >> represented/sorted by a database only query.
      >
      > If this cannot be done adequatly on the database level, then it cannot 
be done
      > efficiently, which means we will not allow it. So our task is to come 
up with an
      > architecture that does allow this.
      >
      > (One way to allow "scripted" queries like this to run efficiently is to 
do this
      > in a massively parallel way, using a map/reduce framework. But that's 
also not
      > trivial, and would require a whole new server infrastructure).
      >
      >> If however this is necessary, i still don't understand why it must 
affect the
      >> datavalue structure. If a index is necessary it could be done over a 
serialized
      >> representation of the value.
      >
      > "Serialized" can mean a lot of things, but an index on some data blob 
is only
      > useful for exact matches, it can not be used for greater/lesser 
queries. We need
      > to map our values to scalar data types the database can understand 
directly, and
      > use for indexing.
      >
      >> This needs to be done anyway, since the values are
      >> saved at a specific unit (which is just a wikidata item). To compare 
them on a
      >> database level they must all be saved at the same unit, or some sort of
      >> procedure must be used to compare them (or am i missing something 
again?).
      >
      > If they measure the same dimension, they should be saved using the same 
unit
      > (probably the SI base unit for that dimension). Saving values using 
different
      > units would make it impossible to run efficient queries against these 
values,
      > thereby defying one of the major reasons for Wikidata's existance. I 
don't see a
      > way around this.
      >
      > -- daniel
      >
      > --
      > Daniel Kinzler, Softwarearchitekt
      > Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
      >
      >
      > _______________________________________________
      > Wikidata-l mailing list
      > Wikidata-l@lists.wikimedia.org
      > https://lists.wikimedia.org/mailman/listinfo/wikidata-l

      _______________________________________________
      Wikidata-l mailing list
      Wikidata-l@lists.wikimedia.org
      https://lists.wikimedia.org/mailman/listinfo/wikidata-l




--
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. 
Eingetragen im
Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 
B. Als
gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, 
Steuernummer
27/681/51985.


--
  KU Leuven, Mechanical Engineering, Robotics Research Group
    <http://people.mech.kuleuven.be/~bruyninc> Tel: +32 16 328056
  Vice-President Research euRobotics <http://www.eu-robotics.net>
  Open RObot COntrol Software <http://www.orocos.org>
  Associate Editor JOSER <http://www.joser.org>, IJRR <http://www.ijrr.org>

_______________________________________________
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] Data values

Reply via email to