Hi,

On Fri, Dec 21, 2012 at 6:14 PM, Denny Vrandečić
<[email protected]> wrote:
>
>
> Friedrich, the term "query answering" simply means the ability to answer 
> queries against the database in Phase 3, e.g. the list of cities located in 
> Ghana with a population over 25,000 ordered by population.
>
> A query system that deals well with intervals -- I would need a pointer for 
> that. For now I was always assuming to use a single value internally to 
> answer such queries. If the values is 90+-20 then the query >100? would not 
> contain that result. Sucks, but I don't know of any better system.
>

So taking the current data model and our eiffel tower example:
We have the entity "Eiffel Tower".
We want to represent "The Eiffel Tower is 324 meters high", so we
would create the statement
(
Assumptions:
wikipedia pages as ids,
I understood the wikidata object notation correctly,
Number(upperlimit lowerlimit unit) is the object notation for numbers.
I omitted the quantity field because i think its redundant at least in
this example and the confidence could be added as an additional
PropertySnak(?))
)
Statement('
http://en.wikipedia.org/wiki/Eiffel_Tower
PropertyValueSnak(' http://en.wikipedia.org/wiki/Height Number(324 324
http://en.wikipedia.org/wiki/Metre) ')
{reference and rank omitted}
')

(Note: I though I read somewhere it was decided that all statements on
wikidata should at least have one reference, but in the object
notation definition the {references} seems to imply this is a optional
argument. Also the visual has a 0..* relation, did i miss something?)

Assuming a System of tables by unit and multiples i.e. (meter is m0)
is used the table could look like this

Table m0:

propertyid | property | max | min | other information
1234 | Height | 324 | 324 | ... (or 323 | 325 )

an index could be put over min and max and property and the query for
all buildings higher then 300 meters could start with:

SELECT propertyid FROM m0 WHERE property = Height AND min > 300 OR max > 300;

This would allow a query for things with a "Height" greater then 300
and it would even include things defined as 290+-20, since the max
value would be over 300. The much harder thing imho is the "located in
Ghana" part of the query, but I think there are such things as spatial
queries for the big databases ( f.e.
http://dev.mysql.com/doc/refman/5.0/en/spatial-extensions.html ).

This would also work for queries on Dates with tables that have
mindate, maxdate. ( appropo dates here is a short but interesting
discussion on how they might be saved in databases if arbitrary dates
are needed here: http://stackoverflow.com/a/2487792 )

Representation of "temporal knowledge" (?) seems to be a huge research
topic anyway (f.e.
http://www.math.unipd.it/~kvenable/RT/corso2009/Allen.pdf or
http://www.cs.ox.ac.uk/boris.motik/pubs/nm03fuzzy.pdf, ...); where the
problems about time intervals vs. time points, uncertainty and
"vagueness" and their representation is discussed.

hope this helps,

Friedrich

_______________________________________________
Wikidata-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Reply via email to