On 13/08/13 22:26, Claus Stadler wrote:
Hi Markus,

Thank you for your information. Below some thoughts and comments from
our side.

 >> This is because the Wikidata datatype for numbers is not implemented
yet.
Ok, is there a timeline when this is planned to be ready?

(Lydia answered this)



 >> To edit Wikidata, you should create an account.
But one can use an existing Wikidata account for performing edits via
the Wikidata API' "login" action, right?

Yes.



 >> If you intend to do mass edits, ...
I don't think we are. The new DBpedia interface should just support
users by transferring selected DBpedia data items to WD *via their own
WD account*.

Ok, fair enough. One can also do anonymous edits, of course (but people should be warned then that their IP will be recorded publicly).



 >> I could imagine that most of the data gets bulk imported from
Wikipedia infoboxes by the community anyway.

By "bulk imported from Wikipedia infoboxes by the community " are you
referring to an automatic or manual process?
For anything manual, the idea is to use the DBpedia-Viewer as a support
tool, as it already contains the data from the infoboxes.
If its automatic, can you explain on how the data is extracted from
Wikipedia?

I guess "semi-automatic" best describes it. What usually happens is that some user proposes an import for some specific property (e.g., import sex information from Italian Wikipedia categories Man/Woman), and then this is done. I cannot explain in all cases how users get the information; it's quite amazing what they do ;-) However, there is no manual control of all imported facts, and errors have been known to happen. There are quality control mechanisms on Wikidata to find problems with the current data (whether imported or not).


 >> Note that there are some differences beyond the vocabulary. Most
Wikidata statements have source information attached

The source for an item from DBpedia is the Wikipedia for the
corresponding language.

Yes, that seems to be a good idea.



 >> , and there are also qualifiers (not used heavily yet, since the
selection/filtering mechanisms of Wikidata are quite weak so far, but
this will change). In some domains, such as roles of actors in films,
quantifiers are getting widely used now; so this is not really triple
data any more. But there will be enough triple data left, I guess.

As for qualifiers, they don't really exist on DBpedia, so if someone
wanted to provide them via the DBpedia-Viewer, one would have to be
provide them manually anyway. I am currently not sure, if they are a
priority to us.

There will be enough cases where they are not needed. With respect to the viewer, the bigger challenge is probably to avoid duplicates/redundant information (entering a property without any qualifier is fine if there is nothing at all yet; but if there is already one with a more specific qualifier, then nothing else should be added).



So for the DBpedia-Viewer's  "transfer DBpedia triple to Wikidata"
feature we see three options, whereas only (c) seems feasible:
(a) The viewer just provides a link the Wikidata, and the user has to
fill out the forms there manually. But we would like a better
interaction between the RDF and WD.
(b) Wikidata offers a way to open an item with a pre-filled out form.
However, on WD it currently seems only a single item can be edit mode.
So this won't work yet, and not sure if this is ever planned to work.

(c) The DBpedia-Viewer aids the user by providing a pre-filled out
edit-form by mapping a triple's propery and object to the corresponding
WD values.
For validation, the user could be presented existing WD-values for that
property. Also, upon edit, a popover or tab with the corresponding WD
page could open up.

Yes, I agree that (c) is most convenient. Since all of the WD interface is coded in Javascript, using the Web API for data exchange, one can create custom UIs with similar functionality quite well. In fact, there are user-contributed Javascript modules that can be activated on wikidata.org to get alternative/additional UIs for editing. So this can be integrated into the web site quite easily if the code is there.

So this kind of edit seems to be possible to do with the API, yet we
would need a mapping between WD RDF and the WD ID's.

All URIs in WD RDF contain the relevant IDs already as substrings. Moreover, the WD RDF URIs are resolvable and support content negotiation (though most formats are quite limited, e.g., the RDF is not the complete RDF that you have in the dumps yet). Is there any further mapping you need?

Markus



Cheers,
Claus

p.s:

 >> Note that all property ids start with a P. If it's of the form Q...,
then it is not a property.
Oops ;)






On 08/13/2013 09:32 PM, Markus Krötzsch wrote:
Hi Claus,

a brief partial reply:

On 13/08/13 16:20, Claus Stadler wrote:
...
For example, I notice that the Wikidata page for my home town "Berndorf
in Lower Austria" does not contain the population:
http://www.wikidata.org/wiki/Q666615

This is because the Wikidata datatype for numbers is not implemented yet.


Looking at the corresponding DBpedia entry, this information actually
exists there:
http://dbpedia.org/resource/Berndorf,_Lower_Austria

The new DBpedia interface should offer a button next to the "population
8728" triple which enables transfer of this information to Wikidata.

To edit Wikidata, you should create an account. If you intend to do
mass edits, this account should be granted bot status first to avoid
it from being blocked if it sends a lot of requests. This is mostly a
community process: you should discuss the intended edit activities
with the community to find out if they are happy with this (this list
is only about the technical aspects). It is good to have additional
inputs, but I could imagine that most of the data gets bulk imported
from Wikipedia infoboxes by the community anyway, which is what
happens with a lot of data right now.


In another GSoC project, Hady Elsahar is working on mappings between the
wikidata RDF vocabulary and the DBpedia vocabulary.
This means, we can in principle map DBpedia RDF data to Wikidata RDF.

Note that there are some differences beyond the vocabulary. Most
Wikidata statements have source information attached, and there are
also qualifiers (not used heavily yet, since the selection/filtering
mechanisms of Wikidata are quite weak so far, but this will change).
In some domains, such as roles of actors in films, quantifiers are
getting widely used now; so this is not really triple data any more.
But there will be enough triple data left, I guess.


However, looking at the Wikidata API [2] there is

action=wbcreateclaim *
with the example:

api.php?action=wbcreateclaim&entity=q42&property=p9001&snaktype=novalue&token=foobar&baserevid=7201010




So the core question is, how can we map e.g. properties such as
wikidata:population (if that existed) to their respective Wikidata
property identifier (Q12345)?
This goes for any property that may occur in an RDF dump, such as:
http://www.wikidata.org/wiki/Special:EntityData/Q666615.nt


Note that all property ids start with a P. If it's of the form Q...,
then it is not a property.

Cheers,

Markus





_______________________________________________
Wikidata-tech mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-tech

Reply via email to