Hi Marco, 

On October 1, 2019 11:48:02 PM GMT+02:00, Marco Fossati <foss...@spaziodati.eu> 
wrote:
>Hi Denny,
>
>Thanks for publishing your Colab notebook!
>I went through it and would like to share my first thoughts here. We
>can 
>then move further discussion somewhere else.
>
>1. in general, how can we compare datasets with totally different time 
>stamps? Wikidata is alive, Freebase is dead, and the latest DBpedia
>dump 
>is old;

DBpedia made monthly releases for the past three months which will continue to 
improve and grow in an agile Manne, we focused on debugging and integration. 
Max age would be 30 days. I think that is OK.  Denny validated against the live 
endpoint. This is OK to drive  growth, but not reproducible scientifically 
compared to dumps. 



>2. given that all datasets contain Wikipedia links, perhaps we could
>use 
>them as a bridge for the comparison, instead of Wikidata mappings. I'm 
>assuming that Freebase and DBpedia entities with Wikidata mappings are 
>subsets of the whole datasets (but this should be verified);
>3. we could use record linkage techniques to connect Wikidata entities 
>with Freebase and DBpedia ones, then assess the agreement in terms of 
>statements per entity. There has been some experimental work (different
>
>use case and goal) in the soweego project:
>https://soweego.readthedocs.io/en/latest/validator.html
>
>
>On 10/1/19 1:13 AM, Denny Vrandečić wrote:
>> Marco, I totally agree with what you said - the project has stalled,
>and 
>> there is plenty of opportunity to harvest more data from Freebase and
>
>> bring it to Wikidata, and this should be reignited.
>Yeah, that would be great.
>There is known work to do, but it's hard to sustain such a big project 
>without allocated resources:
>https://phabricator.wikimedia.org/maniphest/query/CPiqkafGs5G./#R
>
>BTW, there is also version 2 of the Wikidata primary sources tool that 
>needs love, although I'm now skeptical that it will be an effective way
>
>to achieve the Freebase harvesting.
>We should probably rethink the whole thing, and restart small with very
>
>simple use cases, pretty much like the Harvest templates tool you
>mentioned:
>https://tools.wmflabs.org/pltools/harvesttemplates/
>
>Cheers,
>
>Marco
>
>P.S.: I *might* have found the freshest relevant DBpedia datasets:
>https://databus.dbpedia.org/dbpedia/mappings/mappingbased-objects
>I said *might* because it was really painful to find a download button 
>and to guess among multiple versions of the same dataset:
>https://downloads.dbpedia.org/repo/lts/mappings/mappingbased-objects/2019.09.01/mappingbased-objects_lang=en.ttl.bz2
>@Sebastian may know if it's the good one :-)

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Reply via email to