Re: [Wikidata] Wikidata HDT dump

Paul Houle Mon, 01 Oct 2018 16:32:49 -0700

You shouldn't have to keep anything in RAM to HDT-ize something as youcould make the dictionary by sorting on disk and also do the joins tolook up everything against the dictionary by sorting.


------ Original Message ------
From: "Ettore RIZZA" <ettoreri...@gmail.com>

To: "Discussion list for the Wikidata project."<wikidata@lists.wikimedia.org>

Sent: 10/1/2018 5:03:59 PM
Subject: Re: [Wikidata] Wikidata HDT dump

> what computer did you use for this? IIRC it required >512GB of RAM tofunction.
Hello Laura,
Sorry for my confusing message, I am not at all a member of the HDTteam. But according to its creator<https://twitter.com/ciutti/status/1046849607114936320>, 100 GB "withan optimized code" could be enough to produce an HDT like that.
On Mon, 1 Oct 2018 at 18:59, Laura Morales <laure...@mail.com> wrote:
> a new dump of Wikidata in HDT (with index) isavailable[http://www.rdfhdt.org/datasets/].
Thank you very much! Keep it up!
Out of curiosity, what computer did you use for this? IIRC it required>512GB of RAM to function.
> You will see how Wikidata has become huge compared to otherdatasets. it contains about twice the limit of 4B triples discussedabove.
There is a 64-bit version of HDT that doesn't have this limitation of4B triples.
> In this regard, what is in 2018 the most user friendly way to usethis format?
Speaking for me at least, Fuseki with a HDT store. But I know thereare also some CLI tools from the HDT folks.
_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Wikidata HDT dump

Reply via email to