You shouldn't have to keep anything in RAM to HDT-ize something as you
could make the dictionary by sorting on disk and also do the joins to
look up everything against the dictionary by sorting.
------ Original Message ------
From: "Ettore RIZZA" <ettoreri...@gmail.com>
To: "Discussion list for the Wikidata project."
<wikidata@lists.wikimedia.org>
Sent: 10/1/2018 5:03:59 PM
Subject: Re: [Wikidata] Wikidata HDT dump
> what computer did you use for this? IIRC it required >512GB of RAM to
function.
Hello Laura,
Sorry for my confusing message, I am not at all a member of the HDT
team. But according to its creator
<https://twitter.com/ciutti/status/1046849607114936320>, 100 GB "with
an optimized code" could be enough to produce an HDT like that.
On Mon, 1 Oct 2018 at 18:59, Laura Morales <laure...@mail.com> wrote:
> a new dump of Wikidata in HDT (with index) is
available[http://www.rdfhdt.org/datasets/].
Thank you very much! Keep it up!
Out of curiosity, what computer did you use for this? IIRC it required
>512GB of RAM to function.
> You will see how Wikidata has become huge compared to other
datasets. it contains about twice the limit of 4B triples discussed
above.
There is a 64-bit version of HDT that doesn't have this limitation of
4B triples.
> In this regard, what is in 2018 the most user friendly way to use
this format?
Speaking for me at least, Fuseki with a HDT store. But I know there
are also some CLI tools from the HDT folks.
_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata
_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata