Le lun. 13 juil. 2020 à 21:22, Adam Sanchez <[email protected]> a écrit :
>
> I have 14T SSD (RAID 0)
>
> Le lun. 13 juil. 2020 à 21:19, Amirouche Boubekki
> <[email protected]> a écrit :
> >
> > Le lun. 13 juil. 2020 à 19:42, Adam Sanchez <[email protected]> a écrit
> > :
> > >
> > > Hi,
> > >
> > > I have to launch 2 million queries against a Wikidata instance.
> > > I have loaded Wikidata in Virtuoso 7 (512 RAM, 32 cores, SSD disks with
> > > RAID 0).
> > > The queries are simple, just 2 types.
> >
> > How much SSD in Gigabytes do you have?
> >
> > > select ?s ?p ?o {
> > > ?s ?p ?o.
> > > filter (?s = ?param)
> > > }
Can you confirm that the above query is the same as:
select ?p ?o {
param ?p ?o
}
Where param is one of the two million params.
Also, did you investigate where the bottleneck is? Look into disk
usage and CPU load. glances [0] can provide that information.
Can you run the thread pool on another machine?
Some back of the envelope calculation 2 000 000 queries in 6 hours,
means your system achieve 10 milliseconds per query: AFAIK, that is
good.
[0] https://github.com/nicolargo/glances/
_______________________________________________
Wikidata mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata