Re: [Wikidata-bugs] [Maniphest] [Commented On] T116547: try computing certains wikidata stats via hadoop (e.g. spark) instead of query.w.o (blazegraph)

Christopher Johnson Mon, 26 Oct 2015 04:05:25 -0700

It is possible that a Hadoop  architecture could provide the performance
and scalability needed for robust statistical analysis of the Wikidata RDF
datasets.

It is also possible that Jena may have better integration tools with Hadoop
that Blazegraph.

See https://jena.apache.org/documentation/hadoop/

I do not see a direct relationship however between T115242 and performance
other than that the reasoning behind filtering these "boring" objects is
based on the perceived negative performance impact of allowing them to be
queried from a publicly accessible endpoint.

The intent of T115242 is to provide these objects in a dataset to a
"nonpublic" query interface for metrics evaluation only.

The question that should be asked is whether Blazegraph and the WDQS
platform are robust enough for intense stat analysis and if not, why and
what can be done to improve them?
On 26 Oct 2015 10:00, "JanZerebecki" <[email protected]>
wrote:

> JanZerebecki added a comment.
>
> @Christopher can as he created https://phabricator.wikimedia.org/T115242.
>
>
> TASK DETAIL
>   https://phabricator.wikimedia.org/T116547
>
> EMAIL PREFERENCES
>   https://phabricator.wikimedia.org/settings/panel/emailpreferences/
>
> To: JanZerebecki
> Cc: Addshore, Christopher, JanZerebecki, Lydia_Pintscher, Aklapper,
> Ricordisamoa, Wikidata-bugs, aude
>
>
>
> _______________________________________________
> Wikidata-bugs mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
>

_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Re: [Wikidata-bugs] [Maniphest] [Commented On] T116547: try computing certains wikidata stats via hadoop (e.g. spark) instead of query.w.o (blazegraph)

Reply via email to