It is possible that a Hadoop architecture could provide the performance and scalability needed for robust statistical analysis of the Wikidata RDF datasets.
It is also possible that Jena may have better integration tools with Hadoop that Blazegraph. See https://jena.apache.org/documentation/hadoop/ I do not see a direct relationship however between T115242 and performance other than that the reasoning behind filtering these "boring" objects is based on the perceived negative performance impact of allowing them to be queried from a publicly accessible endpoint. The intent of T115242 is to provide these objects in a dataset to a "nonpublic" query interface for metrics evaluation only. The question that should be asked is whether Blazegraph and the WDQS platform are robust enough for intense stat analysis and if not, why and what can be done to improve them? On 26 Oct 2015 10:00, "JanZerebecki" <[email protected]> wrote: > JanZerebecki added a comment. > > @Christopher can as he created https://phabricator.wikimedia.org/T115242. > > > TASK DETAIL > https://phabricator.wikimedia.org/T116547 > > EMAIL PREFERENCES > https://phabricator.wikimedia.org/settings/panel/emailpreferences/ > > To: JanZerebecki > Cc: Addshore, Christopher, JanZerebecki, Lydia_Pintscher, Aklapper, > Ricordisamoa, Wikidata-bugs, aude > > > > _______________________________________________ > Wikidata-bugs mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs >
_______________________________________________ Wikidata-bugs mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
