Some Scholia query rewriting discussion is here:
https://github.com/WDscholia/scholia/issues/2423

Egon

On Tue, 11 Jun 2024 at 18:02, Samuel Klein <[email protected]> wrote:

> It would be helpful to see how the standard Scholia queries work under
> federation.  (those that need it)
>
> Are there evals for other graph dbs on how they handle federation?
>
> On Tue, Jun 11, 2024 at 10:39 AM Egon Willighagen <
> [email protected]> wrote:
>
>>
>> Hi, thank you for the update.
>>
>> The email writes that "Queries that need federation will need to be
>> rewritten. You can ask for help to rewrite queries".
>>
>> Do you have guidelines on how to do this? It took quite some effort to
>> make some of the (I thought simple) queries work, but later improvements
>> showed more workable. How were they developed? How do people rewrite the
>> SPARQL queries when two or more query triples are distributed over the two
>> SPARQL endpoint, and particularly when they depend on each other?
>>
>> Egon
>>
>>
>> On Tue, 11 Jun 2024 at 16:17, Guillaume Lederrey <[email protected]>
>> wrote:
>>
>>> Hello all!
>>>
>>> The feedback period for our WDQS Graph Split proposal
>>> <https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_graph_split/WDQS_Split_Refinement>
>>>  has
>>> come to an end. Many thanks to all people who sent comments, your
>>> contribution is invaluable!
>>>
>>> We’ve incorporated most comments and proposals into our final set of
>>> rules for the graph split
>>> <https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_graph_split/Rules>.
>>> The main proposals (including some that were rejected) were:
>>>
>>>    - Duplicate properties in both graph (wd:P*) does not seem necessary
>>>    and won't be done
>>>    - The list of types of publications that identify what is a
>>>    scholarly article have been improved, see the final list of items
>>>    here
>>>    
>>> <https://docs.google.com/spreadsheets/d/1eKX_2Z1rXj1s_zOapQvn_0uD6MVhc-qyqqxbn5loIvk/edit>
>>>    - It was discussed whether sitelinks should inform the nature of the
>>>    split or not; this idea was not incorporated because it might make it
>>>    harder to understand what is where
>>>    - Discussions and investigations regarding items that define
>>>    multiple instance of (P31)
>>>    <https://www.wikidata.org/wiki/Property:P31> which might be
>>>    ambiguous, it appears that it might not affect a lot of items and that 
>>> the
>>>    solution might be to disambiguate these instances by creating separate
>>>    entities (see the Clinical Trials section
>>>    
>>> <https://www.wikidata.org/wiki/Wikidata_talk:SPARQL_query_service/WDQS_graph_split/WDQS_Split_Refinement#Clinical_trials>
>>>  of
>>>    the Talk Page).
>>>    - Re-thinking how scholarly articles are modelled was raised,
>>>    especially by identifying the nature of the publication using a separate
>>>    property rather than using instance of (P31)
>>>    <https://www.wikidata.org/wiki/Property:P31>. This idea should
>>>    probably be explored and discussed by the wikicite community, since it 
>>> does
>>>    affect the nature of the split but could be a nice criteria to take into
>>>    consideration in the future.
>>>
>>> We are now working on implementing the appropriate tooling to manage
>>> this split, including a new way of processing the Wikidata dumps for an
>>> initial load, modification to the update pipeline to support the graph
>>> split, and additional automation. We hope to have new SPARQL endpoints that
>>> are live updated with the graph split by the end of June. This timeline is
>>> probably slightly optimistic, we’ll let you know when those are ready.
>>>
>>> Once the new SPARQL endpoints that are live updated with the graph split
>>> are available, we will provide a 6 months transition period, during which
>>> the current endpoint (query.wikidata.org/sparql) will keep serving the
>>> full graph. Once that transition is over, query.wikidata.org will only
>>> serve the main graph. Queries that need federation will need to be
>>> rewritten. You can ask for help to rewrite queries
>>> <https://www.wikidata.org/wiki/Wikidata:Request_a_query_rewrite>.
>>>
>>> Thank you all for your help and support!
>>>
>>>
>>> Guillaume
>>>
>>> --
>>> *Guillaume Lederrey* (he/him)
>>> Engineering Manager
>>> Wikimedia Foundation <https://wikimediafoundation.org/>
>>> _______________________________________________
>>> Wikidata mailing list -- [email protected]
>>> Public archives at
>>> https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/YS26TSGY3YRSJADWAE3DXSVQR43FNK4K/
>>> To unsubscribe send an email to [email protected]
>>>
>>
>>
>> --
>> Some nanomaterials stress our cells and cause key event, some towards
>> adverse outcomes. Read about it in our new paper "From papers to RDF-based
>> integration of physicochemical data and adverse outcome pathways for
>> nanomaterials", https://doi.org/10.1186/s13321-024-00833
>>
>> --
>> E.L. Willighagen
>> Department of Bioinformatics - BiGCaT
>> Maastricht University (http://www.bigcat.unimaas.nl/)
>> Blog: https://chem-bla-ics.linkedchemistry.info/
>> Mastodon: https://social.edu.nl/@egonw
>> PubList: https://orcid.org/0000-0001-7542-0286
>> _______________________________________________
>> Wikidata mailing list -- [email protected]
>> Public archives at
>> https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/UMSJIAK5BFLGIBRJP6IVY572G4D64QCK/
>> To unsubscribe send an email to [email protected]
>>
>
>
> --
> Samuel Klein          @metasj           w:user:sj          +1 617 529 4266
> _______________________________________________
> Wikidata mailing list -- [email protected]
> Public archives at
> https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/PONNUDTGWRDMOFKOTGIKIVHLYLIAY7G2/
> To unsubscribe send an email to [email protected]
>


-- 
Some nanomaterials stress our cells and cause key event, some towards
adverse outcomes. Read about it in our new paper "From papers to RDF-based
integration of physicochemical data and adverse outcome pathways for
nanomaterials", https://doi.org/10.1186/s13321-024-00833

--
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Blog: https://chem-bla-ics.linkedchemistry.info/
Mastodon: https://social.edu.nl/@egonw
PubList: https://orcid.org/0000-0001-7542-0286
_______________________________________________
Wikidata mailing list -- [email protected]
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/7YKLTTF2JADX7IOMQYZI6K6YAUYV3BFC/
To unsubscribe send an email to [email protected]

Reply via email to