The approach I use is the following, see this (Bioclipse/Groovy) script:
https://gist.github.com/egonw/ca4c348b9a2d1116efcdb55fa85dd158

It takes advantage of a combination Blazegraph SPARQL trick and breaking up
thing in batches of a certain size:

SELECT ?art ?artLabel
WITH {
SELECT ?art WHERE {
?art wdt:P31 wd:Q13442814
} LIMIT $batchSize OFFSET $offset
} AS %RESULTS {
INCLUDE %RESULTS
?art wdt:P1476 ?artLabel .
MINUS { ?art wdt:P921 wd:$conceptQ }
FILTER (contains(lcase(str(?artLabel)), "$concept"))
}
where "$concept" is my search word in the title, and $batchSize and $offset
take care of the batching by the script. This script creates
QuickStatements.

Mind you, I manually check the created statements, because in my domain
(biochem) a simple search results of false positives, hence the "blacklist"
in the script :)

Egon










On Sat, Dec 15, 2018 at 10:13 AM Fabrizio Carrai <[email protected]>
wrote:

> Thanks Matthias,
> that's a pity. Your suggestion relies on the effective characterization of
> the item that,  at this writing time, is pretty poor for my interest.
> Could it be an idea to download all the "scholary articles", locally
> select  for the keyword of interest (e.g. "microgravity") and set the
> property P921 for all of them ? Quickstatements may be helpful for the last
> step, any suggestions for other tools ?
>
> Thanks
> Fabrizio
>
> Il giorno ven 14 dic 2018 alle ore 22:16 Matthias Erfurth <[email protected]>
> ha scritto:
>
>> Hi Fabrizio,
>> unfortunately you can't fulltext search all the scholarly articles
>> <https://www.wikidata.org/wiki/Q13442814> , you should better work with
>> indexed properties, so
>> you can query for other articles with microgravity as main subject ...
>> With the ajax based wikidata search
>>
>> SELECT ?item
>> WHERE {
>>     ?item wdt:P31 wd:Q13442814;
>>           wdt:P921 wd:Q48655.
>> }
>>
>> Best regards,
>>
>> ciao matthias
>>
>>
>> *Gesendet:* Freitag, 14. Dezember 2018 um 18:55 Uhr
>> *Von:* "Fabrizio Carrai" <[email protected]>
>> *An:* "Discussion list for the Wikidata project" <
>> [email protected]>
>> *Betreff:* Re: [Wikidata] Query on scholarly article fails
>> Thanks again to Ettore, but I immediately found another timeout problem
>> when I just added a FILTER to find all the articles with the word "biokis"
>> in the title
>>
>> SELECT ?istanza_di ?instanza_diLabel WHERE {
>>   ?istanza_di wdt:P31 wd:Q13442814.
>>   ?istanza_di rdfs:label ?instanza_diLabel.
>>   FILTER((LANG(?instanza_diLabel)) = "en").
>>   FILTER(CONTAINS(LCASE(?instanza_diLabel), "biokis"))
>> }
>> LIMIT 100
>>
>> At least one article should be returned:
>> https://www.wikidata.org/wiki/Q57202937
>> but I got a timeout.
>>
>> Thanks to anybody that can help
>>
>> Fabrizio
>>
>>
>> Il giorno ven 14 dic 2018 alle ore 10:12 Ettore RIZZA <
>> [email protected]> ha scritto:
>>
>>> Hello Fabrizio,
>>>
>>> It seems that the problem comes from SERVICE wikibase:label. As said in
>>> another discussion, the query executes in less than one second if you 
>>> rewrite
>>> it in this way
>>> <https://query.wikidata.org/#SELECT%20%3Fistanza_di%20%3Finstanza_diLabel%20WHERE%20%7B%0A%20%20%3Fistanza_di%20wdt%3AP31%20wd%3AQ13442814.%0A%20%20%3Fistanza_di%20rdfs%3Alabel%20%3Finstanza_diLabel.%0A%20%20FILTER%28%28LANG%28%3Finstanza_diLabel%29%29%20%3D%20%22en%22%29%0A%7D%0ALIMIT%2010>
>>> .
>>>
>>> Cheers,
>>>
>>> Ettore Rizza
>>>
>>> Le ven. 14 déc. 2018 à 09:59, Fabrizio Carrai <[email protected]>
>>> a écrit :
>>>
>>>> Hello all,
>>>> the following query ends with a timeot:
>>>>
>>>> SELECT ?istanza_di ?istanza_diLabel WHERE {
>>>>   SERVICE wikibase:label { bd:serviceParam wikibase:language
>>>> "[AUTO_LANGUAGE],en". }
>>>>   ?istanza_di wdt:P31 wd:Q13442814.
>>>> }
>>>> LIMIT 10
>>>>
>>>> Can anybody explain why ?
>>>> Thanks in advance
>>>>
>>>> --
>>>> *Fabrizio*
>>>> _______________________________________________
>>>> Wikidata mailing list
>>>> [email protected]
>>>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>>
>>> _______________________________________________
>>> Wikidata mailing list
>>> [email protected]
>>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>>
>>
>> --
>> *Fabrizio*
>> _______________________________________________ Wikidata mailing list
>> [email protected]
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>> _______________________________________________
>> Wikidata mailing list
>> [email protected]
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>
>
> --
> *Fabrizio*
> _______________________________________________
> Wikidata mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>


-- 
Hi, do you like citation networks? Already 51% of all citations are
available <https://i4oc.org/> available for innovative new uses
<https://twitter.com/hashtag/acs2ioc>. Join my in asking the American
Chemical Society to join the Initiative for Open Citations too
<https://www.change.org/p/asking-the-american-chemical-society-to-join-the-initiative-for-open-citations>.
SpringerNature,
the RSC and many others already did <https://i4oc.org/#publishers>.

-----
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
Blog: http://chem-bla-ics.blogspot.com/
PubList: https://www.zotero.org/egonw
ORCID: 0000-0001-7542-0286 <http://orcid.org/0000-0001-7542-0286>
ImpactStory: https://impactstory.org/u/egonwillighagen
_______________________________________________
Wikidata mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata

Reply via email to