despens created this task.
despens added projects: Wikibase, Wikibase-Containers.
Restricted Application added a subscriber: Aklapper.
Restricted Application added projects: Wikidata, wdwb-tech-focus.

TASK DESCRIPTION
  When using a non-standard $wgArticlePath (that is, not `/wiki/$1`), the 
automatic process of feeding Wikibase data into the query service will stop 
working.
  
  The error message when running munge:
  
    0:16:33.843 [main] INFO  o.wikidata.query.rdf.tool.rdf.Munger - 
Unrecognized subjects: [
  
  then follow thousands of statements, all in the form 
https://artbase.rhizome.org/entity/statement/Q4198-xxxxxx -- then the end of 
the error:
  
    ] while processing https://artbase.rhizome.org/entity/null.  Expected only 
sitelinks and subjects starting with 
https://artbase.rhizome.org/wiki/Special:EntityData/ and 
[https://artbase.rhizome.org/entity/]
  
  This is for a wiki that is set up with the $wgArticlePath `/$1`, according 
information in the `sites` table, and a correctly changed .htaccess for Apache 
to handle the URL routes.
  
  When changing the $wgArticlePath back to the default `/wiki/$1`, (including 
switching back information in the `sites` table and reversing to detault 
.htaccess) the Munger 
<https://github.com/wikimedia/wikidata-query-rdf/blob/master/tools/src/main/java/org/wikidata/query/rdf/tool/rdf/Munger.java>
 process completes without any issues.
  
  I am unclear why the Munger would verify siteLinks in the first place (or do 
any data validation), but if it needs to do that it should check for linked 
wiki's articlePath, which can be found out using the Mediawiki API 
<https://artbase.rhizome.org/w/api.php?action=query&meta=siteinfo&siprop=general>.
  
  My suggestion would be to use the existing command line switch 
`--skipSiteLinks` to at least not check for the formatting of siteLinks when 
they're not going to be exported to the query service. Preferred would be 
another switch that would accept any type of siteLink.

TASK DETAIL
  https://phabricator.wikimedia.org/T274354

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: despens
Cc: despens, Aklapper, Samantha_Alipio_WMDE, Akuckartz, darthmon_wmde, Jelabra, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Asahiko, Wikidata-bugs, aude, Lydia_Pintscher, 
Addshore, Mbch331
_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to