Krinkle updated the task description. (Show Details)

CHANGES TO TASK DESCRIPTION
Checking the format constraint on a SPARQL query service### Problem

Wikidata validates each newly stored statement against the constraints from the associated property. The properties themselves are user-generated and one of the possible for text values is a regex pattern.

Due to the impact of potentially malicious regexes, the MediaWiki PHP backend for Wikidata does //not// use PHP's `preg_match`. Instead, we need to isolate this in some way.

The current workaround uses the SPARQL query service, which
incurs a lot of overhead (ping, TCP, HTTP, SPARQL parsing, query engine preparation), which results in bad timing of the format constraint even for benign regexes. We should investigate whether we can check regexes more locally. However, the mechanism should be tightly restricted in order to avoid denial-of-service attacks via malicious regexes.
...
We can’t directly uses Lua’s regexes because their syntax is too different from PCRE, and implementig a PCRE-compatible engine in Lua is probably too much work. I’m not familiar enough with firejail to comment on that option.

### Solutions

* Lua (via PHP binding?)
* PHP program called as sub process within a Firejail.
* re2 (<https://github.com/google/re2>), either via a microservice, or via PHP binding.
* ...

TASK DETAIL
https://phabricator.wikimedia.org/T176312

EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Krinkle
Cc: Halfak, Anomie, Smalyshev, tstarling, daniel, GWicke, Joe, Lucas_Werkmeister_WMDE, Krinkle, Aklapper, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, Agabi10, SBisson, Wikidata-bugs, aude, jayvdb, fbstj, santhosh, Jdforrester-WMF, Mbch331, Rxy, Jay8g, Ltrlg, bd808, Legoktm
_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to