> Wrt. to using libxml2's regexp implementation, it seems implement only > a very basic dialect of regular expressions. AFAIK it doesn't even > support captures (which are probably the premier reason to use > regexp:match(...), and not just regexp:test() ...)
That's unfortunate. If libxml2 already includes it's own regexp implementation, then there is little chance that a branch that relies on external regex implementation would ever merge. But is that true that current regexp implementation could never support captures ? When I look at the specs [1] xmlregexp.c is based on, capture isn't even mentioned but only testing. But when I look at xmlregexp.c itself, apparently parser deals with parenthesizes gracefully... Ok, I still couldn't understand if corresponding generated automaton transitions could be used to identify those captures at execution time. xmlregexp.c is big, and I would need more time to make myself an idea if capture is possible or not. No idea how much it would complicate backtracking, also. Maybe the authors have an opinion on that ? > https://github.com/smilingthax/songtools/tree/master/xsltlib/regexp/ > [...] > Feel free to fork my implementation into a separate/standalone > repositiory or even integrate it into lib(e)xslt... Thanks ! My use case only needs match(), and unicode is a must. Since I can't be sure that --fileparam will ever merge, I'll anyhow write a little command line tool to process stylesheets with files passed as parameters. I would then just have to port your PCRE based regex extension into it. That could make it. [1] https://www.w3.org/TR/xmlschema-2/#regexs _______________________________________________ xslt mailing list, project page http://xmlsoft.org/XSLT/ xslt@gnome.org https://mail.gnome.org/mailman/listinfo/xslt