https://bugzilla.wikimedia.org/show_bug.cgi?id=51457

       Web browser: ---
            Bug ID: 51457
           Summary: Excessive backtracking in
                    attribute_preprocessor_text_line when parsing table
                    cell
           Product: Parsoid
           Version: unspecified
          Hardware: All
                OS: All
            Status: NEW
          Severity: normal
          Priority: Unprioritized
         Component: tokenizer
          Assignee: [email protected]
          Reporter: [email protected]
                CC: [email protected]
    Classification: Unclassified
   Mobile Platform: ---

Several busy ('hanging') workers in production were backtracking when parsing
pathological tables in
http://el.wikipedia.org/wiki/%CE%A0%CE%BF%CF%81%CE%B5%CE%AF%CE%B1_%CF%84%CF%89%CE%BD_%CE%BA%CF%85%CF%80%CF%81%CE%B9%CE%B1%CE%BA%CF%8E%CE%BD_%CE%BF%CE%BC%CE%AC%CE%B4%CF%89%CE%BD_%CF%83%CF%84%CE%B1_%CE%BA%CF%8D%CF%80%CE%B5%CE%BB%CE%BB%CE%B1_%CE%95%CF%85%CF%81%CF%8E%CF%80%CE%B7%CF%82
I tracked this down by attaching the node debugger to those workers.

Backtracking when parsing table cells with optional attributes is hard to
avoid, but in this case there might be a bug in cache key construction for
memoization. The presence of plenty of quotes additionally slows down
potential-attribute parsing here.

I have some WIP code that speeds things up a lot by avoiding to parse
attributes with clearly invalid names, but get some failures in tests where the
PHP parser simply strips invalid attribute names. Needs further investigation.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to