Dear All,

 I need some help tracking down a DB deadlock on a single-server,
 "private" Wiki installation running latest MediaWiki download
 from http://www.mediawiki.org/wiki/Download

 I have been referred to this mailing list from #mediawiki at
 irc.freenode.org, thanks for the pointer!


 Installed versions:
 -------------------

 - MediaWiki: 1.23.2

 - VisualEditor from Git, branch REL1_23, ID 9883566

 - Parsoid from Git, branch REL1_23, ID 8da3673

 - parsoid service from Git, branch master, ID df35443

 - mysql: 5.5.35, Debian wheezy package

 - apache:  2.2.22, Debian wheezy package

 - PHP: 5.4.4, Debian wheezy package

 - OS: Linux  3.2.0-4-686-pae #1 SMP Debian 3.2.57-3 i686 GNU/Linux

 MediaWiki, Parsoid (nodejs) service and mysql database are all on the
 same server.  Setup followed the instructions on
 https://www.mediawiki.org/wiki/Parsoid/Setup


 Problem description:
 --------------------

 Logging in, reading and editing using the standard (not visual) edit
 works fine.

 Clicking "Edit", which starts the visual editor, results in no
 reaction at all for approx 5 seconds, then the page goes dim and a
 progress indicator appears at the top right corner.

 After another approx. 100 seconds, the page returns to normal state
 as if opened or reading.


 Directly accessing the Parsoid service, here running on port 8000,
 promptly delivers the page.


 Preliminary investigation:
 --------------------------

 Enabling debugging in LocalSettings.php and tweaking wfDebugTimer()
 to get absolute millisecond timestamps and client address/port info
 in the log file showed the following sequences of activities
 (anonymized log files available if needed):

 Summarized:

 - client request page by calling
   "GET /wiki/api.php?format=json&action=visualeditor&paction=parse..."

 - MediaWiki starts processing and class the Parsoid service with
   "GET /testwiki/TestPage?oldid=4053 HTTP/1.0"

 - Parsoid service calls
   "GET /wiki/api.php?format=json&action=query&prop=revisions&..."

 - call from Parsoid server into MediaWiki API deadlocks with
   client call into API.

 The question is why is it deadlocking and how to prevent the deadlock.


 Detailed activities analysis:

 - (1) client accesses
   "GET /wiki/api.php?format=json&action=visualeditor&paction=parse..."

 - (2) MediaWiki starts some session bookkeeping, including opening a
   DB transaction with "BEGIN" and performing several queries in
   the transaction.

 - (3) MediaWiki/1.23.2 calls the parsoid service with
   "GET /testwiki/TestPage?oldid=4053 HTTP/1.0"

 - (4) Parsoid/0.1 calls
   "GET /wiki/api.php?format=json&action=query&prop=revisions&..."

 - (5) MediaWiki starts processing (in a new apache/PHP slave) and opens
   a new DB transaction with "BEGIN" performing the same initial
   "UPDATE `wiki_user` SET user_touched ..." query as first query in
   the transaction as in step (2).

 - (6) approx. 40 seconds pass

 - (7) Parsoid service drops connection from (4) to MediaWiki
   (TCP FIN packet observed by tcpdump while capturing the on-wire
   conversation between MediaWiki and the client / the Parsiod service)

 - (8) Parsoid/0.1 calls again
   "GET /wiki/api.php?format=json&action=query&prop=revisions&..."
   from a new connection (client port number changed)

 - (9) MediaWiki starts processing (in a new apache/PHP slave) and opens
   a new DB transaction with "BEGIN" performing the same initial
   "UPDATE `wiki_user` SET user_touched ..." query as first query in
   the transaction as in step (2) and (5).

 - (10) approx. 10 seconds pass

 - (11) MediaWiki receives a
   "SQL ERROR: Lock wait timeout exceeded; try restarting transaction"
   relating to transaction from step (5)

 - (12) MediaWiki attempts a ROLLBACK of (5) and tries to send a
   "HTTP 500" (content captured by tcpdump) but the connection is
   already dead.

 - (13) approx. 30 seconds pass

 - (14) Parsoid service drops connection from (8) to MediaWiki
   (TCP FIN packet observed by tcpdump while capturing the on-wire
   conversation between MediaWiki and the client / the Parsiod service)

 - (15) Parsoid/0.1 calls again
   "GET /wiki/api.php?format=json&action=query&prop=revisions&..."
   from a new connection (client port number changed)

 - (16) MediaWiki starts processing (in a new apache/PHP slave) and
   opens a new DB transaction with "BEGIN" performing the same initial
   "UPDATE `wiki_user` SET user_touched ..." query as first query in
   the transaction as in step (2), (5) and (9).

 - (17) approx. 10 seconds pass

 - (18) MediaWiki receives a
   "SQL ERROR: Lock wait timeout exceeded; try restarting transaction"
   relating to transaction from step (9)

 - (19) MediaWiki attempts a ROLLBACK of (9) and tries to send a
   "HTTP 500" (content captured by tcpdump) but the connection is
   already dead.

 - (20) approx. 10 seconds pass

 - (21) MediaWiki closes the connection to the Parsoid service from
   step (3), without any data exchange after receiving the initial
   "GET /testwiki/TestPage?oldid=4053 HTTP/1.0"

 - (22) MediaWiki performs a DB "COMMIT" in the original client session
   from step (2), after 104 seconds.

 - (23) MediaWiki continues processing the session from step (16), at
   least in part executing the same set of queries as in step (2).

 - (24) MediaWiki performs a DB "COMMIT" in the Parsoid client session
   from step (15), after 20 seconds.

 - (25) MediaWiki sends a "HTTP 200" with a JSON body to the Parsoid
   service as answer to request from step (15).

 - (26) Parsoid service logs messages "starting parsing", a parse tree
   and "completed parsing in 101234 ms". (Note: It is unclear when in
   the range (21) to (26) these messages are logged, but I think this
   position "(26)" is most logical.)  At earlier times, Parsoid logs
   the two messages "Failed API request, 8 retries remaining." and
   "Failed API request, 7 retries remaining." and not precisely known
   time steps, probably between (7)/(8) and (14)/(15), respectively.



 Question:
 ---------

 What could cause this behavior and how should I configure my system to
 prevent the deadlocks?  If this is a Bug in either MediWiki or the
 VisualEditor or Parsoid, how to further investigate and fix it?


 Best regards

    Björn


-- 
|     Bjoern Kahl   +++   Siegburg   +++    Germany     |
| "googlelogin@-my-domain-"   +++   www.bjoern-kahl.de  |
| Languages: German, English, Ancient Latin (a bit :-)) |

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to