Thompsonbry.systap added a comment.

In terms of the machine shape, the general guidelines you give are appropriate. 
 However, here is out it plays out in terms of GC.  Large heaps => long GC 
pauses.  So you want to keep the JVM heap fairly small (4G => 8G).  Analytic 
queries can use the native C process heap for hash index joins and (in the 
future) for storing intermediate solutions.  So the actual C process heap (for 
the JVM) can be bigger.  If you are bulk loading data then you want more write 
cache buffers. Those are 1MB buffers. You can have 6 => 1000s.  This also helps 
for bulk load onto disks that can not reorder writes (SATA).

The rest of that RAM is going to buffer the file system and decrease IO Wait.

Some of our customers also use warmup procedures to avoid cold start 
performance.  There are a couple of aspect of the cold start issue.  One is 
just that things are slow because they are on the disk.  Another is that the 
JVM is not optimized yet against the code.  However, yet another impact is that 
the data has a longer dwell time during query execution because it takes longer 
to execute the query. This makes the GC overhead higher for cold disks / cold 
JVM scenarios.

One warmup procedure is just to copy the journal file to /dev/nul.  Just get it 
into the OS cache.

Another is to run http://.../bigdata/status?dumpJournal&dumpPages=true  This 
will run through all of the indices and visit all of their pages and provides 
some interesting reporting.  We have been discussing a warmup procedure based 
on this but which only visits the non-leaf nodes of the indices.  After that 
warmup any leaf would just be a single IO.  That should eliminate most of the 
IO Wait and GC burden associated with slamming a cold node.

And if you are load balancing across nodes, then you can obviously just load 
balanced based on metrics and gradually shift more load to a node as it heats 
up.

Thanks,
Bryan


TASK DETAIL
  https://phabricator.wikimedia.org/T90116

REPLY HANDLER ACTIONS
  Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign 
<username>.

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Smalyshev, Thompsonbry.systap
Cc: Thompsonbry.systap, Beebs.systap, Haasepeter, Aklapper, Manybubbles, 
jkroll, Smalyshev, Wikidata-bugs, Jdouglas, aude, GWicke, daniel, JanZerebecki



_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to