ori added a comment.

I rebuilt libjemalloc1 with `--prof-enabled`:


```
lang=bash
$ apt-get source libjemalloc1
$ cd jemalloc-3.5.1
$ cat >> debian/rules
override_dh_auto_configure:
        dh_auto_configure -- --enable-prof
$ dch --distribution trusty-wikimedia --local wmf "Rebuilt with --enable-prof"
$ sudo dpkg-buildpackage -b -uc
```

Then I installed the resultant package on mw1114 and mw1189, the HHVM API 
servers. (It's on the API servers that the leak is most evident.)

I then enabled jemalloc's heap profiling capabilities via `/etc/malloc.conf`. 
The way to do it is to make that file a symlink whose target is an option 
string rather than an actual file. Strange!

```
lang=sh
$ ln -sf "prof:true,prof_gdump:true,lg_prof_sample:21,prof_prefix:/tmp/heaps/
```

  - `prof_gdump` tells jemalloc to dump a memory profile every time the total 
virtual memory exceeds the previous maximum.
  - `lg_prof_sample` specifies the average interval (log base 2) between 
allocation samples, as measured in bytes of allocation activity. 21 = 2 MiB. 
This is granular enough to be useful while keeping the computational overhead 
light.
  - `prof_prefix` sets the filename prefix for profile dumps. If unset, dumps 
are written to the process's working directory -- in this case `/run/hhvm`.

`prof_gdump` ends up being less useful than I thought. Because virtual memory 
usage is steadily increasing, there is a new high water mark every few seconds, 
so I end up with a lot of noise.

The next thing I try is to use the admin server API to request a heap profile 
every ten minutes.

```
lang=sh
#!/bin/bash
# Request a heap profile from HHVM once every ten minutes.
# Profiles are written to /tmp/heaps.

sudo -u apache -g apache mkdir -m0755 -p /tmp/heaps

curl "localhost:9002/jemalloc-prof-activate"

for (( i=1; ; i++ )); do
    curl "localhost:9002/jemalloc-prof-dump?file=/tmp/heaps/${i}.heap"
    sleep 600
done
```

My plan is to use pprof's `--base` arg to subtract a heap profile from its 
successor. The basic invocation looks like this:

```
$ google-pprof `which hhvm` --show_bytes --base=1.heap 2.heap
```

This puts you in interactive mode. You can run 'top' to see the top 10 
routines, etc.

I hack together a simple shell script to help:

```
lang=sh
#!/bin/bash
for (( i=1; ; i++ )); do
    base="$((i)).heap" prof="$((i+1)).heap"
    [ -e "$prof" ] || break
    echo "-------------------- $base => $prof --------------------"
    google-pprof `which hhvm` --show_bytes --text --base="$base" "$prof" 
2>/dev/null | head -6
done
```

Here's the output for an hour's worth of heap profiles:

```
-------------------- 1.heap => 2.heap --------------------
Total: 573794313 B
402653184  70.2%  70.2% 402653184  70.2% HPHP::AsyncFuncImpl::start
48235840   8.4%  78.6% 56624736   9.9% xmlNewText
20972160   3.7%  82.2% 94374384  16.4% xmlSAX2StartElement
18913696   3.3%  85.5% 18913696   3.3% 00007fe80e4343b8
16777520   2.9%  88.5% 16777520   2.9% xmlStrndup
-------------------- 2.heap => 3.heap --------------------
Total: 483619254 B
402653184  83.3%  83.3% 402653184  83.3% HPHP::AsyncFuncImpl::start
50333056  10.4%  93.7% 56624648  11.7% xmlNewText
25166592   5.2%  98.9% 25166592   5.2% xmlNewNodeEatName
23069376   4.8% 103.6% 106957784  22.1% xmlSAX2StartElement
 8388760   1.7% 105.4%  8388760   1.7% xmlStrndup
-------------------- 3.heap => 4.heap --------------------
Total: 139819215 B
44041360  31.5%  31.5% 50332872  36.0% xmlNewText
43129392  30.8%  62.3% 43129392  30.8% HPHP::MemoryManager::newSlab
12583296   9.0%  71.3% 81790932  58.5% xmlSAX2StartElement
12583296   9.0%  80.3% 12583296   9.0% xmlNewNodeEatName
12582979   9.0%  89.3% 12582979   9.0% xmlStrndup
-------------------- 4.heap => 5.heap --------------------
Total: 1258200573 B
1207959552  96.0%  96.0% 1207959552  96.0% HPHP::AsyncFuncImpl::start
33555344   2.7%  98.7% 39846844   3.2% xmlNewText
12609571   1.0%  99.7% 12609571   1.0% 00007fe80e4343b8
12605141   1.0% 100.7% 12605141   1.0% pcre_compile2
12583296   1.0% 101.7% 12583296   1.0% xmlNewNodeEatName
-------------------- 5.heap => 6.heap --------------------
Total: 129656509 B
35652544  27.5%  27.5% 37749720  29.1% xmlNewText
33176456  25.6%  53.1% 33176456  25.6% HPHP::MemoryManager::newSlab
23069376  17.8%  70.9% 23069376  17.8% xmlNewNodeEatName
14694959  11.3%  82.2% 14694959  11.3% 00007fe80e4343b8
14680512  11.3%  93.5% 81791088  63.1% xmlSAX2StartElement
```

Working with XML could be the most common, memory-intensive things the API 
servers do, so the regular appearance of `xmlNewText` in these lists //could// 
be innocent, but it looks suspicious to me.

I'm out of time for now, but the next thing I plan to set a breakpoint on 
`xmlNewText` in gdb and capture some backtraces.

TASK DETAIL
  https://phabricator.wikimedia.org/T820

REPLY HANDLER ACTIONS
  Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign 
<username>.

To: tstarling, ori
Cc: wikibugs-l, tstarling, ori



_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to