Hi folks: We plan to use salt in our production environment, and we need to keep heartbeat to salt server by periodical short connection.
But unfortunately, we found that heartbeat connection will cause salt server memory leak significantly. With some more deep testing, we found the root cause might be *pyzmq* or *libzmq*. And we have seeked help at pyzmq's comunitity, they told us that it should be libzmq's bug. I'm a newbie for libzmq, I use systemtap to analyze the execution times of malloc()/free() issued by libzmq across a heartbeat connection, the result shows that the calls of malloc()/free() are unparied. Obviously, there are memory leak in libzmq or the upper caller(pyzmq) does not call the API of libzmq correctly. I'll show my test script here, hope someone can help us to fix the bug: 1) Compiles the latest version of libzmq and install it to /usr/local/ directory. 2) Build the latest version of pyzmq, and copy the "pyzmq/build/lib.linux-x86_64-2.7/zmq" to "salt/packages/" directory. 3) Start salt server: ./salt-master 4) Use *run.sh.ext*(see attatchment, it will use *malloc-free-libzmq.stp* script internally) to monitor the exectution of malloc()/free() in libzmq. 5) Use *client-request.py*(see attachment) to simulate a heartbeat request to salt server. The *systemtap-libzmq-output.log* file was the output of *run.sh.ext*, it shows that the calls of malloc()/free() are unpaired. NOTE: run.sh would be reject by the mailing list, so I rename it to run.sh.ext. -- Yunkai Zhang Work at Taobao
malloc-free-libzmq.stp
Description: Binary data
root 11895 23437 0 12:14 pts/6 00:00:00 sudo ./salt-master root 11896 11895 0 12:14 pts/6 00:00:00 python2.7 ./salt-master root 11897 11896 0 12:14 pts/6 00:00:00 python2.7 ./salt-master root 11904 11896 0 12:14 pts/6 00:00:00 python2.7 ./salt-master root 11905 11896 0 12:14 pts/6 00:00:00 python2.7 ./salt-master root 11908 11896 0 12:14 pts/6 00:00:00 python2.7 ./salt-master root 11911 11896 0 12:14 pts/6 00:00:00 python2.7 ./salt-master root 11914 11896 0 12:14 pts/6 00:00:00 python2.7 ./salt-master root 11917 11896 0 12:14 pts/6 00:00:00 python2.7 ./salt-master root 11918 11896 0 12:14 pts/6 00:00:00 python2.7 ./salt-master tcp 0 0 0.0.0.0:4505 0.0.0.0:* LISTEN 11904/python2.7 [salt process id]: 11904 ==Begin== Malloc, ptr=0x7f8f2c026da0 0x322685f6fd : operator new(unsigned long, std::nothrow_t const&)+0x1d/0x80 [/usr/lib64/libstdc++.so.6.0.19] 0x7f8f36a012e7 : zmq::tcp_listener_t::in_event()+0x67/0x230 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369eedda : zmq::poll_t::loop()+0x15a/0x2a0 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f36a019c0 : thread_routine+0x30/0xc0 [/usr/local/lib/libzmq.so.3.1.0] 0x3217807f33 : start_thread+0xc3/0x310 [/usr/lib64/libpthread-2.18.so] 0x32174f4ded : clone+0x6d/0x90 [/usr/lib64/libc-2.18.so] -- Malloc, ptr=0x7f8f2c000f70 0x322685f66d : operator new(unsigned long)+0x1d/0x90 [/usr/lib64/libstdc++.so.6.0.19] 0x32268bdcf9 : std::basic_string<char, std::char_traits<char>, std::allocator<char> >::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&)+0x59/0x80 [/usr/lib64/libstdc++.so.6.0.19] 0x32268bdee6 : std::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_mutate(unsigned long, unsigned long, unsigned long)+0x66/0x1d0 [/usr/lib64/libstdc++.so.6.0.19] 0x32268be4ae : std::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_replace_safe(unsigned long, unsigned long, char const*, unsigned long)+0x1e/0x70 [/usr/lib64/libstdc++.so.6.0.19] 0x7f8f369e0766 : zmq::get_peer_ip_address(int, std::basic_string<char, std::char_traits<char>, std::allocator<char> >&)+0x116/0x120 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369fd4c1 : zmq::stream_engine_t::stream_engine_t(int, zmq::options_t const&, std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x6d1/0xda0 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f36a0130e : zmq::tcp_listener_t::in_event()+0x8e/0x230 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369eedda : zmq::poll_t::loop()+0x15a/0x2a0 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f36a019c0 : thread_routine+0x30/0xc0 [/usr/local/lib/libzmq.so.3.1.0] 0x3217807f33 : start_thread+0xc3/0x310 [/usr/lib64/libpthread-2.18.so] 0x32174f4ded : clone+0x6d/0x90 [/usr/lib64/libc-2.18.so] -- Malloc, ptr=0x7f8f2c041f00 0x322685f6fd : operator new(unsigned long, std::nothrow_t const&)+0x1d/0x80 [/usr/lib64/libstdc++.so.6.0.19] 0x7f8f369f18c9 : zmq::session_base_t::create(zmq::io_thread_t*, bool, zmq::socket_base_t*, zmq::options_t const&, zmq::address_t*)+0xa9/0x150 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f36a0133d : zmq::tcp_listener_t::in_event()+0xbd/0x230 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369eedda : zmq::poll_t::loop()+0x15a/0x2a0 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f36a019c0 : thread_routine+0x30/0xc0 [/usr/local/lib/libzmq.so.3.1.0] 0x3217807f33 : start_thread+0xc3/0x310 [/usr/lib64/libpthread-2.18.so] 0x32174f4ded : clone+0x6d/0x90 [/usr/lib64/libc-2.18.so] -- Free, ptr=0x0 0x3217480330 : free+0x0/0xe0 [/usr/lib64/libc-2.18.so] 0x7f8f369e2ef3 : zmq::mailbox_t::recv(zmq::command_t*, int)+0x143/0x1a0 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369e0269 : zmq::io_thread_t::in_event()+0x49/0xa0 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369eedda : zmq::poll_t::loop()+0x15a/0x2a0 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f36a019c0 : thread_routine+0x30/0xc0 [/usr/local/lib/libzmq.so.3.1.0] 0x3217807f33 : start_thread+0xc3/0x310 [/usr/lib64/libpthread-2.18.so] 0x32174f4ded : clone+0x6d/0x90 [/usr/lib64/libc-2.18.so] -- Malloc, ptr=0x7f8f2c000ce0 0x322685f66d : operator new(unsigned long)+0x1d/0x90 [/usr/lib64/libstdc++.so.6.0.19] 0x7f8f369e9b56 : zmq::own_t::process_own(zmq::own_t*)+0xc6/0x100 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369e7cb2 : zmq::object_t::process_command(zmq::command_t&)+0x32/0x160 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369e025c : zmq::io_thread_t::in_event()+0x3c/0xa0 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369eedda : zmq::poll_t::loop()+0x15a/0x2a0 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f36a019c0 : thread_routine+0x30/0xc0 [/usr/local/lib/libzmq.so.3.1.0] 0x3217807f33 : start_thread+0xc3/0x310 [/usr/lib64/libpthread-2.18.so] 0x32174f4ded : clone+0x6d/0x90 [/usr/lib64/libc-2.18.so] -- Malloc, ptr=0x7f8f2c0422b0 0x322685f6fd : operator new(unsigned long, std::nothrow_t const&)+0x1d/0x80 [/usr/lib64/libstdc++.so.6.0.19] 0x7f8f369ec0c2 : zmq::pipepair(zmq::object_t**, zmq::pipe_t**, int*, bool*)+0x122/0x550 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369f14e5 : zmq::session_base_t::process_attach(zmq::i_engine*)+0xe5/0x260 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369e7cfa : zmq::object_t::process_command(zmq::command_t&)+0x7a/0x160 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369e025c : zmq::io_thread_t::in_event()+0x3c/0xa0 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369eedda : zmq::poll_t::loop()+0x15a/0x2a0 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f36a019c0 : thread_routine+0x30/0xc0 [/usr/local/lib/libzmq.so.3.1.0] 0x3217807f33 : start_thread+0xc3/0x310 [/usr/lib64/libpthread-2.18.so] 0x32174f4ded : clone+0x6d/0x90 [/usr/lib64/libc-2.18.so] -- Malloc, ptr=0x7f8f2c042320 0x7f8f369ec0eb : zmq::pipepair(zmq::object_t**, zmq::pipe_t**, int*, bool*)+0x14b/0x550 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369f14e5 : zmq::session_base_t::process_attach(zmq::i_engine*)+0xe5/0x260 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369e7cfa : zmq::object_t::process_command(zmq::command_t&)+0x7a/0x160 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369e025c : zmq::io_thread_t::in_event()+0x3c/0xa0 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369eedda : zmq::poll_t::loop()+0x15a/0x2a0 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f36a019c0 : thread_routine+0x30/0xc0 [/usr/local/lib/libzmq.so.3.1.0] 0x3217807f33 : start_thread+0xc3/0x310 [/usr/lib64/libpthread-2.18.so] 0x32174f4ded : clone+0x6d/0x90 [/usr/lib64/libc-2.18.so] -- Malloc, ptr=0x7f8f2c044b40 0x322685f6fd : operator new(unsigned long, std::nothrow_t const&)+0x1d/0x80 [/usr/lib64/libstdc++.so.6.0.19] 0x7f8f369ec04e : zmq::pipepair(zmq::object_t**, zmq::pipe_t**, int*, bool*)+0xae/0x550 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369f14e5 : zmq::session_base_t::process_attach(zmq::i_engine*)+0xe5/0x260 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369e7cfa : zmq::object_t::process_command(zmq::command_t&)+0x7a/0x160 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369e025c : zmq::io_thread_t::in_event()+0x3c/0xa0 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369eedda : zmq::poll_t::loop()+0x15a/0x2a0 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f36a019c0 : thread_routine+0x30/0xc0 [/usr/local/lib/libzmq.so.3.1.0] 0x3217807f33 : start_thread+0xc3/0x310 [/usr/lib64/libpthread-2.18.so] 0x32174f4ded : clone+0x6d/0x90 [/usr/lib64/libc-2.18.so] -- Malloc, ptr=0x7f8f2c044bb0 0x7f8f369ec076 : zmq::pipepair(zmq::object_t**, zmq::pipe_t**, int*, bool*)+0xd6/0x550 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369f14e5 : zmq::session_base_t::process_attach(zmq::i_engine*)+0xe5/0x260 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369e7cfa : zmq::object_t::process_command(zmq::command_t&)+0x7a/0x160 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369e025c : zmq::io_thread_t::in_event()+0x3c/0xa0 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369eedda : zmq::poll_t::loop()+0x15a/0x2a0 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f36a019c0 : thread_routine+0x30/0xc0 [/usr/local/lib/libzmq.so.3.1.0] 0x3217807f33 : start_thread+0xc3/0x310 [/usr/lib64/libpthread-2.18.so] 0x32174f4ded : clone+0x6d/0x90 [/usr/lib64/libc-2.18.so] -- Malloc, ptr=0x7f8f2c0473d0 0x322685f6fd : operator new(unsigned long, std::nothrow_t const&)+0x1d/0x80 [/usr/lib64/libstdc++.so.6.0.19] 0x7f8f369ec1ab : zmq::pipepair(zmq::object_t**, zmq::pipe_t**, int*, bool*)+0x20b/0x550 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369f14e5 : zmq::session_base_t::process_attach(zmq::i_engine*)+0xe5/0x260 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369e7cfa : zmq::object_t::process_command(zmq::command_t&)+0x7a/0x160 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369e025c : zmq::io_thread_t::in_event()+0x3c/0xa0 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369eedda : zmq::poll_t::loop()+0x15a/0x2a0 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f36a019c0 : thread_routine+0x30/0xc0 [/usr/local/lib/libzmq.so.3.1.0] 0x3217807f33 : start_thread+0xc3/0x310 [/usr/lib64/libpthread-2.18.so] 0x32174f4ded : clone+0x6d/0x90 [/usr/lib64/libc-2.18.so] -- Malloc, ptr=0x7f8f2c047490 0x322685f6fd : operator new(unsigned long, std::nothrow_t const&)+0x1d/0x80 [/usr/lib64/libstdc++.so.6.0.19] 0x7f8f369ec1f3 : zmq::pipepair(zmq::object_t**, zmq::pipe_t**, int*, bool*)+0x253/0x550 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369f14e5 : zmq::session_base_t::process_attach(zmq::i_engine*)+0xe5/0x260 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369e7cfa : zmq::object_t::process_command(zmq::command_t&)+0x7a/0x160 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369e025c : zmq::io_thread_t::in_event()+0x3c/0xa0 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369eedda : zmq::poll_t::loop()+0x15a/0x2a0 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f36a019c0 : thread_routine+0x30/0xc0 [/usr/local/lib/libzmq.so.3.1.0] 0x3217807f33 : start_thread+0xc3/0x310 [/usr/lib64/libpthread-2.18.so] 0x32174f4ded : clone+0x6d/0x90 [/usr/lib64/libc-2.18.so] -- Free, ptr=0x7f8f2c000f70 0x3217480330 : free+0x0/0xe0 [/usr/lib64/libc-2.18.so] 0x7f8f369fcc60 : zmq::stream_engine_t::~stream_engine_t()+0x2a0/0x3f0 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369fcdc9 : zmq::stream_engine_t::~stream_engine_t()+0x9/0x20 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369fb52f : zmq::stream_engine_t::error()+0x5f/0x100 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369fc04e : zmq::stream_engine_t::handshake()+0x38e/0xa60 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369fc966 : zmq::stream_engine_t::in_event()+0x246/0x2a0 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369fa868 : zmq::stream_engine_t::plug(zmq::io_thread_t*, zmq::session_base_t*)+0x108/0x350 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369f144a : zmq::session_base_t::process_attach(zmq::i_engine*)+0x4a/0x260 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369e7cfa : zmq::object_t::process_command(zmq::command_t&)+0x7a/0x160 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369e025c : zmq::io_thread_t::in_event()+0x3c/0xa0 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369eedda : zmq::poll_t::loop()+0x15a/0x2a0 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f36a019c0 : thread_routine+0x30/0xc0 [/usr/local/lib/libzmq.so.3.1.0] 0x3217807f33 : start_thread+0xc3/0x310 [/usr/lib64/libpthread-2.18.so] 0x32174f4ded : clone+0x6d/0x90 [/usr/lib64/libc-2.18.so] -- Free, ptr=0x7f8f2c026da0 0x3217480330 : free+0x0/0xe0 [/usr/lib64/libc-2.18.so] 0x7f8f369fb52f : zmq::stream_engine_t::error()+0x5f/0x100 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369fc04e : zmq::stream_engine_t::handshake()+0x38e/0xa60 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369fc966 : zmq::stream_engine_t::in_event()+0x246/0x2a0 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369fa868 : zmq::stream_engine_t::plug(zmq::io_thread_t*, zmq::session_base_t*)+0x108/0x350 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369f144a : zmq::session_base_t::process_attach(zmq::i_engine*)+0x4a/0x260 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369e7cfa : zmq::object_t::process_command(zmq::command_t&)+0x7a/0x160 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369e025c : zmq::io_thread_t::in_event()+0x3c/0xa0 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369eedda : zmq::poll_t::loop()+0x15a/0x2a0 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f36a019c0 : thread_routine+0x30/0xc0 [/usr/local/lib/libzmq.so.3.1.0] 0x3217807f33 : start_thread+0xc3/0x310 [/usr/lib64/libpthread-2.18.so] 0x32174f4ded : clone+0x6d/0x90 [/usr/lib64/libc-2.18.so] -- Free, ptr=0x7f8f2c000ce0 0x3217480330 : free+0x0/0xe0 [/usr/lib64/libc-2.18.so] 0x7f8f369e9a6b : zmq::own_t::process_term_req(zmq::own_t*)+0x5b/0x80 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369e025c : zmq::io_thread_t::in_event()+0x3c/0xa0 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f369eedda : zmq::poll_t::loop()+0x15a/0x2a0 [/usr/local/lib/libzmq.so.3.1.0] 0x7f8f36a019c0 : thread_routine+0x30/0xc0 [/usr/local/lib/libzmq.so.3.1.0] 0x3217807f33 : start_thread+0xc3/0x310 [/usr/lib64/libpthread-2.18.so] 0x32174f4ded : clone+0x6d/0x90 [/usr/lib64/libc-2.18.so] --
#!/usr/bin/env python2.7 #-*- coding:UTF-8 -*- #============================================================================= # FileName: client-request.py # Desc: # Author: linxiao.jz # Email: [email protected] # Version: 0.1 # LastChange: 2014-04-02 11:35:42 # History: #============================================================================= import os import sys import cgi import math import zlib import json import time import socket import subprocess import traceback from threading import Timer from threading import Thread from threading import Lock RESULT = {} RESULT['modules'] = {} RESULT['content'] = {} LOCK = Lock() PID_FILE = '/var/run/cmclien.pid' GIT_URL = '''http://puppet:[email protected]/root/%s.git''' CONFIG_MAP = { 'puppet': { 'DIR': '/etc/puppet/modules/', 'CLIENT': '/usr/bin/puppet.sh -f', }, 'salt' : { 'DIR': '/etc/salt/base/', 'CLIENT': '/usr/bin/salt.sh -f', } } def checkPort(host, port): sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) result = True try : sock.connect((host, int(port))) except socket.error : result = False finally : sock.close() return result for x in xrange(1): print checkPort('10.69.69.249','4505')
run.sh.ext
Description: Binary data
_______________________________________________ zeromq-dev mailing list [email protected] http://lists.zeromq.org/mailman/listinfo/zeromq-dev
