Erlend Hamberg wrote: > On Saturday 10. October 2009 03.48.29 Guy Harris wrote: >> The data from the frames in the capture file are not kept in >> Wireshark's address space - they are read in as necessary, into a >> small number of buffers (one for the main window, and one for each >> packet window opened). *HOWEVER*, if data from a frame is reassembled >> into a higher-level multiple-frame packet, the result of the >> reassembly is, as noted, kept in Wireshark's address space. > > So, when Wireshark reads the capture file, if it finds a single-frame packet, > it will only create a frame_data structure in memory and possibly data from > the dissector for that type of packet. But if the packet is made up of > several > frames, the packet is reassembled and kept in memory? If so, do you think > this > could be changed? Would it be worth it?
One thought: per-dissector data usually has to be real memory since the dissectors access it as, well, memory. The results of reassembly, however, are (I think always) put into a TVB which you're only allowed[1] to access via the tvb_ APIs. Couldn't a TVB be backed by something other than memory? For example, a (non-memory-mapped) file? To make it not be horrendously slow, the TVB layer might have to implement some kind of in-memory caching of the stuff going to/from the file (so that each tvb_get_guint8() wouldn't result in a seek plus a 1-byte read). Or maybe the OS would do that well enough? [1] tvb_get_ptr() notwithstanding. OK, that is a tvb_ API but it allows you direct access to the TVB data. Using this API with a file-backed TVB would require allocating memory and copying it in from disk to return to the user. BTW, given the big comment about this function in tvbuff.h, I was surprised to find almost 1300 uses in epan/dissectors/ ... >> People complain about it enough that, while in *most* cases it might >> not be a problem, we frequently get mail from people who have to split >> up capture files to read them - I'd call it enough of a problem that >> we should work on it (ideally, by reducing the amount of address space >> required by the aforementioned data items). > > Yes, absolutely. > > It would still be nice if would be possible for people to analyse more data > than will fit in virtual memory (in the case of Linux/Solaris, etc. where the > swap space is fixed). I see that there is an "abstraction" of memory > allocation in epan/emem.c (se_alloc* and friends), but g_malloc, and plain > malloc is used as well, it seems. > If the functions in emem.c were used for all memory allocation/freeing, that > would mean that this could be done by intercepting requests for memory in > those functions. You mean by sending them to memory-mapped files? Unless, as Guy pointed out, there's some way to tell the OS to swap out that memory before normal memory, I think that once you start swapping the UI is (still) going to become unusable. > What is the status on the use of these functions? I got the impression from > README.malloc that these are recommended, but I mostly see allocations done > using g_malloc. Or is that just allocations that should outlive a capture > session? Yes, those functions "should" normally be used. But there are good reasons not to: for example if we know we're allocating a bunch of memory and we'll free it after the current frame is dissected (so we can't use ep_ memory) but before the file is closed (so using se_ memory would mean the allocation sticks around longer than it needs to). The reassembly code uses g_malloc() (presumably) for this reason. Another reason, of course, is that the ep_ and se_ allocators are (relatively) new. ___________________________________________________________________________ Sent via: Wireshark-dev mailing list <[email protected]> Archives: http://www.wireshark.org/lists/wireshark-dev Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev mailto:[email protected]?subject=unsubscribe
