Erlend Hamberg wrote:
> On Saturday 10. October 2009 03.48.29 Guy Harris wrote:
>> The data from the frames in the capture file are not kept in
>> Wireshark's address space - they are read in as necessary, into a
>> small number of buffers (one for the main window, and one for each
>> packet window opened).  *HOWEVER*, if data from a frame is reassembled
>> into a higher-level multiple-frame packet, the result of the
>> reassembly is, as noted, kept in Wireshark's address space.
> 
> So, when Wireshark reads the capture file, if it finds a single-frame packet, 
> it will only create a frame_data structure in memory and possibly data from 
> the dissector for that type of packet. But if the packet is made up of 
> several 
> frames, the packet is reassembled and kept in memory? If so, do you think 
> this 
> could be changed? Would it be worth it?

One thought: per-dissector data usually has to be real memory since the 
dissectors access it as, well, memory.

The results of reassembly, however, are (I think always) put into a TVB 
which you're only allowed[1] to access via the tvb_ APIs.  Couldn't a 
TVB be backed by something other than memory?  For example, a 
(non-memory-mapped) file?

To make it not be horrendously slow, the TVB layer might have to 
implement some kind of in-memory caching of the stuff going to/from the 
file (so that each tvb_get_guint8() wouldn't result in a seek plus a 
1-byte read).  Or maybe the OS would do that well enough?

[1] tvb_get_ptr() notwithstanding.  OK, that is a tvb_ API but it allows 
you direct access to the TVB data.  Using this API with a file-backed 
TVB would require allocating memory and copying it in from disk to 
return to the user.  BTW, given the big comment about this function in 
tvbuff.h, I was surprised to find almost 1300 uses in epan/dissectors/ ...

>> People complain about it enough that, while in *most* cases it might
>> not be a problem, we frequently get mail from people who have to split
>> up capture files to read them - I'd call it enough of a problem that
>> we should work on it (ideally, by reducing the amount of address space
>> required by the aforementioned data items).
> 
> Yes, absolutely.
> 
> It would still be nice if would be possible for people to analyse more data 
> than will fit in virtual memory (in the case of Linux/Solaris, etc. where the 
> swap space is fixed). I see that there is an "abstraction" of memory 
> allocation in epan/emem.c (se_alloc* and friends), but g_malloc, and plain 
> malloc is used as well, it seems.
> If the functions in emem.c were used for all memory allocation/freeing, that 
> would mean that this could be done by intercepting requests for memory in 
> those functions.

You mean by sending them to memory-mapped files?  Unless, as Guy pointed 
out, there's some way to tell the OS to swap out that memory before 
normal memory, I think that once you start swapping the UI is (still) 
going to become unusable.

> What is the status on the use of these functions? I got the impression from 
> README.malloc that these are recommended, but I mostly see allocations done 
> using g_malloc. Or is that just allocations that should outlive a capture 
> session?

Yes, those functions "should" normally be used.  But there are good 
reasons not to: for example if we know we're allocating a bunch of 
memory and we'll free it after the current frame is dissected (so we 
can't use ep_ memory) but before the file is closed (so using se_ memory 
would mean the allocation sticks around longer than it needs to).  The 
reassembly code uses g_malloc() (presumably) for this reason.

Another reason, of course, is that the ep_ and se_ allocators are 
(relatively) new.
___________________________________________________________________________
Sent via:    Wireshark-dev mailing list <[email protected]>
Archives:    http://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev
             mailto:[email protected]?subject=unsubscribe

Reply via email to