Hi Guy, On Sat, Oct 24, 2020 at 01:23:57PM -0700, Guy Harris wrote: > > sidenote: I've just been doing some development work with io_uring > > (using liburing) on modern Linux systems, and it's amazing in terms of > > performance of asynchronous I/O. Might be worth investigating. > > Might be, although we'd need to either: > > 1) figure out a way to do that, while hiding the platform-dependent details, > on *all* (currently living) platforms supported by libpcap: > > macOS; > the *BSDs; > Solaris; > HP-UX; > AIX; > Windows; > Linux; > > which don't all have the same asynchronous I/O mechanisms (POSIX aio on most > if not all of the UN*Xes, "overlapped I/O" or whatever Microsoft calls it on > Windows)
I hear you. But fundamentally, if your abstraction API bases on buffers in memory that * get allocated/provided by the platform-specific code * get handed to the platform-specific code for write You should be able to cover all of those (famous last words). I think the problem only starts when the higher layer code tries to handle the select/poll or even only the write() calls by itself. Now that I think more of your use case, you probably cannot even have the platform specific code handle the allocations for the buffers, as you use mmap()ed AF_PACKET on the "read" side. So if the platform specific AIO mechanism cannot handle "foreign" memory that it didn't allocate, you will have to copy. With io_uring, you can hand in whatever buffers allocated in whichever way. There's a small performance benefit if you pre-register the buffers, so that the mapping in/out of kernel space doesn't have to be done on every I/O operation. You _should_ be able to register the entire mmap'ed memory from the AF_PACKET socket once on startup, though. > 2) arrange that the user may, but need not, provide their own low-level > writing code that the new writing APIs call, so they can either use a > platform-independent mechanism supplied by libpcap or write their own code. > I think Michael Richardson has been thinking of something such as that. That is basically more or less what I'm suggesting in the above. I'm happy to hack up an io_uring / liburing backend and contribute it, once an interface for plugging that in materializes in libpcap. As I'm not a regular follower of the related mailing lists, please send me a ping once you get to that point. unrelated note: io_uring really does marvels, also for workloads with many sockets. It's easy to send and/or receive something like 500k pps from thousands of sockets on my several years old laptop (Lenovo x26). For a traditional userspace program using UDP socket based I/O that's quite amazing. Of course, not at all related to libpcap with it's mmap() ed socket. > What *might* be possible to do, in the absence of new libpcap capture APIs, > would be to have dumpcap, when capturing from the "any" device on Linux and > writing to a pcapng file: > > when the capture starts, write out Interface Description Blocks (IDBs) > for all the currently-known interfaces on the system, and make a table > mapping from the kernel's interface indices (ifIndexes, in SNMP terms) to > interface IDs for those IDBs; > > when a packet arrives, look up its interface index of the packet, and: > > if it's found, write the packet out with that interface index; > > if it's *not* found, write out an IDB for the new interface, > add it to the table, and write the packet out with that interface index. Irrespective of current/future libpcap, this reflects the kind of logic that I understood would be required for writing proper pcap-ng with IDBs on an "any" interface capture, yes. Good to hear it might be possible even with the current code. Regards, Harald -- - Harald Welte <lafo...@gnumonks.org> http://laforge.gnumonks.org/ ============================================================================ "Privacy in residential applications is a desirable marketing option." (ETSI EN 300 175-7 Ch. A6) ___________________________________________________________________________ Sent via: Wireshark-dev mailing list <wireshark-dev@wireshark.org> Archives: https://www.wireshark.org/lists/wireshark-dev Unsubscribe: https://www.wireshark.org/mailman/options/wireshark-dev mailto:wireshark-dev-requ...@wireshark.org?subject=unsubscribe