On Jul 14, 2012, at 8:26 AM, Jakub Zawadzki wrote:
> It'd be great if we have some abstract and pure (no C/assembly inline)
> language to write dissectors.
Or "to describe protocols and the way packets for those protocols are
displayed" - the languages in question wouldn't be as procedural as C/Lua/etc,
they'd be more descriptive.
> We could invent yet another protocol desciption language,
...but, as you suggest, we probably shouldn't.
> but I was thinking to base grammar on netmon NPL [1] or wsgd [2].
Those are probably the two best choices.
I'm not sure it has to be a choice, though - we could implement both, resources
permitting, of course. (And, of course, given that there are many
already-existing languages that describe protocols - ASN.1, {OSF IDL/MIDL/PIDL}
for DCE RPC, rpcgen for ONC RPC, CORBA IDL, xcb for X11 - we will probably
never have the One True Protocol Description Language.)
> I'm bigger fan of NPL (sorry Olivier), nmparsers project has got large
> collection of dissectors[3]
> which we could use (LLTD - bug #6071, Windows USB Port packets - bug #6520,
> netsh - bug #6694)
> but there might exists some legal (patents for grammar/implementation?!)
> issues.
That would be one concern - even having "our own" language, such as wsgd, runs
the risk of infringing a patent, but, well, *writing software of just about any
sort* runs the risk of infringing a patent; however, we're dealing with a large
corporation in the case of NPL, so there's probably a greater risk that some or
all of it is covered by patents. Were Microsoft to explicitly state that there
are no patents on NPL-the-language or that they're granting a royalty-free
license for all implementations (perhaps with a "mutual assured destruction"
clause, so that were we to patent some feature of Wireshark and sue Microsoft
for violating that patent, our license for their patents would terminate), and
the same applied to any patents they hold on their implementation of NPL that
would block independent useful implementations, that might help.
> With wsgd we could reuse some existing code of plugin.
...and we also have more freedom to extend the language, e.g. to support
preferences for a protocol - Paul Long's blog post says
> A common problem: “No silly, we do HTTP traffic on port 8888, not 80 or 8080!”
>
> While changing port mappings for protocols could be something revealed in the
> user interface, we haven’t gotten that far in Network Monitor 3.0 yet. I
> expect we should address this specific problem on different fronts, i.e. a UI
> for each protocol, and some way to handle dynamic port allocations. And
> there are also some heuristics we can use to identify protocols as well. But
> today, there is a fairly simple way to modify the NPL script for protocols on
> non-standard ports.
I don't know whether, as of 3.4, they support "a UI for each protocol, and some
way to handle dynamic port allocations", but we already have the infrastructure
for that.
NPL also, for strings, offers 3 encodings - to quote the help manual:
> This data type extracts a specified number of characters from a sequence of
> bytes. The characters can be UTF-16, UTF-8, or ASCII, depending on the
> encoding specified.
There's no mention of the Extended Binary-Coded Decimal Interchange Code there,
but we have several dissectors using ENC_EBCDIC, so that would be another place
where we might want to extend NPL were we to use it.
Were there an "Open NPL Consortium" of some sort where multiple implementers of
NPL could propose extensions, and perhaps a way an implementation could offer
private extensions without worrying about colliding with other implementations
or future standards, that might help.
Note, by the way, that having a language of this sort could allow something
such as this.
Consider a protocol with the following description (in a C-like protocol
description language that I'm making up on the fly):
enum message_type {
Login = 0,
Logout = 1,
Request = 2,
Response = 3
};
struct login {
ascii string username[16];
ascii string password[16];
};
struct request {
uint32 bigendian requested_item;
};
struct response {
uint32 bigendian value_size;
uint8 value[value_size];
};
struct request {
protocol foo {
uint32 bigendian enum message_type type;
switch (type) {
case Login:
struct login login;
case Logout:
/* logout message has only a type */
case Request:
struct request request;
case Response:
struct response response;
}
uint32 bigendian message_id;
};
which might translate to (in a pseudo-machine language I'm also making up on
the fly):
uint32 bigendian foo.type saveas x
switch x:
0 Login
1 Logout
2 Request
3 Response
Login:
ascii string 16 foo.login.username
ascii string 16 foo.login.password
goto end
Logout:
goto end
Request:
uint32 bigendian foo.request.requested_item
goto end
Response:
uint32 bigendian foo.response.value_size saveas y
uint8 array y foo.response.value
goto end
end:
uint32 bigendian foo.message_id
Now consider a dissection pass being done for a display filter "foo.message_id
== 0x4073". That full "compiled" program is overkill; that dissection pass
might optimize it into
uint32 bigendian foo.type saveas x
switch x:
0 Login
1 Logout
2 Request
3 Response
Login:
skipbytes 32
goto end
Logout:
goto end
Request:
skipbytes 4
goto end
Response:
uint32 bigendian foo.response.value_size saveas y
skipbytes y
goto end
end:
uint32 bigendian foo.message_id
and, for that dissection pass, run that optimized version of the dissection
"machine code" for the foo protocol, and similarly optimized versions of the
dissection code. The optimized versions of the dissection "machine code" might
be generated as needed (rather than generating optimized versions for every
protocol, just generate them from the base code the first time we try to run
the code) and cached with the cache key being the set of fields in which the
dissection in question was interested (whether because they're being used in a
filter or for a column or in "-e {field}" in TShark or...).
This would allow us to get some of the effect of
if (tree) {
...
}
without leaving it up to humans to get it right (which humans often don't), and
allow us to do more such optimization as well (as it's not just "do I need a
protocol tree?", it's "do I need anything other than these few fields and
whatever fields are necessary to get at those fields").
(It also raises the question of whether interpreted execution of that "machine
code" or translation to C or machine language will be faster - interpreted
execution *could* result in a smaller cache footprint if the interpreter is
small enough and the code "high-level" enough to be fairly dense, although it
does involve difficult-at-best-to-predict branches in the interpretive loop.)
Of course, this would allow people to extend Wireshark without needing any C
developer tools, and would reduce the need for stability in the dissector core
code. Translating to a "machine code" of the sort shown above might also
significantly reduce compile time (maybe with support for the CORBA IDL,
building Parlay support won't dim the lights :-)), and if those are all loaded
at startup time, it might make it easier to build configurations of Wireshark
that don't have Every Single Protocol Known To Man and that thus start up more
quickly.
On the other hand, it might also allow protocol descriptions to be shipped
either in source form or binary form with restrictions on redistribution,
providing a way to "get around the GPL" for protocols. Some might consider
that a feature (I seem to remember many years ago Cisco raised this issue about
some protocols) and others might consider it a bug. If we end up with a
consensus of "it's a bug", we might be able to extend the protections of the
GPL to dissector descriptions fed to the interpreter, so that if you make a
"compiled" protocol description available, you must also make the source
available to recipients and must give recipients the right to redistribute the
source or binaries.
___________________________________________________________________________
Sent via: Wireshark-dev mailing list <[email protected]>
Archives: http://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev
mailto:[email protected]?subject=unsubscribe