[zeromq-dev] XRAP clarification: What character set are strings passed in?

Tom Quarendon Tue, 02 Feb 2016 01:21:47 -0800

When attempting to implement an XRAP client in java, one of the first issues 
you come across is what character set should strings be passed in in the XRAP 
messages.
So the client needs to build a GET message, and it needs to put the resource 
name in it. The resource name is a string, so is naturally represented in Java 
as a String object, in Unicode.
I’ve assumed that things that are naturally strings (resource names, content 
types, parameter names, metadata names, error strings) are actually passed in 
UTF8, but this isn’t specified.


I think the specification needs to be explicit about what character set strings 
are passed in, and indeed which things are actually “strings” in that sense. I 
think it’s clear that the resource names, content types, parameter names, 
metadata names, error strings are actually intended as human readable strings. 
However, it gets a but greyer with parameter/metadata values, etag values and 
content bodies.
For the body, you have to take into account the content type, but even for 
those content types that are textual (JSON, XML) currently you just have to 
assume that the encoding is UTF8 (I don’t *think* that is explicitly defined by 
application/json, but perhaps it is, in which case fine).  However the body 
won’t always be textual, in the music example in the spec, actually retrieving 
the music track would most likely return binary data (I don’t think it would 
return a JSON with a BASE64/85 encoded piece of binary data in it would it?). 
So you can’t always assume that the content body is UTF8 text I don’t think.
Etags are supposed to be opaque to the user, but it’s not clear whether this 
has to be opaque textual data, or whether this can be binary data. Maybe this 
is clearer in HTTP, since it’s passed in an HTTP header value, which are always 
ASCII anyway. I think this needs clarification in the case of XRAP, as there’s 
no particular reason it couldn’t be opaque binary data.

Ditto parameter values. I can imagine sending binary valued parameters. Indeed 
I am. Clearly these could be expressed as BASE64 or BASE85 encoded values and 
hence text, at some slight cost, but I think this needs clarification too.

Assuming there is some consensus, can the spec be edited to reflect?
Thanks.

_______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

[zeromq-dev] XRAP clarification: What character set are strings passed in?

Reply via email to