Re: xTalk: Call for participation: external API extensions

M. Uli Kusterer Sun, 05 Nov 2000 20:03:07 -0800
>This sounds fine.  Having GetParamSize() is a good idea in any case,
>since we want to support non-null-terminated strings.  Using this
>approach, the external author has a choice.  They can call
>ParamCStringValue(), for example, to get a pointer to memory owned
>and managed by the xTalk host, or they can call
>CopyParamCStringValue() to get their own copy to do with as they
>wish.

Doug,

 yup. And for SC it could map directly to GetHandleSize() :-)

>They could, of course, just call ParamCStringValue() and then copy
>the memory it points to, which would seem to eliminate the need for
>the Copy... functions altogether.

 As long as we have no typed parameters, that would be OK, but if the
internal representation is any different from the kind of string the user
requested (e.g. Unicode, long, double, Rect, whatever...) it will cause two
copies of the strings to be created, one owned by the host (even though the
host doesn't need it) and another one owned by the user.

>The reason I included the Copy...
>functions in the first place was to save double-copying in some
>cases.  For example, if the host is holding a Unicode string and the
>external requests a C string using ParamCStringValue(), the host will
>have to convert to a C string internally and return a pointer to
>that string which the external would then copy.

 Exactly. IMHO it would be much more efficient to have only the CopyXXX
calls. Only one set of calls to maintain, and the engine wouldn't have to
care about disposing something behind the XCMD's back. It is the very
reason why HyperCard's GetFieldTEHandle() callback (or whatever the exact
name) returns a copy to a field's TEHandle, and not the original. It is
safer, and allows the host to use whatever internal representation they
desire without having to care about maintaining memory for the user.

>Unless you're dealing with megabyte-size strings,
>the performance gain is probably insignificant, so maybe we should
>just eliminate the Copy... functions.

 Here's two people coming to exactly opposite decisions based on the same
data. Maybe it's my thinking the Apple way again. Apple used to make the
fault of letting the users directly access their data structures, meaning
they couldn't change the way e.g a window worked, because then applications
would have had problems getting at the data they wanted. That's what
opacity is all about. And Apple took it one step further by also only
returning copies to their data that is owned by the caller. This way, the
caller can keep it exactly as long as needed, and the host doesn't have to
babysit on data the user might be no longer needing.

>Okay, that's good to keep in mind, although in this case eliminating
>the Copy... calls would also solve the problem.  ;-)

 But would also make many things much more effort. I recently wanted to
take some ANSI code that fetches a whole file's data from a compound file
format. It is not unlike having a hard disk in a file. Now here's the
problem: QuickTime (which I needed to decode the file's data) requires me
to pass it a Macintosh Handle. But I had that memory as a malloc()ed data
block. I have to make an additional copy of that image file in a Handle,
just to have QuickTime make a third copy in a graphics buffer. I don't even
want to *see* the memory stats on this one. Also, if the user knows it's
always a copy of the data being passed around, they have a rule to follow
every time they have to consider whether they are leaking. Otherwise they
always have to try and remember: "Did I allocate the memory, or did the
host?" This begins becoming really bad when you have "if" clauses where one
allocates the data itself while the other retrieves the data from the host.
You'll always have to keep a flag who owns the memory.

>Maybe.  The term "array", however, has been used consistently in
>programming for many years to refer to a numerically-indexed sequence
>of values.  Using it to refer to "associative arrays" is, IMHO, a
>bad idea.  Associative arrays are very different beasts which would
>be better to refer to as HashTables or Dictionaries.

 Certainly. However, a string is not necessarily numerically indexed. Also,
"array" is the word that is used in MetaTalk right now, and keeping
consistent naming between the user side and the programmer side might be
helpful. I don't mind calling an associative array a HashTable, but I don't
like if something else (e.g. a non-terminated string) is suddenly called an
array. Tech support will love you if you do that.

>Getting back to the case at hand, what is the purpose of the
>ParamCharArrayValue() call above, anyway?  If its purpose is to
>provide access to arbitrary binary data, then maybe "byte" or "data"
>would be an appropriate term.  How about changing it to
>ParamDataValue(), which returns a pointer to an array of bytes:

 MetaCard has associative arrays. The way they work is that every variable
can either contain a string value, or it contain an arbitrary number of
"entries" that are identified by strings. These entries are essentially
also variables again. Although MetaCard doesn't allow that yet, you could
technically take this idea a step further and allow these entries to have
sub-entries.

 So, if a parameter is just a variable, and an array is simply a variable
containing a list of variables, we could simplify all of this a great deal
by having:

XParamRef       GetHashTableEntry( XParamRef param, char* hash );

which would give you a reference to a variable in an array variable, and
you'd identify this entry by the hash or index you pass in. Is this more
comprehensible?

>I nearly eliminated the ioLength parameter since we now have
>GetParamSize(), but it's not clear that they would return the same
>value -- should GetParamSize() return the raw size in bytes, or the
>number of characters in a string?  In the case of Unicode they will
>be very different numbers.  Always more questions...

 I also thought about that. We'd probably have to change the name to
GetParamStringLength() or something like that. "length" in the ANSI libs
also means characters, while "size" means bytes, IIRC. having length here
would be smarter since it would work both for ASCII and UniCode strings,
while otherwise the user would need several calls:

 GetParamAsciiStringLength()
 GetParamUnicodeStringLength()

Cheers,
-- M. Uli Kusterer

------------------------------------------------------------
             http://www.weblayout.com/witness
       'The Witnesses of TeachText are everywhere...'
 The future of programming: http://freecard.sourceforge.net
Re: xTalk: Call for participation: external API extensions

Reply via email to