This is an early sketch for how I might propose we re-factor the Xerces x-platform support for "3.0".

The Current Way
=============
There are five main locations of platform specific code in Xerces:

    (1) util/Platforms
    (2) util/Compilers
    (3) util/Transcoders
    (4) util/NetAccessors
    (5) util/MsgLoaders

Compilers, Transcoders, NetAccessors, and MsgLoaders are built to support particular functionality that may or may not be available on a given compiler/platform, and are selected at compile time or runtime. Each of these holds fairly specific functionality that is narrowly focused, and is pretty clearly either present, or not.

In Platforms, a new subclass of Platform is created for each port. Functionality that must be implemented by the platform class includes:

    - Factory for NetAccessor, MsgLoader, and Transcoder
    - Panic handling
    - File access
    - Path and Directory handling
    - isAnySlash
    - Standard Input
    - Timing
    - Atomic Ops
    - Mutexes

The Problems
===========
In general, the current approach works well, I believe, for Transcoders, NetAccessors, and MsgLoaders. As I said above, these are generally either simply available or not in a given configuration. And they're pretty well focused on doing one thing.


There are two areas where the Transcoders might be improved:

- In general, about half of the transcoder code is taken up by supporting the LCP functions, which are disjoint from the standard virtual transcoder interface. As I proved recently for the Mac transcoder, the LCP transcoder can be written in terms of the virtual transcoder; to implement this would reduce the amount of code needed within any given transcoder implementation, and simply the the complexity.

- There looks to be some redundancy in the (six!) iconv-based transcoders, and perhaps the cygwin/Win32 trancoders, but I'm not familiar enough with them yet to understand the reasons for this, or whether/how they might be collapsed. Any input on this from folks would be interesting.

The Platform class is where most of the issues arise. Due to the monolithic nature of this class, you pretty much either have to accept it or not in its entirety, which means that (a) most new ports need to write a new Platform class, and (b) there's quite a bit of work to doing so, and (c) that this leads to a lot of redundancy between ports. Not to mention, (d) that this architecture doesn't tend to lend itself to the autoconf way of selecting functionality in a very fine-grained manner.

The (Potential) Solutions
===================
Given some of my goals for "3.0" of enabling configuration based on autoconf and making it far easier to port Xerces to new platforms, while also trimming redundancy and code bloat, I've started to formulate some ideas on how to re-factor some of this code. I will present those here, and ask for any feedback people have.


The major thing that needs to be done is to break down the monolithic Platform class. We can perhaps leave the interface as it is, but we need to gut it underneath in order to allow ad-hoc selection of bits of functionality therein.

* Create a new "unified" platform subclass that:

- Has factories for NetAccessor, Transcoder, and MsgLoader. They will make their selection based on preprocessor defines, probably.

- Relies on new factories for new Mutex, File, Timer, and AtomicOp classes. The timer functionality may not need to be abstracted to this degree, and could be done with defines if we're lazy.

- Has a panic method that calls abort(). Maybe also have a pluggable panic function for apps that need to do something else.

- The isAnySlash problem should be solved by defines that specify which of the path separators are used, or by having the File class provide a string of path separators. I _think_ I prefer the former. BTW: it looks to me that isAnySlash in most platforms is as / or \. This doesn't look right to me: \ isn't valid on most posix paths, is it...??? Another way to handle this might be to use platform specific code in the File class to do the path weaving/manipulation...but this takes us back in time ;)

- Perhaps eliminate the standard input support entirely. (Is this used anywhere--in tests or samples?)

New classes hierarchies will be generated for Mutex, FileHandle, Timer and AtomicOp, implementing the various flavors of these required by different platforms. Note, for instance, that posix file handle, mutex, and timer classes will serve many platforms. AtomicOp will probably be written in terms of Mutex for those platforms that don't support built-in atomic support or special functions

* The Transcoder LCP interface should perhaps be re-factored as a single class that's written in terms of the virtual transcoder. This assumes that for any given platform, the virtual transcoder can always handle the platform's LCP encoding. The task on a given platform, then, is to determine the proper LCP encoding.


Open Questions =============

- isAnySlash, which is misnamed and is simply supposed to determine whether the character is a path separator, is implemented on nearly all the platforms to include both / and \. Isn't this wrong for posix paths???

- Is the openStdInHandle function used anywhere? I haven't searched yet.

- Can the virtual transcoder handle the needs of the LCP transcoder, particularly on the Windows and Cygwin platforms, if we were to implement a generic LCP transcoder in terms of the virtual transcoder? We would need to supply the LCP encoding.

- What is the nature of redundancy in the six iconv-based transcoder implementations? Or between the Win32 and Cygwin transcoders?

- More thinking/research has to go into how we should handle the "Compilers" section. Leave it alone? Do more in terms of finer-grained tests through autoconf?


This has just been a brain-dump of my currect thoughts on these issues. Your feedback is encouraged.


James.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to