More on the ASCII/Unicode support

Dimitrie O. Paun Thu, 27 Apr 2000 08:39:50 -0700

Now, I initially started this thread by saying that I was 
trying to _eliminate_ our internal API as much as possible, 
and simply use the std. Windows API. My motivation is twofold:
  1. The Win API is huge enough as it is, it is no point in inventing
functions with new semantics for no good reason -- these new APIs
(usually) don't have proper documentation, are unknown to 
developers, etc.
  2. Some of you internal API are bad -- they mix policy with mechanism
way too much for the Wine's well being.

There are some other, maybe not so important reasons, not to have
custom API, but I will stop here.

Now, my proposal (having an X function for each A/W pair), will enlarge
the number of internal functions but quite a bit, I agree. However, these
new internal functions can be reduced to the std. API automatically so
they do not suffer from 1 or 2, so it shouldn't be too bad.

After all is said and done, the crux of the matter is that Unicode is
around the corner. There is a lot of support being built in Linux for it,
and we will have to support it sooner than latter.

We have currently 4 solutions:

A = ASCII
W = Unicode, UTF-16
U = Unicode, UTF-8

-------------------------------------------------------
1. W->A conversion, work internally with A

PROS:
  -- best obtion obtion for debugging
  -- fast for A (common case today)
  -- use std. Win API 

CONS:
  -- we do NOT support Unicode, we just pretend we do. 
  -- A lot of work, a lot clutter, close to no gain.
  -- inefficient for the W case


-------------------------------------------------------
2. A->W conversion, work internally with W

PROS:
  -- full Unicode support
  -- fast for W
  -- use of std. Win API
  -- part of Wine is already written this way

CONS:
  -- a lot of clutter
  -- very inefficient in the A case (A->W->U usually)
  -- no support for other encodings (say for Asian languages
which may need more bytes than Unicode supports)



-------------------------------------------------------
3. A,W call onto a X function which carries the encoding around

PROS:
  -- full Unicode support
  -- as fast as 1 for A, and as 2 for W (for common code path like display)
  -- support for new encodings is trivial
  -- not much worse than 2 for debugging
  -- maybe a bit less clutter than in 1 or 2 (debatable)
  -- easy transition from what we have to this 

CONS:
  -- use of non std. Win API
  -- it is not used in Wine currently


-------------------------------------------------------
4. Write all functions independent of the encoding and recompile
    to get all encodings

PROS:
  -- fastest option for A, W
  -- easy to support future encodings
  -- use of std. API
  -- less clutter (in theory)

CONS:
  -- huge bloat
  -- it is not used in Wine currently
  -- (maybe) difficult transition path


*****************************************************
Now, I submit to you that 1 is unacceptable. 
We need correct Unicode support.

I don't like 2 for its inefficiency in a Unix environament 
where we will need to convert from W->U anyway.
I think the penalty for the common case (A) will be huge: A->W->U. 
We do memory allocations like crazy (slow), 
we copy data (kill cache), etc. We do not blend in the Unix
environment, and I really don't like it.

I would be happy with 3 or 4. Now, I know Alexandre doesn't
like 4, and (at the gut level :) ), I understand him. I also don't like
3 for its use of internal API, but in the long run it may be
beneficial to know that we call an internal API and not an external
one. 

In any case, we need some guidelines on how to deal with Unicode
in a consistent manner. The current situation is a mess of 1 & 2
which will need cleanup. Alexandre?

Sorry for the long post,
Dimi.
More on the ASCII/Unicode support

Reply via email to