Hi Mike, how many files are you looking to transform?

We have done this (gone from an MS word document to XXE and docbook) but
it was only for a single document.
I used a text save and then wrote (first awk and then xslt) scripts to
parse and transform the resulting XML document into more and more
acceptable output. This would be unacceptable for anything more than a
few files unless they were all exactly the same format. I would also
advocate going to an intermediate format which only contains the "real
data" and cuts out all the presentation stuff - for something that is
poure documentation DocBook is brilliant, and yes it can easily hold the
real information but I see it mainly as another step in the process of
converting "real information" into what I want to present. Afterall
Stylesheets are about presentation. All of the web available tools made
an unacceptable/unusable mess of the source document.

Our document shrank from a 1.5 megabyte file downto a few hundred
kilobytes, and now contains both English and German text.
What I will say though is that when handling tables the conversion to
docbook is "really ugly". The results are only acceptable when the
original WORD document is consistiently structured. Ours was not so I
was forced to reject the RTF variant in favour of plain text. Other
benefits from using XML are cleaner rendering of tables, ability to
generate marked up HTML/pdf, easy addition (and selection) of German
text. All of which were possible with MS Word just not as
straightforward. We also generate definition files (and test scripts) on
the fly, something that used to be an error prone cut and paste job and
not possible from WORD (I tried using visual basic but fixing the
structure took too long).

Hope this helps

Cheers Graham

Reply via email to