I asked much the same question a little while back and what I got together was:

First have the doc saved as "HTML (Filtered)" if it's coming from Word 2003 (earlier versions can get the filtered thingy someone else mentioned).

Then in my case I wrote a filter for the content management system I built to pass the Word HTML through. What you could do is use one of the implementations of Tidy ( http://tidy.sourceforge.net/ ) e.g. For a web version try:


...As I type this I'm just testing it on a big Word filtered HTML doc... and yes it seems to do a decent job.


Hi group.

I'm wondering if there's some easy (and free) way to convert text from a WORD document into clean XHTML that retains the formatting.


****************************************************** The discussion list for http://webstandardsgroup.org/

See http://webstandardsgroup.org/mail/guidelines.cfm
for some hints on posting to the list & getting help

Reply via email to