McGibbney, Lewis John schrieb am 07.01.2012 um 17:39 (+0000):

> My situation is that I have lots of legal documents which exist in
> HTML, these in turn include lots and lots of presentation mark-up
> which I would like to strip before getting down to the Xalan-j XSL
> stuff. […] The question I have is whether this must be done via an
> XSL implementation or whether there is some kind of convenience/util
> interface which can be extended to do this type of thing?

I don't know of any convenience utility to do the job, but note that it
isn't hard using XSLT. You start with an identity transform and then add
rules to drop all attributes you don't want to skip all elements you
don't want. And that's it.

  <xsl:template match="@style | @border | @bgcolor | @whatnot" />

  <xsl:template match="font | div | whatnot">
    <xsl:apply-templates select="@*|node()"/>
  </xsl:template>

-- 
Michael Ludwig

Reply via email to