Hi,

it seems that DocBook V5 schema is written in a way which permanently 
turns XXE into lenient mode (which is of course not very handy for real 
editing).

The cause of this are info and indexterm elements. I can understand why 
info cause lenient mode. If you have element with required title such 
chapter, you can write it down like

<chapter>
   <title>

or

<chapter>
   <info>
     <title>

I can see ambiguity here and this is not problem for me, because I 
actually use customized version of DocBook which allows only former way 
of specifying title.

What I actually do not get is problem with indexterms. Because indexterm 
can be child of almost any element in DocBook it means that lenient mode 
is present on almost all DocBook elements. But why lenient mode must be 
used for elements that contain indexterm?

DocBook schema defines three content models for indexterm which are then 
combined together to single pattern used in places suitable for 
indexterm. Suppose that I'm in <para>. Now I should be able to insert 
dozen of DocBook inline elements including indexterm. But because there 
are three different kinds of indexterm, whole para element is edited 
inside lenient mode and element completition is confused. Couldn't this 
be propagated one level down? It should be possible to infer from schema 
that indexterm is allowed here and lenient mode should be activated only 
inside indexterm.

Actually I was able to workaround indexterm problems by pattern 
refactoring for indexterm, so schema defined as

include "docbook.rnc"
{
db.indexterm.singular =
    db.indexterm.singular.attlist,
    db.indexterm.contentmodel

db.indexterm.startofrange =
    db.indexterm.startofrange.attlist,
    db.indexterm.contentmodel

db.indexterm.endofrange =
    db.indexterm.endofrange.attlist,
    empty

db.indexterm =
    element indexterm {
       db.indexterm.singular
     | db.indexterm.startofrange
     | db.indexterm.endofrange
    }
}

doesn't cause lenient mode on parents of indexterms. But couldn't his 
simple case be handled directly by XXE?

I mean if you found element which contains content model like

... (element X { patternA } | element X { patternB }) ...

you could internally rewrite it to

... (element X { patternA | patternB }) ...

This way I suppose that lenient mode will be more rare to occur for many 
schemas.

                                Jirka

-- 
------------------------------------------------------------------
   Jirka Kosek     e-mail: jirka at kosek.cz     http://www.kosek.cz
------------------------------------------------------------------
   Profesion?ln? ?kolen? a poradenstv? v oblasti technologi? XML.
      Pod?vejte se na n?? nov? spu?t?n? web http://DocBook.cz
        Podrobn? p?ehled ?kolen? http://xmlguru.cz/skoleni/
------------------------------------------------------------------
                    Nejbli??? term?ny ?kolen?:
        ** DocBook 15.-17.5.2006 ** XSL-FO 12.-13.6.2006 **
     ** XSLT 23.-26.10.2006 ** XML sch?mata 13.-15.11.2006 **
------------------------------------------------------------------
   http://xmlguru.cz    Blog mostly about XML for English readers
------------------------------------------------------------------

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 3225 bytes
Desc: S/MIME Cryptographic Signature
Url : 
http://www.xmlmind.com/pipermail/xmleditor-support/attachments/20060601/d2b419c9/attachment.bin
 

Reply via email to