Fred Drake wrote:
On 7/22/05, Dmitry Vasiliev <[EMAIL PROTECTED]> wrote:
I think about the following generic algorithm:
1. Preparation stage. Content type and encoding are determining based on the
<?xml?>/<meta> declarations. In case of the 'text/html' type and a not unicoded
content we decode the content. In case of the 'text/xml' type the parser takes
care of the encoding at the cooking stage. We can do it somewhere inside
This is probably right; I'll have to look at the code again.
2. Cooking stage. Nothing interested for our case.
Wrong; this is when the "bytecode" is generated. At this point, we
can remove the encoding markers (since we've already used them for
3. Rendering stage. Now we can strip the <?xml?>/<meta> declarations. We can do
it somewhere inside PageTemplate.pt_render()/PageTempalte.__call__() methods.
Rendering is the most costly stage, so we want to reduce the work done
here. Avoiding it entirely is best. By removing the encoding markers
at compilation time, we manage to have nothing else to do at this
Ok. Now I think that all this can be done somewhere inside zope.tal. I need to
write a proposal...
BTW, just curious why we need to read HTML files in the text mode (See
I don't remember, but it seemed important at the time. It likely has
something to do with newline normalization; the XML parser handles
that for us since the XML specification requires it to, but the HTML
parser doesn't bother.
I doubt this is important in practice, but may be relied on in the tests.
Maybe we can use "universal newlines" mode instead?
Dmitry Vasiliev (dima at hlabs.spb.ru)
Zope3-dev mailing list