Hello Koen!

On Thu, May 19, 2016 at 7:11 AM, Koen Deforche <k...@emweb.be> wrote:
> Hey Frank,
>
> 2016-05-18 16:16 GMT+02:00 K. Frank <kfrank2...@gmail.com>:
>>
>> Just to be sure I understand:
>>
>> Could this be as simple as reading in a well-formed html file (say,
>> content.html,
>> although the actual file name / extension is irrelevant -- could equally
>> well be
>> content.xyz), for example, using in c++ an ifstream, and using the
>> entire contents
>> of the file as the argument to WText::setText()?
>
> More or less, since you're relying on the browser to be forgiving for the
> erroneous <html>, <head> <body> and other tags. In pratice browsers are
> (too) forgiving for these kind of markup mixups.
> If you want to do it cleaner, you would have to remove this junk (e.g. by a
> preprocessing step, or on the fly).

Thank you.  I've been experimenting to see how things behave with
what you call erroneous tags, and I have a question.

Could you give me some guidance as to how I should understand
the following behavior?

I have the following html file, test.html:

<!-- <body> -->
<!-- <badtag> -->
<!-- <badmatch> -->
<h1>heading</h1>
<p>paragraph</p>
<!-- </matchbad> -->
<!-- </badtag> -->
<!-- </body> -->

I have a simple Wt application that loads the contents of test.html
into a WText, and also links to test.html:

   new WText ("link to <a href='links/test.html'>test</a>");

I have four cases:  test.html, as given above; uncommenting the
<body> tag pair; uncommenting the <badtag> pair; and uncommenting
the <badmatch> pair.

In all four cases, when I navigate to test.html thro ugh the link, "heading"
and "paragraph" are displayed as I would expect.

In the WText, the original and <badtag> cases display the same as the
link.  In the <body> case, the WText (or at least its contents) doesn't
display at all, and in the <badmatch> case, the raw contents of test.html
is displayed -- that is, no formatting, and the comments and markup
tags are displayed.

I'm just wondering how I should model the processing in my mind in
order to understand this behavior in detail.  For example <badtag>
seems to be ignored, but <body> causes the contents not to be
displayed, while <badmatch> seems to turn off the html parsing.

(I get the same results with all three of recent versions of chrome,
opera, and ie.)

I should note that there is nothing urgent or problematic about this.
I'm just trying to learn the details of what is going on.

(Also, am I right that to be fully well-formed for a free-standing web
page, the html file should have html, head, and body tags, while
these are technically not legal in an html fragment in a WText?)

> ...
> Regards,
> koen


Thanks again.


K. Frank

------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are 
consuming the most bandwidth. Provides multi-vendor support for NetFlow, 
J-Flow, sFlow and other flows. Make informed decisions using capacity 
planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e
_______________________________________________
witty-interest mailing list
witty-interest@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/witty-interest

Reply via email to