Hello Koen! On Thu, May 19, 2016 at 7:11 AM, Koen Deforche <k...@emweb.be> wrote: > Hey Frank, > > 2016-05-18 16:16 GMT+02:00 K. Frank <kfrank2...@gmail.com>: >> >> Just to be sure I understand: >> >> Could this be as simple as reading in a well-formed html file (say, >> content.html, >> although the actual file name / extension is irrelevant -- could equally >> well be >> content.xyz), for example, using in c++ an ifstream, and using the >> entire contents >> of the file as the argument to WText::setText()? > > More or less, since you're relying on the browser to be forgiving for the > erroneous <html>, <head> <body> and other tags. In pratice browsers are > (too) forgiving for these kind of markup mixups. > If you want to do it cleaner, you would have to remove this junk (e.g. by a > preprocessing step, or on the fly).
Thank you. I've been experimenting to see how things behave with what you call erroneous tags, and I have a question. Could you give me some guidance as to how I should understand the following behavior? I have the following html file, test.html: <!-- <body> --> <!-- <badtag> --> <!-- <badmatch> --> <h1>heading</h1> <p>paragraph</p> <!-- </matchbad> --> <!-- </badtag> --> <!-- </body> --> I have a simple Wt application that loads the contents of test.html into a WText, and also links to test.html: new WText ("link to <a href='links/test.html'>test</a>"); I have four cases: test.html, as given above; uncommenting the <body> tag pair; uncommenting the <badtag> pair; and uncommenting the <badmatch> pair. In all four cases, when I navigate to test.html thro ugh the link, "heading" and "paragraph" are displayed as I would expect. In the WText, the original and <badtag> cases display the same as the link. In the <body> case, the WText (or at least its contents) doesn't display at all, and in the <badmatch> case, the raw contents of test.html is displayed -- that is, no formatting, and the comments and markup tags are displayed. I'm just wondering how I should model the processing in my mind in order to understand this behavior in detail. For example <badtag> seems to be ignored, but <body> causes the contents not to be displayed, while <badmatch> seems to turn off the html parsing. (I get the same results with all three of recent versions of chrome, opera, and ie.) I should note that there is nothing urgent or problematic about this. I'm just trying to learn the details of what is going on. (Also, am I right that to be fully well-formed for a free-standing web page, the html file should have html, head, and body tags, while these are technically not legal in an html fragment in a WText?) > ... > Regards, > koen Thanks again. K. Frank ------------------------------------------------------------------------------ What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic patterns at an interface-level. Reveals which users, apps, and protocols are consuming the most bandwidth. Provides multi-vendor support for NetFlow, J-Flow, sFlow and other flows. Make informed decisions using capacity planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e _______________________________________________ witty-interest mailing list witty-interest@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/witty-interest