This looks like fun with encoding.
It looks like you use cmd.exe and then you use puts as well in that.
Watir doesn't parse html the way you see it with 'view source' it gets
runtime data from IE process (well, a big simplification on my part here)

Let's say I have a test.html page with title test and an html character
copyright as & copy; (or $ #169) somehere in html source (I put space
between & and copy so it will not be interpreted at runtime as "(c)" )

if I open cmd.exe and open irb and do this:

irb(main):001:0> require 'watir'
=> true
irb(main):002:0> ie = Watir::IE.attach(:title, "test")
irb(main):005:0> ie.text
=> "test\251"

I get a translation of html entity into latin octal representaiton '\251'
(hex is \xA9 and unicode \u00A9)

however if I do this:
irb(main):009:0> puts ie.text
test⌐
=> nil

I get garbage - (you get ? marks) - puts returns something and then windows
cmd does something with it. so somewhere there are some translations from
html entitiy to latin etc... etc...
The short version is: I would basically stick with html entities assertions
as octal representations (or hex)

Zeljko's name shows on my machine sometimes as '?eljko' because of the
encodings translations along the way email travels and whatever encoding is
set on my machine. My polish last name gets the same treatment. I have 2 non
latin chars in my last name so sometimes it arives with ?? marks.

Paul suggested to use the actual entity in the assertion. I tried it and it
doesn't work for me. I get false.

If i run the test as a scirpt (not from irb)
<pre>
ie = Watir::IE.attach(:title, "test")
$KCODE = 'u'
puts ie.text
puts "(c)"
puts 'found by octal' if ie.text.include?("\251")
puts "found by hex" if ie.text.include?("\xA9")
puts "found by unicode" if ie.text.include?("\u00A9")
puts "foudn by entity" if ie.text.include?("(c)")
puts "found by html char set copy" if ie.text.include?("&copy;")
puts "found by html char set 169" if ie.text.include?("&#169;")
</pre>

I get this:
test⌐ ⌐
©
found by octal
found by hex




marekj



On 7/13/07, jhe <[EMAIL PROTECTED]> wrote:

 I use "puts $ie.text" to print html document in the windows command
prompt, finding that it return "?" instead of "\251".

In fact, when I view the source in IE, it displays "&copy;"



Now, it seems that the difference is that your $ie.text will return
"\251", but my $ie.text only return "?", is it caused by ruby version?



I use Ruby-185-21, Watir-1.5.1.1192, and IE 6.

 Regards,

Jason
 ------------------------------

*From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
*On Behalf Of *?eljko Filipin
*Sent:* 2007年 7月12 日 17:51
*To:* wtr-general@rubyforge.org
*Subject:* Re: [Wtr-general] How to deal with the copyrihgt symbol



Strange, it works for me.

Ok, let's see where the problem is. Navigate to page that has only &copy;
as it's text.
ie.text returns "\251"

ie.text
=> "\251"

ie.text.include ?("\251") returns true

ie.text.include?("\251")
=> true

Am I missing something?

I have Ruby 1.8.6, Watir 1.5.1.1192 and IE 6.

Zeljko

_______________________________________________
Wtr-general mailing list
Wtr-general@rubyforge.org
http://rubyforge.org/mailman/listinfo/wtr-general

_______________________________________________
Wtr-general mailing list
Wtr-general@rubyforge.org
http://rubyforge.org/mailman/listinfo/wtr-general

Reply via email to