Watir mostly uses WIN32OLE to access Windows COM API's. For example, the document object has an 'outerHTML' COM method.

When the text is ANSI, this method call works, but when the text in the web page is UTF8 and includes Japanese, then the non-ANSI text (non ASCII) is not returned. It's not a matter of translation -- the text is exactly "???????" with a question-mark replacing all the non-ansi text.

Once it is munged, there is no possibility to fix the text, no matter if i use unconv or iconv or foo-conv.

I need a way to get WIN32OLE to give me non-munged text.

I fear that the WIN32OLE library needs to be changed to support UTF-8 (or other encodings).

Because the same method call works in VBA, i believe that the problem is with the WIN32OLE library rather than the COM method itself....

WIN32OLE author Nobu Nakada tells me that WIN32OLE codepage support will be in Ruby 1.8.3. I'm checking it out right now...

Bret

At 01:41 AM 9/8/2005, saud aziz wrote:
Have you guys tried checking Yoshidam's page?

<http://www.yoshidam.net/Ruby.html>http://www.yoshidam.net/Ruby.html

Following text is from there. Also, notice there is a Perl module as well, and i wonder if that could be reused to make Watir use Perl's Win32 module for popup's handling?



2. <http://www.yoshidam.net/Ruby.html#uconv>Uconv module



XMLParser module can process UTF-16 and UTF-8, but cannot process Japanese encoding (e.g. EUC-JP, Shift_JIS and ISO-2022-JP). Uconv module provides the methods to convert UTF-16, UTF-8 or UCS-4 into EUC-JP or CP932, and EUC-JP or CP932 into UTF-16, UTF-8 or UCS-4. * <http://www.yoshidam.net/uconv-0.4.12.tar.gz>version 0.4.12 download<http://www.yoshidam.net/Ruby.html#uconv> * <http://www.yoshidam.net/uconv_ja.txt>Document (Japanese) <http://www.yoshidam.net/Ruby.html#uconv>
   * <http://www.yoshidam.net/uconv_en.txt>Document (English)
Changes of version 0.4.12
   * support Ruby 1.8
Changes of version 0.4.11
* append --enable-compat-win32api option for Win32API compatible CP932 table.
Changes of version 0.4.10
   * fix memory leaks
   * append --enable-fullwidth-reverse-solidus option.
Changes of version 0.4.9
   * add replace_invalid
Changes of version 0.4.8
   * support the tainted status
   * check non-shortest form UTF-8
   * change Exception into Uconv::Error
Changes of version 0.4.6
   * fix s2u_conv
   * add USE_WIN32API
Changes of version 0.4.5
   * fix u2s_conv
   * change USC/CP932 conversion table
Changes of version 0.4.4
   * SJIS to UCS conversion bug
Changes of version 0.4.3
   * Eliminate non-constant initializers
Changes of version 0.4.2
   * ZWNBSP-preservative mode
Changes of version 0.4.0
   * Support CP932 (a variant of Shift_JIS for Japanese Windows)
Changes of version 0.3.1
   * Fix some memory bugs



8. Unicode library



This is a library for Unicode Normalization.
   * <http://www.yoshidam.net/unicode-0.1.tar.gz>version 0.1 download
   * <http://www.yoshidam.net/unicode.txt>Document



12. rbuconv library



This is a pure Ruby library for Unicode translation. It can be used on systems without C compilers, and almost compatible with Uconv library. Ruby license. * <http://www.yoshidam.net/diary/rbuconv-0.1.2.tar.gz>version 0.1.2 download




On 9/7/05, Bret Pettichord <<mailto:[EMAIL PROTECTED]>[EMAIL PROTECTED]> wrote:
At 12:14 AM 9/8/2005, Alexey Verkhovsky wrote:
>Bret Pettichord wrote:
>
>>I'd appreciate any insight into this matter that you might have. I've
>>spent a couple days reading up on UTF-8 and Unicode, but don't really
>>have any experience with these things.
>
>Bret, hi
>
>Try to add this to your code:
>
># Enable UTF-8 support
>$KCODE = 'u'
>require 'jcode'
>
>Not sure if it will help for OLE, but it may.
>
>Alex

Thanks for the idea. No luck. Here's my complete test script:



$KCODE = 'u'
require 'jcode'
require 'spreadsheet'

def copy_column (column)
  data = $sheet.cells(1, column).value
  $sheet.cells(2, column).value = data
  puts data
  puts data.unpack("U*")
  puts "length: #{ data.length}"
  puts "jlength: #{data.jlength}"
  require 'breakpoint';breakpoint
end

this_dir = File.expand_path(File.dirname(__FILE__))
workbook = Workbook.new("#{this_dir}\\intl_text.xls")
workbook.visible = true
$sheet = workbook.use_page 1

copy_column 'b'
copy_column 'c'







_____________________
Bret Pettichord
<http://www.pettichord.com>www.pettichord.com

_______________________________________________
Wtr-general mailing list
<mailto:[email protected]>[email protected]
http://rubyforge.org/mailman/listinfo/wtr-general




--
"..man is a human being, not because of his physical powers for physically the camel is his superior; not because of his size for the elephant is larger; not because of his courage for the lion is more courageous; not because of his appetite for the ox has the greater; not because of coitus for the least of the birds is more virile than he, but rather by virtue of his noble aims and ideals. [As a matter of fact] he was only created to know." (Al- Ghazali; The book of Knowledge, Section 1)

_______________________________________________
Wtr-general mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/wtr-general

_____________________
 Bret Pettichord
 www.pettichord.com

_______________________________________________
Wtr-general mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/wtr-general

Reply via email to