I'm using the lazy ByteString representation to match against, so it's
no surprise that it fails.

On Fri, Mar 20, 2009 at 11:41 AM, Chris  Kuklewicz <[email protected]> wrote:
> With [Char] and (Seq Char) the text is full unicode.
>
> With ByteString and ByteString.Lazy you are really using
> ByteString.Char8 and ByteString.Lazy.Char8
>
> Here is a test (I saved the source file in utf8):
>
> import Text.Regex.TDFA
> text = "☮☯♲☢☣☠☃"
> regex = "(☢|☣)"
> search :: [[String]]
> search = text =~ regex
> main = do
>  print text
>  print regex
>  print search
>
> in ghci this prints:
>
> *Main> main
> main
> "\9774\9775\9842\9762\9763\9760\9731"
> "(\9762|\9763)"
> [["\9762","\9762"],["\9763","\9763"]]
>
> So this works.  Are you using bytestrings to hold unicode as utf-8 or
> utf-16 ?


yup, utf-8.

Thanks for the quick reply!
-- JP

--~--~---------~--~----~------------~-------~--~----~
Yi development mailing list
[email protected]
http://groups.google.com/group/yi-devel
-~----------~----~----~----~------~----~------~--~---

Reply via email to