https://bugzilla.wikimedia.org/show_bug.cgi?id=58784

--- Comment #5 from Tim Landscheidt <t...@tim-landscheidt.de> ---
(In reply to comment #3)
> [...]
> Setting LANG="en_US.UTF-8" (or any other UTF-8 locale) should solve this
> issue.

It apparently does, because I have:

| export LANG=de_DE.UTF-8

in ~/.profile, and for me:

| scfc@tools-login:~$ diff -u test.out <(./test.sh)
| scfc@tools-login:~$

But my test account shows "LANG=en_US.UTF-8" interactively, but "jsub locale"
gives "LANG=", even after "export LANG".  The same occurs if I set the locale
to non-"en_US.UTF8" before jsub with "export LANG=de_DE.UTF-8".

My assumption (and fear :-)) is that SGE sources ~/.profile before job
execution, which means that there will be a *lot* of confusion on where to
configure locales and how they are evaluated.

I don't want to go down that road if it can be avoided.  Is it possible to
explicitely set the locale in Python?  Otherwise we could change jsub so that
users can use qsub's "-v" option to set the locale in the environment:

| scfc-test@tools-login:~$ qsub -b y -N locale-en -v LANG=en_US.UTF-8 locale
| Your job 1934859 ("locale-en") has been submitted
| scfc-test@tools-login:~$ qsub -b y -N locale-de -v LANG=de_DE.UTF-8 locale
| Your job 1934865 ("locale-de") has been submitted
| scfc-test@tools-login:~$ fgrep LANG locale-*.o*
| locale-de.o1934865:LANG=de_DE.UTF-8
| locale-de.o1934865:LANGUAGE=
| locale-en.o1934859:LANG=en_US.UTF-8
| locale-en.o1934859:LANGUAGE=
| scfc-test@tools-login:~$

However that does not seem to solve the Python error:

| scfc-test@tools-login:~$ cat test.py 
| #!/usr/bin/python
| print u"\xe4"
| scfc-test@tools-login:~$ qsub -b y -N python-locale-en -v LANG=en_US.UTF-8
./test.py 
| Your job 1934872 ("python-locale-en") has been submitted
| scfc-test@tools-login:~$ cat python-locale-en.*
| Traceback (most recent call last):
|   File "./test.py", line 2, in <module>
|     print u"\xe4"
| UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position
0: ordinal not in range(128)
| scfc-test@tools-login:~$

And for the dbreps tool I indeed had to use:

|     # Wrap sys.stdout into a StreamWriter to allow writing unicode.
|     sys.stdout = codecs.getwriter(locale.getpreferredencoding())(sys.stdout)

But that is Python 2.7.3 (cf.
http://stackoverflow.com/questions/1473577/writing-unicode-strings-via-sys-stdout-in-python,
http://pythonhosted.org/kitchen/unicode-frustrations.html,
https://wiki.python.org/moin/PrintFails).

I don't know what the situation is for Python 3+.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to