On Mon, Mar 7, 2011 at 10:03 AM, Toshio Kuratomi <[email protected]> wrote: > On Mon, Mar 07, 2011 at 10:56:07AM -0500, seth vidal wrote: >> On Mon, 2011-03-07 at 10:52 -0500, James Antill wrote: >> > --- >> > yum/misc.py | 7 ++++++- >> > 1 files changed, 6 insertions(+), 1 deletions(-) >> > >> > diff --git a/yum/misc.py b/yum/misc.py >> > index 8e81c34..305d4aa 100644 >> > --- a/yum/misc.py >> > +++ b/yum/misc.py >> > @@ -977,7 +977,8 @@ def getloginuid(): >> > # ---------- i18n ---------- >> > import locale >> > import sys >> > -def setup_locale(override_codecs=True, override_time=False): >> > +def setup_locale(override_codecs=True, override_time=False, >> > + override_encoding=True): >> > # This test needs to be before locale.getpreferredencoding() as that >> > # does setlocale(LC_CTYPE, "") >> > try: >> > @@ -995,6 +996,10 @@ def setup_locale(override_codecs=True, >> > override_time=False): >> > import codecs >> > sys.stdout = >> > codecs.getwriter(locale.getpreferredencoding())(sys.stdout) >> > sys.stdout.errors = 'replace' >> > + if override_encoding: >> > + # Dear python, please let your 'ascii' default die in a fire. >> > kthxbye >> > + reload(sys) >> > + sys.setdefaultencoding('utf-8') >> > >> > >> > def get_my_lang_code(): >> >> >> So, you're just interested in seeing what ways this breaks things? >> >> How about we apply this to rawhide yum first, just for s&g and see what >> goes KABOOM before applying upstream? >> > Although getting rid of sys.setdefaultencoding() is probably a good thing > (upstream python claims that using it will break certain aspects of text > handling in python's internals), I agree that there's a lot of potential to > break stuff by making this change. Test and fix will be in order when this > is applied. >
Actually -- I read that wrong. You're adding sys.setdefaultenconding() into the mix... I thought it was subtracting it. Adding sys.setdefaultencoding() at this stage in yum's development is not a safe change. Martin v Lŏwis writes that using sys.setdefaultencoding will change the behaviour of hash() and therefore the behaviour of comparisons:: http://article.gmane.org/gmane.comp.python.devel/109917 It took me some experimenting to figure out a test case that shows this so I'll list it here:: $ python (14:08:48):1191 Python 2.7 (r27:82500, Sep 16 2010, 18:02:00). [GCC 4.5.1 20100907 (Red Hat 4.5.1-3)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> a = {u'á': 1} >>> u'á'.encode('latin-1') in a __main__:1: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal False >>> u'á'.encode('utf-8') in a False >>> u'á'.encode('latin-1') == a.keys()[0] False >>> u'á'.encode('utf-8') == a.keys()[0] False >>> import sys >>> reload(sys) <module 'sys' (built-in)> >>> sys.setdefaultencoding('utf-8') >>> a = {u'á': 1} >>> u'á'.encode('latin-1') in a False >>> u'á'.encode('utf-8') in a False >>> u'á'.encode('latin-1') == a.keys()[0] False >>> u'á'.encode('utf-8') == a.keys()[0] True >>> Since sys.setdefaultencoding() is a global change, it affects all code that is run as part of yum; not just the parts that are inside of yum itself so this seems like a change that's going to lead to other breakage which might be in libraries that yum uses and there won't be a possibility of getting those libraries changed because the reason the problems are occurring is that you're doing something wrong by using sys.setdefaultencoding(). -Toshio _______________________________________________ Yum-devel mailing list [email protected] http://lists.baseurl.org/mailman/listinfo/yum-devel
