On Fri, May 30, 2014 at 05:18:19PM -0400, James Antill wrote: > On Fri, 2014-05-30 at 16:46 +0200, Lars Ellenberg wrote: > > Hi there. > > (Cc me, I'm not subscribed) > > > > I've got a "yum check" taking two hours. > > (really single core CPU time, actually, not much IO going on there, afaics) > > That is not normal. Whenever I've seen this kind of problem it's always > been a problem in the rpm indexes, rpm --rebuilddb might help.
Did that. Does not help. > > I cProfiled it. > > > > 60% of the time it seems to spend in i18n.str_eq > > > > I pastied an SVG (gprof2dot) here: http://paste.fedoraproject.org/105517/raw > > > > It does 2001811124 calls to i18n.str_eq > > Yes, that's 2 billion. > > ... seems excessive ... > > Indeed. > > > rpm -qa | wc -l : 716 > > rpm -qa --provides | wc -l : 71774 > > yum-3.2.29-40.el6.centos.noarch > > That doesn't seem super excessive. > > > Seems like you have some O(2) or worse behaviour below check_provides(), > > You can see the code easily enough, it's generic and could be micro > optimised but it's basically: > > for each package pkg: > for each provide in pkg prov: > lookup prov in rpmdb and make sure pkg is in the results Yes. And somewhere *below* that there seem to be additional loops, where I don't really understand what is supposed to be happening there, or what exactly they are looping over, but it seems to be some linear search? checkPrco for (n, f, (e, v, r)) in self.returnPrco(prcotype): if i18n.str_eq(reqn, n): return 1 matchingPrcos for (n, f, (e, v, r)) in self.returnPrco(prcotype): if not i18n.str_eq(reqn, n): continue > ...so it's roughly: > > (provide lookup in rpmdb) * (provides in rpmdb) > > ...or in your case: > > (time rpm -q --whatprovides blah) * 72000 ~6900 seconds / 72000 "provides" that would mean 95 ms lookup time on average. # time rpm -q --whatprovides does-not-exist --stats no package provides does-not-exist total: 1 0.000000 MB 0.002827 secs digest: 4 0.016060 MB 0.000018 secs dbget: 6 0.016092 MB 0.000450 secs real 0m0.025s user 0m0.015s sys 0m0.009s # time rpm -q --whatprovides bash --stats bash-4.1.2-15.el6_4.x86_64 total: 1 0.000000 MB 0.004547 secs digest: 5 0.063624 MB 0.001372 secs dbget: 7 0.063664 MB 0.000817 secs real 0m0.027s user 0m0.017s sys 0m0.009s # time rpm -q --whatprovides kernel --stats kernel-2.6.32-279.14.1.el6.x86_64 kernel-2.6.32-279.14.1.el6.x86_64 kernel-2.6.32-431.1.2.el6.x86_64 kernel-2.6.32-431.1.2.el6.x86_64 kernel-debug-2.6.32-279.14.1.el6.x86_64 kernel-debug-2.6.32-431.1.2.el6.x86_64 kernel-2.6.32-358.18.1.el6.x86_64 kernel-2.6.32-358.18.1.el6.x86_64 kernel-debug-2.6.32-358.18.1.el6.x86_64 total: 1 0.000000 MB 0.197096 secs digest: 10 14.959840 MB 0.090771 secs dbget: 12 14.959944 MB 0.045128 secs real 0m0.220s user 0m0.173s sys 0m0.046s Same, adding --nodigest --nosignature total: 1 0.000000 MB 0.113099 secs digest: 10 14.959840 MB 0.000140 secs dbget: 12 14.959944 MB 0.045224 secs real 0m0.137s user 0m0.087s sys 0m0.049s Note the factor of >50 in dbget (bash vs kernel) > > ...if the rebuild doesn't fix it, I'd try putting print statements in > rpmsack:check_provides ... to see which package/provide is causing the > problem. Ok. So I added a call counter to i18n.str_eq, and reset that for each "provides" to be checked, and print processed name and calls_to_str_eq (and a similar "import mystats; ++calls" to i18n.str_eq): def check_provides(self): """ For each package, check that a provides search for it's name (and everything it provides) finds it. """ problems = [] for pkg in sorted(self.returnPackages()): for provtup in pkg.provides: import mystats mystats.calls_to_str_eq = 0 name, flags, version = provtup if pkg not in self.getProvides(name, flags, version): problems.append(RPMDBProblemProvides(pkg, provide=provtup)) break try: print "%s %d" % (name, mystats.calls_to_str_eq) except: print "??? %d" % (mystats.calls_to_str_eq) return problems There are many that seem to hit some cache, and have a zero call count. Then there are some small, a few in the 100 or 800 range. Once it starts to check "kernel" provides, its above 64000: ... config(glibc) 564 config(ipa-gothic-fonts) 98 font(:lang=aa) 0 libzip.so()(64bit) 89 libzip.so(SUNWprivate_1.1)(64bit) 89 java-1.6.0-openjdk 177 java-1.6.0-openjdk(x86-64) 178 java_cup-0.10k.jar.so()(64bit) 4 jline 2 config(jpackage-utils) 3 jpackage-utils 4 kbd 3 kbd(x86-64) 4 kbd-misc 2 ... kernel 64770 kernel-x86_64 64773 kernel-xen 32280 kernel(FsmChangeState) 64812 kernel(FsmDelTimer) 64818 ... So for each "provides" line of the kernel package and there are >= ten thousand such provides per kernel package, it needs to do >= 64 thousand str_eq? What's happening there? Lars _______________________________________________ Yum-devel mailing list Yum-devel@lists.baseurl.org http://lists.baseurl.org/mailman/listinfo/yum-devel