On Thu, 17 Feb 2011, seth vidal wrote:
On Thu, 2011-02-17 at 19:09 +0200, Panu Matilainen wrote:
On 02/17/2011 04:17 PM, James Antill wrote:
On Thu, 2011-02-17 at 11:48 +0200, Panu Matilainen wrote:
The open/close file callbacks only need NEVRA information in the "key".
Copy the necessary bits into a regular dict and pass that instead
of the entire header, the callback wont even know the difference
as the header behaves like a dict.
Saves (tens of) megabytes of memory on large transactions.
---
yum/depsolve.py | 6 +++++-
1 files changed, 5 insertions(+), 1 deletions(-)
ACK.
Oh crap... just realized this breaks anaconda and possibly something
else too.
I keep forgetting how completely broken up the python callback is: the
header is never there unless its passed as (a part of) the package
"key", wasting huge amounts of memory. So anaconda (and possibly others)
are relying on it being a header (even testing instanceof(rpm.hdr), and
accessing various things (summary, size, count of 'basenames') from there.
Anaconda can be fixed easily enough: just make it use the information
from package object (which it can find from the nevra info) instead of
the assumed header. Or we could just copy a bit more data to the
"minimal header" and it'd still be a big win over the full header... but
there's no telling what header data other API users might try to access.
And then there's the object type change from rpm.hdr to dict too.
So, it's an API change for sure, far more so than I originally thought
it would be :-/
One possibility is making the new behavior opt-in, such as adding a
fullheader=True keyword argument to populateTs() so that yum itself can
benefit from it now, and give callers the old compatible behavior with
full header until they update their code.
ugh - I don't think that's such a fabulous plan. Decorating the code in
more interesting ways doesn't thrill me.
Well, it isn't exactly a whole lot of code. Something like this (untested
but you'll get the idea):
diff --git a/yum/depsolve.py b/yum/depsolve.py
index 997c66d..b99265d 100644
--- a/yum/depsolve.py
+++ b/yum/depsolve.py
@@ -179,7 +179,7 @@ class Depsolve(object):
return False
- def populateTs(self, test=0, keepold=1):
+ def populateTs(self, test=0, keepold=1, fullheader=1):
"""take transactionData class and populate transaction set"""
if self.dsCallback: self.dsCallback.transactionPopulation()
@@ -220,12 +220,18 @@ class Depsolve(object):
txmbr.ts_state = 'i'
txmbr.output_state = TS_INSTALL
- # grab essential data for callback use in hdr-like form
- minhdr = {}
- for item in ['epoch', 'name', 'version', 'release', 'arch']:
- minhdr[item] = hdr[item]
+ # Traditionally the entire header was used as part of the
+ # pkg key, by default give this compatible behavior. However
+ # this wastes gobs of memory for no good reason, the callback
+ # really only needs nevra to locate its files.
+ if fullhdr:
+ cbhdr = hdr
+ else:
+ cbhdr = {}
+ for item in ['epoch', 'name', 'version', 'release',
'arch']:
+ cbhdr[item] = hdr[item]
- self.ts.addInstall(hdr, (minhdr, rpmfile), txmbr.ts_state)
+ self.ts.addInstall(hdr, (cbhdr, rpmfile), txmbr.ts_state)
self.verbose_logger.log(logginglevels.DEBUG_1,
_('Adding Package %s in mode %s'), txmbr.po,
txmbr.ts_state)
if self.dsCallback:
This "decoration" is worth tens of megabytes of saved memory on bigger
installs/updates, me thinks its well worth those couple of lines. It's
either something to this effect, or revert the API-breaking patch and keep
the memory bloat until cowboys come home. This isn't something rpm can
magically fix either - when the rpm-side callback is finally changed to
something better it's going to require bigger changes than the above.
- Panu -
_______________________________________________
Yum-devel mailing list
[email protected]
http://lists.baseurl.org/mailman/listinfo/yum-devel