Hi!

Thanks for the interest in improving MD download!
Just a few thoughts (I don't consider myself experienced
in the codebase, esp on the createrepo part).

- Sharding metadata is very likely not an option.

The per-file overhead (to download, to store, to query) 
is significant, and to get a significant fraction of files 
not modified, we'd need quite a lot of them (100+ I guess).

- Rsync-friendly metadata are IMO better option, but..

1) pkgKey values are assigned sequential, so adding/removing
a package in the middle touches 50% of metadata.

2) It's very likely (although I'm not sure) that building
sqlite DB from scratch from two slightly different inputs
produces two very different databases that rsync poorly.
(due to records ending up in different page offsets).

So, keeping persistent pkgKeys (1), and building
new metadata database by copying the old one and performing
a set of insert/delete/updates (2) would help a lot.

Then there's another issue.. compressed sqlite files
are currently primary means of metadata distribution,
but that's likely to change.

On yum side, there are other problems:

3) non-existent rsync:// support in libcurl and urlgrabber.

Yum would probably have to exec() rsync, and that integrates
badly (no mirror failovers, different progress meters etc).

--
Zdenek
_______________________________________________
Yum-devel mailing list
Yum-devel@lists.baseurl.org
http://lists.baseurl.org/mailman/listinfo/yum-devel

Reply via email to