(Forwarded to yum-devel list since it is being held for moderation)


-------- Original Message --------
Subject: A proposal to create a kind of 'delta metadata' (for XML md for now)
Date:   Wed, 30 Jul 2014 01:15:15 +0430
From:   Hedayat Vatankhah <heday...@gmail.com>
To:     rpm-metad...@lists.baseurl.org



Dear all,
While I'm thinking about a completely new repository format, but I found that would need much time from my side and probably lots of discussions. But I think that implementing a kind of delta metadata support for the current format is much easier and possibly provides considerable improvement for updates compared to the current situation where a completely new metadata is generated for updates. And I think I can contribute code too.

Anyway, I'd like to propose my idea and request for feedback. My suggestion is currently limited to XML MD format only, which is more space efficient than SQLite dbs in itself. And specially since DNF is using them.

The idea is very simple: when createrepo is going to update an existing repodata, it doesn't replace the old primary/filelists/etc data. Instead, it creates new xml files containing data about added/modified/deleted packages (a deleted package can be marked by having an entry for it without any data except its name and EVR. A special attribute might be added if needed). And, repomd.xml contains reference to these new files only.

Now, how older metadata files are referenced? They are referenced from the new MD files, just like how repomd.xml refers to the latest versions. For example, new primary.xml file which contains data about latest added/modified/deleted packages references the previous primary.xml file (which in turn might reference an older primary.xml database). Therefore, we have a linked list of primary.xml files, where the head is the latest primary delta and the tail is the main primary.xml file created when the repo was created.

I like the idea of linked lists here, because it is scalable! But if it seems overkill (does not allow pipelining to download a number of delta xml files, because you should download an open an XML file to learn about the name of previous one), we might come up with some other suggestions for storing the list, e.g. the list might be stored in repomd.xml itself, or a separate file for interested clients.

IMHO, this can be easily implemented in createrepo, and should be fairly straightforward for clients to use. What do you think?

Regards,
Hedayat


_______________________________________________
Yum-devel mailing list
Yum-devel@lists.baseurl.org
http://lists.baseurl.org/mailman/listinfo/yum-devel

Reply via email to