On Thu, Oct 6, 2011 at 3:37 PM, Zdenek Pavlas <zpav...@redhat.com> wrote: > Hi, > > I did some experiments with parallelizing metadata downloads > using the 'bulk urlgrab' api. Metadata initialization code > is pretty complex so I only came to a 'staged' 3-pass solution > > 1st pass: update metalink files. > 2nd pass: update repomd files. > 3rd pass: update sqlite files. > > Each stage runs downloads in parallel but the preceding > stage has to (completely) finish first, so if just one > repo has a stale metalink, all repos have to wait. > > Yes, it's possible to rewrite the repo initialization code > to a state machine, and do the staging independently for > each repo. But that's IMO ugly and intrusive. The most > natural approach is to keep yumRepo.py with minimal changes, > just init repos in separate threads. > > I assume sqlite and rpmdb are the main reasons we don't > support/allow threading in yum, as these break if thread > A queries a handle opened by thread B. > > But we can wrap such calls, serialize them, and process > all in the main thread's context [see below]. > > It's a hack, but rewriting a substantial part of yumRepo > (and supporting most of the old API at the same time) > seems worse. I'm not convinced that's the way to go, > just an idea to discuss. > > === CUT === > from thread import allocate_lock, get_ident > lock = allocate_lock() > syn1 = allocate_lock(); syn1.acquire() > syn2 = allocate_lock(); syn2.acquire() > curr = [] > > # single threading wrapper > def singlethreaded(fun): > def forward(*arg, **karg): > lock.acquire() # serialize access to 'curr' > curr[:] = fun, arg, karg, None, None > syn1.release() # job ready > syn2.acquire() # wait till done > ret, exc = curr[3:] > lock.release() > if exc: raise exc > return ret > return forward > > # An example API that does not like threads > main = get_ident() > > @singlethreaded > def foo(arg): > assert get_ident() == main > print 'foo', arg > > @singlethreaded > def bar(arg): > assert get_ident() == main > print 'bar', arg > > # test > from thread import start_new > def thread(n): > # this runs in threads, but foo & bar don't. > foo(n) > bar(n) > for n in range(3): > start_new(thread, (n,)) > while 1: > syn1.acquire() # wait for request > try: curr[3] = curr[0](*curr[1], **curr[2]) > except Exception, curr[4]: pass > syn2.release() # signal we're done > === CUT === > _______________________________________________ > Yum-devel mailing list > Yum-devel@lists.baseurl.org > http://lists.baseurl.org/mailman/listinfo/yum-devel >
Have you taken a look at multiprocessing http://docs.python.org/library/multiprocessing.html Tim _______________________________________________ Yum-devel mailing list Yum-devel@lists.baseurl.org http://lists.baseurl.org/mailman/listinfo/yum-devel