Jan Kiszka wrote:
 > Gilles Chanteperdrix wrote:
 > > Jan Kiszka wrote:
 > >  > At least when SMP is enable, already __xnlock_get becomes far too
 > >  > heavy-weighted for being inlined. xnlock_put is fine now, but looking
 > >  > closer at the disassembly still revealed a lot of redundancy related to
 > >  > acquiring and releasing xnlocks. In fact, we are mostly using
 > >  > xnlock_get_irqsave and xnlock_put_irqrestore. Both include fiddling with
 > >  > rthal_local_irq_save/restore, also heavy-weighted on SMP.
 > >  > 
 > >  > So this patch turns the latter two into uninlined functions which
 > >  > reduces the text size or nucleus and skins significantly on x86-64/SMP
 > >  > (XENO_OPT_DEBUG_NUCLEUS disabled):
 > > 
 > > I think the human idea of how long an inline function can be is far more
 > > restrictive than what a processor can take. When looking at assembly
 > > code, you always find the code long, whereas in reality it is not that
 > > long for a processor. 
 > > 
 > > Besides, IMO, the proper way to uninline xnlock operations is to leave
 > > the non contended case inline, and to move the spinning out of line.
 > This patch is not just about uninlining xnlock, that's only one half of
 > the savings. The other one is irq-disabling via i-pipe. The problem with
 > our case is that we have no simple single check to find out that we are
 > on a fast ride. Rather, we have to do quite some calculations/lookups
 > before the first check, and we have to perform multiple checks even in
 > the best case.

This is my fault, a tradeoff I made, I thought that the atomic_cmpxchg
could be heavy on SMP systems, so I made a first check to see if we are
not recursing. But we can do the two operations in one move if we accept
to have a failing atomic_cmpxchg when recursing.


                                            Gilles Chanteperdrix.

Xenomai-core mailing list

Reply via email to