> On 19.12.18 19:26, Auel, Kendall via Xenomai wrote:
> > Jan,
> >
> > I'm very much in favor of providing a way to prevent Xenomai modules
> from using features which can result in deadlock, if there is a clean way to
> detect such a situation.
> >
> > We used gettimeofday in one of our modules and it mostly worked great.
> But once in a great while the system would deadlock. Most calls to
> gettimeofday are benign and appear to work normally, which is why it is
> especially problematic. It would have saved some debug cycles if there was a
> kernel log message to warn us of our danger.
> >
> > Or perhaps we could collect a blacklist of references which will produce
> warnings when linking a Xenomai module. All of these things are 'nice to
> have' but certainly not urgent matters.
>
> We do have the infrastructure and a small use case for such RT traps already:
> If you use --mode-check on xeno-config, any usage of malloc and free from
> RT contexts will be detected and reported. These calls are evil as well
> because they tend no not trigger a syscall in the fast path and only fail on
> contention or empty-pool situations of the userspace allocator.

There is still the issue that the cobald kernel can interrupt the linux kernel
while holding a lock.
Consider the case that you have a 4 core CPU, several cobalt threads are bound 
to eg. Core 0 (legacy code assuming single core).

1) linux wants to update the timekeeper struct
2) now cobalt preempts the linux kernel while holding the lock on Core 0
3) the cobalt threads run close to each other and thus Core 0 remains in cobalt 
domain for hundreds of ms.
4) finally all cobalt threads (that are bound to core 0) idle and linux can 
free the lock

This means that all Linux threads on *any core* that try to call some *gettime 
functions (possible others) will busywait on the lock.

That a rt thread (potentially just temporary promoted non-rt thread, or not 
lazily demoted yet) can additionally deadlock the system sits just on top of 
this issue.

Regarding to what I am allowed to do:
AFAIK a thread started as cobalt thread can freely switch between domains, 
typically around syscalls and the switches are "lazy". What are the rules for a 
thread that needs to collect some data RT (potentially using some RT Mutexes 
with prio inheritance) calling into DSOs that aren’t compiled with the "cobalt 
wrappings" active (say a logging framework that uses libcs clock_gettime).
Do I manually have to demote the thread somehow before calling DSO functions, 
is it not allowed at all to use DSOs that were compiled with "cobalt wrappings"?

> with posix, you are already
> redirected to the RT-safe implementations of those functions.

In my case (posix skin, not "native" as I replied earlier), the call came from 
another DSO which is unaffected by the
link-time wrapping.
I would likely have to LD_PRELOAD a checker DSO, seems more sane to me,
as the calls could originate from implicitly linked DSO aswell (C++ runtime 
library)

Norbert
________________________________

This message and any attachments are solely for the use of the intended 
recipients. They may contain privileged and/or confidential information or 
other information protected from disclosure. If you are not an intended 
recipient, you are hereby notified that you received this email in error and 
that any review, dissemination, distribution or copying of this email and any 
attachment is strictly prohibited. If you have received this email in error, 
please contact the sender and delete the message and any attachment from your 
system.

ANDRITZ HYDRO GmbH


Rechtsform/ Legal form: Gesellschaft mit beschränkter Haftung / Corporation

Firmensitz/ Registered seat: Wien

Firmenbuchgericht/ Court of registry: Handelsgericht Wien

Firmenbuchnummer/ Company registration: FN 61833 g

DVR: 0605077

UID-Nr.: ATU14756806


Thank You
________________________________

Reply via email to