Jan Kiszka wrote:
Brian L. wrote:
I'm finally set up with netconsole to catch panics/crashes when they
happen so now I can report more information on the one I alluded to a
week or two ago in my "General Question.." thread.
What I did to cause it was write past the end of a buffer returned by
rt_queue_alloc. I'm not entirely sure if this message came at the
moment of the write (unlikely, IMHO) or later when more xnheap
activity took place. The crash popped up in several different ways
depending on what code paths I enabled/disabled.
What concerns me is that polluting an xnheap can bring the system to
its knees so harshly. I can see why it could be *very* hard to police
this sort of problem without destroying the performance of xnheap, so
it wouldn't surprise me if this is "normal". Still, though, it's sad
that user-space code can bring the system down after something as
innocent as a fencepost error in a string copy routine...
Thoughts? I've pasted the console dump below.
I remember that control structures and data are tightly knotted in
xnheaps, but I agree with you that this should not lead so easily to
such crashes for user space apps. Maybe some magic number check could
help to reduce the chance for now.
A cleaner long-term solution would be to decouple both regions.
Philippe, is this feasible (I'm not that deep in the internals of xnheap)?
Almost everything could be done, with the proper overhead, I mean. As
you know, there are quite a number of ways to kill your box with RT
activity, even without trashing Xenomai's internal data structures.
Causing a fencepost error when copying data that trashes other crucial
application data comes to mind, which in turns might cause all sort of
weird behaviours, including hard lockups due to unexpected infinite
loops; this might also happen whether the system data are isolated in a
write protected segment or not. The same goes with plain MMIO areas,
just write garbage over such memory which might have I/O side-effects,
and watch the box go south, no need for Xenomai here.
Since providing isolated individual data buffers is out of question for
obvious performance reasons, rt_queue_read/write introduced in 2.2
provide a way to send/receive data blocks without having to share the
memory with the heap, at the expense of the data being transfered
to/from kernel space during the calls.
In contrast, rt_queue_send/receive have been designed to handle the
internal buffer space immediately, so that applications could prepare
significant amount of data directly into it before transmission, which
saves the ultimate copy to transfer the buffer, at the expense of being
able to access the entire heap memory with no particular protection too.
As our friend Spider-Man says, "With great power comes great
responsibility".
--
Philippe.
_______________________________________________
Xenomai-help mailing list
[email protected]
https://mail.gna.org/listinfo/xenomai-help