Anders Blomdell wrote:
Jan Kiszka wrote:
Am 01.11.2010 17:55, Anders Blomdell wrote:
Jan Kiszka wrote:
Am 28.10.2010 11:34, Anders Blomdell wrote:
Jan Kiszka wrote:
Am 28.10.2010 09:34, Anders Blomdell wrote:
Anders Blomdell wrote:
Anders Blomdell wrote:
Hi,
I'm trying to use rt_eepro100, for sending raw ethernet packets,
but I'm
experincing occasionally weird behaviour.
Versions of things:
linux-2.6.34.5
xenomai-2.5.5.2
rtnet-39f7fcf
The testprogram runs on two computers with "Intel Corporation
82557/8/9/0/1 Ethernet Pro 100 (rev 08)" controller, where one
computer
acts as a mirror sending back packets received from the ethernet
(only
those two computers on the network), and the other sends
packets and
measures roundtrip time. Most packets comes back in approximately
100
us, but occasionally the reception times out (once in about 100000
packets or more), but the packets gets immediately received when
reception is retried, which might indicate a race between
rt_dev_recvmsg
and interrupt, but I might miss something obvious.
Changing one of the ethernet cards to a "Intel Corporation 82541PI
Gigabit Ethernet Controller (rev 05)", while keeping everything
else
constant, changes behavior somewhat; after receiving a few 100000
packets, reception stops entirely (-EAGAIN is returned), while
transmission proceeds as it should (and mirror returns packets).
Any suggestions on what to try?
Since the problem disappears with 'maxcpus=1', I suspect I have a
SMP
issue (machine is a Core2 Quad), so I'll move to xenomai-core.
(original message can be found at
http://sourceforge.net/mailarchive/message.php?msg_name=4CC82C8D.3080808%40control.lth.se
)
Xenomai-core gurus: which is the corrrect way to debug SMP issues?
Can I run I-pipe-tracer and expect to be able save at least 150
us of
traces for all cpus? Any hints/suggestions/insigths are welcome...
The i-pipe tracer unfortunately only saves traces for a the CPU that
triggered the freeze. To have a full pictures, you may want to try my
ftrace port I posted recently for 2.6.35.
2.6.35.7 ?
Exactly.
Finally managed to get the ftrace to work
(one possible bug: had to manually copy
include/xenomai/trace/xn_nucleus.h to
include/xenomai/trace/events/xn_nucleus.h), and it looks like it can be
very useful...
But I don't think it will give much info at the moment, since no
xenomai/ipipe interrupt activity shows up, and adding that is far above
my league :-(
You could use the function tracer, provided you are able to stop the
trace quickly enough on error.
My current theory is that the problem occurs when something like this
takes place:
CPU-i CPU-j CPU-k CPU-l
rt_dev_sendmsg
xmit_irq
rt_dev_recvmsg recv_irq
Can't follow. When races here, and what will go wrong then?
Thats the good question. Find attached:
1. .config (so you can check for stupid mistakes)
2. console log
3. latest version of test program
4. tail of ftrace dump
These are the xenomai tasks running when the test program is active:
CPU PID CLASS PRI TIMEOUT TIMEBASE STAT NAME
0 0 idle -1 - master R ROOT/0
1 0 idle -1 - master R ROOT/1
2 0 idle -1 - master R ROOT/2
3 0 idle -1 - master R ROOT/3
0 0 rt 98 - master W rtnet-stack
0 0 rt 0 - master W rtnet-rtpc
0 29901 rt 50 - master raw_test
0 29906 rt 0 - master X reporter
The lines of interest from the trace are probably:
[003] 2061.347855: xn_nucleus_thread_resume: thread=f9bf7b00
thread_name=rtnet-stack mask=2
[003] 2061.347862: xn_nucleus_sched: status=2000000
[000] 2061.347866: xn_nucleus_sched_remote: status=0
since this is the only place where a packet gets delayed, and the only
place in the trace where sched_remote reports a status=0
Since the cpu that has rtnet-stack and hence should be resumed is doing
heavy I/O at the time of fault; could it be that
send_ipi/schedule_handler needs barriers to make sure taht decisions are
made on the right status?
/Anders
_______________________________________________
Xenomai-core mailing list
[email protected]
https://mail.gna.org/listinfo/xenomai-core