> Here is the sync pattern the code normally achieves, once the parent has
> successfully spawned a child thread, which has to wait for a start signal
> before it may run application code:
>
> 1. parent calls threadobj_start(child)
>         1.1 child->status |= __THREAD_S_STARTED
>         1.2 wait for child->status & __THREAD_S_ACTIVE
>
> 2. child calls threadobj_wait_start(self)
>         2.1 wait for self->status & __THREAD_S_STARTED
>         2.2 raise self->status |= __THREAD_S_ACTIVE
>
> All accesses to the status bits are serialized by a per-thread mutex,
> operated by the threadobj_lock/unlock accessors, which also covers the
> condvar signaling/waiting as one would expect.
>
> When running in pshared mode, thread descriptors (holding ->status, mutex
> and barrier sync) are obtained from /dev/shm. If --disable-pshared, we are
> using 100% process-private memory.
>
> Case 1: a race when manipulating the thread status due to inconsistent
> locking. I could not find any so far.
>
> Case 2: a cache coherence issue in SMP, also caused by improper locking.
> Otherwise, the locking should enforce memory barriers as expected.
>
> Case 3: anything not mentioned in other cases...
>
> - Could you paste/copy the disassembly (objdump -dl rather than gdb's
> disass) of the wait_on_barrier() function?
>
>
I have attached the disassembly as wait_on_barrier_disas.txt


> - Does running both programs with --cpu-affinity=0/1 change the outcome?
>
>
There is no change in behavior when trying any combination of cpu
affinities, with either the "task-1" alchemy test or my event test apps.


> - Without specifying any affinity this time, could you run the current
> test with the debug patch below applied (this is clearly not a fix)? The
> patch forces the code to read the value of the ->status field before
> waiting on the barrier. With that code in and a backtrace showing locals,
> we should be able to check the status word when threadobj_wait_start() is
> entered.
>

diff --git a/lib/copperplate/threadobj.c b/lib/copperplate/threadobj.c
> index cc64caa..ed85a12 100644
> --- a/lib/copperplate/threadobj.c
> +++ b/lib/copperplate/threadobj.c
> @@ -1273,7 +1273,9 @@ void threadobj_wait_start(void) /* current->lock
> free. */
>         int status;
>
>         threadobj_lock(current);
> -       status = wait_on_barrier(current,
> __THREAD_S_STARTED|__THREAD_S_ABORTED);
> +       status = current->status;
> +       if (!(status & __THREAD_S_STARTED))
> +               status = wait_on_barrier(current,
> __THREAD_S_STARTED|__THREAD_S_ABORTED);
>         threadobj_unlock(current);
>
>         /*
>
> --
> Philippe.
>

I patched in the debug and I have attached the full backtraces of threads 1
and 3 of the "task-1" alchemy test.

At the time of the hang:
 - parent sees status = 73    (matches the flags set during
threadobj_start())
 - child sees status = 8        (locked?)

- Charles
-------------- next part --------------
00001868 <wait_on_barrier>:
wait_on_barrier():
/home/debian/Development/build/lib/copperplate/../../../xenomai-3/lib/copperplate/threadobj.c:1200
    1868:       b580            push    {r7, lr}
    186a:       b0ce            sub     sp, #312        ; 0x138
    186c:       af00            add     r7, sp, #0
    186e:       1d3b            adds    r3, r7, #4
    1870:       6018            str     r0, [r3, #0]
    1872:       463b            mov     r3, r7
    1874:       6019            str     r1, [r3, #0]
/home/debian/Development/build/lib/copperplate/../../../xenomai-3/lib/copperplate/threadobj.c:1204
    1876:       1d3b            adds    r3, r7, #4
    1878:       681b            ldr     r3, [r3, #0]
    187a:       6a9b            ldr     r3, [r3, #40]   ; 0x28
    187c:       f8c7 3134       str.w   r3, [r7, #308]  ; 0x134
/home/debian/Development/build/lib/copperplate/../../../xenomai-3/lib/copperplate/threadobj.c:1205
    1880:       463b            mov     r3, r7
    1882:       f8d7 2134       ldr.w   r2, [r7, #308]  ; 0x134
    1886:       681b            ldr     r3, [r3, #0]
    1888:       4013            ands    r3, r2
    188a:       2b00            cmp     r3, #0
    188c:       d148            bne.n   1920 <wait_on_barrier+0xb8>
/home/debian/Development/build/lib/copperplate/../../../xenomai-3/lib/copperplate/threadobj.c:1207
    188e:       1d3b            adds    r3, r7, #4
    1890:       681b            ldr     r3, [r3, #0]
    1892:       6a5b            ldr     r3, [r3, #36]   ; 0x24
    1894:       f8c7 3130       str.w   r3, [r7, #304]  ; 0x130
/home/debian/Development/build/lib/copperplate/../../../xenomai-3/lib/copperplate/threadobj.c:1208
    1898:       f240 0300       movw    r3, #0
    189c:       f2c0 0300       movt    r3, #0
    18a0:       f8c7 312c       str.w   r3, [r7, #300]  ; 0x12c
    18a4:       1d3b            adds    r3, r7, #4
    18a6:       681b            ldr     r3, [r3, #0]
    18a8:       3308            adds    r3, #8
    18aa:       f8c7 3128       str.w   r3, [r7, #296]  ; 0x128
    18ae:       f107 0308       add.w   r3, r7, #8
    18b2:       2100            movs    r1, #0
    18b4:       4618            mov     r0, r3
    18b6:       f7ff fffe       bl      0 <__sigsetjmp>
    18ba:       f8c7 0124       str.w   r0, [r7, #292]  ; 0x124
    18be:       f8d7 3124       ldr.w   r3, [r7, #292]  ; 0x124
    18c2:       2b00            cmp     r3, #0
    18c4:       d009            beq.n   18da <wait_on_barrier+0x72>
/home/debian/Development/build/lib/copperplate/../../../xenomai-3/lib/copperplate/threadobj.c:1208
 (discriminator 2)
    18c6:       f8d7 312c       ldr.w   r3, [r7, #300]  ; 0x12c
    18ca:       f8d7 0128       ldr.w   r0, [r7, #296]  ; 0x128
    18ce:       4798            blx     r3
    18d0:       f107 0308       add.w   r3, r7, #8
    18d4:       4618            mov     r0, r3
    18d6:       f7ff fffe       bl      0 <__pthread_unwind_next>
/home/debian/Development/build/lib/copperplate/../../../xenomai-3/lib/copperplate/threadobj.c:1208
 (discriminator 3)
    18da:       f107 0308       add.w   r3, r7, #8
    18de:       4618            mov     r0, r3
    18e0:       f7ff fffe       bl      0 <__pthread_register_cancel>
/home/debian/Development/build/lib/copperplate/../../../xenomai-3/lib/copperplate/threadobj.c:1209
 (discriminator 3)
    18e4:       1d3b            adds    r3, r7, #4
    18e6:       6818            ldr     r0, [r3, #0]
    18e8:       f7fe fd80       bl      3ec <__threadobj_tag_unlocked>
/home/debian/Development/build/lib/copperplate/../../../xenomai-3/lib/copperplate/threadobj.c:1210
 (discriminator 3)
    18ec:       1d3b            adds    r3, r7, #4
    18ee:       681b            ldr     r3, [r3, #0]
    18f0:       f503 7280       add.w   r2, r3, #256    ; 0x100
    18f4:       1d3b            adds    r3, r7, #4
    18f6:       681b            ldr     r3, [r3, #0]
    18f8:       3308            adds    r3, #8
    18fa:       4619            mov     r1, r3
    18fc:       4610            mov     r0, r2
    18fe:       f7ff fffe       bl      1364 <threadobj_cond_wait>
/home/debian/Development/build/lib/copperplate/../../../xenomai-3/lib/copperplate/threadobj.c:1211
 (discriminator 3)
    1902:       1d3b            adds    r3, r7, #4
    1904:       6818            ldr     r0, [r3, #0]
    1906:       f7fe fd61       bl      3cc <__threadobj_tag_locked>
/home/debian/Development/build/lib/copperplate/../../../xenomai-3/lib/copperplate/threadobj.c:1212
 (discriminator 3)
    190a:       f107 0308       add.w   r3, r7, #8
    190e:       4618            mov     r0, r3
    1910:       f7ff fffe       bl      0 <__pthread_unregister_cancel>
/home/debian/Development/build/lib/copperplate/../../../xenomai-3/lib/copperplate/threadobj.c:1213
 (discriminator 3)
    1914:       1d3b            adds    r3, r7, #4
    1916:       681b            ldr     r3, [r3, #0]
    1918:       f8d7 2130       ldr.w   r2, [r7, #304]  ; 0x130
    191c:       625a            str     r2, [r3, #36]   ; 0x24
/home/debian/Development/build/lib/copperplate/../../../xenomai-3/lib/copperplate/threadobj.c:1214
 (discriminator 3)
    191e:       e7aa            b.n     1876 <wait_on_barrier+0xe>
/home/debian/Development/build/lib/copperplate/../../../xenomai-3/lib/copperplate/threadobj.c:1206
    1920:       bf00            nop
/home/debian/Development/build/lib/copperplate/../../../xenomai-3/lib/copperplate/threadobj.c:1216
    1922:       f8d7 3134       ldr.w   r3, [r7, #308]  ; 0x134
/home/debian/Development/build/lib/copperplate/../../../xenomai-3/lib/copperplate/threadobj.c:1217
    1926:       4618            mov     r0, r3
    1928:       f507 779c       add.w   r7, r7, #312    ; 0x138
    192c:       46bd            mov     sp, r7
    192e:       bd80            pop     {r7, pc}

-------------- next part --------------
(gdb) thread 1
[Switching to thread 1 (Thread 0xb6ff0000 (LWP 4007))]
#0  __libc_do_syscall () at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
46      in ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S
(gdb) bt full
#0  __libc_do_syscall () at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
No locals.
#1  0xb6f5f6f2 in __pthread_cond_wait (cond=0xb6caa194, mutex=0xb6caa09c) at 
pthread_cond_wait.c:177
        _a2tmp = 11
        _a2 = 11
        _nametmp = 240
        _a3tmp = 3
        _a3 = 3
        _a1 = -1228234344
        _v1tmp = -1228234596
        _a4tmp = 0
        _a1tmp = -1228234344
        _a4 = 0
        _v1 = -1228234596
        _name = 240
        __ret = <optimized out>
        futex_val = 3
        buffer = {__routine = 0xb6f5f3e1 <__condvar_cleanup>, __arg = 
0xbefffa00, __canceltype = -1225038976, __prev = 0x0}
        cbuffer = {oldtype = 1, cond = 0xb6caa194, mutex = 0xb6caa09c, bc_seq = 
0}
        err = <optimized out>
        pshared = <optimized out>
        pi_flag = 1
        val = <optimized out>
        seq = <optimized out>
#2  0xb6f9cfa6 in threadobj_cond_wait (cond=0xb6caa194, lock=0xb6caa09c) at 
../../../xenomai-3/lib/copperplate/threadobj.c:980
        ret = -1228234604
#3  0xb6f9d554 in wait_on_barrier (thobj=0xb6caa094, mask=16) at 
../../../xenomai-3/lib/copperplate/threadobj.c:1231
        __cancel_buf = {__cancel_jmp_buf = {{__cancel_jmp_buf = {-619902981, 
-754243412, -1090520024, 0, 0, -1090520496, 0, 0, -1224740864, 0 <repeats 17 
times>, 
                2, 0, 5, 0, 1, -1224758200, -1659757782, -1224874387, 0, 
-1224738124, -1224757760, -1090520312, -1224757760, 0, -1, -1224796312, 
-1225176048, 
                -1224794112, -1225021852, -1224802304, -1224801088, 1, 0, 
-1224859353, -1224796312, 1, 5, 0, 0, 1, -1225038976, 0, -1090520024, 0, 
-1090520232, 
                -1090520024, 0, 0}, __mask_was_saved = 0}}, __pad = 
{0xbefffc28, 0x0, 0x0, 0xb6ca496c}}
        __cancel_routine = 0xb6f5e225 <__GI___pthread_mutex_unlock>
        __cancel_arg = 0xb6caa09c
        __not_first_call = 0
        oldstate = 0
        status = 73
#4  0xb6f9d606 in threadobj_start (thobj=0xb6caa094) at 
../../../xenomai-3/lib/copperplate/threadobj.c:1268
        current = 0xb6ca496c
        ret = 0
        oldstate = 1
#5  0xb6fc2d10 in rt_task_start (task=0x20ed0 <t_main>, entry=0x10985 
<main_task>, arg=0xdeadbeef) at ../../../xenomai-3/lib/alchemy/task.c:634
        tcb = 0xb6ca9f6c
        svc = {cancel_type = -559038737}
        ret = 67973
#6  0x00010a62 in main (argc=1, argv=0x21548) at task-1.c:26
        ret = 0
(gdb)
-------------- next part --------------
(gdb) thread 3
[Switching to thread 3 (Thread 0xb6c6f460 (LWP 4016))]
#0  __libc_do_syscall () at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
46      in ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S
(gdb) bt full
#0  __libc_do_syscall () at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
No locals.
#1  0xb6f5f6f2 in __pthread_cond_wait (cond=0xb6caa194, mutex=0xb6caa09c) at 
pthread_cond_wait.c:177
        _a2tmp = 11
        _a2 = 11
        _nametmp = 240
        _a3tmp = 1
        _a3 = 1
        _a1 = -1228234344
        _v1tmp = -1228234596
        _a4tmp = 0
        _a1tmp = -1228234344
        _a4 = 0
        _v1 = -1228234596
        _name = 240
        __ret = <optimized out>
        futex_val = 1
        buffer = {__routine = 0xb6f5f3e1 <__condvar_cleanup>, __arg = 
0xb6c6ec18, __canceltype = -1225038976, __prev = 0x0}
        cbuffer = {oldtype = 1, cond = 0xb6caa194, mutex = 0xb6caa09c, bc_seq = 
0}
        err = <optimized out>
        pshared = <optimized out>
        pi_flag = 1
        val = <optimized out>
        seq = <optimized out>
#2  0xb6f9cfa6 in threadobj_cond_wait (cond=0xb6caa194, lock=0xb6caa09c) at 
../../../xenomai-3/lib/copperplate/threadobj.c:980
        ret = -1228234596
#3  0xb6f9d554 in wait_on_barrier (thobj=0xb6caa094, mask=5) at 
../../../xenomai-3/lib/copperplate/threadobj.c:1231
        __cancel_buf = {__cancel_jmp_buf = {{__cancel_jmp_buf = {-751562301, 
-754243412, -1228476860, -1090520124, 0, -1228477336, -1228476512, -1224802304, 
0, 
                -1228474732, 0 <repeats 16 times>, 1, -1224758200, -1228477152, 
-1228477004, 0, -1228477120, -1224757760, -1228477168, -1224757760, 
-1224738124, 
                -1, 0, -1225177664, -1224794112, 1, -1224794112, 0, 0, 0, 0, 
-1224793672, -1228477112, -1224793672, 0, -1, 0, -1225027584, 119632, 
-1225021852, 
                0, -1228234596, -1090520124, 0, -1228477040, -1228476512, 
-1224802304, 0, -1225404053}, __mask_was_saved = 0}}, __pad = {0xb6c6ee80, 0x0, 
0x0, 
            0xb6c6ed90}}
        __cancel_routine = 0xb6f5e225 <__GI___pthread_mutex_unlock>
        __cancel_arg = 0xb6caa09c
        __not_first_call = 0
        oldstate = 0
        status = 8
#4  0xb6f9d67a in threadobj_wait_start () at 
../../../xenomai-3/lib/copperplate/threadobj.c:1299
        current = 0xb6caa094
        status = 8
#5  0xb6fc2770 in task_prologue_2 (tcb=0xb6ca9f6c) at 
../../../xenomai-3/lib/alchemy/task.c:211
        ret = 0
#6  0xb6fc27b6 in task_entry (arg=0xb6ca9f6c) at 
../../../xenomai-3/lib/alchemy/task.c:227
        __ret = -1228476896
        tcb = 0xb6ca9f6c
        svc = {cancel_type = 1}
        ret = -1228475296
        __FUNCTION__ = "task_entry"
#7  0xb6f9a2d8 in thread_trampoline (arg=0xbefffb94) at 
../../../xenomai-3/lib/copperplate/internal.c:251
        cta = 0xbefffb94
        _cta = {stacksize = 0, detachstate = 1, policy = 1, param_ex = 
{__sched_priority = 99, sched_u = {rr = {__sched_rr_quantum = {tv_sec = 
-1225254836, 
                  tv_nsec = 0}}}}, prologue = 0xb6fc2719 <task_prologue_1>, run 
= 0xb6fc27a5 <task_entry>, arg = 0xb6ca9f6c, __reserved = {status = -38, warm = 
{
              __size = 
"\001\000\000\000\000\000\000\000\060m\376\266\001\000\000", __align = 1}, 
released = 0x10cac}}
        released = {__size = '\000' <repeats 15 times>, __align = 0}
        ret = 0
        __FUNCTION__ = "thread_trampoline"
#8  0xb6f5b424 in start_thread (arg=0x0) at pthread_create.c:335
        pd = 0x0
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {-751561765, -754497110, 
-1228475296, -1090520280, 0, -1228476816, -1228476512, -1224802304, 0, 
-1228474732, 
                0 <repeats 54 times>}, mask_was_saved = 0}}, priv = {pad = 
{0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#9  0xb6eb243c in ?? () at ../sysdeps/unix/sysv/linux/arm/clone.S:89 from 
/lib/arm-linux-gnueabihf/libc.so.6
No locals.
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb)
_______________________________________________
Xenomai mailing list
[email protected]
http://xenomai.org/mailman/listinfo/xenomai

Reply via email to