Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Gilles Chanteperdrix wrote:
>>> Jan Kiszka wrote:
>>>> Gilles Chanteperdrix wrote:
>>>>> Jan Kiszka wrote:
>>>>>> This adds __xn_sys_drop_u_mode, a syscall to deregister the u_mode user
>>>>>> space address passed on shadow creation for the current thread. We need
>>>>>> this in order to safely shut down a thread that wants to release its
>>>>>> u_mode memory (either malloc'ed or TLS-allocated). So far, the kernel
>>>>>> might have touched this memory even after the release which caused
>>>>>> subtle corruptions.
>>>>>> After we released u_mode, we can no longer make reliable decisions based
>>>>>> on it. For the fast mutex use case, we then assume the worst case, ie.
>>>>>> we enforce syscall-based acquisitions unconditionally. For the
>>>>>> assert_nrt case, we simply acquire the required information via
>>>>>> __xn_sys_current_info.
>>>>> As usual, you are firing patches too fast:
>>>>> - you are lacking the exit syscall trap.
>>>> Don't worry, my memory is not yet _that_ bad. :)
>>>>> - the patch is not correct and will cause kernel-space addresses to be
>>>>> passed to put_user, which probably yields bugs when the kernel is
>>>>> compiled with proper debug flags;
>>>> Right, e.g. ARM requires set_fs() fiddling in addition. But then a simple
>>>> NULL-check is way better. Fixed.
>>>>> - I do not see the point where the new syscall is called with __thread,
>>>>> but I am not sure it is missing, and if it is missing, you will get the
>>>>> rightfull warning when hitting the exit syscall.
>>>> Destructor of (now unconditionally used) xeno_current_mode_key.
>>>>> - the user-space users of the function to get current_mode are still
>>>>> cluttered with special cases to handle the invalid state. And
>>>>> illustrating nicely the deficiency of this approach, you have forgot one
>>>>> user.
>>>> My impression is that you are still a bit biased due to TLS limitations
>>>> on ARM.
>>> No, I think the __thread stuff thinks make C code look too much like C++
>>> where the compiler makes things behind your back for such simple code as:
>>> void *foo = &bar;
>>> This is not what I expect from a C compiler, and I suspect a lot of C
>>> developers, including the people who advocate that C, as a simple
>>> language, and not C++ should be used for free software development.
>>> Yes, there is no gain on ARM, where this generates a function call as
>>> pthread_getspecific and causes bugs with gcc >= 4.3, but not much gain
>>> on x86 either where a function call is so cheap.
>> As are individual memory accesses. The sum makes the difference.
> Ok. If you find before 2.5.3 a platform where this makes a difference of
> more than 5% of this platform's latency, I do not remove the __thread
> stuff. Deal?

If you manage to remove 5% of the configuration variables of Xenomai
that way. Or 5% of the conditional code.

>> Once someone starts testing mutexes or other stuff from TSD destructors,
>> not earlier. And this is so special that anyone of us interested in this
>> test case will naturally be aware of its specifics.
> As you said to err is human. And having a single point for handling an
> exceptional case is a good idea. And this shortens your patch.
>>> I think every sane free-software projects ask people to test their
>>> patches/pull requests before sending them, even the Linux kernel. See
>>> for instance:
>>> I am not asking you for one week. 24 hours would be enough.
>> As you may know, complete test coverage is near to impossible.
> That is no excuse for not making any test.
>> On unlucky days, I happen to send out stuff that broke while continuing to
>> type after the last test ran. But specifically these things were tested
>> in various combinations.
>> It is not very polite to assume the contrary.
> When you send a patch every 5 minutes, it is obvious that it was not
> tested. No politeness involved. Everybody can check the timestamps in
> the mailing lists and do the math.

I think my workflow was misleading regarding this. Not necessarily all
stuff I push into my 'for-upstream' branch is done with all tests at the
time when I push. Official pull request mails are supposed to mark this.
I'm pushing to save my work and, of course, to expose it. So I
appreciate comments, but I surely don't expect them within minutes.

>  I'm trying to avoid the
>> same assumption about other people as well, even when things look
>> differently from time to time. To err is human, to always write correct
>> code is divine.
>> But as you already mentioned in another mail, there is room for
>> improvement /wrt testability. On step forward would be an automated
>> build and run of some git tree that we can check into. One board per
>> arch would be enough to avoid issues we face from time to time when we
>> only stress our favorite archs with new features or fixes.
> We have the automated build.

Then we just need to run it against some 'build-test' branches of
everyone's git repositories on checkin. That would be great.

> I plan seriously to work on the automated
> run for 2.5.3. But it is too late for 2.5.2. Also note that I gave you
> access to my server where several targets are configured and usable. If
> you forgot the password, I can reset it.

I do remember this, but I thought this was for a specific test. If I'm
supposed to access that box regularly, I'm fine, we can arrange this.

>>> In fact, the mistake is partly mine too, I am replying to your patches
>>> as fast as you send them. We were going crazy tuesday with the condition
>>> variable patches, I do not think it gives a reassuring image to the
>>> users.
>> I can only say that, as a developer in this business, I trust code
>> having been discussed to the extreme _way_ more than stuff a single
>> clever person checked in even after meditating over it for weeks. I've
>> seen enough of this, produced by non-clever /me as well as way more
>> clever people in all kinds of projects.
>>> So that at some point, I stopped reviewing your patches. I have
>>> not even read the last 8 you sent. Without going to the other extreme of
>>> a "closed door" development model, I think we can find a more reasonable
>>> trade off than the current one.
>> Well, you have to decide which piece to take, or to request it to be
>> rewritten, or to write yourself.
> Yes. That is the plan. Tomorrow.
>>> Our objective now, is to put 2.5.2 in a stable state.
>>>> +substitute_linux_syscall(xnthread_t *thread, struct pt_regs *regs)
>>>> +{
>>>> +  if (unlikely(__xn_reg_mux(regs) == __NR_exit))
>>>> +          handle_shadow_exit(thread);
>>>> +
>>> Not your fault, but this does not work on ARM. Do not know if
>>> __xn_reg_mux needs fixing there or if we need to add a new macro to get
>>> the roots syscall number.
>> And that's the point of sending patches for discussion.
>> It's regs->ARM_r7 == __NR_SYSCALL_BASE + __NR_MY_SYSCALL, right? If
>> __xn_reg_mux is supposed to be only a mux value for Xenomai, we need a
>> separate accessor for the Linux syscall number on all archs.

Patch is in my queue if you are interested.


PS: Note that this u_mode fix exposed an issue of the sigtest:
cancel_with_signals fails if there is some TSD destructor registered
that issues a Xenomai syscall. Not sure yet if it is a fundamental issue
or just related to the test itself.

Attachment: signature.asc
Description: OpenPGP digital signature

Xenomai-core mailing list

Reply via email to