[Re: [linux-yocto][v5.10/standard/preempt-rt/base and 
v5.4/standard/preempt-rt/base][PATCH v2] fs: dcache: avoid livelocks on 
d_alloc_parallel] On 19/05/2021 (Wed 18:25) Wenlin Kang wrote:

> On 2021/5/19 上午11:12, Bruce Ashfield wrote:
> > [Please note: This e-mail is from an EXTERNAL e-mail address]
> > 
> > Adding Paul (I'd like his opinion).
> 
> 
> Hi Bruce
> 
> 
> > 
> > This is in a really core part of the kernel, and I can't tell from the 
> > patch:
> > 
> >    - has this been submitted to the upstream -rt list ? (I checked and
> > didn't see it)
> 
> 
> I haven't found the one upstream -rt tree which has file
> include/linux/spinlock_rt.h and is included in include/linux/spinlock.h,
> so I don't know where I should submit it, would you please tell me if you
> know? thanks.
> 
> 
> >    - is this a backport ? (A quick check also says no, I don't see it
> > in my 5.12-rt)
> 
> 
> It's not a backport, in yocto 5.12-rt,  it don't have
> include/linux/spinlock_rt.h,  so it don't need this patch.

Filenames may have changed, but the core technology has not, so if you
are basing your conclusions on filename alone, they are wrong.

> 
> 
> >    - what sort of stress testing has been done with the patch ?
> >    - was there a reproducer for the original problem that this has been
> > shown to solve ?
> 
> 
> Steps to reproduce:
> 1)  enable config CONFIG_PREEMPT_RT
> 2)  $ taskset -c 1 chrt -f 70 ./test1.sh &
>      $ taskset -c 1 chrt -f 80 ./test2.sh &
> 
> After about 30 minutes, check whether serial console responds.
> 
> If not  have the patch,  after 30 minutes, the console will not respond any
> more,
> with this patch, the problem get solved.

You are likely simply masking a symptom by serializing things in dcache
vs fixing the underlying problem with your old v4.12-rt

I say this because you are essentially saying there is a priority
inversion in the core dcache code, and if that were true, it would show
up not just here for you but in lots of other code places for other
people as well.

> Though this problem was reproduced on v4.12 with preempt-rt with config
> CONFIG_PREEMPT_RT_FULL,

We'd need to see it reproduced on recent -rt and then if seen and once the
underlying issue was properly solved, we  could work backwards to see
which currently maintained -rt releases were impacted.  You could look
at fixes applied to v4.14-rt as I believe that is the closest to you
that is currently maintained.

> I check the codes, found it should also be exist on
> v5.10/standard/preempt-rt/base and v5.4/standard/preempt-rt/base,

If you are just looking at dcache code and saying they are fundamentally
the same, then again your conclusions won't be accurate.  In addition to
reproducing on a modern -rt, I would also suggest - for your end goal -
to get a vanilla v4.14-rt and use the core of your .config  from
v4.12-rt and retest there.

If you can't reproduce it on v4.14-rt then you have a path ; examine
where the 4.14-rt patches diverged from your v4.12-rt and use that to
narrow in on your issue.

If you can reproduce it on v4.14 then great - you've got something with
simple test cases to reproduce that you can take to the rt mailing list,
and indicate you've managed to make the issue go away but are not sure
it is the right way to resolve it.

> only have a difference on config option, v5.2 and previous version used
> config CONFIG_PREEMPT_RT_FULL,
> v5.10/standard/preempt-rt/base and v5.4/standard/preempt-rt/base used
> CONFIG_PREEMPT_RT.

Yes the config options got renamed along the way, as part of a strategy
to get more rt content into mainline, but again the core technology of
"-rt" remains largely unchanged.

To be clear, I can't recommend applying this patch to any v5.x-rt as is
without more investigation as outlined above.

Paul.
--

> 
> 
> P.S.
> $ cat test1.sh
> #!/bin/bash
> 
> while true; do
>       usleep 1
> done
> 
> $ cat test2.sh
> #!/bin/bash
> 
> while true; do
>       cat /proc/stat > /dev/null
>       usleep 100
> done
> 
> 
> > 
> > Bruce
> > 
> > 
> > On Tue, May 18, 2021 at 10:42 PM Kang Wenlin <[email protected]> 
> > wrote:
> > > From: Wenlin Kang <[email protected]>
> > > 
> > > With RT, if a high priority task that runs d_alloc_parallel() preempts a
> > > low priority task that runs __d_add(), it can make d_alloc_parallel() go
> > > into an infinite retry.
> > > 
> > > low priority task:                      high priority task:
> > > __lookup_slow()
> > >      d_add()
> > >          __d_add
> > >              start_dir_add(dir)          __lookup_slow()
> > >                                              d_alloc_parallel()
> > > 
> > > When the low priority task finished start_dir_add(), i_dir_seq has been
> > > incremented to an odd value. Since __d_add() uses spin_lock(), migration
> > > is disabled, so if the low priority task is preempted during 
> > > start_dir_add()
> > > /end_dir_add(), then high priority task can go into an infinite waiting 
> > > for
> > > i_dir_seq.
> > > 
> > > This patch avoid the issue by enable preempt_disable() during 
> > > start_dir_add()
> > > /end_dir_add().
> > > 
> > > Signed-off-by: Wenlin Kang <[email protected]>
> > > ---
> > >   fs/dcache.c | 6 ++++++
> > >   1 file changed, 6 insertions(+)
> > > 
> > > diff --git a/fs/dcache.c b/fs/dcache.c
> > > index ad9bffc..16e204d 100644
> > > --- a/fs/dcache.c
> > > +++ b/fs/dcache.c
> > > @@ -2667,6 +2667,9 @@ static inline void __d_add(struct dentry *dentry, 
> > > struct inode *inode)
> > >   {
> > >          struct inode *dir = NULL;
> > >          unsigned n;
> > > +#ifdef CONFIG_PREEMPT_RT
> > > +       preempt_disable();
> > > +#endif
> > >          spin_lock(&dentry->d_lock);
> > >          if (unlikely(d_in_lookup(dentry))) {
> > >                  dir = dentry->d_parent->d_inode;
> > > @@ -2685,6 +2688,9 @@ static inline void __d_add(struct dentry *dentry, 
> > > struct inode *inode)
> > >          if (dir)
> > >                  end_dir_add(dir, n);
> > >          spin_unlock(&dentry->d_lock);
> > > +#ifdef CONFIG_PREEMPT_RT
> > > +       preempt_enable();
> > > +#endif
> > >          if (inode)
> > >                  spin_unlock(&inode->i_lock);
> > >   }
> > > --
> > > 1.9.1
> > > 
> > 
> > --
> > - Thou shalt not follow the NULL pointer, for chaos and madness await
> > thee at its end
> > - "Use the force Harry" - Gandalf, Star Trek II
> 
> 
> -- 
> Thanks,
> Wenlin Kang
> 
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#9899): 
https://lists.yoctoproject.org/g/linux-yocto/message/9899
Mute This Topic: https://lists.yoctoproject.org/mt/82928225/21656
Group Owner: [email protected]
Unsubscribe: https://lists.yoctoproject.org/g/linux-yocto/unsub 
[[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to