[Re: [linux-yocto][v5.10/standard/preempt-rt/base and v5.4/standard/preempt-rt/base][PATCH v2] fs: dcache: avoid livelocks on d_alloc_parallel] On 19/05/2021 (Wed 18:25) Wenlin Kang wrote:
> On 2021/5/19 上午11:12, Bruce Ashfield wrote: > > [Please note: This e-mail is from an EXTERNAL e-mail address] > > > > Adding Paul (I'd like his opinion). > > > Hi Bruce > > > > > > This is in a really core part of the kernel, and I can't tell from the > > patch: > > > > - has this been submitted to the upstream -rt list ? (I checked and > > didn't see it) > > > I haven't found the one upstream -rt tree which has file > include/linux/spinlock_rt.h and is included in include/linux/spinlock.h, > so I don't know where I should submit it, would you please tell me if you > know? thanks. > > > > - is this a backport ? (A quick check also says no, I don't see it > > in my 5.12-rt) > > > It's not a backport, in yocto 5.12-rt, it don't have > include/linux/spinlock_rt.h, so it don't need this patch. Filenames may have changed, but the core technology has not, so if you are basing your conclusions on filename alone, they are wrong. > > > > - what sort of stress testing has been done with the patch ? > > - was there a reproducer for the original problem that this has been > > shown to solve ? > > > Steps to reproduce: > 1) enable config CONFIG_PREEMPT_RT > 2) $ taskset -c 1 chrt -f 70 ./test1.sh & > $ taskset -c 1 chrt -f 80 ./test2.sh & > > After about 30 minutes, check whether serial console responds. > > If not have the patch, after 30 minutes, the console will not respond any > more, > with this patch, the problem get solved. You are likely simply masking a symptom by serializing things in dcache vs fixing the underlying problem with your old v4.12-rt I say this because you are essentially saying there is a priority inversion in the core dcache code, and if that were true, it would show up not just here for you but in lots of other code places for other people as well. > Though this problem was reproduced on v4.12 with preempt-rt with config > CONFIG_PREEMPT_RT_FULL, We'd need to see it reproduced on recent -rt and then if seen and once the underlying issue was properly solved, we could work backwards to see which currently maintained -rt releases were impacted. You could look at fixes applied to v4.14-rt as I believe that is the closest to you that is currently maintained. > I check the codes, found it should also be exist on > v5.10/standard/preempt-rt/base and v5.4/standard/preempt-rt/base, If you are just looking at dcache code and saying they are fundamentally the same, then again your conclusions won't be accurate. In addition to reproducing on a modern -rt, I would also suggest - for your end goal - to get a vanilla v4.14-rt and use the core of your .config from v4.12-rt and retest there. If you can't reproduce it on v4.14-rt then you have a path ; examine where the 4.14-rt patches diverged from your v4.12-rt and use that to narrow in on your issue. If you can reproduce it on v4.14 then great - you've got something with simple test cases to reproduce that you can take to the rt mailing list, and indicate you've managed to make the issue go away but are not sure it is the right way to resolve it. > only have a difference on config option, v5.2 and previous version used > config CONFIG_PREEMPT_RT_FULL, > v5.10/standard/preempt-rt/base and v5.4/standard/preempt-rt/base used > CONFIG_PREEMPT_RT. Yes the config options got renamed along the way, as part of a strategy to get more rt content into mainline, but again the core technology of "-rt" remains largely unchanged. To be clear, I can't recommend applying this patch to any v5.x-rt as is without more investigation as outlined above. Paul. -- > > > P.S. > $ cat test1.sh > #!/bin/bash > > while true; do > usleep 1 > done > > $ cat test2.sh > #!/bin/bash > > while true; do > cat /proc/stat > /dev/null > usleep 100 > done > > > > > > Bruce > > > > > > On Tue, May 18, 2021 at 10:42 PM Kang Wenlin <[email protected]> > > wrote: > > > From: Wenlin Kang <[email protected]> > > > > > > With RT, if a high priority task that runs d_alloc_parallel() preempts a > > > low priority task that runs __d_add(), it can make d_alloc_parallel() go > > > into an infinite retry. > > > > > > low priority task: high priority task: > > > __lookup_slow() > > > d_add() > > > __d_add > > > start_dir_add(dir) __lookup_slow() > > > d_alloc_parallel() > > > > > > When the low priority task finished start_dir_add(), i_dir_seq has been > > > incremented to an odd value. Since __d_add() uses spin_lock(), migration > > > is disabled, so if the low priority task is preempted during > > > start_dir_add() > > > /end_dir_add(), then high priority task can go into an infinite waiting > > > for > > > i_dir_seq. > > > > > > This patch avoid the issue by enable preempt_disable() during > > > start_dir_add() > > > /end_dir_add(). > > > > > > Signed-off-by: Wenlin Kang <[email protected]> > > > --- > > > fs/dcache.c | 6 ++++++ > > > 1 file changed, 6 insertions(+) > > > > > > diff --git a/fs/dcache.c b/fs/dcache.c > > > index ad9bffc..16e204d 100644 > > > --- a/fs/dcache.c > > > +++ b/fs/dcache.c > > > @@ -2667,6 +2667,9 @@ static inline void __d_add(struct dentry *dentry, > > > struct inode *inode) > > > { > > > struct inode *dir = NULL; > > > unsigned n; > > > +#ifdef CONFIG_PREEMPT_RT > > > + preempt_disable(); > > > +#endif > > > spin_lock(&dentry->d_lock); > > > if (unlikely(d_in_lookup(dentry))) { > > > dir = dentry->d_parent->d_inode; > > > @@ -2685,6 +2688,9 @@ static inline void __d_add(struct dentry *dentry, > > > struct inode *inode) > > > if (dir) > > > end_dir_add(dir, n); > > > spin_unlock(&dentry->d_lock); > > > +#ifdef CONFIG_PREEMPT_RT > > > + preempt_enable(); > > > +#endif > > > if (inode) > > > spin_unlock(&inode->i_lock); > > > } > > > -- > > > 1.9.1 > > > > > > > -- > > - Thou shalt not follow the NULL pointer, for chaos and madness await > > thee at its end > > - "Use the force Harry" - Gandalf, Star Trek II > > > -- > Thanks, > Wenlin Kang >
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#9899): https://lists.yoctoproject.org/g/linux-yocto/message/9899 Mute This Topic: https://lists.yoctoproject.org/mt/82928225/21656 Group Owner: [email protected] Unsubscribe: https://lists.yoctoproject.org/g/linux-yocto/unsub [[email protected]] -=-=-=-=-=-=-=-=-=-=-=-
