Hi Jürgen, Jürgen Groß writes:
> On 13.06.20 00:27, Volodymyr Babchuk wrote: >> On Fri, 2020-06-12 at 17:29 +0200, Dario Faggioli wrote: >>> On Fri, 2020-06-12 at 14:41 +0200, Jürgen Groß wrote: >>>> On 12.06.20 14:29, Julien Grall wrote: >>>>> On 12/06/2020 05:57, Jürgen Groß wrote: >>>>>> On 12.06.20 02:22, Volodymyr Babchuk wrote: >>>>>>> @@ -994,9 +998,22 @@ s_time_t sched_get_time_correction(struct >>>>>>> sched_unit *u) >>>>>>> break; >>>>>>> } >>>>>>> + spin_lock_irqsave(&sched_stat_lock, flags); >>>>>>> + sched_stat_irq_time += irq; >>>>>>> + sched_stat_hyp_time += hyp; >>>>>>> + spin_unlock_irqrestore(&sched_stat_lock, flags); >>>>>> >>>>>> Please don't use a lock. Just use add_sized() instead which will >>>>>> add >>>>>> atomically. >>>>> >>>>> If we expect sched_get_time_correction to be called concurrently >>>>> then we >>>>> would need to introduce atomic64_t or a spin lock. >>>> >>>> Or we could use percpu variables and add the cpu values up when >>>> fetching the values. >>>> >>> Yes, either percpu or atomic looks much better than locking, to me, for >>> this. >> >> Looks like we going to have atomic64_t after all. So, I'll prefer to to >> use atomics there. > > Performance would be better using percpu variables, as those would avoid > the cacheline moved between cpus a lot. I see. But don't we need locking in this case? I can see scenario, when one pCPU updates own counters while another pCPU is reading them. IIRC, ARMv8 guarantees that 64 bit read of aligned data would be consistent. "Consistent" in the sense that, for example, we would not see lower 32 bits of the new value and upper 32 bits of the old value. I can't say for sure about ARMv7 and about x86. -- Volodymyr Babchuk at EPAM