sched_clock: Fix atomicity/continuity bug by using cmpxchg64()
authorEric Dumazet <eric.dumazet@gmail.com>
Wed, 30 Sep 2009 18:36:19 +0000 (20:36 +0200)
committerIngo Molnar <mingo@elte.hu>
Wed, 30 Sep 2009 20:56:10 +0000 (22:56 +0200)
Commit def0a9b2573 (sched_clock: Make it NMI safe) assumed
cmpxchg() of 64bit values was available on X86_32.

That is not so - and causes some subtle scheduler misbehavior due
to incorrect timestamps off to up by ~4 seconds.

Two symptoms are known right now:

 - interactivity problems seen by Arjan: up to 600 msecs
   latencies instead of the expected 20-40 msecs. These
   latencies are very visible on the desktop.

 - incorrect CPU stats: occasionally too high percentages in 'top',
   and crazy CPU usage stats.

Reported-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: John Stultz <johnstul@us.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20090930170754.0886ff2e@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
kernel/sched_clock.c

index ac2e1dc..479ce56 100644 (file)
@@ -127,7 +127,7 @@ again:
        clock = wrap_max(clock, min_clock);
        clock = wrap_min(clock, max_clock);
 
-       if (cmpxchg(&scd->clock, old_clock, clock) != old_clock)
+       if (cmpxchg64(&scd->clock, old_clock, clock) != old_clock)
                goto again;
 
        return clock;
@@ -163,7 +163,7 @@ again:
                val = remote_clock;
        }
 
-       if (cmpxchg(ptr, old_val, val) != old_val)
+       if (cmpxchg64(ptr, old_val, val) != old_val)
                goto again;
 
        return val;