[kernel] r14448 - in dists/lenny/linux-2.6/debian: . patches/bugfix/all patches/bugfix/s390 patches/series
Ben Hutchings
benh at alioth.debian.org
Sun Oct 25 00:33:58 UTC 2009
Author: benh
Date: Sun Oct 25 00:33:55 2009
New Revision: 14448
Log:
nohz: Fix two bugs that can keep a processor idle and lead to a system hang (may fix #538158 and others)
Added:
dists/lenny/linux-2.6/debian/patches/bugfix/all/nohz-dont-stop-if-softirqs-pending.patch
dists/lenny/linux-2.6/debian/patches/bugfix/all/nohz-dont-stop-outside-idle-loop.patch
dists/lenny/linux-2.6/debian/patches/bugfix/s390/nohz-dont-stop-outside-idle-loop-s390.patch
Modified:
dists/lenny/linux-2.6/debian/changelog
dists/lenny/linux-2.6/debian/patches/series/21
Modified: dists/lenny/linux-2.6/debian/changelog
==============================================================================
--- dists/lenny/linux-2.6/debian/changelog Sun Oct 25 00:32:59 2009 (r14447)
+++ dists/lenny/linux-2.6/debian/changelog Sun Oct 25 00:33:55 2009 (r14448)
@@ -2,6 +2,8 @@
[ Ben Hutchings ]
* Fix false soft lockup reports for the nohz idle loop
+ * nohz: Fix two bugs that can keep a processor idle and lead to a
+ system hang (may fix #538158 and others)
-- Ben Hutchings <ben at decadent.org.uk> Sat, 24 Oct 2009 23:45:45 +0100
Added: dists/lenny/linux-2.6/debian/patches/bugfix/all/nohz-dont-stop-if-softirqs-pending.patch
==============================================================================
--- /dev/null 00:00:00 1970 (empty, because file is newly added)
+++ dists/lenny/linux-2.6/debian/patches/bugfix/all/nohz-dont-stop-if-softirqs-pending.patch Sun Oct 25 00:33:55 2009 (r14448)
@@ -0,0 +1,45 @@
+From 857f3fd7a496ddf4329345af65a4a2b16dd25fe8 Mon Sep 17 00:00:00 2001
+From: Heiko Carstens <heiko.carstens at de.ibm.com>
+Date: Fri, 11 Jul 2008 11:09:22 +0200
+Subject: [PATCH] nohz: don't stop idle tick if softirqs are pending.
+
+In case a cpu goes idle but softirqs are pending only an error message is
+printed to the console. It may take a very long time until the pending
+softirqs will finally be executed. Worst case would be a hanging system.
+
+With this patch the timer tick just continues and the softirqs will be
+executed after the next interrupt. Still a delay but better than a
+hanging system.
+
+Currently we have at least two device drivers on s390 which under certain
+circumstances schedule a tasklet from process context. This is a reason
+why we can end up with pending softirqs when going idle. Fixing these
+drivers seems to be non-trivial.
+However there is no question that the drivers should be fixed.
+This patch shouldn't be considered as a bug fix. It just is intended to
+keep a system running even if device drivers are buggy.
+
+Signed-off-by: Heiko Carstens <heiko.carstens at de.ibm.com>
+Cc: Jan Glauber <jan.glauber at de.ibm.com>
+Cc: Stefan Weinhuber <wein at de.ibm.com>
+Cc: Andrew Morton <akpm at linux-foundation.org>
+Signed-off-by: Ingo Molnar <mingo at elte.hu>
+---
+ kernel/time/tick-sched.c | 1 +
+ 1 files changed, 1 insertions(+), 0 deletions(-)
+
+diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
+index cb75394..86baa4f 100644
+--- a/kernel/time/tick-sched.c
++++ b/kernel/time/tick-sched.c
+@@ -235,6 +235,7 @@ void tick_nohz_stop_sched_tick(void)
+ local_softirq_pending());
+ ratelimit++;
+ }
++ goto end;
+ }
+
+ ts->idle_calls++;
+--
+1.6.4.3
+
Added: dists/lenny/linux-2.6/debian/patches/bugfix/all/nohz-dont-stop-outside-idle-loop.patch
==============================================================================
--- /dev/null 00:00:00 1970 (empty, because file is newly added)
+++ dists/lenny/linux-2.6/debian/patches/bugfix/all/nohz-dont-stop-outside-idle-loop.patch Sun Oct 25 00:33:55 2009 (r14448)
@@ -0,0 +1,316 @@
+From b8f8c3cf0a4ac0632ec3f0e15e9dc0c29de917af Mon Sep 17 00:00:00 2001
+From: Thomas Gleixner <tglx at linutronix.de>
+Date: Fri, 18 Jul 2008 17:27:28 +0200
+Subject: [PATCH] nohz: prevent tick stop outside of the idle loop
+
+Jack Ren and Eric Miao tracked down the following long standing
+problem in the NOHZ code:
+
+ scheduler switch to idle task
+ enable interrupts
+
+Window starts here
+
+ ----> interrupt happens (does not set NEED_RESCHED)
+ irq_exit() stops the tick
+
+ ----> interrupt happens (does set NEED_RESCHED)
+
+ return from schedule()
+
+ cpu_idle(): preempt_disable();
+
+Window ends here
+
+The interrupts can happen at any point inside the race window. The
+first interrupt stops the tick, the second one causes the scheduler to
+rerun and switch away from idle again and we end up with the tick
+disabled.
+
+The fact that it needs two interrupts where the first one does not set
+NEED_RESCHED and the second one does made the bug obscure and extremly
+hard to reproduce and analyse. Kudos to Jack and Eric.
+
+Solution: Limit the NOHZ functionality to the idle loop to make sure
+that we can not run into such a situation ever again.
+
+cpu_idle()
+{
+ preempt_disable();
+
+ while(1) {
+ tick_nohz_stop_sched_tick(1); <- tell NOHZ code that we
+ are in the idle loop
+
+ while (!need_resched())
+ halt();
+
+ tick_nohz_restart_sched_tick(); <- disables NOHZ mode
+ preempt_enable_no_resched();
+ schedule();
+ preempt_disable();
+ }
+}
+
+In hindsight we should have done this forever, but ...
+
+/me grabs a large brown paperbag.
+
+Debugged-by: Jack Ren <jack.ren at marvell.com>,
+Debugged-by: eric miao <eric.y.miao at gmail.com>
+Signed-off-by: Thomas Gleixner <tglx at linutronix.de>
+---
+ arch/arm/kernel/process.c | 2 +-
+ arch/avr32/kernel/process.c | 2 +-
+ arch/blackfin/kernel/process.c | 2 +-
+ arch/mips/kernel/process.c | 2 +-
+ arch/powerpc/kernel/idle.c | 2 +-
+ arch/powerpc/platforms/iseries/setup.c | 4 ++--
+ arch/sh/kernel/process_32.c | 2 +-
+ arch/sparc64/kernel/process.c | 2 +-
+ arch/um/kernel/process.c | 2 +-
+ arch/x86/kernel/process_32.c | 2 +-
+ arch/x86/kernel/process_64.c | 2 +-
+ include/linux/tick.h | 5 +++--
+ kernel/softirq.c | 2 +-
+ kernel/time/tick-sched.c | 12 ++++++++++--
+ 14 files changed, 26 insertions(+), 17 deletions(-)
+
+diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c
+index 46bf2ed..84f5a4c 100644
+--- a/arch/arm/kernel/process.c
++++ b/arch/arm/kernel/process.c
+@@ -164,7 +164,7 @@ void cpu_idle(void)
+ if (!idle)
+ idle = default_idle;
+ leds_event(led_idle_start);
+- tick_nohz_stop_sched_tick();
++ tick_nohz_stop_sched_tick(1);
+ while (!need_resched())
+ idle();
+ leds_event(led_idle_end);
+diff --git a/arch/avr32/kernel/process.c b/arch/avr32/kernel/process.c
+index 6cf9df1..ff820a9 100644
+--- a/arch/avr32/kernel/process.c
++++ b/arch/avr32/kernel/process.c
+@@ -31,7 +31,7 @@ void cpu_idle(void)
+ {
+ /* endless idle loop with no priority at all */
+ while (1) {
+- tick_nohz_stop_sched_tick();
++ tick_nohz_stop_sched_tick(1);
+ while (!need_resched())
+ cpu_idle_sleep();
+ tick_nohz_restart_sched_tick();
+diff --git a/arch/blackfin/kernel/process.c b/arch/blackfin/kernel/process.c
+index 53c2cd2..77800dd 100644
+--- a/arch/blackfin/kernel/process.c
++++ b/arch/blackfin/kernel/process.c
+@@ -105,7 +105,7 @@ void cpu_idle(void)
+ #endif
+ if (!idle)
+ idle = default_idle;
+- tick_nohz_stop_sched_tick();
++ tick_nohz_stop_sched_tick(1);
+ while (!need_resched())
+ idle();
+ tick_nohz_restart_sched_tick();
+diff --git a/arch/mips/kernel/process.c b/arch/mips/kernel/process.c
+index 2c09a44..bdead3a 100644
+--- a/arch/mips/kernel/process.c
++++ b/arch/mips/kernel/process.c
+@@ -53,7 +53,7 @@ void __noreturn cpu_idle(void)
+ {
+ /* endless idle loop with no priority at all */
+ while (1) {
+- tick_nohz_stop_sched_tick();
++ tick_nohz_stop_sched_tick(1);
+ while (!need_resched()) {
+ #ifdef CONFIG_SMTC_IDLE_HOOK_DEBUG
+ extern void smtc_idle_loop_hook(void);
+diff --git a/arch/powerpc/kernel/idle.c b/arch/powerpc/kernel/idle.c
+index c3cf0e8..d308a9f 100644
+--- a/arch/powerpc/kernel/idle.c
++++ b/arch/powerpc/kernel/idle.c
+@@ -60,7 +60,7 @@ void cpu_idle(void)
+
+ set_thread_flag(TIF_POLLING_NRFLAG);
+ while (1) {
+- tick_nohz_stop_sched_tick();
++ tick_nohz_stop_sched_tick(1);
+ while (!need_resched() && !cpu_should_die()) {
+ ppc64_runlatch_off();
+
+diff --git a/arch/powerpc/platforms/iseries/setup.c b/arch/powerpc/platforms/iseries/setup.c
+index b721207..70b688c 100644
+--- a/arch/powerpc/platforms/iseries/setup.c
++++ b/arch/powerpc/platforms/iseries/setup.c
+@@ -561,7 +561,7 @@ static void yield_shared_processor(void)
+ static void iseries_shared_idle(void)
+ {
+ while (1) {
+- tick_nohz_stop_sched_tick();
++ tick_nohz_stop_sched_tick(1);
+ while (!need_resched() && !hvlpevent_is_pending()) {
+ local_irq_disable();
+ ppc64_runlatch_off();
+@@ -591,7 +591,7 @@ static void iseries_dedicated_idle(void)
+ set_thread_flag(TIF_POLLING_NRFLAG);
+
+ while (1) {
+- tick_nohz_stop_sched_tick();
++ tick_nohz_stop_sched_tick(1);
+ if (!need_resched()) {
+ while (!need_resched()) {
+ ppc64_runlatch_off();
+diff --git a/arch/sh/kernel/process_32.c b/arch/sh/kernel/process_32.c
+index b98e37a..921892c 100644
+--- a/arch/sh/kernel/process_32.c
++++ b/arch/sh/kernel/process_32.c
+@@ -86,7 +86,7 @@ void cpu_idle(void)
+ if (!idle)
+ idle = default_idle;
+
+- tick_nohz_stop_sched_tick();
++ tick_nohz_stop_sched_tick(1);
+ while (!need_resched())
+ idle();
+ tick_nohz_restart_sched_tick();
+diff --git a/arch/sparc64/kernel/process.c b/arch/sparc64/kernel/process.c
+index 2084f81..0798928 100644
+--- a/arch/sparc64/kernel/process.c
++++ b/arch/sparc64/kernel/process.c
+@@ -97,7 +97,7 @@ void cpu_idle(void)
+ set_thread_flag(TIF_POLLING_NRFLAG);
+
+ while(1) {
+- tick_nohz_stop_sched_tick();
++ tick_nohz_stop_sched_tick(1);
+
+ while (!need_resched() && !cpu_is_offline(cpu))
+ sparc64_yield(cpu);
+diff --git a/arch/um/kernel/process.c b/arch/um/kernel/process.c
+index 83603cf..a1c6d07 100644
+--- a/arch/um/kernel/process.c
++++ b/arch/um/kernel/process.c
+@@ -243,7 +243,7 @@ void default_idle(void)
+ if (need_resched())
+ schedule();
+
+- tick_nohz_stop_sched_tick();
++ tick_nohz_stop_sched_tick(1);
+ nsecs = disable_timer();
+ idle_sleep(nsecs);
+ tick_nohz_restart_sched_tick();
+diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
+index f8476df..1f5fa1c 100644
+--- a/arch/x86/kernel/process_32.c
++++ b/arch/x86/kernel/process_32.c
+@@ -166,7 +166,7 @@ void cpu_idle(void)
+
+ /* endless idle loop with no priority at all */
+ while (1) {
+- tick_nohz_stop_sched_tick();
++ tick_nohz_stop_sched_tick(1);
+ while (!need_resched()) {
+ void (*idle)(void);
+
+diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
+index e2319f3..c0a5c2a 100644
+--- a/arch/x86/kernel/process_64.c
++++ b/arch/x86/kernel/process_64.c
+@@ -148,7 +148,7 @@ void cpu_idle(void)
+ current_thread_info()->status |= TS_POLLING;
+ /* endless idle loop with no priority at all */
+ while (1) {
+- tick_nohz_stop_sched_tick();
++ tick_nohz_stop_sched_tick(1);
+ while (!need_resched()) {
+ void (*idle)(void);
+
+diff --git a/include/linux/tick.h b/include/linux/tick.h
+index a881c65..d3c0269 100644
+--- a/include/linux/tick.h
++++ b/include/linux/tick.h
+@@ -49,6 +49,7 @@ struct tick_sched {
+ unsigned long check_clocks;
+ enum tick_nohz_mode nohz_mode;
+ ktime_t idle_tick;
++ int inidle;
+ int tick_stopped;
+ unsigned long idle_jiffies;
+ unsigned long idle_calls;
+@@ -105,14 +106,14 @@ static inline int tick_check_oneshot_change(int allow_nohz) { return 0; }
+ #endif /* !CONFIG_GENERIC_CLOCKEVENTS */
+
+ # ifdef CONFIG_NO_HZ
+-extern void tick_nohz_stop_sched_tick(void);
++extern void tick_nohz_stop_sched_tick(int inidle);
+ extern void tick_nohz_restart_sched_tick(void);
+ extern void tick_nohz_update_jiffies(void);
+ extern ktime_t tick_nohz_get_sleep_length(void);
+ extern void tick_nohz_stop_idle(int cpu);
+ extern u64 get_cpu_idle_time_us(int cpu, u64 *last_update_time);
+ # else
+-static inline void tick_nohz_stop_sched_tick(void) { }
++static inline void tick_nohz_stop_sched_tick(int inidle) { }
+ static inline void tick_nohz_restart_sched_tick(void) { }
+ static inline void tick_nohz_update_jiffies(void) { }
+ static inline ktime_t tick_nohz_get_sleep_length(void)
+diff --git a/kernel/softirq.c b/kernel/softirq.c
+index 36e0617..05f2480 100644
+--- a/kernel/softirq.c
++++ b/kernel/softirq.c
+@@ -312,7 +312,7 @@ void irq_exit(void)
+ #ifdef CONFIG_NO_HZ
+ /* Make sure that timer wheel updates are propagated */
+ if (!in_interrupt() && idle_cpu(smp_processor_id()) && !need_resched())
+- tick_nohz_stop_sched_tick();
++ tick_nohz_stop_sched_tick(0);
+ rcu_irq_exit();
+ #endif
+ preempt_enable_no_resched();
+diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
+index 86baa4f..ee962d1 100644
+--- a/kernel/time/tick-sched.c
++++ b/kernel/time/tick-sched.c
+@@ -195,7 +195,7 @@ u64 get_cpu_idle_time_us(int cpu, u64 *last_update_time)
+ * Called either from the idle loop or from irq_exit() when an idle period was
+ * just interrupted by an interrupt which did not cause a reschedule.
+ */
+-void tick_nohz_stop_sched_tick(void)
++void tick_nohz_stop_sched_tick(int inidle)
+ {
+ unsigned long seq, last_jiffies, next_jiffies, delta_jiffies, flags;
+ struct tick_sched *ts;
+@@ -224,6 +224,11 @@ void tick_nohz_stop_sched_tick(void)
+ if (unlikely(ts->nohz_mode == NOHZ_MODE_INACTIVE))
+ goto end;
+
++ if (!inidle && !ts->inidle)
++ goto end;
++
++ ts->inidle = 1;
++
+ if (need_resched())
+ goto end;
+
+@@ -372,11 +377,14 @@ void tick_nohz_restart_sched_tick(void)
+ local_irq_disable();
+ tick_nohz_stop_idle(cpu);
+
+- if (!ts->tick_stopped) {
++ if (!ts->inidle || !ts->tick_stopped) {
++ ts->inidle = 0;
+ local_irq_enable();
+ return;
+ }
+
++ ts->inidle = 0;
++
+ rcu_exit_nohz();
+
+ /* Update jiffies first */
+--
+1.6.4.3
+
Added: dists/lenny/linux-2.6/debian/patches/bugfix/s390/nohz-dont-stop-outside-idle-loop-s390.patch
==============================================================================
--- /dev/null 00:00:00 1970 (empty, because file is newly added)
+++ dists/lenny/linux-2.6/debian/patches/bugfix/s390/nohz-dont-stop-outside-idle-loop-s390.patch Sun Oct 25 00:33:55 2009 (r14448)
@@ -0,0 +1,26 @@
+From e338125b8a886923ba8367207c144764dc352584 Mon Sep 17 00:00:00 2001
+From: Thomas Gleixner <tglx at linutronix.de>
+Date: Sat, 19 Jul 2008 09:33:21 +0200
+Subject: [PATCH] nohz: adjust tick_nohz_stop_sched_tick() call of s390 as well
+
+Signed-off-by: Thomas Gleixner <tglx at linutronix.de>
+---
+ arch/s390/kernel/process.c | 2 +-
+ 1 files changed, 1 insertions(+), 1 deletions(-)
+
+diff --git a/arch/s390/kernel/process.c b/arch/s390/kernel/process.c
+index 85defd0..9839767 100644
+--- a/arch/s390/kernel/process.c
++++ b/arch/s390/kernel/process.c
+@@ -142,7 +142,7 @@ static void default_idle(void)
+ void cpu_idle(void)
+ {
+ for (;;) {
+- tick_nohz_stop_sched_tick();
++ tick_nohz_stop_sched_tick(1);
+ while (!need_resched())
+ default_idle();
+ tick_nohz_restart_sched_tick();
+--
+1.6.4.3
+
Modified: dists/lenny/linux-2.6/debian/patches/series/21
==============================================================================
--- dists/lenny/linux-2.6/debian/patches/series/21 Sun Oct 25 00:32:59 2009 (r14447)
+++ dists/lenny/linux-2.6/debian/patches/series/21 Sun Oct 25 00:33:55 2009 (r14448)
@@ -1 +1,4 @@
+ bugfix/all/softlockup-fix-false-positives-on-nohz-idle.patch
++ bugfix/all/nohz-dont-stop-if-softirqs-pending.patch
++ bugfix/all/nohz-dont-stop-outside-idle-loop.patch
++ bugfix/s390/nohz-dont-stop-outside-idle-loop-s390.patch
More information about the Kernel-svn-changes
mailing list