[Pkg-lustre-maintainers] Add fix for bug 13614
Niklas Edmundsson
nikke at acc.umu.se
Thu Sep 27 14:28:43 UTC 2007
Hi again!
I would suggest adding the fix for bug 13614. Really, it's a kludge
until the core problem is fixed, but it mitigates the problem enough
to be useful.
We easily triggered it when rebooting both our OST's at the same time.
When lustre went into recovery it more or less hammered itself to
death resulting in soft lockups and related nastiness. With the patch
applied we haven't been able to trigger it.
Attached is a version adapted for pkg-lustre trunk (mainly
formatting).
/Nikke
--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | nikke at acc.umu.se
---------------------------------------------------------------------------
"Mr Garibaldi would be delighted."--Garibaldi
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
-------------- next part --------------
Index: trunk/debian/patches/bug13614-kludge-stop-resend.dpatch
===================================================================
--- trunk/debian/patches/bug13614-kludge-stop-resend.dpatch (revision 0)
+++ trunk/debian/patches/bug13614-kludge-stop-resend.dpatch (revision 0)
@@ -0,0 +1,25 @@
+#! /bin/sh /usr/share/dpatch/dpatch-run
+##
+## All lines beginning with `## DP:' are a description of the patch.
+## DP: CFS bug 13614 - Reduce problem by stopping resend.
+
+ at DPATCH@
+diff -u -p -r1.189.34.11 ldlm_lockd.c
+--- ./lustre/ldlm/ldlm_lockd.c 16 Aug 2007 01:22:37 -0000 1.189.34.11
++++ ./lustre/ldlm/ldlm_lockd.c 12 Sep 2007 01:35:58 -0000
+@@ -566,6 +566,7 @@ int ldlm_server_blocking_ast(struct ldlm
+ req->rq_async_args.pointer_arg[0] = arg;
+ req->rq_async_args.pointer_arg[1] = lock;
+ req->rq_interpret_reply = ldlm_cb_interpret;
++ req->rq_no_resend = 1;
+
+ lock_res(lock->l_resource);
+ if (lock->l_granted_mode != lock->l_req_mode) {
+@@ -664,6 +665,7 @@ int ldlm_server_completion_ast(struct ld
+ req->rq_async_args.pointer_arg[0] = arg;
+ req->rq_async_args.pointer_arg[1] = lock;
+ req->rq_interpret_reply = ldlm_cb_interpret;
++ req->rq_no_resend = 1;
+
+ body = lustre_msg_buf(req->rq_reqmsg, DLM_LOCKREQ_OFF, sizeof(*body));
+ body->lock_handle[0] = lock->l_remote_handle;
More information about the Pkg-lustre-maintainers
mailing list