Possible race in device mapper

Jeroen van Disseldorp jeroen at dizzl.com
Wed Oct 12 06:30:20 UTC 2005


Neil,

Thanks. It looks like a patch to a very similar bug. However its
reporter states that 2.6.10 did not have any problems and the
bug was only introduced in 2.6.11-rc2:

http://bugzilla.kernel.org/show_bug.cgi?id=4946

The problem I'm facing was first noticed in 2.6.7 and continued up
until last week (2.6.11-1-686-smp from Debian testing). No more
problems since I removed the crypto and run plain raid-5.

Reapplying the crypto takes a while since I have to backup my
entire raidsets first (+-500G). If you think that testing this
patch would still be useful, let me know.

Cheers, Jeroen



> On Wednesday October 5, jdizzl at xs4all.nl wrote:
>> Hi all,
>>
>> I'm mailing this directly as I'd like an opinion of you guys on whether
>> my
>> hunch could be correct:
>>
>> I've filed a kernel bug report early this year and have been gathering
>> more
>> and more info on it ever since. (Bug #295657.) In the beginning it
>> seemed an
>> EXT3 bug on RAID, that messed up the filesystem. Later I suspected that
>> it
>> might be a problem with 4K stacks, but last week I've come to the
>> conclusion
>> that there might be some sort of race condition in the dm crypto-layer
>> that
>> messes up when decrypting. It's especially easily triggered during heavy
>> disk-io.
>>
>> I'm running 2.6.11-1-686-smp on a dual P2-400 with 2 raidsets. See the
>> bug
>> report for more info:
>> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=295657
>>
>> Since I removed the crypto-layer from the filesystem I have no more
>> filesystem
>> inconsistencies.
>>
>> Could you please shed your light on this matter to see whether I should
>> file a
>> formal dm-bug? Thanks.
>>
>> Cheers,
>> Jeroen van Disseldorp
>>
>> PS. Neil: Although you're not lister as maintainer, I got a tip from
>> Wichert
>> Akkerman that you might be able to help in this.
>
> Does that kernel contain this patch (2.6.11 didn't): ??
>
> http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blobdiff;h=249dd6bb66c843ee9e487f124dbe37afab095581;hp=ca8f7a850fe30aa27cb6e3b524464197bd76f3c8;hb=a5453be48e8def75a9c1b2177b82fa0e692c6e3a;f=fs/bio.c
>
> --- fs/bio.c
> +++ fs/bio.c
> @@ -261,6 +261,7 @@ inline void __bio_clone(struct bio *bio,
> 	 */
> 	bio->bi_vcnt = bio_src->bi_vcnt;
>  	bio->bi_size = bio_src->bi_size;
> +	bio->bi_idx = bio_src->bi_idx;
> 	bio_phys_segments(q, bio);
> 	bio_hw_segments(q, bio);
> 	}
>
> If not, could you try a kernel that does and let me know if it still
> fails?
>
> NeilBrown
>
>





More information about the pkg-lvm-maintainers mailing list