Bug#692053: [PATCH] ia64 Iceweasel 10.0 (and above?) randomly stops responding, eating 100% CPU
Stephan Schreiber
info at fs-driver.org
Sun Dec 16 08:20:01 UTC 2012
found 692053 10.0.11esr-1
notfound 692053 10.0~b2-1
# that version is over; nobody wants to patch it anymore, I guess
severity 692053 serious
tags 692053 + patch
thanks
I could reproduce the bug on Wheezy.
Iceweasel also needs a patch for the address range problem on ia64
again since the upstream development has removed it in the meantime. I
filed the separate bug#696041 for that.
After applying the patch of bug#696041, Iceweasel still used to hang.
It turned out that the garbage collector has two bugs that can cause
an endless loop.
The problem is in js/src/jsgc.cpp:2385 in the function DecommitFreePages():
static void
DecommitFreePages(JSContext *cx)
{
JSRuntime *rt = cx->runtime;
for (GCChunkSet::Range r(rt->gcChunkSet.all()); !r.empty();
r.popFront()) {
Chunk *chunk = r.front();
while (chunk) {
ArenaHeader *aheader =
static_cast<ArenaHeader*>(chunk->info.freeArenasHead);
/*
* In the non-failure case, the list will be gone at the end of
* the loop. In the case where we fail, we relink all failed
* decommits into a new list on freeArenasHead.
*/
chunk->info.freeArenasHead = NULL;
while (aheader) {
/* Store aside everything we will need after decommit. */
ArenaHeader *next = aheader->next;
bool success = DecommitMemory(aheader, ArenaSize);
if (!success) {
aheader->next = chunk->info.freeArenasHead;
chunk->info.freeArenasHead = aheader;
continue;
}
size_t arenaOffset =
Chunk::arenaIndex(reinterpret_cast<uintptr_t>(aheader));
chunk->decommittedArenas.set(arenaOffset);
--chunk->info.numArenasFreeCommitted;
--rt->gcNumFreeArenas;
aheader = next;
}
chunk = chunk->info.next;
}
}
}
This code calls a function DecommitMemory(); ArenaSize is constant
4096, defined in js/src/jsgc.h:
const size_t ArenaShift = 12;
const size_t ArenaSize = size_t(1) << ArenaShift;
const size_t ArenaMask = ArenaSize - 1;
The DecommitMemory() is in js/src/jsgcchunk.cpp:
bool
DecommitMemory(void *addr, size_t size)
{
JS_ASSERT(uintptr_t(addr) % 4096UL == 0);
int result = madvise(addr, size, MADV_DONTNEED);
return result != -1;
}
When I debugged the hanging Iceweasel, a thread was in the most inner
while loop in DecommitFreePages()
ArenaHeader *aheader =
static_cast<ArenaHeader*>(chunk->info.freeArenasHead);
/*
* In the non-failure case, the list will be gone at the end of
* the loop. In the case where we fail, we relink all failed
* decommits into a new list on freeArenasHead.
*/
chunk->info.freeArenasHead = NULL;
while (aheader) {
/* Store aside everything we will need after decommit. */
ArenaHeader *next = aheader->next;
bool success = DecommitMemory(aheader, ArenaSize);
if (!success) {
aheader->next = chunk->info.freeArenasHead;
chunk->info.freeArenasHead = aheader;
continue;
}
size_t arenaOffset =
Chunk::arenaIndex(reinterpret_cast<uintptr_t>(aheader));
chunk->decommittedArenas.set(arenaOffset);
--chunk->info.numArenasFreeCommitted;
--rt->gcNumFreeArenas;
aheader = next;
}
aheader was always 0x70011e0100 on each loop iteration since
aheader->next was 0x70011e0100, too.
The question was how this state could occur.
Assume that we are in the first iteration of the loop. aheader is
initialized with the ptr of chunk->info.freeArenasHead.
DecommitMemory is called, for example, with the address 0x70011e0100
and the size 4096 (which is ArenaSize).
DecommitMemory() will fail since madvise() requires a page aligned
addr and length
int madvise(void *addr, size_t length, int advice);
(madvise() returns the EINVAL error code.)
Neither the seen address 0x70011e0100 nor the length 4096 is page
aligned here. Remember that Linux on ia64 can have 4K, 8K, 16K, or 64K
page dependant on the configuration the Kernel was compiled with.
Debian uses 16K on ia64.
So the subsequent if statement executes its body:
if (!success) {
aheader->next = chunk->info.freeArenasHead;
chunk->info.freeArenasHead = aheader;
continue;
}
Here happens what the above comment explains: "...we relink all failed
decommits into a new list on freeArenasHead.".
aheader is inserted as the new head item in the singly linked list
which starts with chunk->info.freeArenasHead. Please note that
chunk->info.freeArenasHead has been cleared above (initialized with
NULL).
aheader->next is initialized with chunk->info.freeArenasHead, which is
NULL this time; chunk->info.freeArenasHead is aheader subsequently.
Not any problem has occurred so far.
The continue statement enters the next iteration of the while loop -
without running the final
aheader = next;
DecommitMemory() is called with the same aheader again and of course
fails again. The body of the if statement runs for the aheader again.
This time aheader->next is initialized with chunk->info.freeArenasHead
which isn't NULL this time; it is the former aheader.
aheader->next is equal to aheader after that. The hang up is complete.
So the considered code has two bugs:
- The garbage collector code gives madvise() non-page-aligned
addresses and lengths since the code assumes 4K page size in
hard-coded manner. Thus, the problem doesn' occur, for example, on x86
or amd64.
- The while loop doesn't correctly iterate forward in the case of a
fail of the memory decommit.
Initially I wanted the garbage collector code use the actual memory
page size, for example, the constants ...
const size_t ArenaShift = 12;
const size_t ArenaSize = size_t(1) << ArenaShift;
const size_t ArenaMask = ArenaSize - 1;
/*
* This is the maximum number of arenas we allow in the FreeCommitted state
* before we trigger a GC_SHRINK to release free arenas to the OS.
*/
const static uint32 MaxFreeCommittedArenas = (32 << 20) / ArenaSize;
/*
* The mark bitmap has one bit per each GC cell. For multi-cell GC things this
* wastes space but allows to avoid expensive devisions by thing's size when
* accessing the bitmap. In addition this allows to use some bits for colored
* marking during the cycle GC.
*/
const size_t ArenaCellCount = size_t(1) << (ArenaShift - Cell::CellShift);
const size_t ArenaBitmapBits = ArenaCellCount;
const size_t ArenaBitmapBytes = ArenaBitmapBits / 8;
const size_t ArenaBitmapWords = ArenaBitmapBits / JS_BITS_PER_WORD;
... should no longer be constants. I think it would be much work.
Furthermore, I found the following code in js/src/jsgc.h:152
/*
* To minimize the size of the arena header the first span is encoded
* there as offsets from the arena start.
*/
static size_t encodeOffsets(size_t firstOffset, size_t lastOffset) {
/* Check that we can pack the offsets into uint16. */
JS_STATIC_ASSERT(ArenaShift < 16);
JS_ASSERT(firstOffset <= ArenaSize);
JS_ASSERT(lastOffset < ArenaSize);
JS_ASSERT(firstOffset <= ((lastOffset + 1) & ~size_t(1)));
return firstOffset | (lastOffset << 16);
}
The
JS_STATIC_ASSERT(ArenaShift < 16);
means that it wouldn't work with 64K memory page size. I guess my idea
would mean a complete rewrite of the code.
Thus, a simplier fix:
- The while loop is fixed in order to iterate correctly in the case of
a fail of the memory decommit.
- The code continues using 4K page size in hard-coded manner. Since
all memory decommits would fail, the chunk->info.freeArenasHead singly
linked list would contain all the items it had before. Thus, we insert
at the begin of the DecommitFreePages() function (the one with the
while loop) the following code which skips the loops when the memory
page size is larger than 4K:
/*
* If the ArenaSize is smaller than a memory page, we don't attempt to
* decommit anything because DecommitMemory() would always fail.
*/
if (ArenaSize < pageSize())
return;
The drawback is that Iceweasel would keep more memory (just as earlier
versions of Iceweasel) on ia64. I think it isn't that bad.
You also need the patches of bug#696041.
Stephan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fix-gc-hang-on-large-pages.patch
Type: application/octet-stream
Size: 2441 bytes
Desc: fix-gc-hang-on-large-pages.patch
URL: <http://lists.alioth.debian.org/pipermail/pkg-mozilla-maintainers/attachments/20121216/16c4dcdb/attachment-0001.obj>
More information about the pkg-mozilla-maintainers
mailing list