[Debootloaders-silo] Bug#688521: SILO first boot after power-on or reset fails on Netra T1 200

Mark Morgan Lloyd markMLl.debian at telemetry.co.uk
Mon Sep 24 09:49:45 UTC 2012


 > Looking at the output you see, I have doubts that it has anything
 > to do with SILO though. SILO prints letters 'S', 'I', 'L' and 'O'
 > (appearing before the prompt) after it completes execution of
 > different parts of first-stage loader. As you can see in the code
 > (first/first.S), printing 'S' is the first thing first-stage loader
 > does upon startup. The fact that it is not seen in the console output
 > suggests that even first-stage loader never got to run. The line
 >
 > Boot device: /pci at 1f,0/pci at 1/scsi at 8/disk at 0,0:a  File and args:
 >
 > which is normally printed by OBP before control is passed to SILO
 > does not appear in the watchdog-reset case either, which, again,
 > is a strong sign that failure happens before SILO has a chance to run.

OK, but it still boots Squeeze without complaint. And complains when 
booting Lenny.

 > In a failure case, how long does it take between you typing 'boot' and
 > "watchdog reset" message being displayed?

About a second.

 > This doc
 >http://docs.oracle.com/cd/E19102-01/n240.srvr/817-5481-11/understanding_wdtimer.html
 >
 > appears to suggest that stuck watchdog would initiate a XIR after 60
 > seconds by default, is it consistent with what you see? What are the
 > values of various variables mentioned there on your system(s)? Does
 > increasing the timeout help?

As far as I can see that's applicable to Solaris and ALOM. The T1 200 
uses the lomlite2 chip.

 > I really can't come up with any reason why it would work for Squeeze
 > but not other releases, so testing all suspect SILO versions on the
 > same machine would be an interesting experiment.

Working backwards using silo_1.4.14+git20120819-1_sparc.deb 
silo_1.4.14+git20100228-1+b1_sparc.deb 
silo_1.4.13a+git20070930-3_sparc.deb silo_1.4.13-1_sparc.deb resulted in 
no change in symptoms. Trying to use silo_1.4.9-1_sparc.deb resulted in 
a system which dumped me straight into BusyBox. Putting the Squeeze disc 
back into the system at that point still worked without complaint.

In case I was doing anything obviously wrong, I was getting the .deb 
using wget and then installing using  dpkg -i

I take Richard's point about it not being caused directly by the LOM 
chip (nothing in its log). The fact that Squeeze (still) works suggests 
that OBP and its variables including nvramrc aren't directly involved. I 
take your point about SILO not being displayed.

Observation (manual transcript follows):

# Booting Squeeze:

OpenBoot 4.0 [...] Ethernet address [...]
ok boot
Bad magic number in disk label
Can't open disk label package
Boot device: disk  File and args:
SILO version 1.4.14
Boot:

# i.e. that works without complaint. Booting Wheezy:

OpenBoot 4.0 [...] Ethernet address [...]
ok boot
[Hex dump here]
Watchdog Reset
Externally Initiated Reset
ok boot
Boot device: /pci at 1f,0/pci at 1/scsi at 8/disk at 0,0:a  File and args:
SILO Version 1.4.[...]
boot:

I need to go back and check Lenny (which fails to boot) again, that 
'Can't open disk label package' message might be significant.



More information about the Debootloaders-silo mailing list