I've been trying to get old versions of SunOS to load under qemu. In
doing so, I've encountered a number of bugs in OBP. I'm not always
certain of the best fix, but I can at least provide a quick hack that
will get people farther along.
1) Error message: "kmem_alloc failed, nbytes 680"
Bug: obp_dumb_memalloc is a bit too dumb. It needs to pick an address
if passed a null address. (According to the comment in the allocator
in OpenSolaris prom_alloc.c (see
<http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/psm/promif/ieee1275/sun4/prom_alloc.c>),
"If virthint is zero, a suitable virt is chosen.")
Quick fix: If passed a null address, start doling out addresses at
10MB and increment by size.
Shortcomings: The quick fix ignores the issue of free() and doesn't
remove memory from the virtual-memory/available node.
After the quick fix, the boot gets farther, leading us to:
2) Error message: "Unhandled Exception 0x00000080"
Bug: Trap 0 (entry 0x80 in the table, i.e. syscall_trap_4x) is
undefined. This is because the SunOS bootloader installs the trap by
writing code in the trap table, but the trap table is in the .text
section of OpenBIOS. Thus the trap 0 handler simply jumps to "bug".
Quick fix: Move the trap table to the .data section. Insert a "b
entry; nop; nop; nop;" before "bug:".
Shortcomings: Requires the extra "b entry" code. Allows the only VM
copy of the trap table to be permanently changed. OpenBIOS should
copy the read-only trap table to read-write memory (and update %tbr)
upon reset/entry.
3) #2 above actually exposes another bug. The write to the read-only
trap table does not cause an access violation -- instead, it silently
fails. The "std" instruction at 0x403e6c in the bootloader has no
effect.
Bug: Uncertain. It could be a systemic bug in qemu, but it appears
that the VM's MMU believes that the page is writable. That means that
the VM's MMU is not having the access protection flags set for pages
mapped to ROM. It thinks everything is rwx.
Fix?: The VM's MMU should have the access protection flags properly
set for each ROM section. This should probably be done within
OpenBIOS. E.g., .text should be r-x, .data should probably be rwx,
etc.
This is the one fix I'm really not sure how to implement. Any
suggestions? This may be a problem that only affects this bootloader,
so fixing #2 above may be all that's strictly necessary. But I'm not
positive that this bug doesn't have other ill effects I haven't found
yet.
At any rate, fixing #2 gets us still further, to:
4) Error messages:
"obp_devopen(sd(0,0,0):d) = 0xffd8e270
obp_inst2pkg(fd 0xffd8e270) = 0xffd57f44
obp_getprop(0xffd57f44, device_type) (not found)"
Bug: The OpenBIOS "interpose" implementation is not transparent to
non-interposition-aware code (in violation of the interposition spec).
The inst2pkg call in this sequence returns the phandle for
/packages/misc-files, instead of the proper phandle.
Quick fix: Comment out the "interpose disk-label" lines in ob_sd_open.
Shortcomings: It disables disk-label. The correct fix is to fix the
underlying problem with interposition, but I'm not sure exactly what
it is. Could someone help?
Fixing #4 gets us quite a bit further, until:
5) Error message:
"Unhandled Exception 0x00000009
PC = 0xf0138b20 NPC = 0xf0138b24
Stopping execution"
Bug: The instruction is trying to read from 0xfd020000+4, which is an
invalid address. This address isn't mapped by OBP by default on Sun
hardware, so the bootloader must be trying to (a) map this address and
failing silently or (b) skipping the mapping for some reason. The
instruction is hard-coded to look at this absolute address.
Fix: Unknown. This may be another instance of writes silently
failing, hence my interest in #3 above. It could also be a
side-effect of the quick fix for #4.
6) Error message:
"BAD TRAP: cpu=0 type=9 rp=fd008f0c addr=feff8008 mmu_fsr=3a6 rw=2
MMU sfsr=3a6: Invalid Address on supv data store at level 3
regs at fd008f0c:
psr=4400fc7 pc=f00053f4 npc=f00053f8
..."
Bug: Real sun4m hardware registers 4 CPU-specific interrupts followed
by a system-wide interrupt, regardless of the number of CPUs
installed. The same is true of counters. SunOS looks at the 5th
interrupt for the system-wide interrupt. OBP, since there's only one
CPU, just sets up one CPU-specific interrupt followed by the
system-wide interrupt, so there is no 5th interrupt. See the comment
on "NCPU" at
<http://stuff.mit.edu/afs/athena/astaff/project/opssrc/sys.sunos/sun4m/devaddr.h>.
Fix: in obp_interrupt_init() and obp_counter_init() register 4
CPU-specific interrupts before allocating the system-wide interrupt.
The kernel will then map the 5th interrupt to the system-wide
interrupt.
7) Error message:
"BAD TRAP: cpu=0 type=9 rp=fd008d8c addr=7ff000 mmu_fsr=126 rw=1
MMU sfsr=126: Invalid Address on supv data fetch at level 1
regs at fd008d8c:
psr=4000cc4 pc=f01339a4 npc=f01339a8
..."
Bug: The command-line arguments passed to the kernel are fixed at
address 0x7FF000 (CMDLINE_ADDR, passed from qemu via nv_info.cmdline),
which is no longer mapped by the time the kernel looks at the boot
arguments. A regular Sun boot ROM will copy this into mapped memory.
Fix: Copy the string in nv_info.cmdline to a OpenBIOS global (since
OpenBIOS continues to be mapped) in ob_nvram_init().
8) Error message:
"BAD TRAP: cpu=0 type=9 rp=fd008dec addr=1019000 mmu_fsr=126 rw=1
MMU sfsr=126: Invalid Address on supv data fetch at level 1
regs at fd008dec:
psr=4400cc5 pc=f0131680 npc=f0131684
..."
Bug: The dumb memory allocator from bug #1 was allocating a range that
the SunOS 4 kernel doesn't like.
Fix: Mimic the Sun boot ROM allocator: the top of the heap should be
a 0xFFEDA000 and allocations should return descending addresses. So,
for example, if asking for 0x1000 bytes, the first returned pointer
should be 0xFFED9000.
9) Error message:
"BAD TRAP: cpu=0 type=9 rp=fd008d2c addr=b1b91000 mmu_fsr=126 rw=1
MMU sfsr=126: Invalid Address on supv data fetch at level 1
regs at fd008d2c:
psr=4900cc3 pc=f0142c04 npc=f0142c08
..."
Bug: The precise underlying cause isn't clear. The bug appears due to
a variation between OBP's behavior and stock Sun behavior.
Fix: Add the "cache-physical?" property to the CPU node in
ob_nvram_init() and bump the "mmu-nctx" property up to 4096 (from
256).
git-svn-id: svn://coreboot.org/openbios/openbios-devel@114 f158a5a8-5612-0410-a976-696ce0be7e32
The current esp code will perform partial reads on devices with sector sizes
> 512. The attached patch makes it read the whole sector from the device, and
uses the additional if multiple blocks are requested.
I'm not sure if the old behavior is technically wrong, but it confused me when
debugging the qemu device emulation :-)
Paul
git-svn-id: svn://coreboot.org/openbios/openbios-devel@87 f158a5a8-5612-0410-a976-696ce0be7e32
This patch fixes SMP booting:
Entering SMP Mode...
Starting CPU 1 at f01e46e4
Calibrating delay loop... 180.63 BogoMIPS
Starting CPU 2 at f01e46f0
Calibrating delay loop... 181.86 BogoMIPS
Starting CPU 3 at f01e46fc
Calibrating delay loop... 182.68 BogoMIPS
Total of 4 Processors activated (721.71 BogoMIPS).
Though depending on version Linux usually hangs later.
git-svn-id: svn://coreboot.org/openbios/openbios-devel@82 f158a5a8-5612-0410-a976-696ce0be7e32
The ESP SCSI driver currently doesn't check whether a DMA requests has
completed before checking its status. On older qemu versions this
works ok because DMA happens instantly. On never qemu DMA can take an
indeterminate amount of time ooto complete, just like on real
hardware.
The patch waits for the controller to raise the DMA interrupt after
initiating a DMA request.
git-svn-id: svn://coreboot.org/openbios/openbios-devel@73 f158a5a8-5612-0410-a976-696ce0be7e32