Commit 76fee95 ("slof: Only close stdout for virtio-serial devices")
says that commit cf28264 ("virtio-serial: Rework shutdown sequence")
fixed a hang. The problem was believed to be that it was necessary to
close stdout to shutdown the underlying virtio device.
Commit cf28264 ("virtio-serial: Rework shutdown sequence") closed stdout
on quiesce. This meant when prom_init() called write on stdout after
quiesce, there is a use after free so this is unreliable, and can also
hang (especially after reboots).
Quiescing is intended to put hardware into a safe state for the client
to take over. It is incorrect for SLOF to close ihandles that the client
could still be using, even after a quiesce.
Rather than closing the stdout device, all that needs to happen is to
ensure virtio-serial-shutdown gets called. On quiesce, close the virtio
device, but leave the stdout device itself open.
Commit 8174acd ("virtio-serial: Close device completely") handles reads
and writes as no-ops if the underlying virtio device is closed so there
is no problem with the client calling "write" on stdout after this, but
no output will be displayed.
Fixes: cf28264 ("virtio-serial: Rework shutdown sequence")
Debugged-by: Kautuk Consul <kconsul@linux.vnet.ibm.com>
Co-developed-by: Kautuk Consul <kconsul@linux.vnet.ibm.com>
Signed-off-by: Kautuk Consul <kconsul@linux.vnet.ibm.com>
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
The read and write methods return successfully even if the virtio device
is closed (virtiodev is 0) and it is not able to send or receive any
characters.
Make the read and write methods return 0 to indicate they did not
succeed in this case.
This also fixes an invalid stack access in the read method.
Fixes: 8174acd ("virtio-serial: Close device completely")
Signed-off-by: Kautuk Consul <kconsul@linux.vnet.ibm.com>
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Currently, go-64 is used for booting a kernel from qemu (i.e. -kernel).
However, there is an expectation from users that this should be able to
boot not just vmlinux kernels but things like Zimages too.
The bootwrapper of a BE zImage is a 32-bit ELF. Attempting to load that
with go-64 means that it will be ran with MSR_SF set (64-bit mode). This
crashes early in boot (usually due to what should be 32-bit operations
being done with 64-bit registers eventually leading to an incorrect
address being generated and branched to).
Note that our 64-bit payloads are prepared to enter with MSR_SF cleared
and set it themselves very early.
Add a new word named go-direct that will execute any simple payload
in-place and will enter with MSR_SF cleared. This allows booting a BE
zImage from qemu with -machine kernel-addr=0.
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Add a new firmware API call with the name 2HASH-EXT-LOG that will be used
by trusted grub for measuring, logging, and extending TPM PCRs.
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
The #define in version.h does not match the #ifndef in the line before
due to a typo in the suffix ("_F" instead of "_H"). Fix it, and while
we're at it, also remove the underscore at the beginning to avoid that
we're using an identifier here that is reserved by the C standard.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
These likely were a blind copy-n-paste from hvterm.fs, but they
simply do not make any sense in virtio-serial.fs. The hvterm.fs is
always included from OF.fs, so the serial-* words are globally there.
virtio-serial.fs is only used within the virtio-serial device tree
nodes, to adding the serial-* words there is just superfluous.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
A loop over peers does not need recursion which becomes a problem with
hundreds devices.
This was discovered with "-smp 2048,cores=512,threads=4".
Suggested-by: Jordan Niethe <jniethe5@gmail.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
On QEMU pseries (and alike environments) the PC starts at 0x100, hence SLOF
starts at address 0x100, not at 0x0 as the current comment informs. After
that fix the comment also matches the comment above it about the __start
load position, which is correct.
Signed-off-by: Gustavo Romero <gromero@linux.ibm.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Recent commit cf28264196 fixed an issue where a virtio-serial device
wouldn't shutdown properly during quiesce. The fix is to close stdout
just before quiesce. As expected this causes some messages to not
appear anymore, like the well known ones from prom_init():
Quiescing Open Firmware ...
Booting Linux via __start() @ 0x0000000002000000 ...
Actually all messages are discarded until the OS driver finally takes
control of the device, which may represent a fair amount of logging.
This is suboptimal but this still better than hanging in SLOF.
The hammer is a bit too big though because the change also affects
spapr-vty based consoles, which have no reason to stop working
after quiesce.
Move the hack from the common code to the virtio-serial code so that
it doesn't affect other device types anymore. Register a quiesce hook
that closes stdout in virtio-serial.fs.
While here, as suggested by Segher, bring back some robustness in the
shutdown method.
Reported-by: Fabiano Rosas <farosas@linux.ibm.com>
Fixes: cf28264196 "virtio-serial: Rework shutdown sequence"
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
This reverts commit 674d0d0cf6 ("rtas: Reserve space for FWNMI log")
which expanded the RTAS blob size to match the QEMU expectation about
the RTAS area available for FWNMI logs.
Instead, it relies on QEMU passing the "rtas-size" property and passes it
through untouched. This adds a check that QEMU allocated enough for
RTAS blob. This adds a fallback to the default 20 bytes "rtas-size" if
none specified by QEMU.
While we are here, replace 's" /rtas" find-node' with 'rtas-node' which
we just set above.
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Linux closes stdout at the end of prom_init which triggers the FW quiesce
code which closes the virtio-serial instance. This misses stopping the
virtio queues. However this seemed working for a little longer (until the
Linux driver took over) till 300384f3dc which moved the VQ descriptors
around which caused use-after-free corruption.
This adds virtio_queue_term_vq(), cleanup in the forth driver and a few
checks.
Fixes: 300384f3dc ("virtio: Store queue descriptors in virtio_device")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[groug: - fix changelog
- don't restore emit]
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
The "io" word of term-io.fs opens two separate instances of the device
for stdin and stdout. The prom_init() function in Linux closes stdin at
some point, which internally calls quiesce and shuts the device down
through a quiesce hook.
When the "open-count" variable in virtio-serial.fs reaches 0, ie. when
closing the last instance, we call "close" two times, which is clearly
wrong. This never hits however because the stdout instance is never
closed which prevents "open-count" to reach 0.
It would make more sense to shutdown the device when closing the last
instance, for symmetry with the first open that initializes the device.
Change the shutdown sequence to do that rather than relying on a quiesce
hook.
Have quiesce to explicitly close stdout, which is supposedly the last
instance, and shutdown the device.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
term-io.fs already overrides "emit", "key" and "key?" with its own version:
- "term-io-emit" calls the "write" method of the "stdout" instance, which
then calls "virtio-serial-putchar"
- "term-io-key" calls the "read" method of the "stdout" instance, which then
calls "virtio-serial-getchar"
- "term-io-key?" calls "serial-key?" if the device is a serial device, which
is the case here and we already override "serial-key?" with
"virtio-serial-term-key?".
It thus looks weird to rely on these shortcuts. Especially, when IOMMU is
enabled, we need a valid instance in "dmap-map-in" and going through
"term-io-emit" buys us that.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Measure and log the GPT table including LBA1 and all GPT table entries
with a non-zero Type GUID.
We follow the specification "TCG PC Client Platform Firmware Profile
Specification" for the format of what needs to be logged and measured.
See section "Event Logging" subsection "Measuring UEFI Variables" for
the UEFI_GPT_DATA structure.
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Implement a TPM 2 menu and enable the user to clear the TPM
and its activate PCR banks.
The main TPM menu is activated by pressing the 't' key during
firmware startup.
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Kevin O'Connor <kevin@koconnor.net>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
This patch adds TPM 2.0 support along with the firmware API that Linux
uses to transfer the firmware log.
The firmware API follows the "PFW Virtual TPM Driver" specification.
The API has callers in existing Linux code (prom_init.c) from TPM 1.2
times but the API also works for TPM 2.0 without modifications.
The TPM 2.0 support logs PCR extensions of measurements of code and data.
For this part we follow the TCG specification "TCG PC Client
Platform Firmware Profile Specification" (section "Event Logging").
Other relevant specs for the construction of TPM commands are:
- Trusted Platform Module Library; Part 2 Structures
- Trusted Platform Module Library; Part 3 Commands
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Kevin O'Connor <kevin@koconnor.net>
[aik: removed new blank lines at EOF]
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
This patch adds a TPM driver for the CRQ interface as used by
the QEMU PAPR implementation.
Also add a Readme that explains the benefits and installation procedure
for the vTPM.
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Make the print_version global variable accessible so that
we can measure the firmware version.
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
We recently fixed node creation at CAS in order to support early hotplug
of devices between boot and CAS. Let's handle node removal now to support
early hot *un*plug of devices.
This is achieved by associating a generation number to each FDT received
from QEMU and tagging all nodes with this number in a "slof,from-fdt"
property. The generation number is kept in the fdt-generation# variable.
It starts at 0 for the initial boot time FDT, and it is incremented at
each subsequent CAS. All boot time nodes hence get "slof,from-fdt" == 0,
all nodes present at CAS get "slof,from-fdt" == 1 and so on in case the
guest calls CAS again. If a device gets hot unplugged before quiesce, we
hence can detect it doesn't have the right generation number and thus
delete the node from the DT. Note that this only affects nodes coming
from the FDT. Nodes created by SLOF don't have the "slof,from-fdt"
property, and therefore cannot be candidates to deletion.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Now that QEMU passes a full FDT at CAS without rebooting, a guest that
has switched from XICS to XIVE ends up being presented an malformed
"interrupt-controller" node in the DT:
# dtc -I fs -O dts /proc/device-tree
<stdout>: Warning (unit_address_vs_reg): /interrupt-controller: node has a reg or ranges property, but no unit name
...
interrupt-controller {
ibm,xive-eq-sizes = <0x10>;
device_type = "power-ivpe";
ibm,interrupt-server-ranges = <0x00 0x03>;
compatible = "ibm,power-ivpe";
#interrupt-cells = <0x02>;
reg = <0x60302 0x31b0000 0x00 0x10000 0x60302 0x31a0000 0x00 0x10000>;
phandle = <0xe7448a8>;
ibm,xive-lisn-ranges = <0x00 0x03>;
interrupt-controller;
};
The node should have its unit set to "60302031b0000" as reported by dtc.
Also the node still has an "ibm,interrupt-server-ranges" property which
only makes sense with XICS.
This happens because we find an existing "interrupt-controller" node,
which describes a XICS controller, and we _wrongly_ decide to copy
all the properties from the new node into it. Delete the existing node
instead so that we create a new node with the appropriate properties
and unit name.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
So far we only allowed new ibm,dynamic-reconfiguration-memory and memory
nodes in the FDT update blob at ibm,client-architecture-support (CAS).
DRC do not have unit addresses and are easy, for memory nodes we use
an address from the node name.
For early hot plugged PCI devices (plugged after reset but before CAS)
we have to have a similar hack as for memory@ but parse the address
differently because of different binding.
Instead, this changes new nodes creation. At pass#0 when we copy phandles
from the FDT update blob to SLOF, we create new nodes with all
new properties and call "finish-device" only after all properties are
copied to the new nodes. At this point we particularly care about "reg"
as this is the unit address which SLOF parses for us and sets the unit
address in "finish-device"; we could skip other properties for later
passes.
Note this creates naked nodes with no methods normally added to the nodes
as this bypasses normal discovery which SLOF performs at start. So
if pass#1 does not find the node created in pass#0, this points to
missing "decode-unit" at the new node parent (happens when adding bridge-
under-bridge) and this prints a message and resets.
While at this, fix few trailing spaces and comments.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[groug: - use fdt-reg-unit to set the unit name
- consolidate finish-device and unit name for nodes and subnodes
with a new fdt-cas-finish-device word ]
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
The previous approach to merge the QEMU FDT into the existing tree and
then patch it turned to be broken as we patch properties based on their
names only so we patch not just what QEMU provides (which was
the intention) but also all properties SLOF created. This breaks one of
them - "interrupt-map" - it is created by QEMU for a PHB but SLOF creates
it for PCI bridges and since they have different sizes, patching phandles
at fixed offset fails.
Rather than skipping certain nodes in the SLOF tree, this uses different
approach: now we read the QEMU FDT in 3 passes:
1. find all phandle/linux-phandle properties and store these in the SLOF
internal tree to allow phandle->node lookup later;
2. walk through all FDT properties, patch them if needed using
phandles from the SLOF tree and save patched values in SLOF properties;
3. delete phandle/linux-phandle properties created in 1. This is safe
as SLOF does not create these properties anyway.
Fixes: 44d06f9e68 ("fdt: Update phandles after H_CAS")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
fdt-fix-cas-node returns the end address after it's finished which
the caller (ibm,client-architecture-support) does not use or drop.
This renames fdt-fix-cas-node to (fdt-fix-cas-node) and adds a wrapper
on top of that which does the drop. This will be used later for 2-pass
tree patching.
While at this, exit the function if memory allocation failed.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
They call parent node (which is a device) methods.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
The immediate problem with the code is that it relies on memory allocator
aligning addresses to the size. This is true for SLOF but not for GRUB
and in unaligned situations we end up mapping more pages than bm-alloc
allocated.
This fixes the problem by calculating aligned DMA size before calling
bm-alloc.
While at this, simplify the code by removing global variables. Also
replace 1000/fff (the default 4K IOMMU page size) with tce-ps/mask.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
---
Changes:
v4:
* fixed code comments, tab/spaces
* fixed bm-alloc failure handling
At the moment SLOF generates phandles except a few exceptions such as
an interrupt controller (XICS/XIVE) and NVLink-related nodes. For these
nodes QEMU generates phandles which SLOF later detects and replaces with
the node addresses (which are phandles in SLOF).
However we are missing these updates when processing
the ibm,client-architecture-support client interface call - SLOF calls
QEMU with H_CAS to get an update for the device tree, and if that blob
contains phandles, they make it to the final tree unchanged with
undefined results.
This calls fdt-fix-phandles for the H_CAS update blob.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
We implement RTAS as a simple binary blob which calls directly into QEMU
via a custom hcall. So far we were relying on QEMU putting the RTAS blob
to the guest memory with its location in linux,rtas-base/rtas-size.
The problems with this are:
1. we need to peek a location in the guest ram in addition to slof, FDT
and sometime kernel and init ram disk; having one less image makes QEMU's
life easier.
2. for secure VMs, it is yet another image which needs to be signed and
verified.
This implements "instantiate-rtas" completely in SLOF, including KVM PR
support ("broken sc1").
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
The spapr-vscsi device of QEMU supports multiple channels (a.k.a. buses).
But when QEMU is started with a device on a bus > 0, SLOF fails to detect
the device, so that the boot fails. For example:
qemu-system-ppc64 -nodefaults -nographic -serial stdio -device spapr-vscsi \
-blockdev driver=file,filename=/path/to/cdrom.iso,node-name=d1,read-only=on \
-device scsi-cd,id=cd1,drive=d1,channel=6,scsi-id=5,lun=1
Thus SLOF should scan the various channels for bootable SCSI devices, too.
Since the common SLOF code for scanning SCSI devices has no meaning of
"channels" or "bus", we simply fake the bus ID to be part of the target
ID, so instead of supporting 64 targets = 64 devices, we now support
8 channel * 64 targets = 512 devices instead.
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1663160
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
QEMU supports up the 64 SCSI IDs on the vscsi "bus", see the string
"max_target = 63" in the source file hw/scsi/spapr_vscsi.c of QEMU.
However, SLOF currently only checks the first 9 IDs on the vscsi adaptor,
so when you try to boot from a CD-ROM like this, the boot fails:
qemu-system-ppc64 ... -device spapr-vscsi,id=scsi0,reg=0x2000 \
-drive file=/path/to/cdrom.iso,format=raw,if=none,id=dr1,readonly=on \
-device scsi-cd,bus=scsi0.0,channel=0,scsi-id=63,lun=1,drive=dr1,id=scd1
Thus let's change the amount of IDs that we scan in SLOF to 64, too, to
match the ID range that QEMU provides.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
The NVIDIA driver for NVLink2-capable GPU NVIDIA V100 discovers topology
between GPU/NPUs/GPURAM via the device tree which needs to have cross
references between device tree nodes.
This adds patching of the nodes needed for the driver to initialize.
As all these properties only contain phandles and nothing else, there is
no risc of accidendal damage.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
Changes:
* added the commit log
We generate a fake XICS phandle in QEMU and SLOF replaces that phandle
with the real one (i.e. SLOF's node address) in interrupt-parent and
interrupt-map properties. These properties are handled differently -
the interrupt-map is fixed in place while interrupt-parent is
decoded+encoded+set as a property.
This changes interrupt-parent fixing code to do what the interrupt-map
code does because soon we are going to have more phandles to fix and some
contain an array of phandles (such as "ibm,npu").
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
Changes:
v2:
* removed fdt-replace-l,
When compiling with a very recent toolchain, I get these warnings:
../../llfw/boot_abort.S: Assembler messages:
../../llfw/boot_abort.S:76: Warning: invalid register expression
and:
stage2_head.S: Assembler messages:
stage2_head.S:57: Warning: invalid register expression
The first one is using the wrong opcode, we should use "and" instead of
"andi" here. The second one is using a register instead of a constant
for load-immediate, which is non-sense, too. Fix it to use the right
constant instead.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Linux kernel commit 2a9d832cc9aae21ea827520fef635b6c49a06c6d
(of: Add bindings for chosen node, stdout-path) deprecated chosen property
"linux,stdout-path" and "stdout".
Check for new property "stdout-path" first and as a fallback check
"linux,stdout-path". This older property can be deprecated after 5 years.
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
At the moment we count on the guest kernel to update or create device
tree properties pointing to the instantiated RTAS copy which is not
very reliable.
This stores rtas-base and rtas-size in the DT at the instantiation
point so later on the H_UPDATE_DT hcall can supply QEMU with an updated
location of RTAS.
This superseeds f9a60de30 "Add private HCALL to inform updated
RTAS base and entry".
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
Changes:
v5:
* ditched rtas-entry, added rtas-size (which is always 20 bytes though)
The existing code hardcodes the length of /openprom/model to 10 characters
even though it is less than that - len("aik")==3. All 10 chars go to
the device tree blob and DTC complains on such a property as there are
characters after terminating null:
aik@fstn1-p1:~$ dtc -f -I dtb -O dts -o dbg.dts dbg.dtb
Warning (model_is_string): "model" property in /openprom is not a string
This uses the real length and limits it by 10 to avoid breaking something.
Since the same code parses the build id field, this moves from-cstring
to a common place for both js2x and qemu boards.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
This creates flatten device tree and passes it to QEMU via a custom
hypercall right before jumping to RTAS.
This preloads strings with 40 property names from CPU and PCI device nodes
and the strings lookup only searches within these.
Test results on a guest with 256 CPUs and 256 virtual Intel E1000 devices
running on a POWER8 box:
FDTsize=366024 Strings=15888 Struct=350080 Reused str=12457 242 ms
A simple guest (one CPU, no PCI) with this patch as is:
FDTsize=15940 Strings=3148 Struct=12736 Reused str=84 7 ms
While we are here, fix the version handling in fdt-init. It only matters
a little for the fdt-debug==1 case though.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
Changes:
v6:
* fix memory sizes for free-mem
* store correct chosen-cpu to the header (used to be just 0)
* fdt-skip-string uses zcount now and works 30% faster
* moved to a new file - fdt-fl.fs
v5:
* applied latest comments from Segher
* s/fdt-property/fdt-copy-property/, s/fdt-properties/fdt-copy-properties/
* reduced the temporary buffers to 1MB each as the guest uses 1MB in total
anyway
* do not pass root phandle to fdt-flatten-tree, it fetches it from
device-tree itself
* reworked fdt-copy-properties to use for-all-words proposed by Segher
v4:
* reworked fdt-properties, works lot faster
* do not store "name" properties as nodes have names already
v3:
* fixed stack handling after hcall returned
* fixed format versions in both rendering and parsing paths
* rebased on top of removed unused hvcalls
* renamed used variables to have fdtfl- prefixes as there are already
some for parsing the initial dt
v2:
* fixed comments from review
* added strings cache
* changed last_compat_vers from 0x17 to 0x16 as suggested by dwg
---
I tested the blob by storing it from QEMU to a file and decompiling it.
This fixes the version handling in fdt-init. It only matters
a little for the fdt-debug==1 case though.
This defines /fdth word for the FDT header size; this will be used
in the next patch.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Greg Kurz <groug@kaod.org>
This replaces current set-chosen-cpu with a cleaner and faster
implementation which does not clobber the current node and stores
the chosen CPU phandle/ihandle.
This adds a helper to get the chosen CPU unit address.
This moves chosen cpu words to root.fs as otherwise it is quite hard
to maintain dependencies.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org>
This reverts commits:
604d28cc3 "board-qemu: add private hcall to inform host on "phandle" update"
089fc18a9 "libhvcall: drop unused KVMPPC_H_REPORT_MC_ERR and KVMPPC_H_NMI_MCE defines"
1c17c13a5 "rtas: Improve error handling in instantiate-rtas"
f9a60de30 "Add private HCALL to inform updated RTAS base and entry"
A bigger hammer is coming soon which will pass the entire device
tree to QEMU, not just some random bits.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Greg Kurz <groug@kaod.org>
QEMU can now set environment variables from the command line (with -prom-env).
By this means, we can set the output-device and input-device variables,
and SLOF can read it and set stdout and stdin accordingly.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Tested-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Found that virtio-net is using a around 200K receive buffer per device, if we
connect more than 40 virtio-net devices the heap(8MB) gets over. Because of
which allocation starts failing and the VM does not boot.
Moreover, the driver did not support opening multiple device, which is possible
using the OF client interface. As it was using globals to store the state
information of the driver.
Now the driver allocates a virtio_net structure during device open stage and
fills in the state information. This details are used during various device
functions and finally for cleaning up on close operation.
Now as the buffer memory is allocated during open and freed during the close
operations the heap usage is contained.
Reported-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
The "interrupt-map" property in each PHB node references the "phandle"
property of the "interrupt-controller" node. This is used by the guest
OS to setup IRQs for any PCI device plugged into the PHB. QEMU sets this
property to an arbitrary value in the flattened DT passed to SLOF.
Since commit 82954d4c10, SLOF has some generic code to convert all
references to any "phandle" property to a SLOF specific value.
This is is perfectly okay for coldplug devices, since the guest OS only
sees the converted value in "interrupt-map". It is a problem though for
hotplug devices. Since they don't go through SLOF, the guest OS receives
the arbitrary value set by QEMU and fails to setup IRQs.
In order to support PHB hotplug, this patch introduces a new private
hcall, which allows SLOF to tell QEMU that a "phandle" was converted
from an old value to a new value.
Suggested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
PCI bridges can only have one prefetchable memory area. If we are
already using 64-bit prefetchable memory regions, we can not use
a dedicated 32-bit prefetchable memory region anymore. In that
case the 32-bit BARs should all be located in the 32-bit non-
prefetchable memory space instead.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
It's much easier to do this when we create the node instead of
looking up the device node again later in each of the boards.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
The virtio-scsi device expects LUNs according to a "Single level LUN
structure" as defined in the "SCSI Architecture Model" specification.
SLOF currently only uses the "Single level LUN structure using
peripheral device addressing method" which provides the possibility
to specify up to 256 different LUNs.
To be able to use LUNs greater than 255, the "Single level LUN structure
using flat space addressing method" has to be used instead. This can
be done by setting the top-most bits to "01" instead of "00" in the first
byte of the two LUN bytes.
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1431584
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
This patch moves the code that actually alter the device tree to a
separate word, for improved readability. While here, it also makes
the comment of fdt-replace-all-phandles more accurate.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
These two values are pushed on the stack by decode-int and stay unused
until the 2drop line. Let's drop them right away to make it obvious.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Print all related variables, using the code from phb-parse-ranges in
board-qemu/slof/pci-phb.fs, so that we can easily check all the values
from the SLOF prompt, too.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>