There are two small bugs in the pxelinux.cfg parser:
1. If the file does not end with a '\n', the code set 'eol = cfg + cfgsize'
and later wrote a NUL character to *eol, i.e. it wrote the NUL character
beyond the end of the buffer. We've got to use 'eol = cfg + cfgsize - 1'
instead.
2. The code always replaced the last byte of the buffer with a NUL character
to get a proper termination. If the config file ends with a required character
(e.g. the last line is a KERNEL or INITRD line and the file does not have
a '\n' at the end), the last character got lost. Move the obligation for the
terminating NUL character to the caller instead so that we can be sure to
have a proper terminated buffer in pxelinux_parse_cfg() without the need to
blindly overwrite the last character here.
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
The pxelinux_load_cfg() function always tried to load one byte less than
its parameter said (so that we've got space for a terminating NUL-character
later). This is not very intuitive, let's better ask for one byte less
when we call the function. While we're at it, add a sanity check that
the function really did not load more bytes than requested.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
... useful for "this should never happen" situations, where
you want to make sure that it really never happens.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
[aik: removed extra ';' and empty line]
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Retrieve the UUID from the device tree and pass it to the pxelinux.cfg
function, so that we can look there for UUID-based file names, too.
Signed-off-by: Thomas Huth <thuth@redhat.com>
[aik: removed trailing space]
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
We will need to retrieve the UUID of the VM in the libnet code, so we
need a function to get the contents from a device tree property.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
There are two dedicated DHCP options for loading PXELINUX config files,
option 209 (config file name) and 210 (path prefix). We should support
them, too, in case some users want to configure their boot flow this way.
See RFC 5071 and the following URL for more details:
https://www.syslinux.org/wiki/index.php?title=PXELINUX#DHCP_options
Unlike most other strings in libnet, I've chosen to not use fixed-size
arrays for these two strings, but to allocate the memory via malloc here.
We always have to make sure not to overflow the stack in Paflof, so
adding 2 * 256 byte arrays to struct filename_ip sounded just too
dangerous to me.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
In case the normal network loading failed, try to load a pxelinux.cfg
config file. If that succeeds, load the kernel and initrd with the
information that could be found in this file.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Booting a kernel via pxelinux.cfg files is common on x86 and also with
ppc64 bootloaders like petitboot, so it would be nice to support this
in SLOF, too. This patch adds functions for downloading and parsing
such pxelinux.cfg files. See this URL for more details on pxelinux.cfg:
https://www.syslinux.org/wiki/index.php?title=PXELINUX
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
This way we can easily re-use the rc --> string translation in later
patches.
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Code has been taken from the sprintf() function (which is almost the same,
except that snprintf calls vsnprintf instead of vsprintf internally).
Signed-off-by: Thomas Huth <thuth@redhat.com>
[aik: fixed traling spaces]
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
When we will support loading of pxelinux.cfg files later, we have to call
the tftp load function multiple times from different places. To avoid that
we've also got to pass around the ip_version information via function para-
meters to all spots, let's rather put it into struct filename_ip instead
since we've got this struct filename_ip info available everywhere already.
While we're at it, also drop the __attribute__((packed)) from the struct.
The struct is only used internally, without exchanging it with the outside
world, so the attribute is certainly not necessary here.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
We can select the console output, but it does not really work
Implement term-io-emit, as we have term-io-key to really
send characters to the output selected by stdout.
Resolve xt and ihandle in the output command.
Use them in the new term-io-emit function.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
[aik: fixed commit log]
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
The obp-tftp package is currently using an arbitrary large value
as maximal load size. If the downloaded file is big enough, we
can easily erase Paflof in memory this way. Let's make sure that
this can not happen by limiting the size to the amount of memory
below the Paflof binary (which is close to the end of the RAM)
in case of board-qemu, or the amount of memory between the minimum
RAM size and the load-base on board-js2x.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
The blocksize is hard-coded to 1428 bytes in obp-tftp.fs, so instead of
hardcoding this in the Forth code, we could also move this into tftp.c
directly instead. A similar condition exists with the huge-tftp-load
parameter. While this non-standard variable could still be changed in the
obp-tftp package, it does not make much sense to set it to zero since you
only lose the possibility to do huge TFTP loads with index wrap-around in
that case.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
POSIX says that the free() function should simply do nothing if a NULL
pointer argument has been specified. So let's be a little bit more
compliant in our libc and add a NULL pointer check here, too.
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
This function will be used in one of the next patches to find the last
slash in a file name string.
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
For some strange reasons, the libnet code is using int8_t arrays for
strings in a couple of places where it really does not make any sense.
Therefor a lot of "(char *)" casts are needed when the code is using
the string functions from the libc. Let's change the strings to use
"char" instead of "int8_t" so we can get rid of a lot of these casts.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Recently, found that when DAWR was disabled by linux kernel, the hcall started
returning H_UNSUPPORTED, and VM did not boot up as broken_sc1 patched up SC
calls falsely.
Instead of checking for various returns, check if its not in privilege mode and
patch sc1 in that case.
CC: Michael Ellerman <michael@ellerman.id.au>
CC: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Linux kernel commit 2a9d832cc9aae21ea827520fef635b6c49a06c6d
(of: Add bindings for chosen node, stdout-path) deprecated chosen property
"linux,stdout-path" and "stdout".
Check for new property "stdout-path" first and as a fallback check
"linux,stdout-path". This older property can be deprecated after 5 years.
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
The catpad size is 1K size, which can overflow easily with around 20 devices
having bootindex. Replace usage of $cat with a dynamically allocated buffer(16K)
here. Introduce new words to work on the buffer (allocate, free and
concatenate)
Reported here: https://github.com/qemu/SLOF/issues/3
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
We were concatenating the word " parse-load" and $bootdev list that was input to
evaluate. Open code EVALUATE work such that concatenation is not required.
"load" and "load-next" does not use $cat anymore.
Reported here: https://github.com/qemu/SLOF/issues/3
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
The struct contains an uneven amount of bytes, so we should use
the "packed" attribute to avoid padding problems here. So far the
problems did not show up yet since the struct is filled by Forth
code only and QEMU seems to be quite forgiving about the length of
the descriptor, but anyway, let's better be safe than sorry here.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
The guest kernel fetches the device tree via the client interface,
calling it for every node and property, and traversing the entire tree
twice - first to build strings blob, second - to build struct blob.
On top of that there is also not so efficient implementation of
the "getprop" method - it calls slow "get-property" which does full
search for a property.
As the result, on a 256 CPU + 256 Intel E1000 virtual devices,
the guest's flatten_device_tree() takes roughly 8.5sec.
However now we have a FDT rendering helper in SLOF which takes about 350ms
to render the FDT. This implements a client interface call to allow
the guest to read it during early boot and save time.
The produced DTB is almost the same as the guest kernel would have
produced itself - the differences are:
1. SLOF creates an empty reserved map; the guest can easily fix
it up later;
2. SLOF only reuses 40 most popular strings; the guest reuses everything
it can - on a 256CPU + 256 PCI devices guest, the difference is about
20KB for 350KB FDT blob.
Note, that the guest also ditches the "name" property just like SLOF
does:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/powerpc/kernel/prom_init.c?h=v4.13#n2302
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
If the guest tries "fdt-fetch" and SLOF does not have it, than SLOF
prints an error:
===
copying OF device tree...
fdt-fetch NOT FOUNDBuilding dt strings...
Building dt structure...
===
and the guest continues with the old method. We could suppress SLOF error
for such unlikely situation though.
At the moment we count on the guest kernel to update or create device
tree properties pointing to the instantiated RTAS copy which is not
very reliable.
This stores rtas-base and rtas-size in the DT at the instantiation
point so later on the H_UPDATE_DT hcall can supply QEMU with an updated
location of RTAS.
This superseeds f9a60de30 "Add private HCALL to inform updated
RTAS base and entry".
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
Changes:
v5:
* ditched rtas-entry, added rtas-size (which is always 20 bytes though)
The existing code hardcodes the length of /openprom/model to 10 characters
even though it is less than that - len("aik")==3. All 10 chars go to
the device tree blob and DTC complains on such a property as there are
characters after terminating null:
aik@fstn1-p1:~$ dtc -f -I dtb -O dts -o dbg.dts dbg.dtb
Warning (model_is_string): "model" property in /openprom is not a string
This uses the real length and limits it by 10 to avoid breaking something.
Since the same code parses the build id field, this moves from-cstring
to a common place for both js2x and qemu boards.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
This creates flatten device tree and passes it to QEMU via a custom
hypercall right before jumping to RTAS.
This preloads strings with 40 property names from CPU and PCI device nodes
and the strings lookup only searches within these.
Test results on a guest with 256 CPUs and 256 virtual Intel E1000 devices
running on a POWER8 box:
FDTsize=366024 Strings=15888 Struct=350080 Reused str=12457 242 ms
A simple guest (one CPU, no PCI) with this patch as is:
FDTsize=15940 Strings=3148 Struct=12736 Reused str=84 7 ms
While we are here, fix the version handling in fdt-init. It only matters
a little for the fdt-debug==1 case though.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
Changes:
v6:
* fix memory sizes for free-mem
* store correct chosen-cpu to the header (used to be just 0)
* fdt-skip-string uses zcount now and works 30% faster
* moved to a new file - fdt-fl.fs
v5:
* applied latest comments from Segher
* s/fdt-property/fdt-copy-property/, s/fdt-properties/fdt-copy-properties/
* reduced the temporary buffers to 1MB each as the guest uses 1MB in total
anyway
* do not pass root phandle to fdt-flatten-tree, it fetches it from
device-tree itself
* reworked fdt-copy-properties to use for-all-words proposed by Segher
v4:
* reworked fdt-properties, works lot faster
* do not store "name" properties as nodes have names already
v3:
* fixed stack handling after hcall returned
* fixed format versions in both rendering and parsing paths
* rebased on top of removed unused hvcalls
* renamed used variables to have fdtfl- prefixes as there are already
some for parsing the initial dt
v2:
* fixed comments from review
* added strings cache
* changed last_compat_vers from 0x17 to 0x16 as suggested by dwg
---
I tested the blob by storing it from QEMU to a file and decompiling it.
This fixes the version handling in fdt-init. It only matters
a little for the fdt-debug==1 case though.
This defines /fdth word for the FDT header size; this will be used
in the next patch.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Greg Kurz <groug@kaod.org>
This replaces current set-chosen-cpu with a cleaner and faster
implementation which does not clobber the current node and stores
the chosen CPU phandle/ihandle.
This adds a helper to get the chosen CPU unit address.
This moves chosen cpu words to root.fs as otherwise it is quite hard
to maintain dependencies.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org>
This adds some internal structure commented definitions so they won't
break things now and grep can find them.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
This reverts commits:
604d28cc3 "board-qemu: add private hcall to inform host on "phandle" update"
089fc18a9 "libhvcall: drop unused KVMPPC_H_REPORT_MC_ERR and KVMPPC_H_NMI_MCE defines"
1c17c13a5 "rtas: Improve error handling in instantiate-rtas"
f9a60de30 "Add private HCALL to inform updated RTAS base and entry"
A bigger hammer is coming soon which will pass the entire device
tree to QEMU, not just some random bits.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Greg Kurz <groug@kaod.org>
QEMU can now set environment variables from the command line (with -prom-env).
By this means, we can set the output-device and input-device variables,
and SLOF can read it and set stdout and stdin accordingly.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Tested-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
According to TFTP Booting extension, after the success of BOOTP, BOOTREPLY
packet should be copied to bootp-response property under "/chosen"
While in current case, even when DHCP was used, bootp-response was being set. So
set bootp-response when BOOTP is used and dhcp-response for DHCP
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
times_asked value remains same as the structure is zeroed, but it makes more
sense to do that directly instead of adding with previous value.
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
When shutting down the adapter, SLOF writes 0 to the Event Ring Segment
Table Base Address Register (ERSTBA) but does not reset the Event Ring
Segment Table Size Register (ERSTSZ) which makes QEMU do DMA access
at zero address which fails in unassigned_mem_accepts.
This resets ERSTSZ right before resetting ERSTBA so these 2 registers
can stay in sync.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Found that virtio-net is using a around 200K receive buffer per device, if we
connect more than 40 virtio-net devices the heap(8MB) gets over. Because of
which allocation starts failing and the VM does not boot.
Moreover, the driver did not support opening multiple device, which is possible
using the OF client interface. As it was using globals to store the state
information of the driver.
Now the driver allocates a virtio_net structure during device open stage and
fills in the state information. This details are used during various device
functions and finally for cleaning up on close operation.
Now as the buffer memory is allocated during open and freed during the close
operations the heap usage is contained.
Reported-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
The "interrupt-map" property in each PHB node references the "phandle"
property of the "interrupt-controller" node. This is used by the guest
OS to setup IRQs for any PCI device plugged into the PHB. QEMU sets this
property to an arbitrary value in the flattened DT passed to SLOF.
Since commit 82954d4c10, SLOF has some generic code to convert all
references to any "phandle" property to a SLOF specific value.
This is is perfectly okay for coldplug devices, since the guest OS only
sees the converted value in "interrupt-map". It is a problem though for
hotplug devices. Since they don't go through SLOF, the guest OS receives
the arbitrary value set by QEMU and fails to setup IRQs.
In order to support PHB hotplug, this patch introduces a new private
hcall, which allows SLOF to tell QEMU that a "phandle" was converted
from an old value to a new value.
Suggested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
The functions used a bogus mixture between programming the registers
with pci-next-mem64 and pci-next-mem - the upper register bits were
filled with the value from the 64-bit memory space while the lower
bits were filled with the bits from the 32-bit memory space variable.
This separates handling of pci-{next|max}-mem64 from pci-{next|max}-mem.
This zeroes bottom 4 bits of the prefetchable memory limit and
prefetchable memory base registers for 32bit windows to enforce
that the resources are marked as 32bit.
This simplifies updating of the prefetchable memory limit in
ci-bridge-set-mem-limit.
Signed-off-by: Thomas Huth <thuth@redhat.com>
[aik: extended commit log with 32bit window changes]
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
PCI bridges can only have one prefetchable memory area. If we are
already using 64-bit prefetchable memory regions, we can not use
a dedicated 32-bit prefetchable memory region anymore. In that
case the 32-bit BARs should all be located in the 32-bit non-
prefetchable memory space instead.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
They are completely unused, and ishexdigit seems even to be implemented
in a wrong way, thus let's simply remove them.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Currently, it is not possible to use VGA devices attached to a
PCI bridge on board-qemu, e.g. by starting QEMU like this:
qemu-system-ppc64 -nodefaults -device pci-bridge,id=br1,chassis_nr=1 \
-serial mon:stdio -device VGA,id=video,bus=br1,addr=1
One of the problems is the missing translate-address at the end
of the map-in function of the bridge - which was already marked
as a TODO, but apparently has never been enabled. So let's do
that now!
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
It's much easier to do this when we create the node instead of
looking up the device node again later in each of the boards.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
The virtio-scsi device expects LUNs according to a "Single level LUN
structure" as defined in the "SCSI Architecture Model" specification.
SLOF currently only uses the "Single level LUN structure using
peripheral device addressing method" which provides the possibility
to specify up to 256 different LUNs.
To be able to use LUNs greater than 255, the "Single level LUN structure
using flat space addressing method" has to be used instead. This can
be done by setting the top-most bits to "01" instead of "00" in the first
byte of the two LUN bytes.
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1431584
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
The SLOF stack pointers - dp/rp - point to the top used element which
means for an empty stack they point to an element below the stack.
This means that for pushing to the stack we can use a store-with-update
instruction (stdu). This generates good code for most primitives,
better than the other stack pointer offsets.
However, with -Warray-bounds enabled, this produces warnings like below:
At the moment SLOF is gcc produces a warning:
/home/aik/p/slof/slof/paflof.c: In function ‘engine’:
/home/aik/p/slof/slof/paflof.c:84:23: warning: array subscript is below array bounds [-Warray-bounds]
dp = the_data_stack - 1;
~~~~~~~~~~~~~~~^~~
This silences gcc by doing c-cast.
Suggested-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Thomas Huth <thuth@redhat.com>
---
uintptr_t is not used anywhere in SLOF, hence type_u.
This patch moves the code that actually alter the device tree to a
separate word, for improved readability. While here, it also makes
the comment of fdt-replace-all-phandles more accurate.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
These two values are pushed on the stack by decode-int and stay unused
until the 2drop line. Let's drop them right away to make it obvious.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Print all related variables, using the code from phb-parse-ranges in
board-qemu/slof/pci-phb.fs, so that we can easily check all the values
from the SLOF prompt, too.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>