Compare commits

..

376 Commits

Author SHA1 Message Date
fde35ff003 [pci] Disable decoding while setting a BAR value
Setting the base address for a 64-bit BAR requires two separate 32-bit
writes to configuration space, and so will necessarily result in the
BAR temporarily holding an invalid partially written address.

Some hypervisors (observed on an AWS EC2 c7a.medium instance in
eu-west-2) will assume that guests will write BAR values only while
decoding is disabled, and may not rebuild MMIO mappings for the guest
if the BAR registers are written while decoding is enabled.  The
effect of this is that MMIO accesses are not routed through to the
device even though inspection from within the guest shows that every
single PCI configuration register has the correct value.  Writes to
the device will be ignored, and reads will return the all-ones pattern
that typically indicates a nonexistent device.

With the ENA network driver now using low latency transmit queues,
this results in the transmit descriptors being lost (since the MMIO
writes to BAR2 never reach the device), which in turn causes the
device to lock up as soon as the transmit doorbell is rung for the
first time.

Fix by disabling decoding of memory and I/O cycles while setting a BAR
address (as we already do while sizing a BAR), so that the invalid
partial address can never be decoded and so that hypervisors will
rebuild MMIO mappings as expected.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-10-29 23:30:52 +00:00
606e87ec7a [cloud] Display instance type in AWS EC2
Experiments suggest that the instance type is exposed via the SMBIOS
product name.  Include this information within the default output,
since it is often helpful in debugging.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-10-29 13:26:50 +00:00
0336e2987c [ena] Leave queue base address empty when creating a low latency queue
The queue base address is meaningless for a low latency queue, since
the queue entries are written directly to the on-device memory.  Any
non-zero queue base address will be safely ignored by the hardware,
but leaves open the possibility that future revisions could treat it
as an error.

Leave this field as zero, to match the behaviour of the Linux driver.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-10-28 12:27:06 +00:00
0ddd830693 [riscv] Correct page table stride calculation
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-10-27 14:22:16 +00:00
426c721e32 [librm] Correct page table stride calculation
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-10-27 14:22:16 +00:00
c8f088d4e1 [cloud] Display build architecture in AWS EC2
On some newer (7th and 8th generation) instance types, the 32-bit
build of iPXE cannot access PCI configuration space since the ECAM is
placed outside of the 32-bit address space.  The visible symptom is
that iPXE fails to detect any network devices.

The public AMIs are all now built as 64-bit binaries, but there is
nothing that prevents the building and importing of a 32-bit AMI.
There are still potentially valid use cases for 32-bit AMIs (e.g. if
planning to use the AMI only for older instance types), and so we
cannot sensibly prevent this error at build time.

Display the build architecture as part of the AWS EC2 embedded script,
to at least allow for easy identification of this particular failure
mode at run time.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-10-20 12:58:03 +01:00
416a2143af [cloud] Remove AWS public image access block only if not already unblocked
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-10-20 12:58:03 +01:00
ba1846a0d3 [cloud] Remove AWS public image access block automatically if needed
Making images public is blocked by default in new AWS regions.  Remove
this block automatically whenever creating a public image.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-10-17 14:22:21 +01:00
b2e8468219 [ena] Limit receive queue size to work around hardware bugs
Commit a801244 ("[ena] Increase receive ring size to 128 entries")
increased the receive ring size to 128 entries (while leaving the fill
level at 16), since using a smaller receive ring caused unexplained
failures on some instance types.

The original hardware bug that resulted in that commit seems to have
been fixed: experiments suggest that the original failure (observed on
a c6i.large instance in eu-west-2) will no longer reproduce when using
a receive ring containing only 16 entries (as was the case prior to
that commit).

Newer generations of the ENA hardware (observed on an m8i.large
instance in eu-south-2) seem to have a new and exciting hardware bug:
these instance types appear to use a hash of the received packet
header to determine which portion of the (out-of-order) receive ring
to use.  If that portion of the ring happens to be empty (e.g. because
only 32 entries of the 128-entry ring are filled at any one time),
then the packet will be silently dropped.

Work around this new hardware bug by reducing the receive ring size
down to the current fill level of 32 entries.  This appears to work on
all current instance types (but has not been exhaustively tested).

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-10-17 13:25:05 +01:00
846c505ae9 [ena] Increase transmit queue size to match receive fill level
Avoid running out of transmit descriptors when sending TCP ACKs by
increasing the transmit queue size to match the increased received
fill level.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-10-17 13:25:05 +01:00
0ae5e25de2 [ena] Add memory barrier after writing to on-device memory
Ensure that writes to on-device memory have taken place before writing
to the doorbell register.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-10-17 12:35:23 +01:00
c296747d0e [ena] Increase receive fill level
Experiments suggest that at least some instance types (observed with
c6i.large in eu-west-2) experience high packet drop rates with only 16
receive buffers allocated.  Increase the fill level to 32 buffers.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-10-16 16:36:29 +01:00
c1badf71ca [ena] Add support for low latency transmit queues
Newer generations of the ENA hardware require the use of low latency
transmit queues, where the submission queues and the initial portion
of the transmitted packet are written to on-device memory via BAR2
instead of being read from host memory.

Detect support for low latency queues and set the placement policy
appropriately.  We attempt the use of low latency queues only if the
device reports that it supports inline headers, 128-byte entries, and
two descriptors prior to the inlined header, on the basis that we
don't care about using low latency queues on older versions of the
hardware since those versions will support normal host memory
submission queues anyway.

We reuse the redundant memory allocated for the submission queue as
the bounce buffer for constructing the descriptors and inlined packet
data, since this avoids needing a separate allocation just for the
bounce buffer.

We construct a metadata submission queue entry prior to the actual
submission queue entry, since experimentation suggests that newer
generations of the hardware require this to be present even though it
conveys no information beyond its own existence.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-10-16 16:36:29 +01:00
0d15d7f0a5 [ena] Record supported device features
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-10-16 16:36:29 +01:00
e5e371f485 [ena] Cancel uncompleted transmit buffers on close
Avoid spurious assertion failures by ensuring that references to
uncompleted transmit buffers are not retained after the device has
been closed.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-10-16 16:36:29 +01:00
dcc5d36ce5 [ena] Map the on-device memory, if present
Newer generations of the ENA hardware require the use of low latency
transmit queues, where the submission queues and the initial portion
of the transmitted packet are written to on-device memory via BAR2
instead of being read from host memory.

Prepare for this by mapping the on-device memory BAR.  As with the
register BAR, we may need to steal a base address from the upstream
PCI bridge since the BIOS on some instance types (observed with an
m8i.metal-48xl instance in eu-south-2) will fail to assign an address
to the device.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-10-15 15:55:57 +01:00
510f3e5e17 [ena] Add descriptive messages for any admin queue command failures
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-10-15 12:00:42 +01:00
3538e9c39a [pci] Record prefetchable memory window for PCI bridges
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-10-14 18:38:08 +01:00
04a61c413d [ena] Use pci_bar_set() to place device within bridge memory window
Use pci_bar_set() when we need to set a device base address (on
instance types such as c6i.metal where the BIOS fails to do so), so
that 64-bit BARs will be handled automatically.

This particular issue has so far been observed only on 6th generation
instances.  These use 32-bit BARs, and so the lack of support for
handling 64-bit BARs has not caused any observable issue.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-10-14 15:57:02 +01:00
94902ae187 [pci] Handle sizing of 64-bit BARs
Provide pci_bar_set() to handle setting the base address for a
potentially 64-bit BAR, and rewrite pci_bar_size() to correctly handle
sizing of 64-bit BARs.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-10-14 14:43:50 +01:00
e80818e4f6 [tls] Disable renegotiation unless extended master secret is used
RFC 7627 states that renegotiation becomes no longer secure under
various circumstances when the non-extended master secret is used.
The description of the precise set of circumstances is spread across
various points within the document and is not entirely clear.

Avoid a superset of the circumstances in which renegotiation
apparently becomes insecure by refusing renegotiation completely
unless the extended master secret is used.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-10-12 23:25:09 +01:00
57504353fe [tls] Refuse to resume sessions with mismatched master secret methods
RFC 7627 section 5.3 states that the client must abort the handshake
if the server attempts to resume a session where the master secret
calculation method stored in the session does not match the method
used for the connection being resumed.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-10-12 23:25:09 +01:00
ab64bc5b8d [tls] Add support for the Extended Master Secret
RFC 7627 defines the Extended Master Secret (EMS) as an alternative
calculation that uses the digest of all handshake messages rather than
just the client and server random bytes.

Add support for negotiating the Extended Master Secret extension and
performing the relevant calculation of the master secret.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-10-12 23:25:04 +01:00
d6656106e9 [tls] Generate master secret only after sending Client Key Exchange
The calculation for the extended master secret as defined in RFC 7627
relies upon the digest of all handshake messages up to and including
the Client Key Exchange.

Facilitate this calculation by generating the master secret only after
sending the Client Key Exchange message.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-10-12 22:20:13 +01:00
4f44f62402 [gve] Rearm interrupts unconditionally on every poll
Experimentation suggests that rearming the interrupt once per observed
completion is not sufficient: we still see occasional delays during
which the hardware fails to write out completions.

As described in commit d2e1e59 ("[gve] Use dummy interrupt to trigger
completion writeback in DQO mode"), there is no documentation around
the precise semantics of the interrupt rearming mechanism, and so
experimentation is the only available guide.  Switch to rearming both
TX and RX interrupts unconditionally on every poll, since this
produces better experimental results.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-10-10 13:12:19 +01:00
f5ca1de738 [gve] Use raw DMA addresses in descriptors in DQO-QPL mode
The DQO-QPL operating mode uses registered queue page lists but still
requires the raw DMA address (rather than the linear offset within the
QPL) to be provided in transmit and receive descriptors.

Set the queue page list base device address appropriately.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-10-10 12:49:26 +01:00
1cc1f1cd4f [gve] Report only packet completions for the transmit ring
The hardware reports descriptor and packet completions separately for
the transmit ring.  We currently ignore descriptor completions (since
we cannot free up the transmit buffers in the queue page list and
advance the consumer counter until the packet has also completed).

Now that transmit completions are written out immediately (instead of
being delayed until 128 bytes of completions are available), there is
no value in retaining the descriptor completions.

Omit descriptor completions entirely, and reduce the transmit fill
level back down to its original value.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-10-09 17:29:20 +01:00
d2e1e591ab [gve] Use dummy interrupt to trigger completion writeback in DQO mode
When operating in the DQO operating mode, the device will defer
writing transmit and receive completions until an entire internal
cacheline (128 bytes) is full, or until an associated interrupt is
asserted.  Since each receive descriptor is 32 bytes, this will cause
received packets to be effectively delayed until up to three further
packets have arrived.  When network traffic volumes are very low (such
as during DHCP, DNS lookups, or TCP handshakes), this typically
induces delays of up to 30 seconds and results in a very poor user
experience.

Work around this hardware problem in the same way as for the Intel
40GbE and 100GbE NICs: by enabling dummy MSI-X interrupts to trick the
hardware into believing that it needs to write out completions to host
memory.

There is no documentation around the interrupt rearming mechanism.
The value written to the interrupt doorbell does not include a
consumer counter value, and so must be relying on some undocumented
ordering constraints.  Comments in the Linux driver source suggest
that the authors believe that the device will automatically and
atomically mask an MSI-X interrupt at the point of asserting it, that
any further interrupts arriving before the doorbell is written will be
recorded in the pending bit array, and that writing the doorbell will
therefore immediately assert a new interrupt if needed.

In the absence of any documentation, choose to rearm the interrupt
once per observed completion.  This is overkill, but is less impactful
than the alternative of rearming the interrupt unconditionally on
every poll.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-10-09 17:12:20 +01:00
c2d7ddd0c2 [gve] Add missing memory barriers
Ensure that remainder of completion records are read only after
verifying the generation bit (or sequence number).

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-10-09 16:42:20 +01:00
5438299649 [intelxl] Use default dummy MSI-X target address
Use the default dummy MSI-X target address that is now allocated and
configured automatically by pci_msix_enable().

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-10-09 16:37:14 +01:00
4224f574da [pci] Map all MSI-X interrupts to a dummy target address by default
Interrupts as such are not used in iPXE, which operates in polling
mode.  However, some network cards (such as the Intel 40GbE and 100GbE
NICs) will defer writing out completions until the point of asserting
an MSI-X interrupt.

From the point of view of the PCI device, asserting an MSI-X interrupt
is just a 32-bit DMA write of an opaque value to an opaque target
address.  The PCI device has no know to know whether or not the target
address corresponds to a real APIC.

We can therefore trick the PCI device into believing that it is
asserting an MSI-X interrupt, by configuring it to write an opaque
32-bit value to a dummy target address in host memory.  This is
sufficient to trigger the associated write of the completions to host
memory.

Allocate a dummy target address when enabling MSI-X on a PCI device,
and map all interrupts to this target address by default.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-10-09 16:29:29 +01:00
ce30ba14fc [gve] Select preferred operating mode
Select a preferred operating mode from those advertised as supported
by the device, falling back to the oldest known mode (GQI-QPL) if
no modes are advertised.

Since there are devices in existence that support only QPL addressing,
and since we want to minimise code size, we choose to always use a
single fixed ring buffer even when using raw DMA addressing.  Having
paid this penalty, we therefore choose to prefer QPL over RDA since
this allows the (virtual) hardware to minimise the number of page
table manipulations required.  We similarly prefer GQI over DQO since
this minimises the amount of work we have to do: in particular, the RX
descriptor ring contents can remain untouched for the lifetime of the
device and refills require only a doorbell write.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-10-06 14:04:18 +01:00
74c9fd72cf [gve] Add support for out-of-order queues
Add support for the "DQO" out-of-order transmit and receive queue
formats.  These are almost entirely different in format and usage (and
even endianness) from the original "GQI" in-order transmit and receive
queues, and arguably should belong to a completely different device
with a different PCI ID.  However, Google chose to essentially crowbar
two unrelated device models into the same virtual hardware, and so we
must handle both of these device models within the same driver.

Most of the new code exists solely to handle the differences in
descriptor sizes and formats.  Out-of-order completions are handled
via a buffer ID ring (as with other devices supporting out-of-order
completions, such as the Xen, Hyper-V, and Amazon virtual NICs).  A
slight twist is that on the transmit datapath (but not the receive
datapath) the Google NIC provides only one completion per packet
instead of one completion per descriptor, and so we must record the
list of chained buffer IDs in a separate array at the time of
transmission.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-10-06 14:04:12 +01:00
0d1ddfe42c [gve] Cancel pending transmissions when closing device
We cancel any pending transmissions when (re)starting the device since
any transmissions that were initiated before the admin queue reset
will not complete.

The network device core will also cancel any pending transmissions
after the device is closed.  If the device is closed with some
transmissions still pending and is then reopened, this will therefore
result in a stale I/O buffer being passed to netdev_tx_complete_err()
when the device is restarted.

This error has not been observed in practice since transmissions
generally complete almost immediately and it is therefore unlikely
that the device will ever be closed with transmissions still pending.
With out-of-order queues, the device seems to delay transmit
completions (with no upper time limit) until a complete batch is
available to be written out as a block of 128 bytes.  It is therefore
very likely that the device will be closed with transmissions still
pending.

Fix by ensuring that we have dropped all references to transmit I/O
buffers before returning from gve_close().

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-10-06 13:16:22 +01:00
cf53497541 [bnxt] Handle link related async events
Handle async events related to link speed change, link speed config
change, and port phy config changes.

Signed-off-by: Joseph Wong <joseph.wong@broadcom.com>
2025-10-01 16:20:23 +01:00
4508e10233 [gve] Allow for descriptor and completion lengths to vary by mode
The descriptors and completions in the DQO operating mode are not the
same sizes as the equivalent structures in the GQI operating mode.
Allow the queue stride size to vary by operating mode (and therefore
to be known only after reading the device descriptor and selecting the
operating mode).

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-09-30 12:17:22 +01:00
20a489253c [gve] Rename GQI-specific data structures and constants
Rename data structures and constants that are specific to the GQI
operating mode, to allow for a cleaner separation from other operating
modes.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-09-30 11:10:20 +01:00
86b322d999 [gve] Allow for out-of-order buffer consumption
We currently assume that the buffer index is equal to the descriptor
ring index, which is correct only for in-order queues.

Out-of-order queues will include a buffer tag value that is copied
from the descriptor to the completion.  Redefine the data buffers as
being indexed by this tag value (rather than by the descriptor ring
index), and add a circular ring buffer to allow for tags to be reused
in whatever order they are released by the hardware.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-09-30 11:09:45 +01:00
b8dd3c384b [gve] Add support for raw DMA addressing
Raw DMA addressing allows the transmit and receive descriptors to
provide the DMA address of the data buffer directly, without requiring
the use of a pre-registered queue page list.  It is modelled in the
device as a magic "raw DMA" queue page list (with QPL ID 0xffffffff)
covering the whole of the DMA address space.

When using raw DMA addressing, the transmit and receive datapaths
could use the normal pattern of mapping I/O buffers directly, and
avoid copying packet data into and out of the fixed queue page list
ring buffer.  However, since we must retain support for queue page
list addressing (which requires this additional copying), we choose to
minimise code size by continuing to use the fixed ring buffer even
when using raw DMA addressing.

Add support for using raw DMA addressing by setting the queue page
list base device address appropriately, omitting the commands to
register and unregister the queue page lists, and specifying the raw
DMA QPL ID when creating the TX and RX queues.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-09-29 15:13:55 +01:00
9f554ec9d0 [gve] Add concept of a queue page list base device address
Allow for the existence of a queue page list where the base device
address is non-zero, as will be the case for the raw DMA addressing
(RDA) operating mode.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-09-29 15:13:55 +01:00
91db5b68ff [gve] Set descriptor and completion ring sizes when creating queues
The "create TX queue" and "create RX queue" commands have fields for
the descriptor and completion ring sizes, which are currently left
unpopulated since they are not required for the original GQI-QPL
operating mode.

Populate these fields, and allow for the possibility that a transmit
completion ring exists (which will be the case when using the DQO
operating mode).

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-09-29 15:13:55 +01:00
048a346705 [gve] Add concept of operating mode
The GVE family supports two incompatible descriptor queue formats:

  * GQI: in-order descriptor queues
  * DQO: out-of-order descriptor queues

and two addressing modes:

  * QPL: pre-registered queue page list addressing
  * RDA: raw DMA addressing

All four combinations (GQI-QPL, GQI-RDA, DQO-QPL, and DQO-RDA) are
theoretically supported by the Linux driver, which is essentially the
only public reference provided by Google.  The original versions of
the GVE NIC supported only GQI-QPL mode, and so the iPXE driver is
written to target this mode, on the assumption that it would continue
to be supported by all models of the GVE NIC.

This assumption turns out to be incorrect: Google does not deem it
necessary to retain backwards compatibility.  Some newer machine types
(such as a4-highgpu-8g) support only the DQO-RDA operating mode.

Add a definition of operating mode, and pass this as an explicit
parameter to the "configure device resources" admin queue command.  We
choose a representation that subtracts one from the value passed in
this command, since this happens to allow us to decompose the mode
into two independent bits (one representing the use of DQO descriptor
format, one representing the use of QPL addressing).

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-09-29 15:13:55 +01:00
610089b98e [gve] Remove separate concept of "packet descriptor"
The Linux driver occasionally uses the terminology "packet descriptor"
to refer to the portion of the descriptor excluding the buffer
address.  This is not a helpful separation, and merely adds
complexity.

Simplify the code by removing this artifical separation.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-09-29 15:12:54 +01:00
ee9aea7893 [gve] Parse option list returned in device descriptor
Provide space for the device to return its list of supported options.
Parse the option list and record the existence of each option in a
support bitmask.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-09-26 12:02:03 +01:00
6464f2edb8 [bnxt] Add error recovery support
Add support to advertise adapter error recovery support to the
firmware.  Implement error recovery operations if adapter fault is
detected.  Refactor memory allocation to better align with probe and
open functions.

Signed-off-by: Joseph Wong <joseph.wong@broadcom.com>
2025-09-18 13:25:07 +01:00
969ce2c559 [efi] Use current boot option as a fallback for obtaining the boot URI
Some systems (observed with a Lenovo X1) fail to populate the loaded
image device path with a Uri() component when performing a UEFI HTTP
boot, instead creating a broken loaded image device path that
represents a DHCP+TFTP boot that has not actually taken place.

If no URI is found within the loaded image device path, then fall back
to looking for a URI within the current boot option.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-08-29 12:34:17 +01:00
c10da8b53c [efi] Add ability to extract device path from an EFI load option
An EFI boot option (stored in a BootXXXX variable) comprises an
EFI_LOAD_OPTION structure, which includes some undefined number of EFI
device paths.  (The structure is extremely messy and awkward to parse
in C, but that's par for the course with EFI.)

Add a function to extract the first device path from an EFI load
option, along with wrapper functions to read and extract the first
device path from an EFI boot variable.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-08-29 12:34:17 +01:00
5bec2604a3 [libc] Add wcsnlen()
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-08-28 15:12:41 +01:00
61b4585e2a [efi] Drag in MNP driver whenever SNP driver is present
The chainloaded-device-only "snponly" driver already drags in support
for driving SNP, NII, and MNP devices, on the basis that the user
generally doesn't care which UEFI API is used and just wants to boot
from the same network device that was used to load iPXE.

The multi-device "snp" driver already drags in support for driving SNP
and NII devices, but does not drag in support for MNP devices.

There is essentially zero code size overhead to dragging in support
for MNP devices, since this support is always present in any iPXE
application build anyway (as part of the code to download
"autoexec.ipxe" prior to installing our own drivers).

Minimise surprise by dragging in support for MNP devices whenever
using the "snp" driver, following the same reasoning used for the
"snponly" driver.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-08-27 13:12:11 +01:00
a53ec44932 [bnxt] Update CQ doorbell type
Update completion queue doorbell to a non-arming type, since polling
is used.

Signed-off-by: Joseph Wong <joseph.wong@broadcom.com>
2025-08-13 12:36:20 +01:00
8460dc4e8f [dwgpio] Use fdt_reg() to get GPIO port numbers
DesignWare GPIO port numbers are represented as unsized single-entry
regions.  Use fdt_reg() to obtain the GPIO port number, rather than
requiring access to a region cell size specification stored in the
port group structure.

This allows the field name "regs" in the port group structure to be
repurposed to hold the I/O register base address, which then matches
the common usage in other drivers.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-08-07 15:49:12 +01:00
88ba011764 [fdt] Provide fdt_reg() for unsized single-entry regions
Many region types (e.g. I2C bus addresses) can only ever contain a
single region with no size cells specified.  Provide fdt_reg() to
reduce boilerplate in this common use case.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-08-07 15:49:09 +01:00
9d4a2ee353 [cmdline] Show commands in alphabetical order
Commands were originally ordered by functional group (e.g. keeping the
image management commands together), with arrays used to impose a
functionally meaningful order within the group.

As the number of commands and functional groups has expanded over the
years, this has become essentially useless as an organising principle.
Switch to sorting commands alphabetically (using the linker table
mechanism).

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-08-06 16:34:45 +01:00
332241238e [digest] Treat inability to acquire an image as a fatal error
The "md5sum" and "sha1sum" commands were originally intended solely as
debugging utilities, and would return success (with a warning message)
even if the specified images did not exist.

To minimise surprise and to be consistent with other commands, treat
the inability to acquire an image as a fatal error.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-08-06 15:21:14 +01:00
6fa901530a [digest] Add "--set" option to store digest value in a setting
Allow the result of a digest calculation to be stored in a named
setting.  This allows for digest verification in scripts using e.g.:

  set expected:hexraw cb05def203386f2b33685d177d9f04e3e3d70dd4
  sha1sum --set actual 1mb
  iseq ${expected} ${actual} || goto checksum_bad

Note that digest verification alone cannot be used to set the trusted
execution status of an image.  The only way to mark an image as
trusted is to use the "imgverify" command.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-08-06 14:07:00 +01:00
f5467d69db [github] Extend sponsorship link
Add Christian Nilsson <nikize@gmail.com> as a project sponsorship
recipient, to reflect the enormous amount of time invested in
responding to issues and pull requests.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-08-06 13:31:00 +01:00
f45782f9f3 [digest] Add commands for all enabled digest algorithms
Add "sha256sum", "sha512sum", and similar commands.  Include these new
commands only when DIGEST_CMD is enabled in config/general.h and the
corresponding algorithm is enabled in config/crypto.h.

Leave "mdsum" and "sha1sum" included whenever only DIGEST_CMD is
enabled, to avoid potentially breaking backwards compatibility with
builds that disabled MD5 or SHA-1 as a TLS or X.509 digest algorithm,
but would still have expected those commands to be present.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-08-06 13:17:25 +01:00
2e4e1f7e9e [dwgpio] Add driver for the DesignWare GPIO controller
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-08-05 14:39:56 +01:00
90fe3a2924 [gpio] Add a framework for GPIO controllers
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-08-05 13:54:27 +01:00
5f10b74555 [fdt] Use phandle as device location
Consumption of phandles will be in the form of locating a functional
device (e.g. a GPIO device, or an I2C device, or a reset controller)
by phandle, rather than locating the device tree node to which the
phandle refers.

Repurpose fdt_phandle() to obtain the phandle value (instead of
searching by phandle), and record this value as the bus location
within the generic device structure.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-08-04 14:52:00 +01:00
f7a1e9ef8e [dwmac] Show core version in debug messages
Read and display the core version immediately after mapping the MMIO
registers, to provide a basic sanity check that the registers have
been correctly mapped and the core is not held in reset.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-30 15:59:38 +01:00
01b1028d4e [bnxt] Remove unnecessary test_if macro
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-30 14:08:25 +01:00
6ca7a560a4 [bnxt] Remove unnecessary I/O macros
Remove unnecessary driver specific macros.  Use standard
pci_read_config_xxxx, pci_write_config_xxx, writel/q calls.

Signed-off-by: Joseph Wong <joseph.wong@broadcom.com>
2025-07-30 14:03:51 +01:00
be551d420e [serial] Explicitly initialise serial console UART to NULL
When debugging is enabled for the device tree or memory map parsing
code, the active serial console UART variable will be accessed during
early initialisation, before the .bss section has been zeroed.

Place this variable in the .data section (by providing an explicit
initialiser), so that reading this variable is well defined even
during early initialisation.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-30 13:40:36 +01:00
a814c46059 [riscv] Place explicitly zero-initialised variables in the .data section
Variables in the .bss section cannot be relied upon to have zero
values during early initialisation, before we have relocated ourselves
to somewhere suitable in RAM and zeroed the .bss section.

Place any explicitly zero-initialised variables in the .data section
rather than in .bss, so that we can rely on their values even during
this early initialisation stage.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-30 13:15:11 +01:00
5bda1727b4 [riscv] Allow for poisoning .bss section before early initialisation
On startup, we may be running from read-only memory, and therefore
cannot zero the .bss section (or write to the .data section) until we
have parsed the system memory map and relocated ourselves to somewhere
suitable in RAM.  The code that runs during this early initialisation
stage must be carefully written to avoid writing to the .data section
and to avoid reading from or writing to the .bss section.

Detecting code that erroneously writes to the .data or .bss sections
is relatively easy since running from read-only memory (e.g. via
QEMU's -pflash option) will immediately reveal the bug.  Detecting
code that erroneously reads from the .bss section is harder, since in
a freshly powered-on machine (or in a virtual machine) there is a high
probability that the contents of the memory will be zero even before
we explicitly zero out the section.

Add the ability to fill the .bss section with an invalid non-zero
value to expose bugs in early initialisation code that erroneously
relies upon variables in .bss before the section has been zeroed.  We
use the value 0xeb55eb55eb55eb55 ("EBSS") since this is immediately
recognisable as a value in a crash dump, and will trigger a page fault
if dereferenced since the address is in a non-canonical form.

Poisoning the .bss can be done only when the image is known to already
reside in writable memory.  It will overwrite the relocation records,
and so can be done only on a system where relocation is known to be
unnecessary (e.g. because paging is supported).  We therefore do not
enable this behaviour by default, but leave it as a configurable
option via the config/fault.h header.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-30 12:31:15 +01:00
e3a6e9230c [undi] Assume that legacy interrupts are broken for any PCIe device
PCI Express devices do not have physical INTx output signals, and on
modern motherboards there is unlikely to be any interrupt controller
with physical interrupt input signals.  There are multiple levels of
abstraction involved in emulating the legacy INTx interrupt mechanism:
the PCIe device sends Assert_INTx and Deassert_INTx messages, PCIe
bridges and switches must collate these virtual wires, and the root
complex must map the virtual wires into messages that can be
understood by the host's emulated 8259 PIC.

This complex chain of emulations is rarely tested on modern hardware,
since operating systems will invariably use MSI-X for PCI devices and
the I/O APIC for non-PCI devices such as the real-time clock.  Since
the legacy interrupt emulation mechanism is rarely tested, it is
frequently unreliable.  We have encountered many issues over the years
in which legacy interrupts are simply not raised as expected, even
when inspection shows that the device believes it is asserting an
interrupt and the controller believes that the interrupt is enabled.

We already maintain a list of devices that are known to fail to
generate legacy interrupts correctly.  This list is based on the PCI
vendor and device IDs, which is not necessarily a fair test since the
root cause may be a board-level misconfiguration rather than a
device-level fault.

Assume that any PCI Express device has a high chance of not being able
to raise legacy interrupts reliably.  This is a relatively intrusive
change since it will affect essentially all modern network devices,
but should hopefully fix all future issues with non-functional legacy
interrupts, without needing to constantly grow the list of known
broken devices.

If some PCI Express devices are found to fail when operated in polling
mode, then this change will need to be revisited.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-24 14:14:41 +01:00
65b8a6e459 [pxeprefix] Display PCI vendor and device ID in PXE startup banner
In the case of a misbehaving PXE stack, it is often useful to know the
PCI vendor and device IDs (e.g. for adding the device to the list of
devices with known broken support for generating interrupts).

The PCI vendor and device ID is already available to the prefix code,
and so can trivially be printed out.  Add this information to the PXE
prefix startup banner.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-23 16:11:09 +01:00
fb082bd4cd [fdt] Add ability to locate node by phandle
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-22 13:39:13 +01:00
e01e5ff7c6 [dwusb] Add driver for DesignWare USB3 host controller
Add a basic driver for the DesignWare USB3 host controller as found in
the Lichee Pi 4A.

This driver covers only the DesignWare host controller hardware.  On
the Lichee Pi 4A, this is sufficient to get the single USB root hub
port (exposed internally via the SODIMM connector) up and running.

The driver does not yet handle the various GPIOs that control power
and signal routing for the Lichee Pi 4A's onboard VL817 USB hub and
the four physical USB-A ports.  This therefore leaves the USB hub and
the USB-A ports unpowered, and the USB2 root hub port routed to the
physical USB-C port.  Devices plugged in to the USB-A ports will not
be powered up, and a device plugged in to the USB-C port will
enumerate as a USB2 device.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-21 15:55:13 +01:00
6c42ea1275 [xhci] Allow for non-PCI xHCI host controllers
Allow for the existence of xHCI host controllers where the underlying
hardware is not a PCI device.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-21 15:33:58 +01:00
eca97c2ee2 [xhci] Use root hub port number to determine slot type
We currently use the downstream hub's port number to determine the
xHCI slot type for a newly connected USB device.  The downstream hub
port number is irrelevant to the xHCI controller's supported protocols
table: the relevant value is the number of the root hub port through
which the device is attached.

Fix by using the root hub port number instead of the immediate parent
hub's port number.

This bug has not previously been detected since the slot type for the
first N root hub ports will invariably be zero to indicate that these
are USB ports.  For any xHCI controller with a sufficiently large
number of root hub ports, the code would therefore end up happening to
calculate the correct slot type value despite using an incorrect port
number.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-18 14:58:56 +01:00
8a8904aadd [efi] Check only the non-extended WaitForKey event
The WaitForKeyEx event in EFI_SIMPLE_TEXT_INPUT_EX_PROTOCOL is
redundant: by definition it has to signal under exactly the same
conditions as the WaitForKey event in EFI_SIMPLE_TEXT_INPUT_PROTOCOL
and cannot provide any "extended" information since EFI events do not
convey any information beyond their own occurrence.

UEFI keyboard drivers such as Ps2KeyboardDxe and UsbKbDxe invariably
use a single notification function to implement both events.  The
console multiplexer driver ConSplitterDxe uses a single notification
function for both events, which ends up checking only the WaitForKey
event on the underlying console devices.  (Since all console input is
routed through the console multiplexer, this means that in practice
nothing will ever check the underlying devices' WaitForKeyEx events.)

UEFI console consumers such as the UEFI shell tend to use only the
EFI_SIMPLE_TEXT_INPUT_PROTOCOL instance provided as ConIn in the EFI
system table.  With the exception of the UEFI text editor (the "edit"
command in the UEFI shell), almost nothing bothers to open the
EFI_SIMPLE_TEXT_INPUT_EX_PROTOCOL instance on the same handle.

The Lenovo ThinkPad T14s Gen 5 has a very peculiar firmware bug.
Enabling the "UEFI Wi-Fi Network Boot" feature in the BIOS setup will
cause the completely unrelated WaitForKeyEx event pointer to be
overwritten with a pointer to a FAT_DIRENT structure representing the
"BOOT" directory in the EFI system partition.  This happens with 100%
repeatability.  It is not necessary to attempt to boot from Wi-Fi: it
is only necessary to have the feature enabled.  The root cause is
unknown, but is presumably an uninitialised pointer or similar
memory-related bug in Lenovo's UEFI Wi-Fi driver.

Work around this Lenovo firmware bug by checking only the WaitForKey
event, ignoring the WaitForKeyEx event even if we will subsequently
use ReadKeyStrokeEx() to read the keypress.  Since almost all other
UEFI console consumers use only WaitForKey, this ensures that we will
be using code paths that the firmware vendor is likely to have tested
at least once.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-16 13:20:29 +01:00
8701863a17 [efi] Allow compiler to perform type checks on EFI_EVENT
As with EFI_HANDLE, the EFI headers define EFI_EVENT as a void
pointer, rendering EFI_EVENT compatible with a pointer to itself and
hence guaranteeing that pointer type bugs will be introduced.

Redefine EFI_EVENT as a pointer to an anonymous structure (as we
already do for EFI_HANDLE) to allow the compiler to perform type
checking as expected.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-15 16:57:25 +01:00
1e3fb1b37e [init] Show initialisation function names in debug messages
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-15 14:10:33 +01:00
7ac4b3c6f1 [efi] Assume that vendor wireless drivers are unusable via SNP
The UEFI model for wireless network boot cannot sensibly be described
without cursing.  Commit 758a504 ("[efi] Inhibit calls to Shutdown()
for wireless SNP devices") attempts to work around some of the known
issues.

Experimentation shows that on at least some platforms (observed with a
Lenovo ThinkPad T14s Gen 5) the vendor SNP driver is broken to the
point of being unusable in anything other than the single use case
envisioned by the firwmare authors.  Doing almost anything directly
via the SNP protocol interface has a greater than 50% chance of
locking up the system.

Assume, in the absence of any evidence to the contrary so far, that
vendor SNP drivers for wireless network devices are so badly written
as to be unusable.  Refuse to even attempt to interact with these
drivers via the SNP or NII protocol interfaces.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-15 09:12:54 +01:00
c3376f8645 [efi] Drop to external TPL for calls to ConnectController()
There is nothing in the current versions of the UEFI specification
that limits the TPL at which we may call ConnectController() or
DisconnectController().  However, at least some platforms (observed
with a Lenovo ThinkPad T14s Gen 5) will occasionally and unpredictably
lock up before returning from ConnectController() if called at a TPL
higher than TPL_APPLICATION.

Work around whatever defect is present on these systems by dropping to
the current external TPL for all calls to ConnectController() or
DisconnectController().

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-14 12:19:15 +01:00
c01c3215dc [efi] Provide efi_tpl_name() for transcribing TPLs in debug messages
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-14 12:15:08 +01:00
434462a93e [riscv] Ensure coherent DMA allocations do not cross cacheline boundaries
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-11 13:50:41 +01:00
d539a420df [riscv] Support the standard Svpbmt extension for page-based memory types
Set the appropriate Svpbmt type bits within page table entries if the
extension is supported.  Tested only in QEMU so far, due to the lack
of availability of real hardware supporting Svpbmt.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-11 12:24:02 +01:00
2aacb346ca [riscv] Create coherent DMA mapping of 32-bit address space on demand
Reuse the code that creates I/O device page mappings to create the
coherent DMA mapping of the 32-bit address space on demand, instead of
constructing this mapping as part of the initial page table.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-11 12:23:51 +01:00
0611ddbd12 [riscv] Use 1GB pages for I/O device mappings
All 64-bit paging schemes support at least 1GB "gigapages".  Use these
to map I/O devices instead of 2MB "megapages".  This reduces the
number of consumed page table entries, increases the visual similarity
of I/O remapped addresses to the underlying physical addresses, and
opens up the possibility of reusing the code to create the coherent
DMA map of the 32-bit address space.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-11 12:05:52 +01:00
c2cdc1d31e [dwmac] Add driver for DesignWare Ethernet MAC
Add a basic driver for the DesignWare Ethernet MAC network interface
as found in the Lichee Pi 4A.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-10 14:39:07 +01:00
bbabde8ff8 [riscv] Invalidate data cache on completed RX DMA buffers
The data cache must be invalidated twice for RX DMA buffers: once
before passing ownership to the DMA device (in case the cache happens
to contain dirty data that will be written back at an undefined future
point), and once after receiving ownership from the DMA device (in
case the CPU happens to have speculatively accessed data in the buffer
while it was owned by the hardware).

Only the used portion of the buffer needs to be invalidated after
completion, since we do not care about data within the unused portion.

Update the DMA API to include the used length as an additional
parameter to dma_unmap(), and add the necessary second cache
invalidation pass to the RISC-V DMA API implementation.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-10 14:39:07 +01:00
634d9abefb [riscv] Add optimised TCP/IP checksumming
Add a RISC-V assembly language implementation of TCP/IP checksumming,
which is around 50x faster than the generic algorithm.  The main loop
checksums aligned xlen-bit words, using almost entirely compressible
instructions and accumulating carries in a separate register to allow
folding to be deferred until after all loops have completed.

Experimentation on a C910 CPU suggests that this achieves around four
bytes per clock cycle, which is comparable to the x86 implementation.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-10 13:32:45 +01:00
101ef74a6e [riscv] Provide a DMA API implementation for RISC-V bare-metal systems
Provide an implementation of dma_map() that performs cache clean or
invalidation as required, and an implementation of dma_alloc() that
returns virtual addresses within the coherent mapping of the 32-bit
physical address space.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-09 11:07:37 +01:00
22de0c4edf [dma] Use virtual addresses for dma_map()
Cache management operations must generally be performed on virtual
addresses rather than physical addresses.

Change the address parameter in dma_map() to be a virtual address, and
make dma() the API-level primitive instead of dma_phys().

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-08 15:13:19 +01:00
06083d2676 [build] Handle isohybrid with xorrisofs
Generating an isohybrid image with `xorrisofs` is supposed to happen
with option `-isohybrid-gpt-basdat`, not command `isohybrid`.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-08 11:49:16 +01:00
e223b32511 [riscv] Support explicit cache management operations on I/O buffers
On platforms where DMA devices are not in the same coherency domain as
the CPU cache, it is necessary to be able to explicitly clean the
cache (i.e. force data to be written back to main memory) and
invalidate the cache (i.e. discard any cached data and force a
subsequent read from main memory).

Add support for cache management via the standard Zicbom extension or
the T-Head cache management operations extension, with the supported
extension detected on first use.

Support cache management operations only on I/O buffers, since these
are guaranteed to not share cachelines with other data.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-07 16:38:23 +01:00
6a75115a74 [riscv] Add support for detecting T-Head vendor extensions
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-07 16:38:23 +01:00
19f1407ad9 [iobuf] Ensure I/O buffer data sits within unshared cachelines
On platforms where DMA devices are not in the same coherency domain as
the CPU cache, we must ensure that DMA I/O buffers do not share
cachelines with other data.

Align the start and end of I/O buffers to IOB_ZLEN, which is larger
than any cacheline size we expect to encounter.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-07 16:18:04 +01:00
c21443f0b9 [uaccess] Allow for coherent DMA mapping of the 32-bit address space
On platforms where DMA devices are not in the same coherency domain as
the CPU cache, it is necessary to create page table entries where the
translations are marked as uncacheable.

We choose to place iPXE within the low 4GB of memory (since 32-bit DMA
devices are still reasonably common even on systems with 64-bit CPUs).
We therefore need to cover only the low 4GB of memory with these page
table entries.

Update virt_to_phys() to allow for the existence of such a mapping,
assuming that iPXE itself will always reside within the top 4GB of the
64-bit virtual address space (and therefore that the DMA mapping must
lie somewhere below this in the negative virtual address space).

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-04 16:10:51 +01:00
d75d10df16 [riscv] Create coherent DMA mapping for low 4GB of address space
Use PTEs 256-259 to create a mapping of the 32-bit physical address
space with attributes suitable for coherent DMA mappings.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-04 16:10:51 +01:00
3fd54e4f3a [riscv] Construct invariant portions of page table outside the loop
The page table entries for the identity map vary according to the
paging level in use, and so must be constructed within the loop used
to detect the maximum supported paging level.  Other page table
entries are invariant between paging levels, and so may be constructed
just once before entering the loop.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-04 16:10:51 +01:00
6bc55d65b1 [bnxt] Update supported devices array
Add support for new device IDs. Remove device IDs which were never
in use.

Signed-off-by: Joseph Wong <joseph.wong@broadcom.com>
2025-07-02 16:18:33 -07:00
0020627777 [bnxt] Update device descriptions
Use human readable strings for dev_description in PCI_ROM array.

Signed-off-by:  Joseph Wong <joseph.wong@broadcom.com>
2025-07-01 16:05:34 -07:00
126366ac47 [bnxt] Remove VLAN stripping logic
Remove logic that programs the hardware to strip out VLAN from RX
packets.  Do not drop packets due to VLAN mismatch and allow the upper
layer to decide whether to discard the packets.

Signed-off-by: Joseph Wong <joseph.wong@broadcom.com>
2025-06-29 14:21:51 +01:00
4262328c13 [github] Add sponsorship link
iPXE is released under the GNU GPL and is 100% open source software.
There are no "premium editions", no in-app advertisements, and no
hidden costs.  The fully public version published to GitHub is and
always will be the definitive and only version of iPXE.

Many large features in iPXE have been commercially funded within this
open source model, with features being published upstream as soon as
they are complete and made available for the whole world to use, not
restricted for use only by the customer funding that particular piece
of development work.

There has not to date been any funding model for smaller pieces of
work, such as occasional code review or guaranteed attention to bug
reports.  The overhead of establishing a commercial relationship is
usually too high to be worthwhile for very small units of work.

The GitHub sponsorship mechanism provides a framework for efficiently
handling small commercial requests (or individual tokens of thanks).
Add a FUNDING.yml file to provide a convenient way for anyone who
wants to support the ongoing open source development of iPXE to do so.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-06-26 16:33:58 +01:00
54392f0d70 [bnxt] Increase Tx descriptors
Increase TX and CMP descriptor counts.

Signed-off-by: Joseph Wong <joseph.wong@broadcom.com>
2025-06-25 14:05:33 +01:00
e5953ed7e6 [build] Disable use of common symbols
We no longer have any requirement for common symbols.  Disable common
symbols via the -fno-common compiler option, and simplify the test for
support of -fdata-sections (which can return a false negative when
common symbols are enabled).

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-06-24 14:40:57 +01:00
8df3b96402 [build] Allow for the existence of small-data sections
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-06-24 14:40:57 +01:00
d3e10ebd35 [legacy] Allocate legacy driver .bss-like segments at probe time
Some legacy drivers use large static allocations for transmit and
receive buffers.  To avoid bloating the .bss segment, we currently
implement these as a single common symbol named "_shared_bss" (which
is permissible since only one legacy driver may be active at any one
time).

Switch to dynamic allocation of these .bss-like segments, to avoid the
requirement for using common symbols.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-06-24 13:41:51 +01:00
6ea800ab54 [legacy] Rename the global legacy NIC to "legacy_nic"
We currently have contexts in which the local variable "nic" is a
pointer to the global variable also called "nic".  This complicates
the creation of macros.

Rename the global variable to "legacy_nic" to reduce pollution of the
global namespace and to allow for the creation of macros referring to
fields within this global variable.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-06-24 13:41:51 +01:00
d0c02e0df8 [legacy] Allocate extra padding in receive buffers
Allow for legacy drivers that include VLAN tags or CRCs within their
received packets.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-06-24 13:41:51 +01:00
97f40c5fcc [pxe] Use a weak symbol for isapnp_read_port
Use a weak symbol for isapnp_read_port used in pxe_preboot.c, rather
than relying on a common symbol.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-06-24 13:34:41 +01:00
c33ff76d8d [fdtcon] Add basic support for FDT-based system serial console
Add support for probing a device based on the path or alias found in
the "/chosen/stdout-path" node, and using a consequently instantiated
UART as the default serial console.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-06-23 23:35:27 +01:00
9ada09c919 [dwuart] Read input clock frequency from the device tree
The 16550 design includes a programmable 16-bit clock divider for an
arbitrary input clock, requiring knowledge of the input clock
frequency in order to calculate the divider value for a given baud
rate.  The 16550 UARTs in an x86 PC will always have a 1.8432 MHz
input clock.  Non-x86 systems may have other input clock frequencies.

Define the input clock frequency as a property of a 16550 UART, and
read the value from the device tree "clock-frequency" property.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-06-23 22:56:38 +01:00
0ed1dea7f4 [uart] Wait for 16550 UART to become idle before modifying LCR
Some implementations of 16550-compatible UARTs (e.g. the DesignWare
UART) are known to ignore writes to the line control register while
the transmitter is active.

Wait for the transmitter to become empty before attempting to write to
the line control register.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-06-23 22:56:09 +01:00
2ce1b185b2 [serial] Allow platform to specify mechanism for identifying console
Allow the platform configuration to provide a mechanism for
identifying the serial console UART.  Provide two globally available
mechanisms: "null" (i.e. no serial console), and "fixed" (i.e. use
whatever is specified by COMCONSOLE in config/serial.h).

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-06-23 16:53:13 +01:00
5d9f20bbd6 [dwuart] Add "ns16550a" compatible device ID
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-06-23 15:10:55 +01:00
d1823eb677 [riscv] Inhibit SBI console when a serial console is active
When a native serial driver is enabled for the system console device
specified via "/chosen/stdout-path", it is very likely that this will
correspond to the same physical serial port used for the SBI debug
console.

Inhibit input and output via the SBI console whenever a serial console
is active, to avoid duplicated output characters and unpredictable
input behaviour.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-06-23 15:07:07 +01:00
25fa01822b [riscv] Serialise MMIO accesses with respect to each other
iPXE drivers have been written with the implicit assumption that MMIO
writes are allowed to be posted but that an MMIO register read or
write after another MMIO register write will always observe the
effects of the first write.

For example: after having written a byte to the transmit holding
register (THR) of a 16550 UART, it is expected that any subsequent
read of the line status register (LSR) will observe a value consistent
with the occurrence of the write.

RISC-V does not seem to provide any ordering guarantees between
accesses to different registers within the same MMIO device.  Add
fences as part of the MMIO accessors to provide the assumed
guarantees.

Use "fence io, io" before each MMIO read or write to enforce full
serialisation of MMIO accesses with respect to each other.  This is
almost certainly more conservative than is strictly necessary.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-06-22 09:45:09 +01:00
53a3befb69 [dwuart] Add a basic driver for the Synopsys DesignWare UART
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-06-21 23:34:32 +01:00
cca1cfd49e [uart] Allow for dynamically registered 16550 UARTs
Use the generic UART driver-private data pointer, rather than
embedding the generic UART within the 16550 UART structure.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-06-21 23:34:32 +01:00
71b4bfb6b2 [uart] Add support for MMIO-accessible 16550 UARTs
16550 UARTs exist on non-x86 platforms but will be accessible via MMIO
rather than port I/O.  It is possible to encounter MMIO-mapped 16550
UARTs on x86 platforms, but there is no real requirement to support
them in iPXE since the standard COM1, COM2, etc ports have been
present on every PC-compatible machine since 1981.

Assume for now that accessing 16550 UART registers requires
inb()/outb() on x86 and readb()/writeb() on other architectures.

Allow for the existence of a register shift on MMIO-mapped 16550
UARTs, since modern SoCs tend to treat register addresses as being
aligned to either 32-bit or 64-bit boundaries.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-06-20 12:52:04 +01:00
6c8fb4b89d [uart] Allow for the existence of non-16550 UARTs
Remove the assumption that all platforms use a fixed number of 16550
UARTs identifiable by a simple numeric index.  Create an abstraction
allowing for dynamic instantiation and registration of any number of
arbitrary UART models.

The common case of the serial console on x86 uses a single fixed UART
specified at compile time.  Avoid unnecessarily dragging in the
dynamic instantiation code in this use case by allowing COMCONSOLE to
refer to a single static UART object representing the relevant port.

When selecting a UART by command-line argument (as used in the
"gdbstub serial <port>" command), allow the UART to be specified as
either a numeric index (to retain backwards compatiblity) or a
case-insensitive port name such as "COM2".

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-06-20 12:52:04 +01:00
60e167c00b [uart] Remove ability to use frame formats other than 8n1
In the context of serial consoles, the use of any frame formats other
than the standard 8 data bits, no parity, and one stop bit is so rare
as to be nonexistent.

Remove the almost certainly unused support for custom frame formats.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-06-17 15:44:12 +01:00
5783a10f72 [riscv] Write SBI console output to early UART, if enabled
The early UART is an optional feature used to obtain debug output from
the prefix before iPXE is able to parse the device tree.

Extend this feature to also cover any console output that iPXE
attempts to send to the SBI console, on the basis that the purpose of
the early UART is to provide an output-only device for situations in
which there is no functional SBI console.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-06-12 12:57:26 +01:00
41e65df19d [riscv] Maximise barrier effects of memory fences
The RISC-V "fence" instruction encoding includes bits for predecessor
and successor input and output operations, separate from read and
write operations.  It is up to the CPU implementation to decide what
counts as I/O space rather than memory space for the purposes of this
instruction.

Since we do not expect fencing to be performance-critical, keep
everything as simple and reliable as possible by using the unadorned
"fence" instruction (equivalent to "fence iorw, iorw").

Add a memory clobber to ensure that the compiler does not reorder the
barrier.  (The volatile qualifier seems to already prevent reordering
in practice, but this is not guaranteed according to the compiler
documentation.)

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-06-12 12:33:46 +01:00
7e96e5f2ef [fdt] Allow paths and aliases to be terminated with separator characters
Non-permitted name characters such as a colon are sometimes used to
separate alias names or paths from additional metadata, such as the
baud rate for a UART in the "/chosen/stdout-path" property.

Support the use of such alias names and paths by allowing any
character not permitted in a property name to terminate a property or
node name match.  (This is a very relaxed matching rule that will
produce false positive matches on invalid input, but this is unlikely
to cause problems in practice.)

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-06-11 16:18:36 +01:00
1de3aef78c [bnxt] Remove TX padding
Remove unnecessary TX padding.

Signed-off-by: Joseph Wong <joseph.wong@broadcom.com>
2025-06-11 15:07:40 +01:00
3e8909cf5f [fdtmem] Limit relocation to 32-bit address space
Devices with only 32-bit DMA addressing are relatively common even on
systems with 64-bit CPUs.  Limit relocation of iPXE to 32-bit address
space so that I/O buffers and other DMA allocations will be accessible
by 32-bit devices.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-06-11 13:49:08 +01:00
c4a3d438e6 [dt] Allow for creation of standalone devices
We will want to be able to create the console device as early as
possible.  Refactor devicetree probing to remove the assumption that a
devicetree device must have a devicetree parent, and expose functions
to allow a standalone device to be created given only the offset of a
node within the tree.

The full device path is no longer trivial to construct with this
assumption removed.  The full path is currently used only for debug
messages.  Remove the stored full path, use just the node name for
debug messages, and ensure that the topology information previously
visible in the full path is reconstructible from the combined debug
output if needed.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-06-11 13:02:20 +01:00
b5fb7353fa [ipv4] Add support for classless static routes
Add support for RFC 3442 classless static routes provided via DHCP
option 121.

Originally-implemented-by: Hazel Smith <hazel.smith@leicester.ac.uk>
Originally-implemented-by: Raphael Pour <raphael.pour@hetzner.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-06-10 18:22:32 +01:00
e648d23fba [ipv4] Extend routing mechanism to handle non-default routes
Extend the definition of an IPv4 routing table entry to allow for the
expression of non-default gateways for specified off-link subnets, and
of on-link secondary subnets (where we can send directly to the
destination address even though our source address is not within the
subnet).

This more precise definition also allows us to correctly handle
routing in the (uncommon for iPXE) case when multiple network
interfaces are open concurrently and more than one interface has a
default gateway.

The common case of a single IPv4 address/netmask and a default gateway
now results in two routing table entries.  To retain backwards
compatibility with existing documentation (and to avoid on-screen
clutter), the "route" command prints default gateways on the same line
as the locally assigned address.  There is therefore no change in
output from the "route" command unless explicit additional (off-link
or on-link) routes are present.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-06-10 13:54:15 +01:00
96f5864660 [ipv4] Add self-tests for IPv4 routing
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-06-10 13:54:15 +01:00
1ae75a3bde [test] Add infrastructure for test network devices
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-06-10 13:39:57 +01:00
5b3ebf8b24 [riscv] Support T-Head CPUs using non-standard Memory Attribute Extension
Xuantie/T-Head processors such as the C910 (as used in the Sipeed
Lichee Pi 4A) use the high bits of the PTE in a very non-standard way
that is incompatible with the RISC-V specification.

As per the "Memory Attribute Extension (XTheadMae)", bits 62 and 61
represent cacheability and "bufferability" (write-back cacheability)
respectively.  If we do not enable these bits, then the processor gets
incredibly confused at the point that paging is enabled.  The symptom
is that cache lines will occasionally fail to fill, and so reads from
any address may return unrelated data from a previously read cache
line for a different address.

Work around these hardware flaws by detecting T-Head CPUs (via the
"get machine vendor ID" SBI call), then reading the vendor-specific
SXSTATUS register to determine whether or not the vendor-specific
Memory Attribute Extension has been enabled by the M-mode firmware.
If it has, then set bits 61 and 62 in each page table entry that is
used to access normal memory.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-06-02 14:19:15 +01:00
817145fe01 [riscv] Do not set executable bit in early UART page mapping
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-06-02 08:59:54 +01:00
7df005c4c6 [riscv] Add fences around early UART writes
Add a fence between the write to the UART transmit register and the
subsequent read from the transmit status register, to ensure that the
status correctly reflects the occurrence of the write.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-06-02 08:36:22 +01:00
88cffd75a9 [riscv] Zero SATP after any failed attempt to enable paging
The RISC-V specification states that "if SATP is written with an
unsupported mode, the entire write has no effect; no fields in SATP
are modified".  We currently rely on this specified behaviour when
calculating the early UART base address: if SATP has a non-zero value
then we assume that paging must be enabled.

The XuanTie C910 CPU (as used in the Lichee Pi 4A) does not conform to
this specified behaviour.  Writing SATP with an unsupported mode will
leave SATP.MODE as zero (i.e. bare physical addressing) but the write
to SATP.PPN will still take effect, leaving SATP with an illegal
non-zero value.

Work around this misbehaviour by explicitly writing zero to SATP if we
detect that the mode change has not taken effect (e.g. because the CPU
does not support the requested paging mode).

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-06-02 08:09:15 +01:00
bb2011241f [dt] Locate parent node at point of use in dt_ioremap()
We currently rely on the recursive nature of devicetree bus probing to
obtain the region cell size specification from the parent device.
This blocks the possibility of creating a standalone console device
based on /chosen/stdout-path before probing the whole bus.

Fix by using fdt_parent() to locate the parent device at the point of
use within dt_ioremap().

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-30 16:39:10 +01:00
1762568ec5 [fdt] Provide ability to locate the parent device node
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-30 16:38:39 +01:00
d64250918c [fdt] Add tests for device tree creation
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-30 14:21:53 +01:00
3fe321c42a [riscv] Add support for a SiFive-compatible early UART
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-27 17:24:16 +01:00
2e27d772ca [riscv] Support mapping early UARTs outside of the identity map
Some platforms (such as the Sipeed Lichee Pi 4A) choose to make early
debugging entertainingly cumbersome for the programmer.  These
platforms not only fail to provide a functional SBI debug console, but
also choose to place the UART at a physical address that cannot be
identity-mapped under the only paging model supported by the CPU.

Support such platforms by creating a virtual address mapping for the
early UART (in the 2MB megapage immediately below iPXE itself), and
using this as the UART base address whenever paging is enabled.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-27 16:31:51 +01:00
98fdfdd255 [riscv] Add support for writing prefix debug messages direct to a UART
Some platforms (such as the Sipeed Lichee Pi 4A) do not provide a
functional SBI debug console.  We can obtain early debug messages on
these systems by writing directly to the UART used by the vendor
firmware.

There is no viable way to parse the UART address from the device tree,
since the prefix debug messages occur extremely early, before the C
runtime environment is available and therefore before any information
has been parsed from the device tree.  The early UART model and
register addresses must be configured by editing config/serial.h if
needed.  (This is an acceptable limitation, since prefix debugging is
an extremely specialised use case.)

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-27 14:49:18 +01:00
2e8d45aeef [riscv] Create macros for writing characters to the debug console
Abstract out the SBI debug console calls into macros that can be
shared between print_message and print_hex_value.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-26 23:36:02 +01:00
6eb51f1a6a [riscv] Ignore riscv,isa property in favour of direct CSR testing
The riscv,isa devicetree property appears not to be fully populated on
some real-world systems.  For example, the Sipeed Lichee Pi 4A
(running the vendor U-Boot) reports itself as "rv64imafdcvsu", which
does not include the "zicntr" extension even though the time CSR is
present and functional.

Ignore the riscv,isa property and rely solely on CSR testing to
determine whether or not extensions are present.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-26 21:34:11 +01:00
192cfc3cc5 [image] Use image name rather than pointer value in all debug messages
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-26 18:22:07 +01:00
eae9a27542 [riscv] Support mapping I/O devices outside of the identity map
With the 64-bit paging schemes (Sv39, Sv48, and Sv57), we identity-map
as much of the physical address space as is possible.  Experimentation
shows that this is not sufficient to provide access to all I/O
devices.  For example: the Sipeed Lichee Pi 4A includes a CPU that
supports only Sv39, but places I/O devices at the top of a 40-bit
address space.

Add support for creating I/O page table entries on demand to map I/O
devices, based on the existing design used for x86_64 BIOS.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-26 17:56:27 +01:00
6af4a022b2 [fdtmem] Ignore reservation regions with no fixed addresses
Do not print an error message for unused reservation regions that have
no fixed reserved address ranges.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-26 00:22:52 +01:00
56f5845b36 [riscv] Include carriage returns in libprefix.S debug messages
Support debug consoles that do not automatically convert LF to CRLF by
including the CR character within the debug message strings.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-26 00:10:30 +01:00
09140ab2c1 [memmap] Allow explicit colour selection for memory map debug messages
Provide DBGC_MEMMAP() as a replacement for memmap_dump(), allowing the
colour used to match other messages within the same message group.

Retain a dedicated colour for output from memmap_dump_all(), on the
basis that it is generally most useful to visually compare full memory
dumps against previous full memory dumps.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-25 12:06:53 +01:00
8d88870da5 [riscv] Support older SBI implementations
Fall back to attempting the legacy SBI console and shutdown calls if
the standard calls fail.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-25 10:43:39 +01:00
036e43334a [memmap] Rename addr/last fields to min/max for clarity
Use the terminology "min" and "max" for addresses covered by a memory
region descriptor, since this is sufficiently intuitive to generally
not require further explanation.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-23 16:55:42 +01:00
cd38ed4fab [lkrn] Support initrd construction for RISC-V bare-metal kernels
Use the shared initrd reshuffling and CPIO header construction code
for RISC-V bare-metal kernels.  This allows for files to be injected
into the constructed ("magic") initrd image in exactly the same way as
is done for bzImage and UEFI kernels.

We append a dummy image encompassing the FDT to the end of the
reshuffle list, so that it ends up directly following the constructed
initrd in memory (but excluded from the initrd length, which was
recorded before constructing the FDT).

We also temporarily prepend the kernel binary itself to the reshuffle
list.  This is guaranteed to be safe (since reshuffling is designed to
be unable to fail), and avoids the requirement for the kernel segment
to be available before reshuffling.  This is useful since current
RISC-V bare-metal kernels tend to be distributed as EFI zboot images,
which require large temporary allocations from the external heap for
the intermediate images created during archive extraction.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-23 16:14:45 +01:00
c713ce5c7b [initrd] Squash and shuffle only initrds within the external heap
Any initrd images that are not within the external heap (e.g. embedded
images) do not need to be copied to the external heap for reshuffling,
and can just be left in their original locations.

Ignore any images that are not already within the external heap (or,
more precisely, that are wholly outside of the reshuffle region within
the external heap) when squashing and swapping images.

This reduces the maximum additional storage required by squashing and
swapping to zero, and so ensures that the reshuffling step is
guaranteed to succeed under all circumstances.  (This is unrelated to
the post-reshuffle load region check, which is still required.)

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-23 14:39:17 +01:00
4a39b877dd [initrd] Split out initrd construction from bzimage.c
Provide a reusable function initrd_load_all() to load all initrds
(including any constructed CPIO headers) into a contiguous memory
region, and support functions to find the constructed total length and
permissible post-reshuffling load address range.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-23 12:31:46 +01:00
11929389e4 [initrd] Allow for images straddling the top of the reshuffle region
It is hypothetically possible for external heap memory allocated
during driver startup to have been freed before an image was
downloaded, which could therefore leave an image straddling the
address recorded as the top of the reshuffle region.

Allow for this possibility by skipping squashing for any images
already straddling (or touching) the top of the reshuffle region.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-22 16:28:15 +01:00
029c7c4178 [initrd] Rename bzimage_align() to initrd_align()
Alignment of initrd lengths is applicable to all Linux kernels, not
just those in the x86 bzImage format.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-22 16:28:15 +01:00
9231d8c952 [initrd] Swap initrds entirely in-place via triple reversal
Eliminate the requirement for free space when reshuffling initrds by
swapping adjacent initrds using an in-place triple reversal.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-22 16:28:15 +01:00
11e01f0652 [uheap] Expose external heap region directly
We currently rely on implicit detection of the external heap region.
The INT 15 memory map mangler relies on examining the corresponding
in-use memory region, and the initrd reshuffler relies on performing a
separate detection of the largest free memory block after startup has
completed.

Replace these with explicit public symbols to describe the external
heap region.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-22 16:28:15 +01:00
e056041074 [uheap] Prevent allocation of blocks with zero physical addresses
If the external heap ends up at the top of the system memory map then
leave a gap after the heap to ensure that no block ends up being
allocated with either a start or end address of zero, since this is
frequently confusing to both code and humans.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-22 16:16:14 +01:00
b9095a045a [fdtmem] Allow iPXE to be relocated to the top of the address space
Allow for relocation to a region at the very end of the physical
address space (where the next address wraps to zero).

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-22 16:16:14 +01:00
a534563345 [riscv] Speed up memmove() when copying in forwards direction
Use the word-at-a-time variable-length memcpy() implementation when
performing an overlapping copy in the forwards direction, since this
is guaranteed to be safe and likely to be substantially faster than
the existing bytewise copy.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-21 16:12:56 +01:00
20d2c0f787 [lkrn] Shut down devices before jumping to kernel entry point
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-21 15:07:55 +01:00
969e8b5462 [lkrn] Allow a single initrd to be passed to the booted kernel
Allow a single initrd image to be passed verbatim to the booted RISC-V
kernel, as a proof of concept.

We do not yet support reshuffling to make optimal use of available
memory, or dynamic construction of CPIO headers, but this is
sufficient to allow iPXE to start up the Fedora 42 kernel with its
matching initrd image.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-21 14:56:10 +01:00
9bc559850c [fdt] Allow an initrd to be specified when creating a device tree
Allow an initrd location to be specified in our constructed device
tree via the "linux,initrd-start" and "linux,initrd-end" properties.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-21 14:31:18 +01:00
c1cd54ad74 [initrd] Move initrd reshuffling to be architecture-independent code
There is nothing x86-specific in initrd.c, and a variant of the
reshuffling logic will be required for executing bare-metal kernels on
RISC-V and AArch64.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-21 12:12:16 +01:00
d15a11f3a4 [image] Use image replacement when executing extracted images
Use image_replace() to transfer execution to the extracted image,
rather than calling image_exec() directly.  This allows the original
archive image to be freed immediately if it was marked as an
automatically freeable image (e.g. via "chain --autofree").

In particular, this ensures that in the case of an archive image
containing another archive image (such as an EFI zboot kernel wrapper
image containing a gzip-compressed kernel image), the intermediate
extracted image will be freed as early as possible, since extracted
images are always marked as automatically freeable.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-20 15:34:49 +01:00
e2f4dba2b7 [lkrn] Add support for EFI zboot compressed kernel images
Current RISC-V and AArch64 kernels found in the wild tend not to be in
the documented kernel format, but are instead "EFI zboot" kernels
comprising a small EFI executable that decompresses and executes the
inner payload (which is a kernel in the expected format).

The EFI zboot header includes a recognisable magic value "zimg" along
with two fields describing the offset and length of the compressed
payload.  We can therefore treat this as an archive image format,
extracting the payload as-is and then relying on our existing ability
to execute compressed images.

This is sufficient to allow iPXE to execute the Fedora 42 RISC-V
kernel binary as currently published.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-20 14:29:57 +01:00
ecac4a34c7 [lkrn] Add basic support for the RISC-V Linux kernel image format
The RISC-V and AArch64 bare-metal kernel images share a common header
format, and require essentially the same execution environment: loaded
close to the start of RAM, entered with paging disabled, and passed a
pointer to a flattened device tree that describes the hardware and any
boot arguments.

Implement basic support for executing bare-metal RISC-V and AArch64
kernel images.  The (trivial) AArch64-specific code path is untested
since we do not yet have the ability to build for any bare-metal
AArch64 platforms.  Constructing and passing an initramfs image is not
yet supported.

Rename the IMAGE_BZIMAGE build configuration option to IMAGE_LKRN,
since "bzImage" is specific to x86.  To retain backwards compatibility
with existing local build configurations, we leave IMAGE_BZIMAGE as
the enabled option in config/default/pcbios.h and treat IMAGE_LKRN as
a synonym for IMAGE_BZIMAGE when building for x86 BIOS.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-20 13:08:38 +01:00
d0c35b6823 [bios] Use generic external heap based on the system memory map
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-19 20:47:21 +01:00
140ceeeb08 [riscv] Use generic external heap based on the system memory map
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-19 19:36:25 +01:00
4d560af2b0 [uheap] Add a generic external heap based on the system memory map
Add an implementation of umalloc() using the generalised model of a
heap, placing the external heap in the largest usable region obtained
from the system memory map.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-19 19:36:25 +01:00
490f1ecad8 [malloc] Allow heap to specify block and pointer alignments
Size-tracked pointers allocated via umalloc() have historically been
aligned to a page boundary, as have the edges of the hidden memory
region covering the external heap.

Allow the block and size-tracked pointer alignments to be specified as
heap configuration parameters.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-19 19:36:23 +01:00
c6ca3d3af8 [malloc] Allow for the existence of multiple heaps
Create a generic model of a heap as a list of free blocks with
optional methods for growing and shrinking the heap.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-19 19:35:56 +01:00
83449702e0 [memmap] Remove now-obsolete get_memmap()
All memory map users have been updated to use the new system memory
map API.  Remove get_memmap() and its associated definitions.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-16 18:16:41 +01:00
624d76e26d [bios] Use memmap_describe() to find an external heap location
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-16 18:04:27 +01:00
79c30b92a3 [settings] Use memmap_describe() to construct memory map settings
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-16 17:39:36 +01:00
c8d64ecd87 [bios] Use memmap_describe() to find a relocation address
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-16 17:39:29 +01:00
dbc86458e5 [comboot] Use memmap_describe() to obtain available memory
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-16 17:02:55 +01:00
d0adf3b4cc [multiboot] Use memmap_describe() to construct Multiboot memory map
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-16 17:02:55 +01:00
25ab8f4629 [image] Use memmap_describe() to check loadable image segments
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-16 16:55:35 +01:00
a353e70800 [memmap] Use memmap_dump_all() to dump debug memory maps
There are several places where get_memmap() is called solely to
produce debug output.  Replace these with calls to memmap_dump_all()
(which will be a no-op unless debugging is enabled).

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-16 16:18:36 +01:00
3812860e39 [bios] Describe umalloc() heap as an in-use memory area
Use the concept of an in-use memory region defined as part of the
system memory map API to describe the umalloc() heap.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-16 16:18:36 +01:00
4c4c94ca09 [bios] Update to use the generic system memory map API
Provide an implementation of the system memory map API based on the
assorted BIOS INT 15 calls, and a temporary implementation of the
legacy get_memmap() function using the new API.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-16 16:18:36 +01:00
3f6ee95737 [fdtmem] Update to use the generic system memory map API
Provide an implementation of the system memory map API based on the
system device tree, excluding any memory outside the size of the
accessible physical address space and defining an in-use region to
cover the relocated copy of iPXE and the system device tree.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-16 16:18:36 +01:00
bab3d76717 [memmap] Define an API for managing the system memory map
Define a generic system memory map API, based on the abstraction
created for parsing the FDT memory map and adding a concept of hidden
in-use memory regions as required to support patching the BIOS INT 15
memory map.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-16 16:12:15 +01:00
f6f11c101c [tests] Remove prehistoric umalloc() test code
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-15 15:47:08 +01:00
e0c4cfa81e [fdtmem] Record size of accessible physical address space
The size of accessible physical address space will be required for the
runtime memory map, not just at relocation time.  Make this size an
additional parameter to fdt_register() (matching the prototype for
fdt_relocate()), and record the value for future reference.

Note that we cannot simply store the limit in fdt_relocate() since it
is called before .data is writable and before .bss is zeroed.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-14 22:09:51 +01:00
64ad1d03c3 [bios] Rename memmap.c to int15.c
Create namespace for an architecture-independent memmap.c by renaming
the BIOS-specific memmap.c to int15.c.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-14 22:02:46 +01:00
1dd9ac13fd [bnxt] Use updated DMA APIs
Replace malloc_phys with dma_alloc, free_phys with dma_free, alloc_iob
with alloc_rx_iob, free_iob with free_rx_iob, virt_to_bus with dma or
iob_dma.  Replace dma_addr_t with physaddr_t.

Signed-off-by: Joseph Wong <joseph.wong@broadcom.com>
2025-05-14 14:21:02 +01:00
08edad7ca3 [bnxt] Return proper error codes in probe
Return the proper error codes in bnxt_init_one, to indicate the
correct return status upon completion.  Failure paths could
incorrectly indicate a success.  Correct assertion condition to check
for non-NULL pointer.

Signed-off-by: Joseph Wong <joseph.wong@broadcom.com>
2025-05-14 14:08:27 +01:00
4d39b2dcc6 [crypto] Remove redundant null pointer check
Coverity reports a spurious potential null pointer dereference in
cms_decrypt(), since the null pointer check takes place after the
pointer has already been dereferenced.  The pointer can never be null,
since it is initialised to point to cipher_null at the point that the
containing structure is allocated.

Remove the redundant null pointer check, and for symmetry ensure that
the digest and public-key algorithm pointers are similarly initialised
at the point of allocation.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-14 12:46:23 +01:00
d1c1e578af [riscv] Add a .pf32 build target for padded parallel flash images
QEMU's -pflash option requires an image that has been padded to the
exact expected size (32MB for all of the supported RISC-V virtual
machines).

Add a .pf32 build target which is simply the equivalent .sbi target
padded to 32MB in size, to simplify testing.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-13 18:25:24 +01:00
6fd927f929 [riscv] Perform a writability test before applying relocations
If paging is not supported, then we will attempt to apply dynamic
relocations to fix up the runtime addresses.  If the image is
currently executing directly from flash memory, this can result in
effectively sending an undefined sequence of commands to the flash
device, which can cause unwanted side effects.

Perform an explicit writability test before applying relocations,
using a write value chosen to be safe for at least any devices
conforming to the JEDEC Common Flash Interface (CFI01).

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-13 17:42:53 +01:00
4566f59757 [riscv] Avoid potentially overwriting the scratch area during relocation
We do not currently describe the temporary page table or the temporary
stack as areas to be avoided during relocation of the iPXE image to a
new physical address.

Perform the copy of the iPXE image and zeroing of the .bss within
libprefix.S, after we have no futher use for the temporary page table
or the temporary initial stack.  Perform the copy and registration of
the system device tree in C code after relocation is complete and the
new stack (within .bss) has been set up.

This provides a clean separation of responsibilities between the
RISC-V libprefix.S and the architecture-independent fdtmem.c.  The
prefix is responsible only for relocating iPXE to the new physical
address returned from fdtmem_relocate(), and doesn't need to know or
care where fdtmem.c is planning to place the copy of the device tree.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-13 14:00:34 +01:00
8e38af800b [riscv] Add a .lkrn build target resembling a Linux kernel binary
On x86 BIOS, it has been useful to be able to build iPXE to resemble a
Linux kernel, so that it can be loaded by programs such as syslinux
which already know how to handle Linux kernel binaries.

Add an equivalent .lkrn build target for RISC-V SBI, allowing for
build targets such as:

  make bin-riscv64/ipxe.lkrn

  make bin-riscv64/cgem.lkrn

The Linux kernel header format allows us to specify a required length
(including uninitialised-data portions) and defines that the image
will be loaded at a fixed offset from the start of RAM.  We can
therefore use known-safe areas of memory (within our own .bss) for the
initial temporary page table and stack.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-13 13:03:08 +01:00
17fd67ce03 [riscv] Relocate to a safe physical address on startup
On startup, we may be running from read-only memory.  We need to parse
the devicetree to obtain the system memory map, and identify a safe
location to which we can copy our own binary image along with a
stashed copy of the devicetree, and then transfer execution to this
new location.

Parsing the system memory map realistically requires running C code.
This in turn requires a small temporary stack, and some way to ensure
that symbol references are valid.

We first attempt to enable paging, to make the runtime virtual
addresses equal to the link-time virtual addresses.  If this fails,
then we attempt to apply the compressed relocation records.

Assuming that one of these has worked (i.e. that either the CPU
supports paging or that our image started execution in writable
memory), then we call fdtmem_relocate() to parse the system memory map
to find a suitable relocation target address.

After the copy we disable paging, jump to the relocated copy,
re-enable paging, and reapply relocation records (if needed).  At this
point, we have a full runtime environment, and can transfer control to
normal C code.

Provide this functionality as part of libprefix.S, since it is likely
to be shared by multiple prefixes.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-12 13:59:42 +01:00
3dfc88158c [riscv] Construct page tables based on link-time virtual addresses
Always construct the page tables based on the link-time address values
even if relocations have already been applied, on the assumption that
relocations will be reapplied after paging has been enabled.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-12 13:59:42 +01:00
c45dc4a55d [riscv] Allow apply_relocs() to use non-inline relocation records
The address of the compressed relocation records is currently
calculated implicitly relative to the program counter.  This requires
the relocation records to be copied as part of relocation to a new
physical address, so that they can be reapplied (if needed) after
copying iPXE to the new physical address.

Since the relocation destination will never overlap the original iPXE
image, and since the relocation records will not be needed further
after completing relocation, we can avoid the need to copy the records
by passing in a pointer to the relocation records present in the
original iPXE image.

Pass the compressed relocation record address as an explicit parameter
to apply_relocs(), rather than being implicit in the program counter.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-12 12:23:23 +01:00
420e475b11 [riscv] Return accessible physical address space size from enable_paging()
Relocation requires knowledge of the size of the accessible physical
address space, which for 64-bit CPUs will vary according to the paging
level supported by the processor.

Update enable_paging_64() and enable_paging_32() to calculate and
return the size of the accessible physical address space.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-12 11:47:25 +01:00
6fe9ce66ae [fdtmem] Add ability to parse FDT memory map for a relocation address
Add code to parse the devicetree memory nodes, memory reservations
block, and reserved memory nodes to construct an ordered and
non-overlapping description of the system memory map, and use this to
identify a suitable address to which iPXE may be relocated at runtime.

We choose to place iPXE on a superpage boundary (as required by the
paging code), and to use the highest available address within
accessible memory.  This mirrors the approach taken for x86 BIOS
builds, where we have long assumed that any image format that we might
need to support may require specific fixed addresses towards the
bottom of the memory map, but is very unlikely to require specific
fixed addresses towards the top of the memory map (since those
addresses may not exist, depending on the amount of installed RAM).

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-11 18:23:08 +01:00
2e45106c0a [riscv] Ensure that prefix_virt is aligned on an xlen boundary
Ensure that the prefix_virt dynamic relocation ends up on a suitably
aligned boundary for a compressed relocation.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-11 14:17:39 +01:00
95ede670bc [riscv] Hold virtual address offset in the thread pointer register
iPXE does not make use of any thread-local storage.  Use the otherwise
unused thread pointer register ("tp") to hold the current value of
the virtual address offset, rather than using a global variable.

This ensures that virt_offset can be made valid even during very early
initialisation (when iPXE may be executing directly from read-only
memory and so cannot update a global variable).

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-11 13:46:21 +01:00
f988ec09e0 [fdt] Generalise access to "reg" property
The "reg" property is also used by non-device nodes, such as the nodes
describing the system memory map.

Provide generalised functionality for parsing the "#address-cells",
"#size-cells", and "reg" properties.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-09 19:09:57 +01:00
3027864f13 [riscv] Use load and store pseudo-instructions where possible
The pattern of "load address to register" followed by "load value from
address in register" generally results in three instructions: two to
load the address and one to load the value.

This can be reduced to two instructions by allowing the assembler to
incorporate the low bits of the address within the load (or store)
instruction itself.  In the case of a store, this requires specifying
a second register that can be temporarily used to hold the high bits
of the address.  (In the case of a load, the destination register is
reused for this purpose.)

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-09 15:23:41 +01:00
134d76379e [build] Formalise mechanism for accessing absolute symbols
In a position-dependent executable, where all addresses are fixed
at link time, we can use the standard technique as documented by
GNU ld to get the value of an absolute symbol, e.g.:

    extern char _my_symbol[];

    printf ( "Absolute symbol value is %x\n", ( ( int ) _my_symbol ) );

This technique may not work in a position-independent executable.
When dynamic relocations are applied, the runtime addresses will no
longer be equal to the link-time addresses.  If the code to obtain the
address of _my_symbol uses PC-relative addressing, then it will
calculate the runtime "address" of the absolute symbol, which will no
longer be equal the the link-time "address" (i.e. the correct value)
of the absolute symbol.

Define macros ABS_SYMBOL(), ABS_VALUE_INIT(), and ABS_VALUE() that
provide access to the correct values of absolute symbols even in
position-independent code, and use these macros wherever absolute
symbols are accessed.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-09 15:14:03 +01:00
1d58d928fe [libc] Display assertion failure message before incrementing counter
During early initialisation on some platforms, the .data and .bss
sections may not yet be writable.

Display the assertion message before attempting to increment the
assertion failure counter, since writing to the assertion counter may
trigger a CPU exception that ends up resetting the system.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-09 14:36:00 +01:00
8fe3c68b31 [riscv] Add support for disabling 64-bit and 32-bit paging
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-08 16:17:21 +01:00
5b19ddbb3c [riscv] Return virtual address offset from enable_paging()
Once paging has been enabled, there is no direct way to determine the
virtual address offset without external knowledge.  (The paging mode,
if needed, can be read directly from the SATP CSR.)

Change the return value from enable_paging() to provide the virtual
address offset.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-08 14:37:30 +01:00
5e518c744e [riscv] Restore temporarily modified PTE within 32-bit transition code
If the virtual address offset is precisely one page (i.e. each virtual
address maps to a physical address one page higher), and if the 32-bit
transition code happens to end up at the end of a page (which would
require an unrealistic 2MB of content in .prefix), then it would be
possible for the program counter to cross into the portion of the
virtual address space still borrowed for use as the temporary physical
map.

Avoid this remote possibility by moving the restoration of the
temporarily modified PTE within the transition code block (which is
guaranteed to remain within a single page since it is aligned on its
own size).

This unfortunately requires increasing the alignment of the transition
code (and hence the maximum number of NOPs inserted).  The assembler
syntax theoretically allows us to avoid inserting any NOPs via a
directive such as:

   .balign PAGE_SIZE, , enable_paging_32_max_len

(i.e. relying on the fact that if the transition code is already
sufficiently far away from the end of a page, then no padding needs to
be inserted).  However, alignment on RISC-V is implemented using the
R_RISCV_ALIGN relaxing relocation, which doesn't encode any concept of
a maximum padding length, and so the maximum padding length value is
effectively ignored.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-08 12:45:37 +01:00
0279015d09 [uaccess] Generalise librm's virt_offset mechanism for RISC-V
The virtual offset memory model used for i386-pcbios and x86_64-pcbios
can be generalised to also cover riscv32-sbi and riscv64-sbi.  In both
architectures, the 32-bit builds will use a circular map of the 32-bit
address space, and the 64-bit builds will use an identity map for the
relevant portion of the physical address space, with iPXE itself
placed in the negative (kernel) address space.

Generalise and document the virt_offset mechanism, and set it as the
default for both PCBIOS and SBI platforms.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-08 00:12:33 +01:00
e8a6c26571 [build] Constrain PHYS_CODE() and REAL_CODE() to use i386 registers
Inline assembly using PHYS_CODE() or REAL_CODE() must use the "R"
constraint rather than the "r" constraint to ensure that the compiler
chooses registers that will be valid for the 32-bit or 16-bit assembly
code fragment.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-07 23:03:02 +01:00
12dee2dab2 [riscv] Add debug printing of hexadecimal values in libprefix.S
Add millicode routines to print hexadecimal values (with any number of
digits), and macros to print register contents or symbol addresses.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-07 14:23:56 +01:00
72c81419b1 [riscv] Move prefix system reset code to libprefix.S
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-07 13:10:40 +01:00
764183504c [riscv] Add basic debug progress messages in libprefix.S
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-07 13:08:49 +01:00
9445a9ff40 [riscv] Provide a millicode variant of print_message()
RISC-V has a millicode calling convention that allows for the use of
an alternative link register x5/t0.  With sufficient care, this allows
for two levels of subroutine call even when no stack is available.

Provide both standard and millicode entry points for print_message(),
and use the millicode entry point to allow for printing debug messages
from libprefix.S itself.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-07 13:08:49 +01:00
dc9e6f0edf [riscv] Move prefix debug message printing to libprefix.S
Create a prefix library function print_message() to print text to the
SBI debug console.  Use the "write byte" SBI call (rather than "write
string") so that the function remains usable even after enabling
paging.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-06 17:28:14 +01:00
b3cbdc86fc [riscv] Place prefix debug strings in .rodata
The GNU assembler does not seem to automatically assume alignment to
an instruction boundary for sections containing assembled code.

Place the prefix debug strings (if present) in .rodata rather than in
.prefix, to avoid potentially creating misaligned code sections.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-06 15:51:39 +01:00
4bef4c8069 [riscv] Use compressed relocation records
Use compressed relocation records instead of raw Elf_Rela records.
This saves around 15% of the total binary size for the all-drivers
image bin-riscv64/ipxe.sbi.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-06 15:01:45 +01:00
8f7aa292aa [riscv] Place .got and .got.plt in .data
Even though we build with -mno-plt, redundant .got and .got.plt
sections are still generated.

Include these redundant sections within .data (which has identical
section attributes) to simplify the section list.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-06 13:58:38 +01:00
e37e3f17e5 [riscv] Discard ELF hash tables
The ELF hash table is generated when building a position-independent
executable even though it is not required (since we have no dynamic
linker).

Explicitly discard these unneeded sections.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-06 13:44:44 +01:00
70bb5e5e63 [zbin] Allow for constructing compressed dynamic relocation records
Define a new "ZREL" compressor information block, describing a block
of Elf_Rel or Elf_Rela runtime relocations to be converted to an
iPXE-specific compressed relocation format.

The compressed relocation format is based loosely on the Elf_Relr
bitmap+offset format, with some optimisations for use in iPXE.  In
particular:

  - a relative "skip" value is used instead of an absolute offset

  - the width of the skip value is reduced to 19 bits (when present)

  - an explicit skip value of zero is used to terminate the list

  - unaligned relocations are prohibited

The layout of bits within the compressed relocation record is also
adjusted to make assembly code implementations simpler: the skip flag
bit is placed in the MSB so that it can be tested using "bltz" or
similar instructions, and the skip value is placed above the
relocation flag bits so that a typical shifting implementation will
naturally end up with a zero value in its accumulator if and only if
the record was a terminator.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-06 12:11:56 +01:00
98646b9f01 [build] Allow for 32-bit and 64-bit versions of util/zbin
Parsing ELF data is simpler if we don't have to build a single binary
to handle both 32-bit and 64-bit ELF formats.

Allow for separate 32-bit and 64-bit binaries built from util/zbin.c
(as is already done for util/elf2efi.c).

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-06 12:11:02 +01:00
4c11737d5d [riscv] Add support for enabling 32-bit paging
Add code to construct a 32-bit page table to map the whole of the
32-bit address space with a fixed offset selected to map iPXE itself
at its link-time address, and to return with paging enabled and the
program counter updated to a virtual address.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-04 21:40:32 +01:00
a32f3c2bc4 [riscv] Add support for enabling 64-bit paging
Paging provides an alternative to using relocations: instead of
applying relocation fixups to the runtime addresses, we can set up
virtual addressing so that the runtime addresses match the link-time
addresses.

This opens up the possibility of running portions of iPXE directly
from read-only memory (such as a memory-mapped flash device), subject
to the caveats that .data is not yet writable and .bss is not yet
zeroed.  This should allow us to run enough code to parse the memory
map from the FDT, identify a suitable RAM block, and physically
relocate ourselves there.

Add code to construct a 64-bit page table (in a single 4kB buffer) to
identity-map as much of the physical address space as possible, to map
iPXE itself at its link-time address, and to return with paging
enabled and the program counter updated to a virtual address.  We use
the highest paging level supported by the CPU, to maximise the amount
of the physical address space covered by the identity map.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-02 14:33:43 +01:00
dad2060260 [riscv] Allow for a non-zero link-time address
Using paging (rather than relocation records) will be easier on 64-bit
RISC-V if we place iPXE within the negative (kernel) virtual address
space.

Allow the link-time address to be non-zero and to vary between 32-bit
and 64-bit builds.  Choose addresses that are expected to be amenable
to the use of paging.

There is no particular need to use a non-zero address in the 32-bit
builds, but doing so allows us to validate that the relocation code is
handling this case correctly.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-01 14:49:53 +01:00
a4b5dd63c5 [riscv] Split out runtime relocator to libprefix.S
Split out the runtime relocation logic from sbiprefix.S to a new
library libprefix.S.

Since this logically decouples the process of runtime relocation from
the _sbi_start symbol (currently used to determine the base address
for applying relocations), provide an alternative mechanism for the
relocator to determine the base address.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-01 14:36:26 +01:00
1534b0a6e9 [uaccess] Remove redundant virt_to_user() and userptr_t
Remove the last remaining traces of the concept of a user pointer,
leaving iPXE with a simpler and cleaner memory model that implicitly
assumes that all memory locations can be reached through pointer
dereferences.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-30 16:26:16 +01:00
a169d73593 [uaccess] Reduce scope of included uaccess.h header
The uaccess.h header is no longer required for any code that touches
external ("user") memory, since such memory accesses are now performed
through pointer dereferences.  Reduce the number of files including
this header.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-30 16:16:02 +01:00
05ad7833c5 [image] Make image data read-only to most consumers
Almost all image consumers do not need to modify the content of the
image.  Now that the image data is a pointer type (rather than the
opaque userptr_t type), we can rely on the compiler to enforce this at
build time.

Change the .data field to be a const pointer, so that the compiler can
verify that image consumers do not modify the image content.  Provide
a transparent .rwdata field for consumers who have a legitimate (and
now explicit) reason to modify the image content.

We do not attempt to impose any runtime restriction on checking
whether or not an image is writable.  The only existing instances of
genuinely read-only images are the various unit test images, and it is
acceptable for defective test cases to result in a segfault rather
than a runtime error.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-30 15:38:15 +01:00
cd803ff2e2 [image] Add the concept of a static image
Not all images are allocated via alloc_image().  For example: embedded
images, the static images created to hold a runtime command line, and
the images used by unit tests are all static structures.

Using image_set_cmdline() (via e.g. the "imgargs" command) to set the
command-line arguments of a static image will succeed but will leak
memory, since nothing will ever free the allocated command line.
There are no code paths that can lead to calling image_set_len() on a
static image, but there is no safety check against future code paths
attempting this.

Define a flag IMAGE_STATIC to mark an image as statically allocated,
generalise free_image() to also handle freeing dynamically allocated
portions of static images (such as the command line), and expose
free_image() for use by static images.

Define a related flag IMAGE_STATIC_NAME to mark the name as statically
allocated.  Allow a statically allocated name to be replaced with a
dynamically allocated name since this is a potentially valid use case
(e.g. if "imgdecrypt --name <name>" is used on an embedded image).

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-30 15:38:15 +01:00
3303910010 [image] Move embedded images from .rodata to .data
Decrypting a CMS-encrypted image will overwrite the existing image
data in place, and using an encrypted embedded image is a valid use
case.

Move embedded images from .rodata to .data to reflect the fact that
they are intended to be writable.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-30 15:38:15 +01:00
2d9a6369dd [test] Separate read-only and writable CMS test images
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-30 15:38:15 +01:00
b6f9e4bab0 [uaccess] Remove redundant copy_from_user() and copy_to_user()
Remove the now-redundant copy_from_user() and copy_to_user() wrapper
functions.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-30 15:32:03 +01:00
a69c42dd9f [image] Clear recorded replacement image immediately after consuming
If an embedded script uses "chain --replace", the embedded image will
retain a reference to the replacement image in perpetuity.

Fix by clearing any recorded replacement image immediately in
image_exec(), instead of relying upon image_free() to drop the
reference.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-29 16:32:01 +01:00
9962c0a58f [bofm] Remove userptr_t from BOFM table parsing and updating
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-29 13:42:42 +01:00
0800723845 [bofm] Allow BOFM tests to be run without a BOFM-capable device driver
The BOFM tests are not part of the standard unit test suite, since
they are designed to allow for exercising real BOFM driver code
outside of the context of a real IBM blade server.

Allow for the BOFM tests to be run without a real BOFM driver, by
providing a dummy driver for the specified PCI test device.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-29 13:39:12 +01:00
4e909cc2b0 [build] Remove some long-obsolete unused header files
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-29 12:17:16 +01:00
6c9dc063f6 [peerdist] Remove never-used peerdist_msg_blk() macro
The peerdist_msg_blk() macro seems to have been introduced in the
original commit that added pccrr.h, but this macro was never used by
the version of the code present in that commit.

Remove this unused macro and the corresponding nonexistent external
function declaration.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-29 12:08:33 +01:00
54c4217bdd [peerdist] Remove userptr_t from PeerDist content information parsing
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-29 11:28:45 +01:00
837b77293b [xferbuf] Simplify and generalise data transfer buffers
Since all data transfer buffer contents are now accessible via direct
pointer dereferences, remove the unnecessary abstractions for read and
write operations and create two new data transfer buffer types: a
fixed-size buffer, and a void buffer that records its size but can
never receive non-zero lengths of data.  These replace the custom data
buffer types currently implemented for EFI PXE TFTP downloads and for
block device translations.

A new operation xferbuf_detach() is required to take ownership of the
data accumulated in the data transfer buffer, since we no longer rely
on the existence of an independently owned external data pointer for
data transfer buffers allocated via umalloc().

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-29 11:27:22 +01:00
43fc516298 [prefix] Remove userptr_t from command line image construction
Simplify cmdline_init() by assuming that the externally provided
command line is directly accessible via pointer dereferences.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-29 00:30:34 +01:00
c9fb94dbaa [comboot] Remove userptr_t from COM32 API implementation
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-29 00:24:55 +01:00
f001e61a68 [comboot] Remove userptr_t from COMBOOT API implementation
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-28 22:50:23 +01:00
ef97119589 [comboot] Remove userptr_t from COMBOOT image parsing
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-28 22:31:18 +01:00
0b45db3972 [uaccess] Remove redundant UNULL definition
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-28 17:36:18 +01:00
6ccb6bcfc8 [bzimage] Remove userptr_t from bzImage parsing
Simplify bzImage parsing by assuming that the various headers are
directly accessible via pointer dereferences.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-28 16:30:35 +01:00
412ad56012 [initrd] Use physical addresses for calculations on initrd locations
Commit ef03849 ("[uaccess] Remove redundant userptr_add() and
userptr_diff()") exposed a signedness bug in the comparison of initrd
locations, since the expression (initrd->data - current) was
effectively no longer coerced to a signed type.

In particular, the common case will be that the top of the initrd
region is the start of the iPXE .textdata region, which has virtual
address zero.  This causes initrd->data to compare as being above the
top of the initrd region for all images, when this bug would
previously have been limited to affecting only initrds placed 2GB or
more below the start of .textdata.

Fix by using physical addresses for all comparisons on initrd
locations.

Reported-by: Sven Dreyer <sven@dreyer-net.de>
Reported-by: Harald Jensås <hjensas@redhat.com>
Reported-by: Jan ONDREJ (SAL) <ondrejj@salstar.sk>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-28 15:35:55 +01:00
ef3827cf14 [bzimage] Use image name in debug messages
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-28 14:43:19 +01:00
083e273bbc [efi] Add ability to reboot to firmware setup menu
Add the ability to reboot to the firmware setup menu (if supported) by
setting the relevant value in the OsIndications variable.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-28 14:01:17 +01:00
7eaa2daf6f [reboot] Generalise warm reboot indicator to a flags bitmask
Allow for the possibility of additional reboot types by extending the
reboot() function to use a flags bitmask rather than a single flag.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-28 13:44:53 +01:00
ba2135d0fd [multiboot] Remove userptr_t from Multiboot and ELF image parsing
Simplify Multiboot and ELF image parsing by assuming that the
Multiboot and ELF headers are directly accessible via pointer
dereferences, and add some missing header validations.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-28 13:06:18 +01:00
c8c5cd685f [multiboot] Use image name in Multiboot and ELF debug messages
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-28 12:59:25 +01:00
3befb5eb57 [linux] Enable compiler warnings when building the linux_api.o object
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-27 23:36:34 +01:00
024439f339 [linux] Add missing return statement to linux_poll()
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-27 23:28:51 +01:00
bd4ca67cf4 [build] Disable gcc unterminated-string-initializer warnings
GCC 15 generates a warning when a string initializer is too large to
allow for a trailing NUL terminator byte.  This type of initializer is
fairly common in signature strings such as ACPI table identifiers.

Fix by disabling the warning.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-27 18:40:52 +01:00
15c1111c78 [build] Remove unsafe disable function wrapper from legacy NIC drivers
The legacy NIC drivers do not consistently take a second parameter in
their disable function.  We currently use an unsafe function wrapper
that declares no parameters, and rely on the ABI allowing a second
parameter to be silently ignored if not expected by the caller.  As of
GCC 15, this hack results in an incompatible pointer type warning.

Fix by removing the hack, and instead updating all relevant legacy NIC
drivers to take an unused second parameter in their disable function.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-27 18:40:52 +01:00
7741756afc [build] Prevent the use of reserved words in C23
GCC 15 defaults to C23, which reserves bool, true, and false as
keywords.  Avoid using these as parameter or variable names.

Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-27 18:40:33 +01:00
b816b816ab [build] Fix old-style function definition
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-27 18:40:03 +01:00
58e6729cb6 [build] Fix typo in xenver.h header guard
GCC 15 helpfully reports mismatched #ifdef and #define lines in header
guards.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-27 18:40:03 +01:00
4c8bf666f4 [pnm] Remove userptr_t from PNM image parsing
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-25 17:23:37 +01:00
d29651ddec [png] Remove userptr_t from PNG image parsing
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-25 16:43:11 +01:00
76a17b0986 [fbcon] Avoid redrawing unchanged characters when scrolling
Scrolling currently involves redrawing every character cell, which can
be frustratingly slow on large framebuffer consoles.  Accelerate this
operation by skipping the redraw for any unchanged character cells.

In the common case that large areas of the screen contain whitespace,
this optimises away the vast majority of the redrawing operations.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-25 13:44:18 +01:00
aa3cc56ab2 [fbcon] Remove userptr_t from framebuffer console drivers
Simplify the framebuffer console drivers by assuming that the raw
framebuffer, character cell array, background picture, and glyph data
are all directly accessible via pointer dereferences.

In particular, this avoids the need to copy each glyph during drawing:
the VESA framebuffer driver can simply return a pointer to the glyph
data stored in the video ROM.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-25 12:44:28 +01:00
4cca1cadf8 [efi] Remove userptr_t from EFI PE image parsing
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-25 00:49:27 +01:00
338cebfeef [pxe] Remove userptr_t from PXE file API implementation
Simplify the PXE file API implementation by assuming that all string
buffers are directly accessible via pointer dereferences.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-25 00:43:30 +01:00
8b3b4f2454 [pxe] Remove userptr_t from PXE API call dispatcher
Simplify the PXE API call dispatcher code by assuming that the PXE
parameter block is accessible via a direct pointer dereference.  This
avoids the need for the API call dispatcher to know the size of the
parameter block.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-24 23:38:50 +01:00
c1b558f59e [cmdline] Remove userptr_t from "digest" command
Simplify the implementation of the "digest" command by assuming that
the entire image data can be passed directly to digest_update().

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-24 23:24:29 +01:00
0edbc4c082 [nbi] Remove userptr_t from NBI image parsing
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-24 23:17:16 +01:00
3cb33435f5 [sdi] Remove userptr_t from SDI image parsing
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-24 23:01:25 +01:00
d7c94c4aa5 [pxe] Remove userptr_t from PXE NBP image parsing
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-24 22:46:50 +01:00
2f11f466e6 [block] Remove userptr_t from block device abstraction
Simplify the block device code by assuming that all read/write buffers
are directly accessible via pointer dereferences.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-24 17:11:30 +01:00
2742ed5d77 [uaccess] Remove now-obsolete memchr_user()
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-24 16:35:49 +01:00
4f4f6c33ec [script] Remove userptr_t from script image parsing
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-24 16:28:32 +01:00
8923a216b0 [ucode] Remove userptr_t from microcode image parsing
Simplify microcode image parsing by assuming that all image content is
directly accessible via pointer dereferences.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-24 14:25:00 +01:00
605cff4c84 [ucode] Remove userptr_t from microcode update mechanism
Simplify the microcode update mechanism by assuming that status
reports are accessible via direct pointer dereferences.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-24 13:48:57 +01:00
f18c1472e3 [thunderx] Replace uses of userptr_t with direct pointer dereferences
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-24 13:16:50 +01:00
8ac03b4a73 [exanic] Replace uses of userptr_t with direct pointer dereferences
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-24 10:56:21 +01:00
e8ffe2cd64 [uaccess] Remove trivial uses of userptr_t
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-24 01:40:05 +01:00
945df9b429 [gve] Replace uses of userptr_t with direct pointer dereferences
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-23 23:12:50 +01:00
839540cb95 [umalloc] Remove userptr_t from user memory allocations
Use standard void pointers for umalloc(), urealloc(), and ufree(),
with the "u" prefix retained to indicate that these allocations are
made from external ("user") memory rather than from the internal heap.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-23 14:43:04 +01:00
0bf0f8716a [smbios] Remove userptr_t from SMBIOS structure parsing
Simplify the SMBIOS structure parsing code by assuming that all
structure content is fully accessible via pointer dereferences.

In particular, this allows the convoluted find_smbios_structure() and
read_smbios_structure() to be combined into a single function
smbios_structure() that just returns a direct pointer to the SMBIOS
structure, with smbios_string() similarly now returning a direct
pointer to the relevant string.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-23 10:08:16 +01:00
0b3fc48fef [acpi] Remove userptr_t from ACPI table parsing
Simplify the ACPI table parsing code by assuming that all table
content is fully accessible via pointer dereferences.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-22 14:21:06 +01:00
c059b34170 [deflate] Remove userptr_t from decompression code
Simplify the deflate, zlib, and gzip decompression code by assuming
that all content is fully accessible via pointer dereferences.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-22 12:32:12 +01:00
b89a34b07f [image] Remove userptr_t from image definition
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-22 12:21:26 +01:00
e98b84f1b9 [crypto] Remove userptr_t from CMS verification and decryption
Simplify the CMS code by assuming that all content is fully accessible
via pointer dereferences.  This avoids the need to use fragment loops
for calculating digests and decrypting (or reencrypting) data.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-22 00:28:07 +01:00
3f8937d2f3 [crypto] Remove userptr_t from ASN.1 parsers
Simplify the ASN.1 code by assuming that all objects are fully
accessible via pointer dereferences.  This allows the concept of
"additional data beyond the end of the cursor" to be removed, and
simplifies parsing of all ASN.1 image formats.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-21 23:30:13 +01:00
04d0b2fdf9 [uaccess] Remove redundant read_user()
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-21 18:55:30 +01:00
050df80bbc [uaccess] Replace real_to_user() with real_to_virt()
Remove the intermediate concept of a user pointer from real address
conversion, leaving real_to_virt() as the directly implemented
function.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-21 18:28:56 +01:00
8c31270a21 [uaccess] Remove user_to_phys() and phys_to_user()
Remove the intermediate concept of a user pointer from physical
address conversions, leaving virt_to_phys() and phys_to_virt() as the
directly implemented functions.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-21 16:17:19 +01:00
4535548cba [uaccess] Remove redundant user_to_virt()
The user_to_virt() function is now a straightforward wrapper around
addition, with the addend almost invariably being zero.

Remove this redundant wrapper.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-21 00:15:52 +01:00
89fe788689 [uaccess] Remove redundant memcpy_user() and related string functions
The memcpy_user(), memmove_user(), memcmp_user(), memset_user(), and
strlen_user() functions are now just straightforward wrappers around
the corresponding standard library functions.

Remove these redundant wrappers.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-20 23:00:13 +01:00
ef03849185 [uaccess] Remove redundant userptr_add() and userptr_diff()
The userptr_add() and userptr_diff() functions are now just
straightforward wrappers around addition and subtraction.

Remove these redundant wrappers.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-20 22:31:29 +01:00
b65f67d443 [uaccess] Change userptr_t to be a pointer type
The original motivation for the userptr_t type was to be able to
support a pure 16-bit real-mode memory model in which a segment:offset
value could be encoded as an unsigned long, with corresponding
copy_from_user() and copy_to_user() functions used to perform
real-mode segmented memory accesses.

Since this memory model was first created almost twenty years ago, no
serious effort has been made to support a pure 16-bit mode of
operation for iPXE.  The constraints imposed by the memory model are
becoming increasingly cumbersome to work within: for example, the
parsing of devicetree structures is hugely simplified by being able to
use and return direct pointers to the names and property values.  The
devicetree code therefore relies upon virt_to_user(), which is
nominally illegal under the userptr_t memory model.

Drop support for the concept of a memory location that cannot be
reached through a straightforward pointer dereference, by redefining
userptr_t to be a simple pointer type.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-20 17:28:33 +01:00
71174e19d8 [uaccess] Add explicit casts to and from userptr_t where needed
Allow for the possibility of userptr_t becoming a pointer type by
adding explicit casts where necessary.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-20 17:21:53 +01:00
63d27c6311 [uaccess] Rename userptr_sub() to userptr_diff()
Clarify the intended usage of userptr_sub() by renaming it to
userptr_diff() (to avoid confusion with userptr_add()), and fix the
existing call sites that erroneously use userptr_sub() to subtract an
offset from a userptr_t value.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-20 17:20:30 +01:00
453acba7dc [time] Use currticks() to provide the null system time
For platforms with no real-time clock (such as RISC-V SBI) we use the
null time source, which currently just returns a constant zero.

Switch to using currticks() to provide a clock that does not represent
the real current time, but does at least advance at approximately the
correct rate.  In conjunction with the "ntp" command, this allows
these platforms to use time-dependent features such as X.509
certificate verification for HTTPS connections.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-19 13:35:23 +01:00
423cdbeb39 [riscv] Map DEL to backspace on the SBI debug console
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-19 12:20:59 +01:00
1291dc39fd [cgem] Add a driver for the Cadence GEM NIC
Add a basic driver for the Cadence GEM network interface as emulated
by QEMU when using the RISC-V "sifive_u" machine type.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-19 11:54:08 +01:00
0c482060d5 [undi] Work around broken ASUSTeK KNPA-U16 server PXE ROM
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-17 15:53:28 +01:00
758a504860 [efi] Inhibit calls to Shutdown() for wireless SNP devices
The UEFI model for wireless network configuration is somewhat
underdefined.  At the time of writing, the EDK2 "UEFI WiFi Connection
Manager" driver provides only one way to configure wireless network
credentials, which is to enter them interactively via an HII form.
Credentials are not stored (or exposed via any protocol interface),
and so any temporary disconnection from the wireless network will
inevitably leave the interface in an unusable state that cannot be
recovered without user intervention.

Experimentation shows that at least some wireless network drivers
(observed with an HP Elitebook 840 G10) will disconnect from the
wireless network when the SNP Shutdown() method is called, or if the
device is not polled sufficiently frequently to maintain its
association to the network.  We therefore inhibit calls to Shutdown()
and Stop() for any such SNP protocol interfaces, and mark our network
device as insomniac so that it will be polled even when closed.

Note that we need to inhibit not only our own calls to Shutdown() and
Stop(), but also those that will be attempted by MnpDxe when we
disconnect it from the SNP handle.  We do this by patching the
installed SNP protocol interface structure to modify the Shutdown()
and Stop() method pointers, which is ugly but unavoidable.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-17 14:42:18 +01:00
b07cc851f0 [netdevice] Add the concept of an insomniac network device
Some network devices (observed with the SNP interface to the wireless
network card on an HP Elitebook 840 G10) will stop working if they are
left for too long without being polled.

Add the concept of an insomniac network device, that must continue to
be polled even when closed.

Note that drivers are already permitted to call netdev_rx() et al even
when closed: this will already be happening for USB devices since
polling operates at the level of the whole USB bus, rather than at the
level of individual USB devices.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-17 10:42:22 +01:00
c88ebf2ac6 [efi] Allow for custom methods for disconnecting existing drivers
Allow for greater control over the process used to disconnect existing
drivers from a device handle, by converting the "exclude" field from a
simple protocol GUID to a per-driver method.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-17 10:08:54 +01:00
eeec6442d9 [dt] Provide dt_ioremap() to map device registers
Devicetree devices encode register address ranges within the "reg"
property, with the number of cells used for addresses and for sizes
determined by the #address-cells and #size-cells properties of the
immediate parent device.

Record the number of address and size cells for each device, and
provide a dt_ioremap() function to allow drivers to map a specified
range without having to directly handle the "reg" property.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-15 20:39:28 +01:00
99322fd3b3 [fdt] Add fdt_cells() to read cell-based properties such as "reg"
Add fdt_cells() to read scalar values encoded within a cell array,
reimplement fdt_u64() as a wrapper around this, and add fdt_u32() for
completeness.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-15 20:24:19 +01:00
2c406ec0b1 [netdevice] Add missing bus type identifier for devicetree devices
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-15 14:02:14 +01:00
424839c58a [crypto] Allow for explicit control of external trust sources
We currently disable all external trust sources (such as the UEFI
TlsCaCertificate variable) if an explicit TRUST=... parameter is
provided on the build command line.

Define an explicit TRUST_EXT build parameter that can be used to
explicitly disable external trust sources even if no TRUST=...
parameter is provided, or to explicitly enable external trust sources
even if an explicit TRUST=... parameter is provided.  For example:

   # Default trusted root certificate, disable external sources
   make TRUST_EXT=0

   # Explicit trusted root certificate, enable external sources
   make TRUST=custom.crt TRUST_EXT=1

If no TRUST_EXT parameter is specified, then continue to default to
disabling external trust sources if an explicit TRUST=... parameter is
provided, to maintain backwards compatibility with existing build
command lines.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-15 13:22:00 +01:00
37e9f785ba [dt] Add basic concept of a devicetree bus
Add a basic model for devices instantiated by parsing the system
flattened device tree, with drivers matched via the "compatible"
property for any non-root node.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-14 14:52:51 +01:00
d462aeb0ca [fdt] Remove concept of a device tree cursor
Refactor device tree traversal to operate on the basis of describing
the token at a given offset, with no separate notion of a device tree
cursor.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-14 14:38:40 +01:00
b1125007ca [fdt] Add basic tests for reading values from a flattened device tree
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-14 14:20:31 +01:00
db49346177 [fdt] Avoid temporarily modifying path during path lookup
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-14 13:53:09 +01:00
c887de208f [fdt] Provide fdt_strings() to read string list properties
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-14 11:32:17 +01:00
69af6f0c30 [fdt] Allow for trailing slashes in path lookups
Using fdt_path() to find the root node "/" currently fails, since it
will attempt to find a child node with the empty name "" within the
root node.

Fix by changing fdt_path() to ignore any trailing slashes in a device
tree path.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-14 11:26:49 +01:00
96dfaa7e7a [crypto] Switch to using python-asn1crypto instead of python-asn1
Version 3.0.0 of python-asn1 has a serious defect that causes it to
generate invalid DER.

Fix by switching to the asn1crypto module, which also allows for
simpler code to be used.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-11 12:40:22 +01:00
7e64e9b670 [fdt] Populate boot arguments in constructed device tree
When creating a device tree to pass to a booted operating system,
ensure that the "chosen" node exists, and populate the "bootargs"
property with the image command line.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-01 16:55:28 +01:00
d853448887 [fdt] Identify free space (if any) at end of parsed tree
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-04-01 13:08:41 +01:00
0a48bb3214 [x509] Ensure certificate remains valid during x509_append()
The allocation of memory for the certificate chain link may cause the
certificate itself to be freed by the cache discarder, if the only
current reference to the certificate is held by the certificate store
and the system runs out of memory during the call to malloc().

Ensure that this cannot happen by taking out a temporary additional
reference to the certificate within x509_append(), rather than
requiring the caller to do so.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-31 18:05:11 +01:00
a289b4b8c2 [tls] Support fragmentation of transmitted records
Large transmitted records may arise if we have long client certificate
chains or if a client sends a large block of data (such as a large
HTTP POST payload).  Fragment records as needed to comply with the
value that we advertise via the max_fragment_length extension.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-31 16:36:33 +01:00
f115cfcf99 [tls] Send an empty client certificate chain if we have no certificate
RFC5246 states that "a client MAY send no certificates if it does not
have an appropriate certificate to send in response to the server's
authentication request".  This use case may arise when the server is
using optional client certificate verification and iPXE has not been
provided with a client certificate to use.

Treat the absence of a suitable client certificate as a non-fatal
condition and send a Certificate message containing no certificates as
permitted by RFC5246.

Reported-by: Alexandre Ravey <alexandre@voilab.ch>
Originally-implemented-by: Alexandre Ravey <alexandre@voilab.ch>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-31 14:33:16 +01:00
5818529f39 [iobuf] Limit automatic I/O buffer alignment to page size
Without any explicit alignment requirement, we will currently allocate
I/O buffers on their own size rounded up to the nearest power of two.
This is done to simplify driver transmit code paths, which can assume
that a standard Ethernet frame lies within a single physical page and
therefore does not need to be split even for devices with DMA engines
that cannot cross page boundaries.

Limit this automatic alignment to a maximum of the page size, to avoid
requiring excessive alignment for unusually large buffers (such as a
buffer allocated for an HTTP POST with a large parameter list).

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-31 13:39:58 +01:00
7fe467a46d [tls] Encrypt data in place to reduce memory usage
Provide a custom xfer_alloc_iob() handler to ensure that transmit I/O
buffers contain sufficient headroom for the TLS record header and
record initialisation vector, and sufficient tailroom for the MAC,
block cipher padding, and authentication tag.  This allows us to use
in-place encryption for the actual data within the I/O buffer, which
essentially halves the amount of memory that needs to be allocated for
a TLS data transmission.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-31 12:42:07 +01:00
d92551a320 [xfer] Use xfer_alloc_iob() for transmit I/O buffers on stream sockets
Datagram sockets such as UDP, ICMP, and fibre channel tend to provide
a custom xfer_alloc_iob() handler to ensure that transmit I/O buffers
contain sufficient headroom to accommodate any required protocol
headers.

Stream sockets such as TCP and TLS do not typically provide a custom
xfer_alloc_iob() handler at present.  The default handler simply calls
alloc_iob(), and so stream socket consumers can therefore get away
with using alloc_iob() rather than xfer_alloc_iob().

Fix the HTTP and ONC RPC protocols to use xfer_alloc_iob() where
relevant, in order to operate correctly if the underlying stream
socket chooses to provide a custom xfer_alloc_iob() handler.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-30 21:47:34 +01:00
3937c893ae [isa] Disable legacy ISA device probing by default
Legacy ISA device probing involves poking at various I/O addresses to
guess whether or not a particular device is present.

Actual legacy ISA cards are essentially nonexistent by now, but the
probed I/O addresses have a habit of being reused for various
OEM-specific functions.  This can cause some very undesirable side
effects.  For example, probing for the "ne2k_isa" driver on an HP
Elitebook 840 G10 will cause the system to lock up in a way that
requires two cold reboots to recover.

Enable ISA_PROBE_ONLY in config/isa.h by default.  This limits ISA
probing to use only the addresses specified in ISA_PROBE_ADDRS, which
is empty by default, and so effectively disables ISA probing.  The
vanishingly small number of users who require ISA probing can simply
adjust this configuration in config/local/isa.h.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-29 23:01:21 +00:00
4a7f64bf4f [efi] Allow for fact that SNP device may be removed by executed image
The executed image may call DisconnectController() to remove our
network device.  This will leave the net device unregistered but not
yet freed (since our installed PXE base code protocol retains a
reference to the net device).

Unregistration will cause the network upper-layer driver removal
functions to be called, which will free the SNP device structure.
When the image returns from StartImage(), the snpdev pointer may
therefore no longer be valid.

The SNP device structure is not reference counted, and so we cannot
simply take out a reference to ensure that it remains valid across the
call to StartImage().  However, the code path following the call to
StartImage() doesn't actually require the SNP device pointer, only the
EFI device handle.

Store the device handle in a local variable and ensure that snpdev is
invalidated before the call to StartImage() so that future code cannot
accidentally reintroduce this issue.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-29 22:07:13 +00:00
18dbd05ed5 [efi] Check correct return value from efi_pxe_find()
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-29 22:03:32 +00:00
4bcaa3d380 [efi] Disconnect existing drivers on a per-protocol basis
UEFI does not provide a direct method to disconnect the existing
driver of a specific protocol from a handle.  We currently use
DisconnectController() to remove all drivers from a handle that we
want to drive ourselves, and then rely on recursion in the call to
ConnectController() to reconnect any drivers that did not need to be
disconnected in the first place.

Experience shows that OEMs tend not to ever test the disconnection
code paths in their UEFI drivers, and it is common to find drivers
that refuse to disconnect, fail to close opened handles, fail to
function correctly after reconnection, or lock up the entire system.

Implement a more selective form of disconnection, in which we use
OpenProtocolInformation() to identify the driver associated with a
specific protocol, and then disconnect only that driver.

Perform disconnections in reverse order of attachment priority, since
this is the order likely to minimise the number of cascaded implicit
disconnections.

This allows our MNP driver to avoid performing any disconnections at
all, since it does not require exclusive access to the MNP protocol.
It also avoids performing unnecessary disconnections and reconnections
of unrelated drivers such as the "UEFI WiFi Connection Manager" that
attaches to wireless network interfaces in order to manage wireless
network associations.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-29 20:26:06 +00:00
7737fec5c6 [efi] Define an attachment priority order for EFI drivers
Define an ordering for internal EFI drivers on the basis of how close
the driver is to the hardware, and attempt to start drivers in this
order.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-29 18:44:34 +00:00
be33224754 [efi] Show all drivers claiming support for a handle in debug messages
UEFI assumes in several places that an image installs only a single
driver binding protocol instance, and that this is installed on the
image handle itself.  We therefore provide a single driver binding
protocol instance, which delegates to the various internal drivers
(for EFI_PCI_IO_PROTOCOL, EFI_USB_IO_PROTOCOL, etc) as appropriate.

The debug messages produced by our Supported() method can end up
slightly misleading, since they will report only the first internal
driver that claims support for a device.  In the common case of the
all-drivers build, there may be multiple drivers that claim support
for the same handle: for example, the PCI, NII, SNP, and MNP drivers
are all likely to initially find the protocols that they need on the
same device handle.

Report all internal drivers that claim support for a device, to avoid
confusing debug messages.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-29 18:44:34 +00:00
ea5762d9d0 [efi] Return success from Stop() if driver is already stopped
Return success if asked to stop driving a device that we are not
currently driving.  This avoids propagating spurious errors to an
external caller of DisconnectController().

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-29 18:44:34 +00:00
7adce3a13e [efi] Add various well-known GUIDs encountered in WiFi boot
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-28 21:01:42 +00:00
b20f506a72 [efi] Install a device tree for the booted OS, if available
If we have a device tree available (e.g. because the user has
explicitly downloaded a device tree using the "fdt" command), then
provide it to the booted operating system as an EFI configuration
table.

Since x86 does not typically use device trees, we create weak symbols
for efi_fdt_install() and efi_fdt_uninstall() to avoid dragging FDT
support into all x86 UEFI binaries.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-28 15:29:53 +00:00
761f43ce12 [fdt] Provide the ability to create a device tree for a booted OS
Provide fdt_create() to create a device tree to be passed to a booted
operating system.  The device tree will be created from the FDT image
(if present), falling back to the system device tree (if present).

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-28 15:29:51 +00:00
666929e311 [efi] Create a copy of the system flattened device tree, if present
EFI configuration tables may be freed at any time, and there is no way
to be notified when the table becomes invalidated.  Create a copy of
the system flattened device tree (if present), so that we do not risk
being left with an invalid pointer.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-28 15:29:20 +00:00
3860313dd5 [fdt] Allow for parsing device trees where the length is known in advance
Allow for parsing device trees where an external factor (such as a
downloaded image length) determines the maximum length, which must be
validated against the length within the device tree header.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-28 15:11:39 +00:00
2399c79980 [fdt] Allow for the existence of multiple device trees
When running on a platform that uses FDT as its hardware description
mechanism, we are likely to have multiple device tree structures.  At
a minimum, there will be the device tree passed to us from the
previous boot stage (e.g. OpenSBI), and the device tree that we
construct to be passed to the booted operating system.

Update the internal FDT API to include an FDT pointer in all function
parameter lists.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-28 14:14:32 +00:00
09fbebc084 [fdt] Add the "fdt" command
Allow a Flattened Device Tree blob (DTB) to be provided to a booted
operating system using a script such as:

  #!ipxe
  kernel /images/vmlinuz console=ttyAMA0
  initrd /images/initrd.img
  fdt /images/rk3566-radxa-zero-3e.dtb
  boot

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-27 15:36:39 +00:00
cfd93465ec [fdt] Add the concept of an FDT image
Define the concept of an "FDT" image, representing a Flattened Device
Tree blob that has been downloaded in order to be provided to a kernel
or other executable image.  FDT images are represented using an image
tag (as with other special-purpose images such as the UEFI shim), and
are similarly marked as hidden so that they will not be included in a
generated magic initrd or show up in a virtual filesystem directory
listing.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-27 15:36:39 +00:00
98f86b4d0a [efi] Add support for installing EFI configuration tables
Add the ability to install and uninstall arbitrary EFI configuration
tables.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-27 15:36:39 +00:00
f0caf90a72 [efi] Add flattened device tree header and GUID definitions
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-27 14:48:04 +00:00
ec8c5a5fbb [efi] Add ACPI and SMBIOS tables as well-known GUIDs
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-27 14:48:04 +00:00
0b606221cb [undi] Ensure forward progress is made even if UNDI IRQ is stuck
If the UNDI interrupt remains constantly asserted (e.g. because the
BIOS has enabled interrupts for an unrelated device sharing the same
IRQ, or because of bugs in the OEM UNDI driver), then we may get stuck
in an interrupt storm.

We cannot safely chain to the previous interrupt handler (which could
plausibly handle an unrelated device interrupt) since there is no
well-defined behaviour for previous interrupt handlers.  We have
observed BIOSes to provide default interrupt handlers that variously
do nothing, send EOI, disable the IRQ, or crash the system.

Fix by disabling the UNDI interrupt whenever our handler is triggered,
and rearm it as needed when polling the network device.  This ensures
that forward progress continues to be made even if something causes
the interrupt to be constantly asserted.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-26 14:56:20 +00:00
4134280bcd [pxeprefix] Ensure that UNDI IRQ is disabled before starting iPXE
When using the undionly.kkpxe binary (which is never recommended), the
UNDI interrupt may still be enabled when iPXE starts up.  If the PXE
base code interrupt handler is not well-behaved, this can result in
undefined behaviour when interrupts are first enabled (e.g. for
entropy gathering, or for allowing the timer tick to occur).

Fix by detecting and disabling the UNDI interrupt during the prefix
code.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-26 14:56:13 +00:00
e8365f7a51 [pxeprefix] Work around missing type values from PXENV_UNDI_GET_NIC_TYPE
The implementation of PXENV_UNDI_GET_NIC_TYPE in some PXE ROMs
(observed with an Intel X710 ROM in a Dell PowerEdge R6515) will fail
to write the NicType byte, leaving it uninitialised.

Prepopulate the NicType byte with a highly unlikely value as a
sentinel to allow us to detect this, and assume that any such devices
are overwhelmingly likely to be PCI devices.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-26 12:02:27 +00:00
32a9408217 [efi] Allow use of typed pointers for efi_open() et al
Provide wrapper macros to allow efi_open() and related functions to
accept a pointer to any pointer type as the "interface" argument, in
order to allow a substantial amount of type adjustment boilerplate to
be removed.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-24 15:43:56 +00:00
37897fbd40 [efi] Eliminate uses of HandleProtocol()
It is now simpler to use efi_open() than to use HandleProtocol() to
obtain an ephemeral protocol instance.  Remove all remaining uses of
HandleProtocol() to simplify the code.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-24 14:25:10 +00:00
bac3187439 [efi] Use efi_open() for all ephemeral protocol opens
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-24 13:19:26 +00:00
5a5e2a1dae [efi] Use efi_open_unsafe() for all explicitly unsafe protocol opens
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-24 13:19:26 +00:00
9dd30f11f7 [efi] Use efi_open_by_driver() for all by-driver protocol opens
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-24 13:19:26 +00:00
4561a03766 [efi] Use efi_open_by_child() for all by-child protocol opens
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-24 13:19:26 +00:00
358db15612 [efi] Create safe wrappers for OpenProtocol() and CloseProtocol()
The UEFI model for opening and closing protocols is broken by design
and cannot be repaired.

Calling OpenProtocol() to obtain a protocol interface pointer does
not, in general, provide any guarantees about the lifetime of that
pointer.  It is theoretically possible that the pointer has already
become invalid by the time that OpenProtocol() returns the pointer to
its caller.  (This can happen when a USB device is physically removed,
for example.)

Various UEFI design flaws make it occasionally necessary to hold on to
a protocol interface pointer despite the total lack of guarantees that
the pointer will remain valid.

The UEFI driver model overloads the semantics of OpenProtocol() to
accommodate the use cases of recording a driver attachment (which is
modelled as opening a protocol with EFI_OPEN_PROTOCOL_BY_DRIVER
attributes) and recording the existence of a related child controller
(which is modelled as opening a protocol with
EFI_OPEN_PROTOCOL_BY_CHILD_CONTROLLER attributes).

The parameters defined for CloseProtocol() are not sufficient to allow
the implementation to precisely identify the matching call to
OpenProtocol().  While the UEFI model appears to allow for matched
open and close pairs, this is merely an illusion.  Calling
CloseProtocol() will delete *all* matching records in the protocol
open information tables.

Since the parameters defined for CloseProtocol() do not include the
attributes passed to OpenProtocol(), this means that a matched
open/close pair using EFI_OPEN_PROTOCOL_GET_PROTOCOL can inadvertently
end up deleting the record that defines a driver attachment or the
existence of a child controller.  This in turn can cause some very
unexpected side effects, such as allowing other UEFI drivers to start
controlling hardware to which iPXE believes it has exclusive access.
This rarely ends well.

To prevent this kind of inadvertent deletion, we establish a
convention for four different types of protocol opening:

- ephemeral opens: always opened with ControllerHandle = NULL

- unsafe opens: always opened with ControllerHandle = AgentHandle

- by-driver opens: always opened with ControllerHandle = Handle

- by-child opens: always opened with ControllerHandle != Handle

This convention ensures that the four types of open never overlap
within the set of parameters defined for CloseProtocol(), and so a
close of one type cannot inadvertently delete the record corresponding
to a different type.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-24 13:19:23 +00:00
48d1680127 [efi] Remove the efipci_open() and efipci_close() wrappers
In preparation for formalising the way that EFI protocols are opened
across the codebase, remove the efipci_open() wrapper.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-24 12:05:30 +00:00
3283885326 [efi] Avoid function name near-collision
We currently have both efipci_info() and efi_pci_info() serving
different but related purposes.  Rename the latter to reduce
confusion.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-23 22:29:30 +00:00
331bbf5075 [efi] Remove spurious close of SNP device parent's device path
Commit e727f57 ("[efi] Include a copy of the device path within struct
efi_device") neglected to delete the closure of the parent's device
path from the success code path in efi_snp_probe().

Reduce confusion by removing this (harmless) additional close.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-23 18:24:10 +00:00
8249bbc098 [efi] Use driver name only from driver binding handles in debug messages
Some non-driver handles may have an installed component name protocol.
In particular, iPXE itself installs these protocols on its SNP device
handles, to simplify the process of delegating GetControllerName()
from our single-instance driver binding protocol to whatever child
controllers the relevant EFI driver may have installed.

For non-driver handles, the device path is more useful as debugging
information than the driver name.  Limit the use of the component name
protocols to handles with a driver binding protocol installed, so that
we will end up using the device path for non-driver handles such as
the SNP device.

Continue to prefer the driver name to the device path for handles with
a driver binding protocol installed, since these will generally map to
things we are likely to conceptualise as drivers rather than as
devices.

Note that we deliberately do not use GetControllerName() to attempt to
get a human-readable name for a controller handle.  In the normal
course of events, iPXE is likely to disconnect at least some existing
drivers from their controller handles.  This would cause the name
obtained via GetControllerName() to change.  By using the device path
instead, we ensure that the debug message name remains the same even
when the driver controlling the handle is changed.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-21 17:15:38 +00:00
02ecb23d10 [efi] Get veto candidate driver name via either component name protocol
Attempt to get the veto candidate driver name from both the current
and obsolete versions of the component name protocol.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-20 15:17:08 +00:00
756e3907fd [efi] Get veto candidate driver name from image handle
Allow for drivers that do not install the driver binding protocol on
the image handle by opening the component name protocol on the driver
binding's ImageHandle rather than on the driver handle itself.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-20 14:39:52 +00:00
be5bf0aa7a [efi] Show image address range in veto debug messages
When hunting down a misbehaving OEM driver to add it to the veto list,
it can be very useful to know the address ranges used by each driver.
Add this information to the verbose debug messages.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-20 14:30:34 +00:00
5d64469a9e [efi] Prefer driver name to device path for debug messages
The driver name is usually more informative for debug messages than
the device path from which a driver was loaded.  Try using the various
mechanisms for obtaining a driver name before trying the device path.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-20 14:20:57 +00:00
7cda3dbf94 [efi] Attempt to retrieve driver name from image handle for debug messages
Not all drivers will install the driver binding protocol on the image
handle.  Accommodate these drivers by attempting to retrieve the
driver name via the component name protocol(s) located on the driver
binding's ImageHandle, as well as on the driver handle itself.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-20 14:20:36 +00:00
1a602c92ac [efi] Allow wrapping the global boot services table in situ
When DEBUG=efi_wrap is enabled, we construct a patched copy of the
boot services table and patch the global system table to point to this
copy.  This ensures that any subsequently loaded EFI binaries will
call our wrappers.

Previously loaded EFI binaries will typically have cached the boot
services table pointer (in the gBS variable used by EDK2 code), and
therefore will not pick up the updated pointer and so will not call
our wrappers.  In most cases, this is what we want to happen: we are
interested in tracing the calls issued by the newly loaded binary and
we do not want to be distracted by the high volume of boot services
calls issued by existing UEFI drivers.

In some circumstances (such as when a badly behaved OEM driver is
causing the system to lock up during the ExitBootServices() call), it
can be very useful to be able to patch the global boot services table
in situ, so that we can trace calls issued by existing drivers.

Restructure the wrapping code to allow wrapping to be enabled or
disabled at any time, and to allow for patching the global boot
services table in situ.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-20 12:35:42 +00:00
f68c8b09e3 [efi] Fix debug wrappers for CloseEvent() and CheckEvent()
The debug wrappers for CloseEvent() and CheckEvent() are currently
both calling SignalEvent() instead (presumably due to copy-paste
errors).  Astonishingly, this has generally not prevented a successful
boot in the (very rare) case that DEBUG=efi_wrap is enabled.

Fix the wrappers to call the intended functions.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-19 16:20:27 +00:00
37ea181d8b [efi] Ignore path separator characters in virtual filenames
The virtual filesystem that we provide to expose downloaded images
will erroneously interpret filenames with redundant path separators
such as ".\filename" as an attempt to open the directory, rather than
an attempt to open "filename".

This shows up most obviously when chainloading from one iPXE into
another iPXE, when the inner iPXE may end up attempting to open
".\autoexec.ipxe" from the outer iPXE's virtual filesystem.  (The
erroneously opened file will have a zero length and will therefore be
ignored, but is still confusing.)

Fix by discarding any dot or backslash characters after a potential
initial backslash.  This is very liberal and will accept some
syntactically invalid paths, but this is acceptable since our virtual
filesystem does not implement directories anyway.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-18 16:21:10 +00:00
6e4196baff [efi] Prescroll the display after a failed wrapped ExitBootServices() call
On some systems (observed with an HP Elitebook 840 G10), writing
console output that happens to cause the display to scroll will modify
the system memory map.  This causes builds with DEBUG=efi_wrap to
typically fail to boot, since the debug output from the wrapped
ExitBootServices() call itself is sufficient to change the memory map
and therefore cause ExitBootServices() to fail due to an invalid
memory map key.

Work around these UEFI firmware bugs by prescrolling the display after
a failed ExitBootServices() attempt, in order to minimise the chance
that further scrolling will happen during the subsequent attempt.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-18 14:13:56 +00:00
8ea8411f0d [efi] Add EFI_RNG_PROTOCOL_GUID as a well-known GUID
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-18 12:49:19 +00:00
42a29d5681 [crypto] Update cmsdetach to work with python-asn1 version 3.0.0
The python-asn1 documentation indicates that end of file may be
detected either by obtaining a True value from .eof() or by obtaining
a None value from .peek(), but does not mention any way to detect the
end of a constructed tag (rather than the end of the overall file).
We currently use .eof() to detect the end of a constructed tag, based
on the observed behaviour of the library.

The behaviour of .eof() changed between versions 2.8.0 and 3.0.0, such
that .eof() no longer returns True at the end of a constructed tag.

Switch to testing for a None value returned from .peek() to determine
when we have reached the end of a constructed tag, since this works on
both newer and older versions.

Continue to treat .eof() as a necessary but not sufficient condition
for reaching the overall end of file, to maintain compatibility with
older versions.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-17 11:48:06 +00:00
829e2d1f29 [rng] Restore state of IRQ 8 and PIE when disabling entropy gathering
Legacy IRQ 8 appears to be enabled by default on some platforms.  If
iPXE selects the RTC entropy source, this will currently result in the
RTC IRQ 8 being unconditionally disabled.  This can break assumptions
made by BIOSes or subsequent bootloaders: in particular, the FreeBSD
loader may lock up at the point of starting its default 10-second
countdown when it calls INT 15,86.

Fix by restoring the previous state of IRQ 8 instead of disabling it
unconditionally.  Note that we do not need to disable IRQ 8 around the
point of hooking (or unhooking) the ISR, since this code will be
executing in iPXE's normal state of having interrupts disabled anyway.

Also restore the previous state of the RTC periodic interrupt enable,
rather than disabling it unconditionally.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-14 15:08:05 +00:00
8840de4096 [pic8259] Return previous state when enabling or disabling IRQs
Return the previous interrupt enabled state from enable_irq() and
disable_irq(), to allow callers to more easily restore this state.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-14 14:09:26 +00:00
d1133956d1 [contrib] Update bochsrc.txt to work with current versions
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-14 12:46:02 +00:00
ddc2d928d2 [efi] Accept and trust CA certificates in the TlsCaCertificates variable
UEFI's built-in HTTPS boot mechanism requires the trusted CA
certificates to be provided via the TlsCaCertificates variable.
(There is no equivalent of the iPXE cross-signing mechanism, so it is
not possible for UEFI to automatically use public CA certificates.)

Users who have configured UEFI HTTPS boot to use a custom root of
trust (e.g. a private CA certificate) may find it useful to have iPXE
automatically pick up and use this same root of trust, so that iPXE
can seamlessly fetch files via HTTPS from the same servers that were
trusted by UEFI HTTPS boot, in addition to servers that iPXE can
validate through other means such as cross-signed certificates.

Parse the TlsCaCertificates variable at startup, add any certificates
to the certificate store, and mark these certificates as trusted.

There are no access restrictions on modifying the TlsCaCertificates
variable: anybody with access to write UEFI variables is permitted to
change the root of trust.  The UEFI security model assumes that anyone
with access to run code prior to ExitBootServices() or with access to
modify UEFI variables from within a loaded operating system is
supposed to be able to change the system's root of trust for TLS.

Any certificates parsed from TlsCaCertificates will show up in the
output of "certstat", and may be discarded using "certfree" if
unwanted.

Support for parsing TlsCaCertificates is enabled by default in EFI
builds, but may be disabled in config/general.h if needed.

As with the ${trust} setting, the contents of the TlsCaCertificates
variable will be ignored if iPXE has been compiled with an explicit
root of trust by specifying TRUST=... on the build command line.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-13 15:54:43 +00:00
aa49ce5b1d [efi] Add TLS authentication header and GUID definitions
Add the TlsAuthentication.h header from EDK2's NetworkPkg, along with
a GUID definition for EFI_TLS_CA_CERTIFICATE_GUID.

It is unclear whether or not the TlsCaCertificate variable is intended
to be a UEFI standard.  Its presence in NetworkPkg (rather than
MdePkg) suggests not, but the choice of EFI_TLS_CA_CERTIFICATE_GUID
(rather than e.g. EDKII_TLS_CA_CERTIFICATE_GUID) suggests that it is
intended to be included in future versions of the standard.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-13 14:04:41 +00:00
2a901a33df [efi] Add EFI_GLOBAL_VARIABLE as a well-known GUID
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-13 14:04:40 +00:00
da3024d257 [cpio] Allow for the construction of pure directories
Allow for the possibility of creating empty directories (without
having to include a dummy file inside the directory) using a
zero-length image and a CPIO filename with a trailing slash, such as:

  initrd emptyfile /usr/share/oem/

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-12 14:32:41 +00:00
d6ee9a9242 [cpio] Fix calculation of name lengths in CPIO headers
Commit 12ea8c4 ("[cpio] Allow for construction of parent directories
as needed") introduced a regression in constructing CPIO archive
headers for relative paths (e.g. simple filenames with no leading
slash).

Fix by counting the number of path components rather than the number
of path separators, and add some test cases to cover CPIO header
construction.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-12 14:27:44 +00:00
5f3ecbde5a [crypto] Support extracting certificates from EFI signature list images
Add support for the EFI signature list image format (as produced by
tools such as efisecdb).

The parsing code does not require any EFI boot services functions and
so may be enabled even in non-EFI builds.  We default to enabling it
only for EFI builds.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-11 12:58:19 +00:00
26a8fed710 [crypto] Allow for parsing of DER data separate from DER images
We currently provide pem_asn1() to allow for parsing of PEM data that
is not necessarily contained in an image.  Provide an equivalent
function der_asn1() to allow for similar parsing of DER data.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-11 12:36:23 +00:00
011c778f06 [efi] Allow efi_guid_ntoa() to be used in non-EFI builds
The debug message transcription of well-known EFI GUIDs does not
require any EFI boot services calls.  Move this code from efi_debug.c
to efi_guid.c, to allow it to be linked in to non-EFI builds.

We continue to rely on linker garbage collection to ensure that the
code is omitted completely from any non-debug builds.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-11 11:52:37 +00:00
8706ae36d3 [efi] Add EFI_SIGNATURE_LIST header and GUID definitions
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-10 12:34:35 +00:00
a3ede10788 [efi] Update to current EDK2 headers
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-10 12:34:35 +00:00
32d706a9ff [build] Use -fshort-wchar when building EFI host utilities
The EFI host utilities (such as elf2efi64, efirom, etc) include the
EDK2 headers, which include static assertions to ensure that they are
built with -fshort-wchar enabled.  When building the host utilities,
we currently bypass these assertions by defining MDE_CPU_EBC.  The EBC
compiler apparently does not support static assertions, and defining
MDE_CPU_EBC therefore causes EDK2's Base.h to define STATIC_ASSERT()
as a no-op.

Newer versions of the EDK2 headers omit the check for MDE_CPU_EBC (and
will presumably therefore fail to build with the EBC compiler).  This
causes our host utility builds to fail since the static assertion now
detects that we are building with the host's default ABI (i.e. without
enabling -fshort-wchar).

Fix by enabling -fshort-wchar when building EFI host utilities.  This
produces binaries that are technically incompatible with the host ABI.
However, since our host utilities never handle any wide-character
strings, this nominal ABI incompatiblity has no effect.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-10 12:34:35 +00:00
82fac51626 [efi] Mark UsbHostController.h as a non-imported header
The UsbHostController.h header has been removed from the EDK2 codebase
since it was never defined in a released UEFI specification.  However,
we may still encounter it in the wild and so it is useful to retain
the GUID and the corresponding protocol name for debug messages.

Add an iPXE include guard to this file so that the EDK2 header import
script will no longer attempt to import it from the EDK2 tree.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-10 11:15:04 +00:00
be3a78eaf8 [lkrnprefix] Support a longer version string
The bzImage specification allows two bytes for the setup code jump
instruction at offset 0x200, which limits its relative offset to +0x7f
bytes.  This currently imposes an upper limit on the length of the
version string, which currently precedes the setup code.

Fix by moving the version string to the .prefix.data section, so that
it no longer affects the placement of the setup code.

Originally-fixed-by: Miao Wang <shankerwangmiao@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-02-28 11:32:42 +00:00
12ea8c4074 [cpio] Allow for construction of parent directories as needed
iPXE allows individual raw files to be automatically wrapped with
suitable CPIO headers and injected into the magic initrd image as
exposed to a booted Linux kernel.  This feature is currently limited
to placing files within directories that already exist in the initrd
filesystem.

Remove this limitation by adding the ability for iPXE to construct
CPIO headers for parent directories as needed, under control of the
"mkdir=<n>" command-line argument.  For example:

  initrd config.ign /usr/share/oem/config.ign mkdir=1

will create CPIO headers for the "/usr/share/oem" directory as well as
for the "/usr/share/oem/config.ign" file itself.

This simplifies the process of booting operating systems such as
Flatcar Linux, which otherwise require the single "config.ign" file to
be manually wrapped up as a CPIO archive solely in order to create the
relevant parent directory entries.

The value <n> may be used to control the number of parent directory
entries that are created.  For example, "mkdir=2" would cause up to
two parent directories to be created (i.e. "/usr/share" and
"/usr/share/oem" in the above example).  A negative value such as
"mkdir=-1" may be used to create all parent directories up to the root
of the tree.

Do not create any parent directory entries by default, since doing so
would potentially cause the modes and ownership information for
existing directories to be overwritten.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-02-24 14:37:26 +00:00
571 changed files with 25865 additions and 9250 deletions

1
.github/FUNDING.yml vendored Normal file
View File

@ -0,0 +1 @@
github: [mcb30, NiKiZe]

View File

@ -104,6 +104,9 @@ def import_image(region, name, family, architecture, image, public, overwrite,
image_id = image['ImageId']
client.get_waiter('image_available').wait(ImageIds=[image_id])
if public:
image_block = client.get_image_block_public_access_state()
if image_block['ImageBlockPublicAccessState'] != 'unblocked':
client.disable_image_block_public_access()
resource.Image(image_id).modify_attribute(Attribute='launchPermission',
OperationType='add',
UserGroups=['all'])

View File

@ -7,8 +7,9 @@ message into a separate file.
"""
import argparse
from pathlib import Path
import asn1
from asn1crypto.cms import ContentInfo, AuthEnvelopedData, EnvelopedData
# Parse command-line arguments
#
@ -16,65 +17,45 @@ parser = argparse.ArgumentParser(
description=__doc__,
formatter_class=argparse.RawDescriptionHelpFormatter,
)
parser.add_argument("-d", "--data", metavar="FILE",
parser.add_argument("-d", "--data", metavar="FILE", type=Path,
help="Write detached data (without envelope) to FILE")
parser.add_argument("-e", "--envelope", metavar="FILE",
parser.add_argument("-e", "--envelope", metavar="FILE", type=Path,
help="Write envelope (without data) to FILE")
parser.add_argument("-o", "--overwrite", action="store_true",
help="Overwrite output files")
parser.add_argument("file", help="Input envelope file")
parser.add_argument("file", type=Path, help="Input envelope file")
args = parser.parse_args()
if args.data is None and args.envelope is None:
parser.error("at least one of --data and --envelope is required")
outmode = "wb" if args.overwrite else "xb"
# Create decoder
# Read input envelope
#
decoder = asn1.Decoder()
with open(args.file, mode="rb") as fh:
decoder.start(fh.read())
envelope = ContentInfo.load(args.file.read_bytes())
# Create encoder
# Locate encrypted content info
#
encoder = asn1.Encoder()
encoder.start()
# Detach encrypted data
#
data = None
datastack = [
asn1.Numbers.Sequence, 0, asn1.Numbers.Sequence, asn1.Numbers.Sequence
]
stack = []
while stack or not decoder.eof():
if decoder.eof():
encoder.leave()
decoder.leave()
stack.pop()
else:
tag = decoder.peek()
if tag.typ == asn1.Types.Constructed:
encoder.enter(nr=tag.nr, cls=tag.cls)
decoder.enter()
stack.append(tag.nr)
else:
(tag, value) = decoder.read()
if stack == datastack and tag.nr == 0:
data = value
else:
encoder.write(value, nr=tag.nr, cls=tag.cls)
envelope = encoder.output()
if data is None:
content = envelope["content"]
if type(content) is AuthEnvelopedData:
encinfo = content["auth_encrypted_content_info"]
elif type(content) is EnvelopedData:
encinfo = content["encrypted_content_info"]
else:
parser.error("Input file does not contain any encrypted data")
# Detach encrypted content data
#
data = encinfo["encrypted_content"]
del encinfo["encrypted_content"]
# Write envelope (without data), if applicable
#
if args.envelope:
with open(args.envelope, mode=outmode) as fh:
fh.write(envelope)
with args.envelope.open(mode=outmode) as fh:
fh.write(envelope.dump())
# Write data (without envelope), if applicable
#
if args.data:
with open(args.data, mode=outmode) as fh:
fh.write(data)
with args.data.open(mode=outmode) as fh:
fh.write(data.contents)

View File

@ -25,12 +25,12 @@ plugin_ctrl: unmapped=1, biosdev=1, speaker=1, e1000=1, parallel=1, serial=1
# allows you to change all the settings that control Bochs's behavior.
# Depending on the platform there are up to 3 choices of configuration
# interface: a text mode version called "textconfig" and two graphical versions
# called "win32config" and "wx". The text mode version uses stdin/stdout and
# is always compiled in, unless Bochs is compiled for wx only. The choice
# "win32config" is only available on win32 and it is the default there.
# The choice "wx" is only available when you use "--with-wx" on the configure
# command. If you do not write a config_interface line, Bochs will
# choose a default for you.
# called "win32config" and "wx". The text mode version uses stdin/stdout or
# gui console (if available / runtime config) and is always compiled in, unless
# Bochs is compiled for wx only. The choice "win32config" is only available on
# win32/win64 and it is the default on these platforms. The choice "wx" is only
# available when Bochs is compiled with wxWidgets support. If you do not write
# a config_interface line, Bochs will choose a default for you.
#
# NOTE: if you use the "wx" configuration interface, you must also use
# the "wx" display library.
@ -73,12 +73,14 @@ plugin_ctrl: unmapped=1, biosdev=1, speaker=1, e1000=1, parallel=1, serial=1
# "cmdmode" - call a headerbar button handler after pressing F7 (sdl, sdl2,
# win32, x)
# "fullscreen" - startup in fullscreen mode (sdl, sdl2)
# "gui_debug" - use GTK debugger gui (sdl, sdl2, x) / Win32 debugger gui (sdl,
# sdl2, win32)
# "hideIPS" - disable IPS output in status bar (rfb, sdl, sdl2, term, vncsrv,
# win32, wx, x)
# "nokeyrepeat" - turn off host keyboard repeat (sdl, sdl2, win32, x)
# "no_gui_console" - use system console instead of builtin gui console
# (rfb, sdl, sdl2, vncsrv, x)
# "timeout" - time (in seconds) to wait for client (rfb, vncsrv)
# "gui_debug" - This option is DEPRECATED, use command line option '-dbg_gui'
# instead. It also supports the 'globalini' extension
#
# See the examples below for other currently supported options.
# Setting up options without specifying display library is also supported.
@ -113,9 +115,12 @@ plugin_ctrl: unmapped=1, biosdev=1, speaker=1, e1000=1, parallel=1, serial=1
#
# CPU configurations that can be selected:
# -----------------------------------------------------------------
# i386 Intel 386SX
# i486dx4 Intel 486DX4
# pentium Intel Pentium (P54C)
# pentium_mmx Intel Pentium MMX
# amd_k6_2_chomper AMD-K6(tm) 3D processor (Chomper)
# athlon_xp AMD Athlon(tm) XP Processor
# p2_klamath Intel Pentium II (Klamath)
# p3_katmai Intel Pentium III (Katmai)
# p4_willamette Intel(R) Pentium(R) 4 (Willamette)
@ -136,6 +141,26 @@ plugin_ctrl: unmapped=1, biosdev=1, speaker=1, e1000=1, parallel=1, serial=1
# corei7_ivy_bridge_3770k Intel(R) Core(TM) i7-3770K CPU (Ivy Bridge)
# corei7_haswell_4770 Intel(R) Core(TM) i7-4770 CPU (Haswell)
# broadwell_ult Intel(R) Processor 5Y70 CPU (Broadwell)
# corei7_skylake_x Intel(R) Core(TM) i7-7800X CPU (Skylake)
# corei3_cnl Intel(R) Core(TM) i3-8121U CPU (Cannonlake)
# corei7_icelake_u QuadCore Intel Core i7-1065G7 (IceLake)
# tigerlake 11th Gen Intel(R) Core(TM) i5-1135G7 (TigerLake)
# sapphire_rapids Intel(R) Xeon(R) w9-3475X (Sappire Rapids)
# arrow_lake 15th Gen Intel(R) Core(TM) Ultra 5 245K (ArrowLake)
#
# ADD_FEATURES:
# Enable one of more CPU feature in the CPU configuration selected by MODEL.
# Could be useful for testing CPU with newer imaginary configurations by
# adding a specific feature or set of features to existing MODEL. The list
# of features to add supplied through space or comma separated string.
#
# EXCLUDE_FEATURES:
# Disable one of more CPU feature from CPU configuration selected by MODEL.
# Could be useful for testing CPU without a specific feature or set of
# features. When experiening issues booting a modern OS it could be useful
# to disable CPU features(s) to see if they responsible for the failures.
# The list of features to exclude supplied through space or comma separated
# string.
#
# COUNT:
# Set the number of processors:cores per processor:threads per core when
@ -196,151 +221,6 @@ plugin_ctrl: unmapped=1, biosdev=1, speaker=1, e1000=1, parallel=1, serial=1
cpu: model=core2_penryn_t9600, count=1, ips=50000000, reset_on_triple_fault=1, ignore_bad_msrs=1, msrs="msrs.def"
cpu: cpuid_limit_winnt=0
#=======================================================================
# CPUID:
#
# This defines features and functionality supported by Bochs emulated CPU.
# The option has no offect if CPU model was selected in CPU option.
#
# MMX:
# Select MMX instruction set support.
# This option exists only if Bochs compiled with BX_CPU_LEVEL >= 5.
#
# APIC:
# Select APIC configuration (LEGACY/XAPIC/XAPIC_EXT/X2APIC).
# This option exists only if Bochs compiled with BX_CPU_LEVEL >= 5.
#
# SEP:
# Select SYSENTER/SYSEXIT instruction set support.
# This option exists only if Bochs compiled with BX_CPU_LEVEL >= 6.
#
# SIMD:
# Select SIMD instructions support.
# Any of NONE/SSE/SSE2/SSE3/SSSE3/SSE4_1/SSE4_2/AVX/AVX2/AVX512
# could be selected.
#
# This option exists only if Bochs compiled with BX_CPU_LEVEL >= 6.
# The AVX choises exists only if Bochs compiled with --enable-avx option.
#
# SSE4A:
# Select AMD SSE4A instructions support.
# This option exists only if Bochs compiled with BX_CPU_LEVEL >= 6.
#
# MISALIGNED_SSE:
# Select AMD Misaligned SSE mode support.
# This option exists only if Bochs compiled with BX_CPU_LEVEL >= 6.
#
# AES:
# Select AES instruction set support.
# This option exists only if Bochs compiled with BX_CPU_LEVEL >= 6.
#
# SHA:
# Select SHA instruction set support.
# This option exists only if Bochs compiled with BX_CPU_LEVEL >= 6.
#
# MOVBE:
# Select MOVBE Intel(R) Atom instruction support.
# This option exists only if Bochs compiled with BX_CPU_LEVEL >= 6.
#
# ADX:
# Select ADCX/ADOX instructions support.
# This option exists only if Bochs compiled with BX_CPU_LEVEL >= 6.
#
# XSAVE:
# Select XSAVE extensions support.
# This option exists only if Bochs compiled with BX_CPU_LEVEL >= 6.
#
# XSAVEOPT:
# Select XSAVEOPT instruction support.
# This option exists only if Bochs compiled with BX_CPU_LEVEL >= 6.
#
# AVX_F16C:
# Select AVX float16 convert instructions support.
# This option exists only if Bochs compiled with --enable-avx option.
#
# AVX_FMA:
# Select AVX fused multiply add (FMA) instructions support.
# This option exists only if Bochs compiled with --enable-avx option.
#
# BMI:
# Select BMI1/BMI2 instructions support.
# This option exists only if Bochs compiled with --enable-avx option.
#
# XOP:
# Select AMD XOP instructions support.
# This option exists only if Bochs compiled with --enable-avx option.
#
# FMA4:
# Select AMD four operand FMA instructions support.
# This option exists only if Bochs compiled with --enable-avx option.
#
# TBM:
# Select AMD Trailing Bit Manipulation (TBM) instructions support.
# This option exists only if Bochs compiled with --enable-avx option.
#
# X86-64:
# Enable x86-64 and long mode support.
# This option exists only if Bochs compiled with x86-64 support.
#
# 1G_PAGES:
# Enable 1G page size support in long mode.
# This option exists only if Bochs compiled with x86-64 support.
#
# PCID:
# Enable Process-Context Identifiers (PCID) support in long mode.
# This option exists only if Bochs compiled with x86-64 support.
#
# FSGSBASE:
# Enable GS/GS BASE access instructions support in long mode.
# This option exists only if Bochs compiled with x86-64 support.
#
# SMEP:
# Enable Supervisor Mode Execution Protection (SMEP) support.
# This option exists only if Bochs compiled with BX_CPU_LEVEL >= 6.
#
# SMAP:
# Enable Supervisor Mode Access Prevention (SMAP) support.
# This option exists only if Bochs compiled with BX_CPU_LEVEL >= 6.
#
# MWAIT:
# Select MONITOR/MWAIT instructions support.
# This option exists only if Bochs compiled with --enable-monitor-mwait.
#
# VMX:
# Select VMX extensions emulation support.
# This option exists only if Bochs compiled with --enable-vmx option.
#
# SVM:
# Select AMD SVM (Secure Virtual Machine) extensions emulation support.
# This option exists only if Bochs compiled with --enable-svm option.
#
# VENDOR_STRING:
# Set the CPUID vendor string returned by CPUID(0x0). This should be a
# twelve-character ASCII string.
#
# BRAND_STRING:
# Set the CPUID vendor string returned by CPUID(0x80000002 .. 0x80000004).
# This should be at most a forty-eight-character ASCII string.
#
# LEVEL:
# Set emulated CPU level information returned by CPUID. Default value is
# determined by configure option --enable-cpu-level. Currently supported
# values are 5 (for Pentium and similar processors) and 6 (for P6 and
# later processors).
#
# FAMILY:
# Set model information returned by CPUID. Default family value determined
# by configure option --enable-cpu-level.
#
# MODEL:
# Set model information returned by CPUID. Default model value is 3.
#
# STEPPING:
# Set stepping information returned by CPUID. Default stepping value is 3.
#=======================================================================
#cpuid: x86_64=1, mmx=1, sep=1, simd=sse4_2, apic=xapic, aes=1, movbe=1, xsave=1
#cpuid: family=6, model=0x1a, stepping=5
#=======================================================================
# MEMORY
# Set the amount of physical memory you want to emulate.
@ -357,8 +237,14 @@ cpu: cpuid_limit_winnt=0
# memory pool. You will be warned (by FATAL PANIC) in case guest already
# used all allocated host memory and wants more.
#
# BLOCK_SIZE:
# Memory block size select granularity of host memory allocation. Very
# large memory configurations might requre larger memory blocks which
# configurations with small memory might want memory block smaller.
# Default memory block size is 128K.
#
#=======================================================================
memory: guest=512, host=256
memory: guest=512, host=256, block_size=512
#=======================================================================
# ROMIMAGE:
@ -368,28 +254,50 @@ memory: guest=512, host=256
# starting at address 0xfffe0000, and it is exactly 128k long. The legacy
# version of the Bochs BIOS is usually loaded starting at address 0xffff0000,
# and it is exactly 64k long.
# You can use the environment variable $BXSHARE to specify the location
# of the BIOS.
# The usage of external large BIOS images (up to 512k) at memory top is
# now supported, but we still recommend to use the BIOS distributed with Bochs.
# The start address is optional, since it can be calculated from image size.
# The Bochs BIOS currently supports only the option "fastboot" to skip the
# boot menu delay.
#
# FILE
# Name of the BIOS image file. You can use the environment variable $BXSHARE
# to specify the location of the BIOS.
#
# ADDRESS
# The start address is optional, since it can be calculated from image size.
#
# OPTIONS
# The Bochs BIOS currently only supports the option "fastboot" to skip the
# boot menu delay.
#
# FLASH_DATA
# This parameter defines the file name for the flash BIOS config space loaded
# at startup if existing and saved on exit if modified. The Bochs BIOS doesn't
# use this feature yet.
#
# Please note that if you use the BIOS-bochs-legacy romimage BIOS option,
# you cannot use a PCI enabled VGA ROM BIOS.
# Please note that if you use a SeaBIOS binary in romimage BIOS option,
# you must use a PCI enabled VGA ROM BIOS.
#=======================================================================
#romimage: file=$BXSHARE/BIOS-bochs-latest, options=fastboot
#romimage: file=$BXSHARE/BIOS-bochs-legacy
#romimage: file=$BXSHARE/bios.bin-1.13.0 # http://www.seabios.org/SeaBIOS
#romimage: file=$BXSHARE/i440fx.bin, flash_data=escd.bin
#romimage: file=asus_p6np5.bin, flash_data=escd.bin
#romimage: file=mybios.bin, address=0xfff80000 # 512k at memory top
romimage: file=bochs/bios/BIOS-bochs-latest
#=======================================================================
# VGAROMIMAGE
# You now need to load a VGA ROM BIOS into C0000.
# Please note that if you use the BIOS-bochs-legacy romimage BIOS option,
# you cannot use a PCI enabled VGA ROM BIOS option such as the cirrus
# option shown below.
#=======================================================================
#vgaromimage: file=$BXSHARE/VGABIOS-lgpl-latest
#vgaromimage: file=bios/VGABIOS-lgpl-latest-cirrus
#vgaromimage: file=$BXSHARE/VGABIOS-lgpl-latest.bin
#vgaromimage: file=bios/VGABIOS-lgpl-latest-cirrus.bin
#vgaromimage: file=$BXSHARE/vgabios-cirrus.bin-1.13.0 # http://www.seabios.org/SeaVGABIOS
#vgaromimage: file=bios/VGABIOS-elpin-2.40
vgaromimage: file=bochs/bios/VGABIOS-lgpl-latest
vgaromimage: file=bochs/bios/VGABIOS-lgpl-latest.bin
#=======================================================================
# OPTROMIMAGE[1-4]:
@ -406,7 +314,7 @@ vgaromimage: file=bochs/bios/VGABIOS-lgpl-latest
#optromimage2: file=optionalrom.bin, address=0xd1000
#optromimage3: file=optionalrom.bin, address=0xd2000
#optromimage4: file=optionalrom.bin, address=0xd3000
optromimage1: file=../../src/bin/intel.rom, address=0xcb000
optromimage1: file=../../src/bin/intel.rom, address=0xc8000
#optramimage1: file=/path/file1.img, address=0x0010000
#optramimage2: file=/path/file2.img, address=0x0020000
@ -426,7 +334,9 @@ optromimage1: file=../../src/bin/intel.rom, address=0xcb000
# UPDATE_FREQ
# This parameter specifies the number of display updates per second.
# The VGA update timer by default uses the realtime engine with a value
# of 5. This parameter can be changed at runtime.
# of 10 (valid: 1 to 75). This parameter can be changed at runtime.
# The special value 0 enables support for using the frame rate of the
# emulated graphics device.
#
# REALTIME
# If set to 1 (default), the VGA timer is based on realtime, otherwise it
@ -440,6 +350,10 @@ optromimage1: file=../../src/bin/intel.rom, address=0xcb000
# the monitor EDID data. By default the 'builtin' values for 'Bochs Screen'
# are used. Other choices are 'disabled' (no DDC emulation) and 'file'
# (read monitor EDID from file / path name separated with a colon).
#
# VBE_MEMSIZE
# With this parameter the size of the memory for the Bochs VBE extension
# can be defined. Valid values are 4, 8, 16 and 32 MB (default is 16 MB).
# Examples:
# vga: extension=cirrus, update_freq=10, ddc=builtin
#=======================================================================
@ -488,6 +402,8 @@ optromimage1: file=../../src/bin/intel.rom, address=0xcb000
# KEYMAP:
# This enables a remap of a physical localized keyboard to a
# virtualized us keyboard, as the PC architecture expects.
# Using a language specifier instead of a file name is also supported.
# A keymap is also required by the paste feature.
#
# USER_SHORTCUT:
# This defines the keyboard shortcut to be sent when you press the "user"
@ -504,7 +420,7 @@ optromimage1: file=../../src/bin/intel.rom, address=0xcb000
# keyboard: keymap=gui/keymaps/x11-pc-de.map
# keyboard: user_shortcut=ctrl-alt-del
#=======================================================================
#keyboard: type=mf, serial_delay=250
#keyboard: type=mf, serial_delay=150
#=======================================================================
# MOUSE:
@ -529,8 +445,8 @@ optromimage1: file=../../src/bin/intel.rom, address=0xcb000
# TOGGLE:
# The default method to toggle the mouse capture at runtime is to press the
# CTRL key and the middle mouse button ('ctrl+mbutton'). This option allows
# to change the method to 'ctrl+f10' (like DOSBox), 'ctrl+alt' (like QEMU)
# or 'f12'.
# to change the method to 'ctrl+f10' (like DOSBox), 'ctrl+alt' (legacy QEMU),
# 'ctrl+alt+g' (QEMU current) or 'f12'.
#
# Examples:
# mouse: enabled=1
@ -567,7 +483,8 @@ mouse: enabled=0
# PCI chipset. These options can be specified as comma-separated values.
# By default the "Bochs i440FX" chipset enables the ACPI and HPET devices, but
# original i440FX doesn't support them. The options 'noacpi' and 'nohpet' make
# it possible to disable them.
# it possible to disable them. The option 'noagp' disables the incomplete AGP
# subsystem of the i440BX chipset.
#
# Example:
# pci: enabled=1, chipset=i440fx, slot1=pcivga, slot2=ne2k, advopts=noacpi
@ -772,10 +689,11 @@ ata3: enabled=0, ioaddr1=0x168, ioaddr2=0x360, irq=9
# This defines the boot sequence. Now you can specify up to 3 boot drives,
# which can be 'floppy', 'disk', 'cdrom' or 'network' (boot ROM).
# Legacy 'a' and 'c' are also supported.
# The new boot choice 'usb' is only supported by the i440fx.bin BIOS.
# Examples:
# boot: floppy
# boot: cdrom, disk
# boot: network, disk
# boot: network, usb, disk
# boot: cdrom, floppy, disk
#=======================================================================
#boot: floppy
@ -929,15 +847,15 @@ parport1: enabled=1, file="parport.out"
# waveoutdrv:
# This defines the driver to be used for the waveout feature.
# Possible values are 'file' (all wave data sent to file), 'dummy' (no
# output) and the platform-dependant drivers 'alsa', 'oss', 'osx', 'sdl'
# and 'win'.
# output), 'pulse', 'sdl' (both cross-platform) and the platform-dependant
# drivers 'alsa', 'oss', 'osx' and 'win'.
# waveout:
# This defines the device to be used for wave output (if necessary) or
# the output file for the 'file' driver.
# waveindrv:
# This defines the driver to be used for the wavein feature.
# Possible values are 'dummy' (recording silence) and platform-dependent
# drivers 'alsa', 'oss', 'sdl' and 'win'.
# Possible values are 'dummy' (recording silence), 'pulse', 'sdl' (both
# cross-platform) and platform-dependent drivers 'alsa', 'oss' and 'win'.
# wavein:
# This defines the device to be used for wave input (if necessary).
# midioutdrv:
@ -967,6 +885,7 @@ parport1: enabled=1, file="parport.out"
# the Beep() function. The 'gui' mode forwards the beep to the related
# gui methods (currently only used by the Carbon gui).
#=======================================================================
#speaker: enabled=1, mode=sound, volume=15
speaker: enabled=1, mode=system
#=======================================================================
@ -1000,14 +919,15 @@ speaker: enabled=1, mode=system
# log: The file to write the sb16 emulator messages to.
# dmatimer:
# microseconds per second for a DMA cycle. Make it smaller to fix
# non-continuous sound. 750000 is usually a good value. This needs a
# reasonably correct setting for the IPS parameter of the CPU option.
# non-continuous sound. 1000000 is usually a good value. This needs a
# reasonably correct setting for the IPS parameter of the CPU option
# and also depends on the clock sync setting.
#
# Examples for output modes:
# sb16: midimode=2, midifile="output.mid", wavemode=1 # MIDI to file
# sb16: midimode=1, wavemode=3, wavefile="output.wav" # wave to file and device
#=======================================================================
#sb16: midimode=1, wavemode=1, loglevel=2, log=sb16.log, dmatimer=600000
#sb16: midimode=1, wavemode=1, loglevel=2, log=sb16.log, dmatimer=900000
#=======================================================================
# ES1370:
@ -1069,7 +989,8 @@ speaker: enabled=1, mode=system
#
# BOOTROM: The bootrom value is optional, and is the name of the ROM image
# to load. Note that this feature is only implemented for the PCI version of
# the NE2000.
# the NE2000. For the ISA version using one of the 'optromimage[1-4]' options
# must be used instead of this one.
#
# If you don't want to make connections to any physical networks,
# you can use the following 'ethmod's to simulate a virtual network.
@ -1137,34 +1058,40 @@ e1000: enabled=1, mac=52:54:00:12:34:56, ethmod=tuntap, ethdev=/dev/net/tun:tap0
# the numeric keypad to the USB device instead of the PS/2 keyboard. If the
# keyboard is selected, all key events are sent to the USB device.
#
# To connect a 'flat' mode image as a USB hardisk you can use the 'disk' device
# with the path to the image separated with a colon. To use other disk image modes
# similar to ATA disks the syntax 'disk:mode:filename' must be used (see below).
# To connect a disk image as a USB hardisk you can use the 'disk' device. Use
# the 'path' option in the optionsX parameter to specify the path to the image
# separated with a colon. To use other disk image modes similar to ATA disks
# the syntax 'path:mode:filename' must be used (see below).
#
# To emulate a USB cdrom you can use the 'cdrom' device name and the path to
# an ISO image or raw device name also separated with a colon. An option to
# insert/eject media is available in the runtime configuration.
# To emulate a USB cdrom you can use the 'cdrom' device and the path to an
# ISO image or raw device name can be set with the 'path' option in the
# optionsX parameter also separated with a colon. An option to insert/eject
# media is available in the runtime configuration.
#
# To emulate a USB floppy you can use the 'floppy' device with the path to the
# image separated with a colon. To use the VVFAT image mode similar to the
# legacy floppy the syntax 'floppy:vvfat:directory' must be used (see below).
# To emulate a USB floppy you can use the 'floppy' device and the path to a
# floppy image can be set with the 'path' option in the optionsX parameter
# separated with a colon. To use the VVFAT image mode similar to the legacy
# floppy the syntax 'path:vvfat:directory' must be used (see below).
# An option to insert/eject media is available in the runtime configuration.
#
# The device name 'hub' connects an external hub with max. 8 ports (default: 4)
# to the root hub. To specify the number of ports you have to add the value
# separated with a colon. Connecting devices to the external hub ports is only
# available in the runtime configuration.
# to the root hub. To specify the number of ports you have to use the 'ports'
# option in the optionsX parameter with the value separated with a colon.
# Connecting devices to the external hub ports is only available in the runtime
# configuration.
#
# The device 'printer' emulates the HP Deskjet 920C printer. The PCL data is
# sent to a file specified in bochsrc.txt. The current code appends the PCL
# code to the file if the file already existed. The output file can be
# changed at runtime.
# sent to a file specified in the 'file' option with the optionsX parameter.
# The current code appends the PCL code to the file if the file already existed.
# The output file can be changed at runtime.
#
# The optionsX parameter can be used to assign specific options to the device
# connected to the corresponding USB port. Currently this feature is used to
# set the speed reported by device ('low', 'full', 'high' or 'super'). The
# available speed choices depend on both HC and device. The option 'debug' turns
# on debug output for the device at connection time.
# connected to the corresponding USB port. The option 'speed' can be used to set
# the speed reported by device ('low', 'full', 'high' or 'super'). The available
# speed choices depend on both HC and device. The option 'debug' turns on debug
# output for the device at connection time. The option 'pcap' turns on packet
# logging in PCAP format.
#
# For the USB 'disk' device the optionsX parameter can be used to specify an
# alternative redolog file (journal) of some image modes. For 'vvfat' mode USB
# disks the optionsX parameter can be used to specify the disk size (range
@ -1174,15 +1101,23 @@ e1000: enabled=1, mac=52:54:00:12:34:56, ethmod=tuntap, ethdev=/dev/net/tun:tap0
# supported (can fix hw detection in some guest OS). The USB floppy also
# accepts the parameter "write_protected" with valid values 0 and 1 to select
# the access mode (default is 0).
#
# For a high- or super-speed USB 'disk' device the optionsX parameter can include
# the 'proto:bbb' or 'proto:uasp' parameter specifying to use either the bulk-only
# Protocol (default) or the USB Attached SCSI Protocol. If no such parameter
# is given, the 'bbb' protocol is used. A Guest that doesn't support UASP
# should revert to bbb even if the 'uasp' attribute is given. See the usb_ehci:
# or usb_xhci: section below for an example. (Only 1 LUN is available at this time)
#=======================================================================
#usb_uhci: enabled=1
#usb_uhci: enabled=1, port1=mouse, port2=disk:usbstick.img
#usb_uhci: enabled=1, port1=hub:7, port2=disk:growing:usbdisk.img
#usb_uhci: enabled=1, port2=disk:undoable:usbdisk.img, options2=journal:redo.log
#usb_uhci: enabled=1, port2=disk:usbdisk2.img, options2=sect_size:1024
#usb_uhci: enabled=1, port2=disk:vvfat:vvfat, options2="debug,speed:full"
#usb_uhci: enabled=1, port1=printer:printdata.bin, port2=cdrom:image.iso
#usb_uhci: enabled=1, port2=floppy:vvfat:diskette, options2="model:teac"
#usb_uhci: port1=mouse, port2=disk, options2="path:usbstick.img"
#usb_uhci: port1=hub, options1="ports:6, pcap:outfile.pcap"
#usb_uhci: port2=disk, options2="path:undoable:usbdisk.img, journal:u.redolog"
#usb_uhci: port2=disk, options2=""path:usbdisk2.img, sect_size:1024"
#usb_uhci: port2=disk, options2="path:vvfat:vvfat, debug, speed:full"
#usb_uhci: port2=cdrom, options2="path:image.iso"
#usb_uhci: port1=printer, options1="file:printdata.bin"
#usb_uhci: port2=floppy, options2="path:vvfat:diskette, model:teac"
#=======================================================================
# USB_OHCI:
@ -1200,19 +1135,61 @@ e1000: enabled=1, mac=52:54:00:12:34:56, ethmod=tuntap, ethdev=/dev/net/tun:tap0
# 6-port hub. The portX parameter accepts the same device types with the
# same syntax as the UHCI controller (see above). The optionsX parameter is
# also available on EHCI.
# The HC will default to three UHCI companion controllers, but you can specify
# either UHCI or OHCI. Each companion controller will be evenly divided
# with 2 ports each, the first 2 on the first companion, and so on.
#=======================================================================
#usb_ehci: enabled=1
#usb_ehci: enabled=1, companion=uhci
#usb_ehci: enabled=1, companion=ohci
#usb_ehci: port1=disk, options1="speed:high, path:hdd.img, proto:bbb"
#usb_ehci: port1=disk, options1="speed:high, path:hdd.img, proto:uasp"
#=======================================================================
# USB_XHCI:
# This option controls the presence of the USB xHCI host controller with a
# 4-port hub. The portX parameter accepts the same device types with the
# same syntax as the UHCI controller (see above). The optionsX parameter is
# also available on xHCI. NOTE: port 1 and 2 are USB3 and only support
# super-speed devices, but port 3 and 4 are USB2 and support speed settings
# low, full and high.
# default 4-port hub. The portX parameter accepts the same device types
# with the same syntax as the UHCI controller (see above). The optionsX
# parameter is also available on xHCI.
#
# The xHCI emulation allows you to set the number of ports used with a range
# of 2 to 10, requiring an even numbered count.
#
# NOTE: The first half of the ports, (ports 1 and 2 on a 4-port hub) are
# USB3 only and support super-speed devices. The second half ports (ports
# 3 and 4) are USB2 and support speed settings of low, full, or high.
# The xHCI also allows for different host controllers using the model=
# parameter. Currently, the two allowed options are "uPD720202" and
# "uPD720201". The first defaults to 2 sockets (4 ports) and the later
# defaults to 4 sockets (8 ports).
#=======================================================================
#usb_xhci: enabled=1
#usb_xhci: enabled=1 # defaults to the uPD720202 w/4 ports
#usb_xhci: enabled=1, n_ports=6 # defaults to the uPD720202 w/6 ports
#usb_xhci: enabled=1, model=uPD720202 # defaults to 4 ports
#usb_xhci: enabled=1, model=uPD720202, n_ports=6 # change to 6 ports
#usb_xhci: enabled=1, model=uPD720201 # defaults to 8 ports
#usb_xhci: enabled=1, model=uPD720201, n_ports=10 # change to 10 ports
#usb_xhci: port1=disk, options1="speed:super, path:hdd.img, proto:bbb"
#usb_xhci: port1=disk, options1="speed:super, path:hdd.img, proto:uasp"
#usb_xhci: port3=disk, options3="speed:high, path:hdd.img, proto:uasp"
#=======================================================================
# USB Debugger:
# This is the experimental USB Debugger for the Windows Platform.
# Specify a type (none, uhci, ohci, ehci, xhci) and one or more triggers.
# (Currently, only xhci is supported, with some uhci in the process)
# Triggers:
# reset: will break and load the debugger on a port reset
# enable: will break and load the debugger on a port enable
# doorbell: will break and load the debugger a xHCI Command Ring addition
# event: will break and load the debugger on a xHCI Event Ring addition
# data: will break and load the debugger on a xHCI Data Ring addition
# start_frame: will break and load the debugger on start of frame
# (this is different for each controller type)
# non_exist: will break and load the debugger on a non-existant port access
# (experimental and is under development)
#=======================================================================
#usb_debug: type=xhci, reset=1, enable=1, start_frame=1, doorbell=1, event=1, data=1, non_exist=1
#=======================================================================
# PCIDEV:
@ -1232,11 +1209,20 @@ e1000: enabled=1, mac=52:54:00:12:34:56, ethmod=tuntap, ethdev=/dev/net/tun:tap0
#=======================================================================
# MAGIC_BREAK:
# This enables the "magic breakpoint" feature when using the debugger.
# The useless cpu instruction XCHG BX, BX causes Bochs to enter the
# The useless cpu instructions XCHG %REGW, %REGW causes Bochs to enter the
# debugger mode. This might be useful for software development.
#
# Example:
# You can specify multiple at once:
#
# cx dx bx sp bp si di
#
# Example for breaking on "XCHGW %DI, %DI" or "XCHG %SP, %SP" execution
# magic_break: enabled=1 di sp
#
# If nothing is specified, the default will be used: XCHGW %BX, %BX
# magic_break: enabled=1
#
# Note: Windows XP ntldr can cause problems with XCHGW %BX, %BX
#=======================================================================
magic_break: enabled=1
@ -1262,12 +1248,27 @@ magic_break: enabled=1
# very early when writing BIOS or OS code for example, without having to
# bother with setting up a serial port or etc. Reading from port 0xE9 will
# will return 0xe9 to let you know if the feature is available.
# Leave this 0 unless you have a reason to use it.
# Leave this 0 unless you have a reason to use it. By enabling the
# 'all_rings' option, you can utilize the port e9 hack from ring3.
#
# Example:
# port_e9_hack: enabled=1
# port_e9_hack: enabled=1, all_rings=1
#=======================================================================
port_e9_hack: enabled=1
port_e9_hack: enabled=1, all_rings=1
#=======================================================================
# IODEBUG:
# I/O Interface to Bochs Debugger plugin allows the code running inside
# Bochs to monitor memory ranges, trace individual instructions, and
# observe register values during execution. By enabling the 'all_rings'
# option, you can utilize the iodebug ports from ring3. For more
# information, refer to "Advanced debugger usage" documentation.
#
# Example:
# iodebug: all_rings=1
#=======================================================================
#iodebug: all_rings=1
#=======================================================================
# fullscreen: ONLY IMPLEMENTED ON AMIGA

View File

@ -26,6 +26,7 @@ PRINTF := printf
PERL := perl
PYTHON := python
TRUE := true
TRUNCATE := truncate
CC = $(CROSS_COMPILE)gcc
CPP = $(CC) -E
AS = $(CROSS_COMPILE)as
@ -45,7 +46,8 @@ SORTOBJDUMP := ./util/sortobjdump.pl
PADIMG := ./util/padimg.pl
LICENCE := ./util/licence.pl
NRV2B := ./util/nrv2b
ZBIN := ./util/zbin
ZBIN32 := ./util/zbin32
ZBIN64 := ./util/zbin64
ELF2EFI32 := ./util/elf2efi32
ELF2EFI64 := ./util/elf2efi64
EFIROM := ./util/efirom
@ -81,6 +83,7 @@ SRCDIRS += drivers/net/marvell
SRCDIRS += drivers/block
SRCDIRS += drivers/nvs
SRCDIRS += drivers/bitbash
SRCDIRS += drivers/gpio
SRCDIRS += drivers/infiniband
SRCDIRS += drivers/infiniband/mlx_utils_flexboot/src
SRCDIRS += drivers/infiniband/mlx_utils/src/public
@ -92,6 +95,7 @@ SRCDIRS += drivers/infiniband/mlx_utils/mlx_lib/mlx_link_speed
SRCDIRS += drivers/infiniband/mlx_utils/mlx_lib/mlx_mtu
SRCDIRS += drivers/infiniband/mlx_nodnic/src
SRCDIRS += drivers/usb
SRCDIRS += drivers/uart
SRCDIRS += interface/pxe interface/efi interface/smbios
SRCDIRS += interface/bofm
SRCDIRS += interface/xen

View File

@ -368,6 +368,15 @@ WNAPM_FLAGS := $(shell $(WNAPM_TEST) && \
WORKAROUND_CFLAGS += $(WNAPM_FLAGS)
endif
# gcc 15 generates warnings for fixed-length character array
# initializers that lack a terminating NUL. Inhibit the warnings.
#
WNUSI_TEST = $(CC) -Wunterminated-string-initialization -x c -c /dev/null \
-o /dev/null >/dev/null 2>&1
WNUSI_FLAGS := $(shell $(WNUSI_TEST) && \
$(ECHO) '-Wno-unterminated-string-initialization')
WORKAROUND_CFLAGS += $(WNUSI_FLAGS)
# Some versions of gas choke on division operators, treating them as
# comment markers. Specifying --divide will work around this problem,
# but isn't available on older gas versions.
@ -474,7 +483,7 @@ CFLAGS += -Os
CFLAGS += -g
ifeq ($(CCTYPE),gcc)
CFLAGS += -ffreestanding
CFLAGS += -fcommon
CFLAGS += -fno-common
CFLAGS += -Wall -W -Wformat-nonliteral
CFLAGS += -Wno-array-bounds -Wno-dangling-pointer
HOST_CFLAGS += -Wall -W -Wformat-nonliteral
@ -484,6 +493,7 @@ CFLAGS += $(WORKAROUND_CFLAGS) $(EXTRA_CFLAGS)
ASFLAGS += $(WORKAROUND_ASFLAGS) $(EXTRA_ASFLAGS)
LDFLAGS += $(WORKAROUND_LDFLAGS) $(EXTRA_LDFLAGS)
HOST_CFLAGS += -O2 -g
HOST_EFI_CFLAGS += -fshort-wchar
# Inhibit -Werror if NO_WERROR is specified on make command line
#
@ -493,17 +503,14 @@ ASFLAGS += --fatal-warnings
HOST_CFLAGS += -Werror
endif
# Enable per-item sections and section garbage collection. Note that
# some older versions of gcc support -fdata-sections but treat it as
# implying -fno-common, which would break our build. Some other older
# versions issue a spurious and uninhibitable warning if
# -ffunction-sections is used with -g, which would also break our
# build since we use -Werror.
# Enable per-item sections and section garbage collection. Some older
# versions of gcc issue a spurious and uninhibitable warning if
# -ffunction-sections is used with -g, which would break our build
# since we use -Werror.
#
ifeq ($(CCTYPE),gcc)
DS_TEST = $(ECHO) 'char x;' | \
$(CC) -fdata-sections -S -x c - -o - 2>/dev/null | \
grep -E '\.comm' > /dev/null
DS_TEST = $(CC) -fdata-sections -c -x c /dev/null \
-o /dev/null 2>/dev/null
DS_FLAGS := $(shell $(DS_TEST) && $(ECHO) '-fdata-sections')
FS_TEST = $(CC) -ffunction-sections -g -c -x c /dev/null \
-o /dev/null 2>/dev/null
@ -605,7 +612,7 @@ embedded_DEPS += $(EMBEDDED_FILES) $(EMBEDDED_LIST)
CFLAGS_embedded = -DEMBED_ALL="$(EMBED_ALL)"
# List of trusted root certificates
# List of trusted root certificate configuration
#
TRUSTED_LIST := $(BIN)/.trusted.list
ifeq ($(wildcard $(TRUSTED_LIST)),)
@ -613,8 +620,9 @@ TRUST_OLD := <invalid>
else
TRUST_OLD := $(shell cat $(TRUSTED_LIST))
endif
ifneq ($(TRUST_OLD),$(TRUST))
$(shell $(ECHO) "$(TRUST)" > $(TRUSTED_LIST))
TRUST_CFG := $(TRUST) $(TRUST_EXT)
ifneq ($(TRUST_OLD),$(TRUST_CFG))
$(shell $(ECHO) "$(TRUST_CFG)" > $(TRUSTED_LIST))
endif
$(TRUSTED_LIST) : $(MAKEDEPS)
@ -631,7 +639,8 @@ TRUSTED_FPS := $(foreach CERT,$(TRUSTED_CERTS),\
rootcert_DEPS += $(TRUSTED_FILES) $(TRUSTED_LIST)
CFLAGS_rootcert = $(if $(TRUSTED_FPS),-DTRUSTED="$(TRUSTED_FPS)")
CFLAGS_rootcert += $(if $(TRUST_EXT),-DALLOW_TRUST_OVERRIDE=$(TRUST_EXT))
CFLAGS_rootcert += $(if $(TRUSTED_FPS),-DTRUSTED="$(TRUSTED_FPS)")
# List of embedded certificates
#
@ -1437,10 +1446,15 @@ endif # defined(BIN)
ZBIN_LDFLAGS := -llzma
$(ZBIN) : util/zbin.c $(MAKEDEPS)
$(ZBIN32) : util/zbin.c $(MAKEDEPS)
$(QM)$(ECHO) " [HOSTCC] $@"
$(Q)$(HOST_CC) $(HOST_CFLAGS) $< $(ZBIN_LDFLAGS) -o $@
CLEANUP += $(ZBIN)
$(Q)$(HOST_CC) $(HOST_CFLAGS) $< $(ZBIN_LDFLAGS) -DELF32 -o $@
CLEANUP += $(ZBIN32)
$(ZBIN64) : util/zbin.c $(MAKEDEPS)
$(QM)$(ECHO) " [HOSTCC] $@"
$(Q)$(HOST_CC) $(HOST_CFLAGS) $< $(ZBIN_LDFLAGS) -DELF64 -o $@
CLEANUP += $(ZBIN64)
###############################################################################
#
@ -1449,22 +1463,22 @@ CLEANUP += $(ZBIN)
$(ELF2EFI32) : util/elf2efi.c $(MAKEDEPS)
$(QM)$(ECHO) " [HOSTCC] $@"
$(Q)$(HOST_CC) $(HOST_CFLAGS) -idirafter include -DEFI_TARGET32 $< -o $@
$(Q)$(HOST_CC) $(HOST_CFLAGS) $(HOST_EFI_CFLAGS) -idirafter include -DEFI_TARGET32 $< -o $@
CLEANUP += $(ELF2EFI32)
$(ELF2EFI64) : util/elf2efi.c $(MAKEDEPS)
$(QM)$(ECHO) " [HOSTCC] $@"
$(Q)$(HOST_CC) $(HOST_CFLAGS) -idirafter include -DEFI_TARGET64 $< -o $@
$(Q)$(HOST_CC) $(HOST_CFLAGS) $(HOST_EFI_CFLAGS) -idirafter include -DEFI_TARGET64 $< -o $@
CLEANUP += $(ELF2EFI64)
$(EFIROM) : util/efirom.c util/eficompress.c $(MAKEDEPS)
$(QM)$(ECHO) " [HOSTCC] $@"
$(Q)$(HOST_CC) $(HOST_CFLAGS) -idirafter include -o $@ $<
$(Q)$(HOST_CC) $(HOST_CFLAGS) $(HOST_EFI_CFLAGS) -idirafter include -o $@ $<
CLEANUP += $(EFIROM)
$(EFIFATBIN) : util/efifatbin.c $(MAKEDEPS)
$(QM)$(ECHO) " [HOSTCC] $@"
$(Q)$(HOST_CC) $(HOST_CFLAGS) -idirafter include -o $@ $<
$(Q)$(HOST_CC) $(HOST_CFLAGS) $(HOST_EFI_CFLAGS) -idirafter include -o $@ $<
CLEANUP += $(EFIFATBIN)
###############################################################################

View File

@ -29,9 +29,13 @@ NON_AUTO_MEDIA = linux
# Compiler flags for building host API wrapper
#
LINUX_CFLAGS += -Os -idirafter include -DSYMBOL_PREFIX=$(SYMBOL_PREFIX)
LINUX_CFLAGS += -Wall -W
ifneq ($(SYSROOT),)
LINUX_CFLAGS += --sysroot=$(SYSROOT)
endif
ifneq ($(NO_WERROR),1)
LINUX_CFLAGS += -Werror
endif
# Check for libslirp
#

View File

@ -1,3 +1,7 @@
# Specify compressor
#
ZBIN = $(ZBIN32)
# ARM32-specific directories containing source files
#
SRCDIRS += arch/arm32/core

View File

@ -1,3 +1,7 @@
# Specify compressor
#
ZBIN = $(ZBIN64)
# ARM64-specific directories containing source files
#
SRCDIRS += arch/arm64/core

View File

@ -0,0 +1,30 @@
#ifndef _BITS_LKRN_H
#define _BITS_LKRN_H
/** @file
*
* Linux kernel image invocation
*
*/
FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
/** Header magic value */
#define LKRN_MAGIC_ARCH LKRN_MAGIC_AARCH64
/**
* Jump to kernel entry point
*
* @v entry Kernel entry point
* @v fdt Device tree
*/
static inline __attribute__ (( noreturn )) void
lkrn_jump ( physaddr_t entry, physaddr_t fdt ) {
register unsigned long x0 asm ( "x0" ) = fdt;
__asm__ __volatile__ ( "br %1"
: : "r" ( x0 ), "r" ( entry ) );
__builtin_unreachable();
}
#endif /* _BITS_LKRN_H */

View File

@ -1,3 +1,7 @@
# Specify compressor
#
ZBIN = $(ZBIN32)
# Force i386-only instructions
#
CFLAGS += -march=i386

View File

@ -88,6 +88,8 @@ SECTIONS {
__rodata16 = .;
*(.rodata16)
*(.rodata16.*)
*(.srodata)
*(.srodata.*)
*(.rodata)
*(.rodata.*)
}
@ -95,6 +97,8 @@ SECTIONS {
__data16 = .;
*(.data16)
*(.data16.*)
*(.sdata)
*(.sdata.*)
*(.data)
*(.data.*)
KEEP(*(SORT(.tbl.*))) /* Various tables. See include/tables.h */
@ -107,6 +111,8 @@ SECTIONS {
_bss16 = .;
*(.bss16)
*(.bss16.*)
*(.sbss)
*(.sbss.*)
*(.bss)
*(.bss.*)
*(COMMON)

View File

@ -1,3 +1,7 @@
# Specify compressor
#
ZBIN = $(ZBIN64)
# Assembler section type character
#
ASM_TCHAR := @

View File

@ -5,7 +5,14 @@
# prefix code.
#
CFLAGS += -mcmodel=medany -fpie
LDFLAGS += -pie --no-dynamic-linker
LDFLAGS += -pie --no-dynamic-linker -z combreloc
# Place explicitly zero-initialised variables in the .data section
# rather than in .bss, so that we can rely on their values even during
# parsing of the system memory map prior to relocation (and therefore
# prior to explicit zeroing of the .bss section).
#
CFLAGS += -fno-zero-initialized-in-bss
# Linker script
#
@ -14,3 +21,11 @@ LDSCRIPT = arch/riscv/scripts/sbi.lds
# Media types
#
MEDIA += sbi
MEDIA += lkrn
# Padded flash device images (e.g. for QEMU's -pflash option)
#
NON_AUTO_MEDIA += pf32
%.pf32 : %.sbi $(MAKEDEPS)
$(Q)$(CP) $< $@
$(Q)$(TRUNCATE) -s 32M $@

View File

@ -55,7 +55,7 @@ static int hart_node ( unsigned int *offset ) {
snprintf ( path, sizeof ( path ), "/cpus/cpu@%lx", boot_hart );
/* Find node */
if ( ( rc = fdt_path ( path, offset ) ) != 0 ) {
if ( ( rc = fdt_path ( &sysfdt, path, offset ) ) != 0 ) {
DBGC ( colour, "HART could not find %s: %s\n",
path, strerror ( rc ) );
return rc;
@ -81,7 +81,7 @@ int hart_supported ( const char *ext ) {
return rc;
/* Get ISA description */
isa = fdt_string ( offset, "riscv,isa" );
isa = fdt_string ( &sysfdt, offset, "riscv,isa" );
if ( ! isa ) {
DBGC ( colour, "HART could not identify ISA\n" );
return -ENOENT;

View File

@ -0,0 +1,190 @@
/*
* Copyright (C) 2025 Michael Brown <mbrown@fensystems.co.uk>.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License as
* published by the Free Software Foundation; either version 2 of the
* License, or any later version.
*
* This program is distributed in the hope that it will be useful, but
* WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
* 02110-1301, USA.
*
* You can also choose to distribute this program under the terms of
* the Unmodified Binary Distribution Licence (as given in the file
* COPYING.UBDL), provided that you have satisfied its requirements.
*/
FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
#include <assert.h>
#include <ipxe/zicbom.h>
#include <ipxe/iomap.h>
#include <ipxe/dma.h>
/** @file
*
* iPXE DMA API for RISC-V
*
*/
/** Minimum alignment for coherent DMA allocations
*
* We set this sufficiently high to ensure that we do not end up with
* both cached and uncached uses in the same cacheline.
*/
#define RISCV_DMA_ALIGN 256
/**
* Map buffer for DMA
*
* @v dma DMA device
* @v map DMA mapping to fill in
* @v addr Buffer address
* @v len Length of buffer
* @v flags Mapping flags
* @ret rc Return status code
*/
static int riscv_dma_map ( struct dma_device *dma,
struct dma_mapping *map,
void *addr, size_t len, int flags ) {
/* Sanity check: we cannot support bidirectional mappings */
assert ( ! ( ( flags & DMA_TX ) & ( flags & DMA_RX ) ) );
/* Populate mapping */
map->dma = dma;
map->offset = 0;
map->token = NULL;
/* Flush cached data to transmit buffers */
if ( flags & DMA_TX )
cache_clean ( addr, len );
/* Invalidate cached data in receive buffers and record address */
if ( flags & DMA_RX ) {
cache_invalidate ( addr, len );
map->token = addr;
}
/* Increment mapping count (for debugging) */
if ( DBG_LOG )
dma->mapped++;
return 0;
}
/**
* Unmap buffer
*
* @v map DMA mapping
* @v len Used length
*/
static void riscv_dma_unmap ( struct dma_mapping *map, size_t len ) {
struct dma_device *dma = map->dma;
void *addr = map->token;
/* Invalidate cached data in receive buffers */
if ( addr )
cache_invalidate ( addr, len );
/* Clear mapping */
map->dma = NULL;
/* Decrement mapping count (for debugging) */
if ( DBG_LOG )
dma->mapped--;
}
/**
* Allocate and map DMA-coherent buffer
*
* @v dma DMA device
* @v map DMA mapping to fill in
* @v len Length of buffer
* @v align Physical alignment
* @ret addr Buffer address, or NULL on error
*/
static void * riscv_dma_alloc ( struct dma_device *dma,
struct dma_mapping *map,
size_t len, size_t align ) {
physaddr_t phys;
void *addr;
void *caddr;
/* Round up length and alignment */
len = ( ( len + RISCV_DMA_ALIGN - 1 ) & ~( RISCV_DMA_ALIGN - 1 ) );
if ( align < RISCV_DMA_ALIGN )
align = RISCV_DMA_ALIGN;
/* Allocate from heap */
addr = malloc_phys ( len, align );
if ( ! addr )
return NULL;
/* Invalidate any existing cached data */
cache_invalidate ( addr, len );
/* Record mapping */
map->dma = dma;
map->token = addr;
/* Calculate coherently-mapped virtual address */
phys = virt_to_phys ( addr );
assert ( phys == ( ( uint32_t ) phys ) );
caddr = ( ( void * ) ( intptr_t ) ( phys + svpage_dma32() ) );
assert ( phys == virt_to_phys ( caddr ) );
DBGC ( dma, "DMA allocated [%#08lx,%#08lx) via %p\n",
phys, ( phys + len ), caddr );
/* Increment allocation count (for debugging) */
if ( DBG_LOG )
dma->allocated++;
return caddr;
}
/**
* Unmap and free DMA-coherent buffer
*
* @v dma DMA device
* @v map DMA mapping
* @v addr Buffer address
* @v len Length of buffer
*/
static void riscv_dma_free ( struct dma_mapping *map,
void *addr, size_t len ) {
struct dma_device *dma = map->dma;
/* Sanity check */
assert ( virt_to_phys ( addr ) == virt_to_phys ( map->token ) );
/* Round up length to match allocation */
len = ( ( len + RISCV_DMA_ALIGN - 1 ) & ~( RISCV_DMA_ALIGN - 1 ) );
/* Free original allocation */
free_phys ( map->token, len );
/* Clear mapping */
map->dma = NULL;
map->token = NULL;
/* Decrement allocation count (for debugging) */
if ( DBG_LOG )
dma->allocated--;
}
PROVIDE_DMAAPI ( riscv, dma_map, riscv_dma_map );
PROVIDE_DMAAPI ( riscv, dma_unmap, riscv_dma_unmap );
PROVIDE_DMAAPI ( riscv, dma_alloc, riscv_dma_alloc );
PROVIDE_DMAAPI ( riscv, dma_free, riscv_dma_free );
PROVIDE_DMAAPI ( riscv, dma_umalloc, riscv_dma_alloc );
PROVIDE_DMAAPI ( riscv, dma_ufree, riscv_dma_free );
PROVIDE_DMAAPI_INLINE ( riscv, dma_set_mask );
PROVIDE_DMAAPI_INLINE ( riscv, dma );

View File

@ -195,42 +195,13 @@ void riscv_memset ( void *dest, size_t len, int character ) {
}
/**
* Copy (possibly overlapping) memory region forwards
* Copy (possibly overlapping) memory region
*
* @v dest Destination region
* @v src Source region
* @v len Length
*/
void riscv_memmove_forwards ( void *dest, const void *src, size_t len ) {
unsigned long discard_data;
/* Do nothing if length is zero */
if ( ! len )
return;
/* Assume memmove() is not performance-critical, and perform a
* bytewise copy for simplicity.
*/
__asm__ __volatile__ ( "\n1:\n\t"
"lb %2, (%1)\n\t"
"sb %2, (%0)\n\t"
"addi %1, %1, 1\n\t"
"addi %0, %0, 1\n\t"
"bne %0, %3, 1b\n\t"
: "+r" ( dest ), "+r" ( src ),
"=&r" ( discard_data )
: "r" ( dest + len )
: "memory" );
}
/**
* Copy (possibly overlapping) memory region backwards
*
* @v dest Destination region
* @v src Source region
* @v len Length
*/
void riscv_memmove_backwards ( void *dest, const void *src, size_t len ) {
void riscv_memmove ( void *dest, const void *src, size_t len ) {
void *orig_dest = dest;
unsigned long discard_data;
@ -238,8 +209,14 @@ void riscv_memmove_backwards ( void *dest, const void *src, size_t len ) {
if ( ! len )
return;
/* Use memcpy() if copy direction is forwards */
if ( dest <= src ) {
memcpy ( dest, src, len );
return;
}
/* Assume memmove() is not performance-critical, and perform a
* bytewise copy for simplicity.
* bytewise copy backwards for simplicity.
*/
dest += len;
src += len;
@ -254,19 +231,3 @@ void riscv_memmove_backwards ( void *dest, const void *src, size_t len ) {
: "r" ( orig_dest )
: "memory" );
}
/**
* Copy (possibly overlapping) memory region
*
* @v dest Destination region
* @v src Source region
* @v len Length
*/
void riscv_memmove ( void *dest, const void *src, size_t len ) {
if ( dest <= src ) {
riscv_memmove_forwards ( dest, src, len );
} else {
riscv_memmove_backwards ( dest, src, len );
}
}

View File

@ -0,0 +1,138 @@
/*
* Copyright (C) 2025 Michael Brown <mbrown@fensystems.co.uk>.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License as
* published by the Free Software Foundation; either version 2 of the
* License, or any later version.
*
* This program is distributed in the hope that it will be useful, but
* WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
* 02110-1301, USA.
*
* You can also choose to distribute this program under the terms of
* the Unmodified Binary Distribution Licence (as given in the file
* COPYING.UBDL), provided that you have satisfied its requirements.
*/
FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL )
/** @file
*
* TCP/IP checksum
*
*/
.section ".note.GNU-stack", "", @progbits
.text
/**
* Calculate continued TCP/IP checkum
*
* @v partial Checksum of already-summed data, in network byte order
* @v data Data buffer
* @v len Length of data buffer
* @ret cksum Updated checksum, in network byte order
*
* In practice, this routine will only ever be called with a data
* pointer aligned to a 16-bit boundary. We optimise for this case,
* ensuring that the code would still give correct output if called
* with a misaligned pointer.
*/
.section ".text.tcpip_continue_chksum", "ax", @progbits
.globl tcpip_continue_chksum
tcpip_continue_chksum:
/* Set up register usage:
*
* a0: checksum low xlen bits
* a1: data pointer
* a2: end of data pointer
* a3: end of data pointer minus a constant offset of interest
* a4: checksum high bits (guaranteed to never carry) / constant 0xffff
* a5: temporary register
*/
not a0, a0
add a2, a2, a1
addi a3, a2, -( __riscv_xlen / 8 )
mv a4, zero
/* Skip aligned checksumming if data is too short */
bgtu a1, a3, post_aligned
/* Checksum 16-bit words until we reach xlen-bit alignment (or
* one byte past xlen-bit alignment).
*/
j 2f
1: lhu a5, (a1)
addi a1, a1, 2
add a4, a4, a5
2: andi a5, a1, ( ( ( __riscv_xlen / 8 ) - 1 ) & ~1 )
bnez a5, 1b
/* Checksum aligned xlen-bit words */
j 2f
1: LOADN a5, (a1)
addi a1, a1, ( __riscv_xlen / 8 )
add a0, a0, a5
sltu a5, a0, a5
add a4, a4, a5
2: bleu a1, a3, 1b
post_aligned:
/* Checksum remaining 16-bit words */
addi a3, a2, -2
j 2f
1: lhu a5, (a1)
addi a1, a1, 2
add a4, a4, a5
2: bleu a1, a3, 1b
/* Checksum final byte if present */
beq a1, a2, 1f
lbu a5, (a1)
add a4, a4, a5
1:
/* Fold down to xlen+1 bits */
add a0, a0, a4
sltu a4, a0, a4
/* Fold down to (xlen/2)+2 bits */
slli a5, a0, ( __riscv_xlen / 2 )
srli a0, a0, ( __riscv_xlen / 2 )
srli a5, a5, ( __riscv_xlen / 2 )
add a0, a0, a4
add a0, a0, a5
/* Load constant 0xffff for use in subsequent folding */
li a4, 0xffff
#if __riscv_xlen >= 64
/* Fold down to (xlen/4)+3 bits (if xlen >= 64) */
and a5, a0, a4
srli a0, a0, ( __riscv_xlen / 4 )
add a0, a0, a5
#endif
/* Fold down to 16+1 bits */
and a5, a0, a4
srli a0, a0, 16
add a0, a0, a5
/* Fold down to 16 bits */
srli a5, a0, 16
add a0, a0, a5
srli a5, a0, 17
add a0, a0, a5
and a0, a0, a4
/* Negate and return */
xor a0, a0, a4
ret
.size tcpip_continue_chksum, . - tcpip_continue_chksum

View File

@ -0,0 +1,287 @@
/*
* Copyright (C) 2025 Michael Brown <mbrown@fensystems.co.uk>.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License as
* published by the Free Software Foundation; either version 2 of the
* License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful, but
* WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
* 02110-1301, USA.
*
* You can also choose to distribute this program under the terms of
* the Unmodified Binary Distribution Licence (as given in the file
* COPYING.UBDL), provided that you have satisfied its requirements.
*/
FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
#include <stdint.h>
#include <strings.h>
#include <assert.h>
#include <ipxe/hart.h>
#include <ipxe/iomap.h>
/** @file
*
* Supervisor page table management
*
* With the 64-bit paging schemes (Sv39, Sv48, and Sv57) we choose to
* identity-map as much as possible of the physical address space via
* PTEs 0-255, and place a recursive page table entry in PTE 511 which
* allows PTEs 256-510 to be used to map 1GB "gigapages" within the
* top 256GB of the 64-bit address space. At least one of these PTEs
* will already be in use to map iPXE itself. The remaining PTEs may
* be used to map I/O devices.
*/
/** A page table */
struct page_table {
/** Page table entry */
uint64_t pte[512];
};
/** Page table entry flags */
enum pte_flags {
/** Page table entry is valid */
PTE_V = 0x01,
/** Page is readable */
PTE_R = 0x02,
/** Page is writable */
PTE_W = 0x04,
/** Page has been accessed */
PTE_A = 0x40,
/** Page is dirty */
PTE_D = 0x80,
/** Page is the last page in an allocation
*
* This bit is ignored by the hardware. We use it to track
* the size of allocations made by ioremap().
*/
PTE_LAST = 0x100,
};
/** Page-based memory type (Svpbmt) */
#define PTE_SVPBMT( x ) ( ( ( unsigned long long ) (x) ) << 61 )
/** Page is non-cacheable memory (Svpbmt) */
#define PTE_SVPBMT_NC PTE_SVPBMT ( 1 )
/** Page maps I/O addresses (Svpbmt) */
#define PTE_SVPBMT_IO PTE_SVPBMT ( 2 )
/** Page table entry address */
#define PTE_PPN( addr ) ( (addr) >> 2 )
/** The page table */
extern struct page_table page_table;
/** I/O page size
*
* We choose to use 1GB "gigapages", since these are supported by all
* paging levels.
*/
#define MAP_PAGE_SIZE 0x40000000UL
/** I/O page base address
*
* The recursive page table entry maps the high 512GB of the 64-bit
* address space as 1GB "gigapages".
*/
#define MAP_BASE ( ( void * ) ( intptr_t ) ( -1ULL << 39 ) )
/** Coherent DMA mapping of the 32-bit address space */
static void *svpage_dma32_base;
/** Size of the coherent DMA mapping */
#define DMA32_LEN ( ( size_t ) 0x100000000ULL )
/**
* Map pages
*
* @v phys Physical address
* @v len Length
* @v attrs Page attributes
* @ret virt Mapped virtual address, or NULL on error
*/
static void * svpage_map ( physaddr_t phys, size_t len, unsigned long attrs ) {
unsigned long satp;
unsigned long start;
unsigned int count;
unsigned int stride;
unsigned int first;
unsigned int i;
size_t offset;
void *virt;
DBGC ( &page_table, "SVPAGE mapping %#08lx+%#zx attrs %#016lx\n",
phys, len, attrs );
/* Sanity checks */
if ( ! len )
return NULL;
assert ( attrs & PTE_V );
/* Use physical address directly if paging is disabled */
__asm__ ( "csrr %0, satp" : "=r" ( satp ) );
if ( ! satp ) {
virt = phys_to_virt ( phys );
DBGC ( &page_table, "SVPAGE mapped %#08lx+%#zx to %p (no "
"paging)\n", phys, len, virt );
return virt;
}
/* Round down start address to a page boundary */
start = ( phys & ~( MAP_PAGE_SIZE - 1 ) );
offset = ( phys - start );
assert ( offset < MAP_PAGE_SIZE );
/* Calculate number of pages required */
count = ( ( offset + len + MAP_PAGE_SIZE - 1 ) / MAP_PAGE_SIZE );
assert ( count != 0 );
assert ( count < ( sizeof ( page_table.pte ) /
sizeof ( page_table.pte[0] ) ) );
/* Round up number of pages to a power of two */
stride = ( 1 << fls ( count - 1 ) );
assert ( count <= stride );
/* Allocate pages */
for ( first = 0 ; first < ( sizeof ( page_table.pte ) /
sizeof ( page_table.pte[0] ) ) ;
first += stride ) {
/* Calculate virtual address */
virt = ( MAP_BASE + ( first * MAP_PAGE_SIZE ) + offset );
/* Check that page table entries are available */
for ( i = first ; i < ( first + count ) ; i++ ) {
if ( page_table.pte[i] & PTE_V ) {
virt = NULL;
break;
}
}
if ( ! virt )
continue;
/* Create page table entries */
for ( i = first ; i < ( first + count ) ; i++ ) {
page_table.pte[i] = ( PTE_PPN ( start ) | attrs );
start += MAP_PAGE_SIZE;
}
/* Mark last page as being the last in this allocation */
page_table.pte[ i - 1 ] |= PTE_LAST;
/* Synchronise page table updates */
__asm__ __volatile__ ( "sfence.vma" );
/* Return virtual address */
DBGC ( &page_table, "SVPAGE mapped %#08lx+%#zx to %p using "
"PTEs [%d-%d]\n", phys, len, virt, first,
( first + count - 1 ) );
return virt;
}
DBGC ( &page_table, "SVPAGE could not map %#08lx+%#zx\n",
phys, len );
return NULL;
}
/**
* Unmap pages
*
* @v virt Virtual address
*/
static void svpage_unmap ( const volatile void *virt ) {
unsigned long satp;
unsigned int first;
unsigned int i;
int is_last;
DBGC ( &page_table, "SVPAGE unmapping %p\n", virt );
/* Do nothing if paging is disabled */
__asm__ ( "csrr %0, satp" : "=r" ( satp ) );
if ( ! satp )
return;
/* Calculate first page table entry */
first = ( ( virt - MAP_BASE ) / MAP_PAGE_SIZE );
/* Clear page table entries */
for ( i = first ; ; i++ ) {
/* Sanity check */
assert ( page_table.pte[i] & PTE_V );
/* Check if this is the last page in this allocation */
is_last = ( page_table.pte[i] & PTE_LAST );
/* Clear page table entry */
page_table.pte[i] = 0;
/* Terminate if this was the last page */
if ( is_last )
break;
}
/* Synchronise page table updates */
__asm__ __volatile__ ( "sfence.vma" );
DBGC ( &page_table, "SVPAGE unmapped %p using PTEs [%d-%d]\n",
virt, first, i );
}
/**
* Map pages for I/O
*
* @v bus_addr Bus address
* @v len Length of region
* @ret io_addr I/O address
*/
static void * svpage_ioremap ( unsigned long bus_addr, size_t len ) {
unsigned long attrs = ( PTE_V | PTE_R | PTE_W | PTE_A | PTE_D );
int rc;
/* Add Svpbmt attributes if applicable */
if ( ( rc = hart_supported ( "_svpbmt" ) ) == 0 )
attrs |= PTE_SVPBMT_IO;
/* Map pages for I/O */
return svpage_map ( bus_addr, len, attrs );
}
/**
* Get 32-bit address space coherent DMA mapping address
*
* @ret base Coherent DMA mapping base address
*/
void * svpage_dma32 ( void ) {
unsigned long attrs = ( PTE_V | PTE_R | PTE_W | PTE_A | PTE_D );
int rc;
/* Add Svpbmt attributes if applicable */
if ( ( rc = hart_supported ( "_svpbmt" ) ) == 0 )
attrs |= PTE_SVPBMT_NC;
/* Create mapping, if necessary */
if ( ! svpage_dma32_base )
svpage_dma32_base = svpage_map ( 0, DMA32_LEN, attrs );
/* Sanity check */
assert ( virt_to_phys ( svpage_dma32_base ) == 0 );
return svpage_dma32_base;
}
PROVIDE_IOMAP_INLINE ( svpage, io_to_bus );
PROVIDE_IOMAP ( svpage, ioremap, svpage_ioremap );
PROVIDE_IOMAP ( svpage, iounmap, svpage_unmap );

View File

@ -0,0 +1,65 @@
/*
* Copyright (C) 2025 Michael Brown <mbrown@fensystems.co.uk>.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License as
* published by the Free Software Foundation; either version 2 of the
* License, or any later version.
*
* This program is distributed in the hope that it will be useful, but
* WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
* 02110-1301, USA.
*
* You can also choose to distribute this program under the terms of
* the Unmodified Binary Distribution Licence (as given in the file
* COPYING.UBDL), provided that you have satisfied its requirements.
*/
FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
/** @file
*
* T-Head vendor extensions
*
*/
#include <ipxe/sbi.h>
#include <ipxe/xthead.h>
/** Colour for debug messages */
#define colour THEAD_MVENDORID
/**
* Check for a T-Head feature via SXSTATUS register
*
* @v feature Feature bit
* @ret supported Feature is supported
*/
int xthead_supported ( unsigned long feature ) {
struct sbi_return ret;
unsigned long sxstatus;
/* Check for a T-Head CPU */
ret = sbi_ecall_0 ( SBI_BASE, SBI_BASE_MVENDORID );
if ( ret.error )
return 0;
if ( ret.value != THEAD_MVENDORID ) {
DBGC ( colour, "THEAD vendor ID mismatch: %#08lx\n",
ret.value );
return 0;
}
/* Read SXSTATUS CSR */
__asm__ ( "csrr %0, %1"
: "=r" ( sxstatus ) : "i" ( THEAD_SXSTATUS ) );
DBGC ( colour, "THEAD sxstatus %#08lx\n", sxstatus );
/* Check feature bit */
return ( !! ( sxstatus & feature ) );
}

View File

@ -0,0 +1,255 @@
/*
* Copyright (C) 2025 Michael Brown <mbrown@fensystems.co.uk>.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License as
* published by the Free Software Foundation; either version 2 of the
* License, or any later version.
*
* This program is distributed in the hope that it will be useful, but
* WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
* 02110-1301, USA.
*
* You can also choose to distribute this program under the terms of
* the Unmodified Binary Distribution Licence (as given in the file
* COPYING.UBDL), provided that you have satisfied its requirements.
*/
FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
/** @file
*
* Cache-block management operations (Zicbom)
*
* We support explicit cache management operations on I/O buffers.
* These are guaranteed to be aligned on their own size and at least
* as large as a (reasonable) cacheline, and therefore cannot cross a
* cacheline boundary.
*/
#include <stdint.h>
#include <ipxe/hart.h>
#include <ipxe/xthead.h>
#include <ipxe/zicbom.h>
/** Minimum supported cacheline size
*
* We assume that cache management operations will ignore the least
* significant address bits, and so we are safe to assume a cacheline
* size that is smaller than the size actually used by the CPU.
*
* Cache clean and invalidate loops could be made faster by detecting
* the actual cacheline size.
*/
#define CACHE_STRIDE 32
/** A cache management extension */
struct cache_extension {
/**
* Clean data cache (i.e. write cached content back to memory)
*
* @v first First byte
* @v last Last byte
*/
void ( * clean ) ( const void *first, const void *last );
/**
* Invalidate data cache (i.e. discard any cached content)
*
* @v first First byte
* @v last Last byte
*/
void ( * invalidate ) ( void *first, void *last );
};
/** Define an operation to clean the data cache */
#define CACHE_CLEAN( extension, insn ) \
static void extension ## _clean ( const void *first, \
const void *last ) { \
\
__asm__ __volatile__ ( ".option arch, +" #extension "\n\t" \
"\n1:\n\t" \
insn "\n\t" \
"addi %0, %0, %2\n\t" \
"bltu %0, %1, 1b\n\t" \
: "+r" ( first ) \
: "r" ( last ), "i" ( CACHE_STRIDE ) ); \
}
/** Define an operation to invalidate the data cache */
#define CACHE_INVALIDATE( extension, insn ) \
static void extension ## _invalidate ( void *first, \
void *last ) { \
\
__asm__ __volatile__ ( ".option arch, +" #extension "\n\t" \
"\n1:\n\t" \
insn "\n\t" \
"addi %0, %0, %2\n\t" \
"bltu %0, %1, 1b\n\t" \
: "+r" ( first ) \
: "r" ( last ), "i" ( CACHE_STRIDE ) \
: "memory" ); \
}
/** Define a cache management extension */
#define CACHE_EXTENSION( extension, clean_insn, invalidate_insn ) \
CACHE_CLEAN ( extension, clean_insn ); \
CACHE_INVALIDATE ( extension, invalidate_insn ); \
static struct cache_extension extension = { \
.clean = extension ## _clean, \
.invalidate = extension ## _invalidate, \
};
/** The standard Zicbom extension */
CACHE_EXTENSION ( zicbom, "cbo.clean (%0)", "cbo.inval (%0)" );
/** The T-Head cache management extension */
CACHE_EXTENSION ( xtheadcmo, "th.dcache.cva %0", "th.dcache.iva %0" );
/**
* Clean data cache (with fully coherent memory)
*
* @v first First byte
* @v last Last byte
*/
static void cache_coherent_clean ( const void *first __unused,
const void *last __unused ) {
/* Nothing to do */
}
/**
* Invalidate data cache (with fully coherent memory)
*
* @v first First byte
* @v last Last byte
*/
static void cache_coherent_invalidate ( void *first __unused,
void *last __unused ) {
/* Nothing to do */
}
/** Dummy cache management extension for fully coherent memory */
static struct cache_extension cache_coherent = {
.clean = cache_coherent_clean,
.invalidate = cache_coherent_invalidate,
};
static void cache_auto_detect ( void );
static void cache_auto_clean ( const void *first, const void *last );
static void cache_auto_invalidate ( void *first, void *last );
/** The autodetect cache management extension */
static struct cache_extension cache_auto = {
.clean = cache_auto_clean,
.invalidate = cache_auto_invalidate,
};
/** Active cache management extension */
static struct cache_extension *cache_extension = &cache_auto;
/**
* Clean data cache (i.e. write cached content back to memory)
*
* @v start Start address
* @v len Length
*/
void cache_clean ( const void *start, size_t len ) {
const void *first;
const void *last;
/* Do nothing for zero-length buffers */
if ( ! len )
return;
/* Construct address range */
first = ( ( const void * )
( ( ( intptr_t ) start ) & ~( CACHE_STRIDE - 1 ) ) );
last = ( start + len - 1 );
/* Clean cache lines */
cache_extension->clean ( first, last );
}
/**
* Invalidate data cache (i.e. discard any cached content)
*
* @v start Start address
* @v len Length
*/
void cache_invalidate ( void *start, size_t len ) {
void *first;
void *last;
/* Do nothing for zero-length buffers */
if ( ! len )
return;
/* Construct address range */
first = ( ( void * )
( ( ( intptr_t ) start ) & ~( CACHE_STRIDE - 1 ) ) );
last = ( start + len - 1 );
/* Invalidate cache lines */
cache_extension->invalidate ( first, last );
}
/**
* Autodetect and clean data cache
*
* @v first First byte
* @v last Last byte
*/
static void cache_auto_clean ( const void *first, const void *last ) {
/* Detect cache extension */
cache_auto_detect();
/* Clean data cache */
cache_extension->clean ( first, last );
}
/**
* Autodetect and invalidate data cache
*
* @v first First byte
* @v last Last byte
*/
static void cache_auto_invalidate ( void *first, void *last ) {
/* Detect cache extension */
cache_auto_detect();
/* Clean data cache */
cache_extension->invalidate ( first, last );
}
/**
* Autodetect cache
*
*/
static void cache_auto_detect ( void ) {
int rc;
/* Check for standard Zicbom extension */
if ( ( rc = hart_supported ( "_zicbom" ) ) == 0 ) {
DBGC ( &cache_extension, "CACHE detected Zicbom\n" );
cache_extension = &zicbom;
return;
}
/* Check for T-Head cache management extension */
if ( xthead_supported ( THEAD_SXSTATUS_THEADISAEE ) ) {
DBGC ( &cache_extension, "CACHE detected XTheadCmo\n" );
cache_extension = &xtheadcmo;
return;
}
/* Assume coherent memory if no supported extension detected */
DBGC ( &cache_extension, "CACHE assuming coherent memory\n" );
cache_extension = &cache_coherent;
}

View File

@ -32,7 +32,7 @@ FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
#include <string.h>
#include <errno.h>
#include <ipxe/fdt.h>
#include <ipxe/hart.h>
#include <ipxe/csr.h>
#include <ipxe/timer.h>
/** Timer increment per microsecond */
@ -145,15 +145,15 @@ static int zicntr_probe ( void ) {
} u;
int rc;
/* Check if Zicntr extension is supported */
if ( ( rc = hart_supported ( "_zicntr" ) ) != 0 ) {
DBGC ( colour, "ZICNTR not supported: %s\n", strerror ( rc ) );
return rc;
/* Check if time CSR can be read */
if ( ! csr_can_read ( "time" ) ) {
DBGC ( colour, "ZICNTR cannot read TIME CSR\n" );
return -ENOTSUP;
}
/* Get timer frequency */
if ( ( ( rc = fdt_path ( "/cpus", &offset ) ) != 0 ) ||
( ( rc = fdt_u64 ( offset, "timebase-frequency",
if ( ( ( rc = fdt_path ( &sysfdt, "/cpus", &offset ) ) != 0 ) ||
( ( rc = fdt_u64 ( &sysfdt, offset, "timebase-frequency",
&u.freq ) ) != 0 ) ) {
DBGC ( colour, "ZICNTR could not determine frequency: %s\n",
strerror ( rc ) );

View File

@ -30,7 +30,6 @@ FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
*/
#include <errno.h>
#include <ipxe/hart.h>
#include <ipxe/csr.h>
#include <ipxe/entropy.h>
#include <ipxe/drbg.h>
@ -53,13 +52,6 @@ struct entropy_source zkr_entropy __entropy_source ( ENTROPY_PREFERRED );
* @ret rc Return status code
*/
static int zkr_entropy_enable ( void ) {
int rc;
/* Check if Zkr extension is supported */
if ( ( rc = hart_supported ( "_zkr" ) ) != 0 ) {
DBGC ( colour, "ZKR not supported: %s\n", strerror ( rc ) );
return rc;
}
/* Check if seed CSR is accessible in S-mode */
if ( ! csr_can_write ( "seed", 0 ) ) {

View File

@ -0,0 +1,14 @@
#ifndef _BITS_DMA_H
#define _BITS_DMA_H
/** @file
*
* RISCV-specific DMA API implementations
*
*/
FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
#include <ipxe/riscv_dma.h>
#endif /* _BITS_DMA_H */

View File

@ -0,0 +1,14 @@
#ifndef _BITS_IOMAP_H
#define _BITS_IOMAP_H
/** @file
*
* RISCV-specific I/O mapping API implementations
*
*/
FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
#include <ipxe/svpage.h>
#endif /* _BITS_IOMAP_H */

View File

@ -0,0 +1,34 @@
#ifndef _BITS_LKRN_H
#define _BITS_LKRN_H
/** @file
*
* Linux kernel image invocation
*
*/
FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
#include <ipxe/hart.h>
/** Header magic value */
#define LKRN_MAGIC_ARCH LKRN_MAGIC_RISCV
/**
* Jump to kernel entry point
*
* @v entry Kernel entry point
* @v fdt Device tree
*/
static inline __attribute__ (( noreturn )) void
lkrn_jump ( physaddr_t entry, physaddr_t fdt ) {
register unsigned long a0 asm ( "a0" ) = boot_hart;
register unsigned long a1 asm ( "a1" ) = fdt;
__asm__ __volatile__ ( "call disable_paging\n\t"
"jr %2\n\t"
: : "r" ( a0 ), "r" ( a1 ), "r" ( entry ) );
__builtin_unreachable();
}
#endif /* _BITS_LKRN_H */

View File

@ -12,8 +12,6 @@ FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
extern void riscv_bzero ( void *dest, size_t len );
extern void riscv_memset ( void *dest, size_t len, int character );
extern void riscv_memcpy ( void *dest, const void *src, size_t len );
extern void riscv_memmove_forwards ( void *dest, const void *src, size_t len );
extern void riscv_memmove_backwards ( void *dest, const void *src, size_t len );
extern void riscv_memmove ( void *dest, const void *src, size_t len );
/**
@ -68,17 +66,12 @@ static inline __attribute__ (( always_inline )) void *
memmove ( void *dest, const void *src, size_t len ) {
ssize_t offset = ( dest - src );
/* If required direction of copy is known at build time, then
* use the appropriate forwards/backwards copy directly.
/* If direction of copy is known to be forwards at build time,
* then use variable-length memcpy().
*/
if ( __builtin_constant_p ( offset ) ) {
if ( offset <= 0 ) {
riscv_memmove_forwards ( dest, src, len );
return dest;
} else {
riscv_memmove_backwards ( dest, src, len );
return dest;
}
if ( __builtin_constant_p ( offset ) && ( offset <= 0 ) ) {
riscv_memcpy ( dest, src, len );
return dest;
}
/* Otherwise, use ambidirectional copy */

View File

@ -0,0 +1,15 @@
#ifndef _BITS_TCPIP_H
#define _BITS_TCPIP_H
/** @file
*
* Transport-network layer interface
*
*/
FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
extern uint16_t tcpip_continue_chksum ( uint16_t partial, const void *data,
size_t len );
#endif /* _BITS_TCPIP_H */

View File

@ -1,14 +0,0 @@
#ifndef _BITS_UMALLOC_H
#define _BITS_UMALLOC_H
/** @file
*
* RISCV-specific user memory allocation API implementations
*
*/
FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
#include <ipxe/sbi_umalloc.h>
#endif /* _BITS_UMALLOC_H */

View File

@ -0,0 +1,33 @@
#ifndef _BITS_VIRT_OFFSET_H
#define _BITS_VIRT_OFFSET_H
/** @file
*
* RISCV-specific virtual address offset
*
* We use the thread pointer register (tp) to hold the virtual address
* offset, so that virtual-to-physical address translations work as
* expected even while we are executing directly from read-only memory
* (and so cannot store a value in a global virt_offset variable).
*
*/
FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
/**
* Read virtual address offset held in thread pointer register
*
* @ret virt_offset Virtual address offset
*/
static inline __attribute__ (( const, always_inline )) unsigned long
tp_virt_offset ( void ) {
register unsigned long tp asm ( "tp" );
__asm__ ( "" : "=r" ( tp ) );
return tp;
}
/** Always read thread pointer register to get virtual address offset */
#define virt_offset tp_virt_offset()
#endif /* _BITS_VIRT_OFFSET_H */

View File

@ -0,0 +1,45 @@
#ifndef _IPXE_RISCV_DMA_H
#define _IPXE_RISCV_DMA_H
/** @file
*
* iPXE DMA API for RISC-V
*
*/
FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
#ifdef DMAAPI_RISCV
#define DMAAPI_PREFIX_riscv
#else
#define DMAAPI_PREFIX_riscv __riscv_
#endif
/**
* Set addressable space mask
*
* @v dma DMA device
* @v mask Addressable space mask
*/
static inline __always_inline void
DMAAPI_INLINE ( riscv, dma_set_mask ) ( struct dma_device *dma __unused,
physaddr_t mask __unused ) {
/* Nothing to do */
}
/**
* Get DMA address from virtual address
*
* @v map DMA mapping
* @v addr Address within the mapped region
* @ret addr Device-side DMA address
*/
static inline __always_inline physaddr_t
DMAAPI_INLINE ( riscv, dma ) ( struct dma_mapping *map __unused, void *addr ) {
/* Use physical address as device address */
return virt_to_phys ( addr );
}
#endif /* _IPXE_RISCV_DMA_H */

View File

@ -47,7 +47,8 @@ IOAPI_INLINE ( riscv, bus_to_phys ) ( unsigned long bus_addr ) {
static inline __always_inline _type \
IOAPI_INLINE ( riscv, read ## _suffix ) ( volatile _type *io_addr ) { \
unsigned long data; \
__asm__ __volatile__ ( "l" _insn_suffix " %0, %1" \
__asm__ __volatile__ ( "fence io, io\n\t" \
"l" _insn_suffix " %0, %1\n\t" \
: "=r" ( data ) : "m" ( *io_addr ) ); \
return data; \
}
@ -57,7 +58,8 @@ IOAPI_INLINE ( riscv, read ## _suffix ) ( volatile _type *io_addr ) { \
static inline __always_inline void \
IOAPI_INLINE ( riscv, write ## _suffix ) ( _type data, \
volatile _type *io_addr ) { \
__asm__ __volatile__ ( "s" _insn_suffix " %0, %1" \
__asm__ __volatile__ ( "fence io, io\n\t" \
"s" _insn_suffix " %0, %1\n\t" \
: : "r" ( data ), "m" ( *io_addr ) ); \
}
@ -69,7 +71,8 @@ IOAPI_INLINE ( riscv, read ## _suffix ) ( volatile _type *io_addr ) { \
unsigned long half[2]; \
_type data; \
} u; \
__asm__ __volatile__ ( "l" _insn_suffix " %0, 0(%2)\n\t" \
__asm__ __volatile__ ( "fence io, io\n\t" \
"l" _insn_suffix " %0, 0(%2)\n\t" \
"l" _insn_suffix " %1, %3(%2)\n\t" \
: "=&r" ( u.half[0] ), \
"=&r" ( u.half[1] ) \
@ -87,7 +90,8 @@ IOAPI_INLINE ( riscv, write ## _suffix ) ( _type data, \
unsigned long half[2]; \
_type data; \
} u = { .data = data }; \
__asm__ __volatile__ ( "s" _insn_suffix " %0, 0(%2)\n\t" \
__asm__ __volatile__ ( "fence io, io\n\t" \
"s" _insn_suffix " %0, 0(%2)\n\t" \
"s" _insn_suffix " %1, %3(%2)\n\t" : \
: "r" ( u.half[0] ), \
"r" ( u.half[1] ), \
@ -128,7 +132,7 @@ RISCV_WRITEX ( w, uint16_t, "h" );
*/
static inline __always_inline void
IOAPI_INLINE ( riscv, mb ) ( void ) {
__asm__ __volatile__ ( "fence rw, rw" );
__asm__ __volatile__ ( "fence" : : : "memory" );
}
/* Dummy PIO */

View File

@ -73,7 +73,7 @@ sbi_ecall_0 ( int eid, int fid ) {
*
* @v eid Extension ID
* @v fid Function ID
* @v param0 Parameter 0
* @v p0 Parameter 0
* @ret ret Return value
*/
static inline __attribute__ (( always_inline )) struct sbi_return
@ -98,8 +98,8 @@ sbi_ecall_1 ( int eid, int fid, unsigned long p0 ) {
*
* @v eid Extension ID
* @v fid Function ID
* @v param0 Parameter 0
* @v param1 Parameter 1
* @v p0 Parameter 0
* @v p1 Parameter 1
* @ret ret Return value
*/
static inline __attribute__ (( always_inline )) struct sbi_return
@ -124,9 +124,9 @@ sbi_ecall_2 ( int eid, int fid, unsigned long p0, unsigned long p1 ) {
*
* @v eid Extension ID
* @v fid Function ID
* @v param0 Parameter 0
* @v param1 Parameter 1
* @v param2 Parameter 2
* @v p0 Parameter 0
* @v p1 Parameter 1
* @v p2 Parameter 2
* @ret ret Return value
*/
static inline __attribute__ (( always_inline )) struct sbi_return
@ -148,9 +148,55 @@ sbi_ecall_3 ( int eid, int fid, unsigned long p0, unsigned long p1,
return ret;
}
/**
* Call supervisor with no parameters
*
* @v fid Legacy function ID
* @ret ret Return value
*/
static inline __attribute__ (( always_inline )) long
sbi_legacy_ecall_0 ( int fid ) {
register unsigned long a7 asm ( "a7" ) = ( ( long ) fid );
register unsigned long a0 asm ( "a0" );
__asm__ __volatile__ ( "ecall"
: "=r" ( a0 )
: "r" ( a7 )
: "memory" );
return a0;
}
/**
* Call supervisor with one parameter
*
* @v fid Legacy function ID
* @v p0 Parameter 0
* @ret ret Return value
*/
static inline __attribute__ (( always_inline )) long
sbi_legacy_ecall_1 ( int fid, unsigned long p0 ) {
register unsigned long a7 asm ( "a7" ) = ( ( long ) fid );
register unsigned long a0 asm ( "a0" ) = p0;
__asm__ __volatile__ ( "ecall"
: "+r" ( a0 )
: "r" ( a7 )
: "memory" );
return a0;
}
/** Convert an SBI error code to an iPXE status code */
#define ESBI( error ) EPLATFORM ( EINFO_EPLATFORM, error )
/** Legacy extensions */
#define SBI_LEGACY_PUTCHAR 0x01 /**< Console Put Character */
#define SBI_LEGACY_GETCHAR 0x02 /**< Console Get Character */
#define SBI_LEGACY_SHUTDOWN 0x08 /**< System Shutdown */
/** Base extension */
#define SBI_BASE 0x10
#define SBI_BASE_MVENDORID 0x04 /**< Get machine vendor ID */
/** System reset extension */
#define SBI_SRST SBI_EID ( 'S', 'R', 'S', 'T' )
#define SBI_SRST_SYSTEM_RESET 0x00 /**< Reset system */

View File

@ -1,18 +0,0 @@
#ifndef _IPXE_SBI_UMALLOC_H
#define _IPXE_SBI_UMALLOC_H
/** @file
*
* External memory allocation
*
*/
FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
#ifdef UMALLOC_SBI
#define UMALLOC_PREFIX_sbi
#else
#define UMALLOC_PREFIX_sbi __sbi_
#endif
#endif /* _IPXE_SBI_UMALLOC_H */

View File

@ -0,0 +1,28 @@
#ifndef _IPXE_SVPAGE_H
#define _IPXE_SVPAGE_H
/** @file
*
* Supervisor page table management
*
*/
FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
#include <stdint.h>
#ifdef IOMAP_SVPAGE
#define IOMAP_PREFIX_svpage
#else
#define IOMAP_PREFIX_svpage __svpage_
#endif
static inline __always_inline unsigned long
IOMAP_INLINE ( svpage, io_to_bus ) ( volatile const void *io_addr ) {
/* Not easy to do; just return the CPU address for debugging purposes */
return ( ( intptr_t ) io_addr );
}
extern void * svpage_dma32 ( void );
#endif /* _IPXE_SVPAGE_H */

View File

@ -0,0 +1,21 @@
#ifndef _IPXE_XTHEAD_H
#define _IPXE_XTHEAD_H
/** @file
*
* T-Head vendor extensions
*
*/
FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
/** T-Head machine vendor ID */
#define THEAD_MVENDORID 0x5b7
/** T-Head SXSTATUS CSR */
#define THEAD_SXSTATUS 0x5c0
#define THEAD_SXSTATUS_THEADISAEE 0x00400000 /**< General ISA extensions */
extern int xthead_supported ( unsigned long feature );
#endif /* _IPXE_XTHEAD_H */

View File

@ -0,0 +1,17 @@
#ifndef _IPXE_ZICBOM_H
#define _IPXE_ZICBOM_H
/** @file
*
* Cache-block management operations (Zicbom)
*
*/
FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
#include <stdint.h>
extern void cache_clean ( const void *start, size_t len );
extern void cache_invalidate ( void *start, size_t len );
#endif /* _IPXE_ZICBOM_H */

View File

@ -31,6 +31,8 @@ FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
#include <ipxe/sbi.h>
#include <ipxe/io.h>
#include <ipxe/keys.h>
#include <ipxe/serial.h>
#include <ipxe/console.h>
#include <config/console.h>
@ -40,6 +42,11 @@ FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
#define CONSOLE_SBI ( CONSOLE_USAGE_ALL & ~CONSOLE_USAGE_LOG )
#endif
extern void early_uart_putchar ( int character );
/** Dummy serial console (if not present in build) */
struct uart *serial_console __attribute__ (( weak ));
/** Buffered input character (if any) */
static unsigned char sbi_console_input;
@ -49,9 +56,22 @@ static unsigned char sbi_console_input;
* @v character Character to be printed
*/
static void sbi_putchar ( int character ) {
struct sbi_return ret;
/* Do nothing if a real serial console has been enabled */
if ( serial_console )
return;
/* Write byte to early UART, if enabled */
early_uart_putchar ( character );
/* Write byte to console */
sbi_ecall_1 ( SBI_DBCN, SBI_DBCN_WRITE_BYTE, character );
ret = sbi_ecall_1 ( SBI_DBCN, SBI_DBCN_WRITE_BYTE, character );
if ( ! ret.error )
return;
/* Debug extension not supported: try legacy method */
sbi_legacy_ecall_1 ( SBI_LEGACY_PUTCHAR, character );
}
/**
@ -65,6 +85,11 @@ static int sbi_getchar ( void ) {
/* Consume and return buffered character, if any */
character = sbi_console_input;
sbi_console_input = 0;
/* Convert DEL to backspace */
if ( character == DEL )
character = BACKSPACE;
return character;
}
@ -76,20 +101,32 @@ static int sbi_getchar ( void ) {
*/
static int sbi_iskey ( void ) {
struct sbi_return ret;
long key;
/* Do nothing if we already have a buffered character */
if ( sbi_console_input )
return sbi_console_input;
/* Do nothing if a real serial console has been enabled */
if ( serial_console )
return 0;
/* Read and buffer byte from console, if any */
ret = sbi_ecall_3 ( SBI_DBCN, SBI_DBCN_READ,
sizeof ( sbi_console_input ),
virt_to_phys ( &sbi_console_input ), 0 );
if ( ret.error )
return 0;
if ( ! ret.error )
return ret.value;
/* Return number of characters read and buffered */
return ret.value;
/* Debug extension not supported: try legacy method */
key = sbi_legacy_ecall_0 ( SBI_LEGACY_GETCHAR );
if ( key > 0 ) {
sbi_console_input = key;
return key;
}
/* No character available */
return 0;
}
/** SBI console */

View File

@ -37,13 +37,15 @@ FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
/**
* Reboot system
*
* @v warm Perform a warm reboot
* @v flags Reboot flags
*/
static void sbi_reboot ( int warm ) {
static void sbi_reboot ( int flags ) {
struct sbi_return ret;
int warm;
int rc;
/* Reboot system */
warm = ( flags & REBOOT_WARM );
ret = sbi_ecall_2 ( SBI_SRST, SBI_SRST_SYSTEM_RESET,
( warm ? SBI_RESET_WARM : SBI_RESET_COLD ), 0 );
@ -51,6 +53,10 @@ static void sbi_reboot ( int warm ) {
rc = -ESBI ( ret.error );
DBGC ( SBI_SRST, "SBI %s reset failed: %s\n",
( warm ? "warm" : "cold" ), strerror ( rc ) );
/* Try a legacy shutdown */
sbi_legacy_ecall_0 ( SBI_LEGACY_SHUTDOWN );
DBGC ( SBI_SRST, "SBI legacy shutdown failed\n" );
}
/**
@ -69,6 +75,11 @@ static int sbi_poweroff ( void ) {
/* Any return is an error */
rc = -ESBI ( ret.error );
DBGC ( SBI_SRST, "SBI shutdown failed: %s\n", strerror ( rc ) );
/* Try a legacy shutdown */
sbi_legacy_ecall_0 ( SBI_LEGACY_SHUTDOWN );
DBGC ( SBI_SRST, "SBI legacy shutdown failed\n" );
return rc;
}

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,129 @@
/*
* Copyright (C) 2025 Michael Brown <mbrown@fensystems.co.uk>.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License as
* published by the Free Software Foundation; either version 2 of the
* License, or any later version.
*
* This program is distributed in the hope that it will be useful, but
* WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
* 02110-1301, USA.
*
* You can also choose to distribute this program under the terms of
* the Unmodified Binary Distribution Licence (as given in the file
* COPYING.UBDL), provided that you have satisfied its requirements.
*/
FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL )
/** @file
*
* Linux kernel prefix
*
*/
.section ".note.GNU-stack", "", @progbits
.text
/* Layout of kernel header */
.struct 0
hdr_code0: .space 4
hdr_code1: .space 4
hdr_text_offset: .space 8
hdr_image_size: .space 8
hdr_flags: .space 8
hdr_version: .space 4
hdr_res1: .space 4
hdr_res2: .space 8
hdr_magic: .space 8
hdr_magic2: .space 4
hdr_res3: .space 4
hdr_end:
.org 64
.previous
/* Header version */
#define HDR_VERSION( major, minor ) ( ( (major) << 16 ) | (minor) )
#define HDR_VERSION_0_2 HDR_VERSION ( 0, 2 )
/* Header flags */
#define HDR_FL_BIG_ENDIAN 0x00000001
/* Magic numbers */
#define HDR_MAGIC "RISCV\0\0\0"
#define HDR_MAGIC2 "RSC\x05"
/*
* Linux kernel header
*/
.section ".prefix", "ax", @progbits
/* Executable code / MZ header (for EFI-compatible binaries) */
.org hdr_code0
j _lkrn_start
/* Image load offset
*
* Must be set to the size of a single "megapage" (2MB for
* 64-bit, 4MB for 32-bit).
*/
.org hdr_text_offset
.dword _max_align
/* Image size (including uninitialised-data potions) */
.org hdr_image_size
.dword _memsz
/* Flags */
.org hdr_flags
.dword 0
/* Version */
.org hdr_version
.word HDR_VERSION_0_2
/* Magic numbers */
.org hdr_magic
.ascii HDR_MAGIC
.org hdr_magic2
.ascii HDR_MAGIC2
.org hdr_end
/*
* Linux kernel entry point
*/
.globl _lkrn_start
_lkrn_start:
/* Identify temporary page table and stack space
*
* Linux expects to be placed at the image load offset from
* the start of RAM. Assume that our loaded image is
* therefore already writable, and that we can therefore use
* the page table and stack within our (not yet zeroed) .bss
* section.
*/
la a2, page_table
la sp, _estack
/* Install iPXE */
call install
/* Call main program */
call main
/* We have no return path, since the Linux kernel does not
* define that a valid return address exists.
*
* Attempt a system reset, since there is nothing else we can
* viably do at this point.
*/
j reset_system
.size _lkrn_start, . - _lkrn_start

View File

@ -32,45 +32,8 @@
.section ".note.GNU-stack", "", @progbits
.text
/* SBI debug console extension */
#define SBI_DBCN ( ( 'D' << 24 ) | ( 'B' << 16 ) | ( 'C' << 8 ) | 'N' )
#define SBI_DBCN_WRITE 0x00
/* SBI system reset extension */
#define SBI_SRST ( ( 'S' << 24 ) | ( 'R' << 16 ) | ( 'S' << 8 ) | 'T' )
#define SBI_SRST_SYSTEM_RESET 0x00
#define SBI_RESET_COLD 0x00000001
/* Relative relocation type */
#define R_RISCV_RELATIVE 3
/* Layout of a relocation record */
.struct 0
rela_offset: .space ( __riscv_xlen / 8 )
rela_type: .space ( __riscv_xlen / 8 )
rela_addend: .space ( __riscv_xlen / 8 )
rela_len:
.previous
/*
* Display progress message via debug console
*/
.macro progress message
#ifndef NDEBUG
.section ".prefix.data", "aw", @progbits
progress_\@:
.ascii "\message"
.equ progress_\@_len, . - progress_\@
.size progress_\@, . - progress_\@
.previous
li a7, SBI_DBCN
li a6, SBI_DBCN_WRITE
li a0, progress_\@_len
la a1, progress_\@
mv a2, zero
ecall
#endif
.endm
/* Page size */
#define PAGE_SIZE 4096
/*
* SBI entry point
@ -79,53 +42,21 @@ progress_\@:
.org 0
.globl _sbi_start
_sbi_start:
/* Preserve arguments */
mv s0, a0
mv s1, a1
progress "\nSBI->iPXE"
/* Identify temporary page table and stack space
*
* Assume that there is sufficient writable memory (~8kB)
* directly below the device tree.
*/
li t0, ~( PAGE_SIZE - 1 )
and sp, a1, t0
li t0, PAGE_SIZE
sub sp, sp, t0
mv a2, sp
/* Apply dynamic relocations */
la t0, _reloc
la t1, _ereloc
la t2, _sbi_start
1: /* Read relocation record */
LOADN t3, rela_offset(t0)
LOADN t4, rela_type(t0)
LOADN t5, rela_addend(t0)
/* Check relocation type */
addi t4, t4, -R_RISCV_RELATIVE
bnez t4, 2f
/* Apply relocation */
add t3, t3, t2
add t5, t5, t2
STOREN t5, (t3)
2: /* Loop */
addi t0, t0, rela_len
blt t0, t1, 1b
progress " .reloc"
/* Zero the bss */
la t0, _bss
la t1, _ebss
1: STOREN zero, (t0)
addi t0, t0, ( __riscv_xlen / 8 )
blt t0, t1, 1b
progress " .bss"
/* Set up stack */
la sp, _estack
progress " .stack"
/* Store boot hart */
la t0, boot_hart
STOREN s0, (t0)
/* Register device tree */
mv a0, s1
call register_fdt
/* Install iPXE */
call install
/* Call main program */
progress "\n\n"
call main
/* We have no return path, since the M-mode SBI implementation
@ -135,22 +66,5 @@ _sbi_start:
* Attempt a system reset, since there is nothing else we can
* viably do at this point.
*/
progress "\niPXE->SBI reset\n"
li a7, SBI_SRST
li a6, SBI_SRST_SYSTEM_RESET
li a0, SBI_RESET_COLD
mv a1, zero
ecall
/* If reset failed, lock the system */
progress "(reset failed)\n"
1: wfi
j 1b
j reset_system
.size _sbi_start, . - _sbi_start
/* File split information for the compressor */
.section ".zinfo", "a", @progbits
.ascii "COPY"
.word 0
.word _sbi_filesz
.word 1

View File

@ -5,10 +5,8 @@
SECTIONS {
/* Start at virtual address zero */
. = 0;
/* Weak symbols that need zero values if not otherwise defined */
saved_pos = .;
.weak 0x0 : {
_weak = .;
*(.weak)
@ -16,6 +14,7 @@ SECTIONS {
_eweak = .;
}
_assert = ASSERT ( ( _weak == _eweak ), ".weak is non-zero length" );
_assert = ASSERT ( ( . == saved_pos ), ".weak altered current position" );
/* Prefix code */
.prefix : {
@ -39,6 +38,8 @@ SECTIONS {
/* Read-only data */
.rodata : {
_rodata = .;
*(.srodata)
*(.srodata.*)
*(.rodata)
*(.rodata.*)
_erodata = .;
@ -47,11 +48,17 @@ SECTIONS {
/* Writable data */
.data : {
_data = .;
*(.sdata)
*(.sdata.*)
*(.data)
*(.data.*)
KEEP(*(SORT(.tbl.*))) /* Various tables. See include/tables.h */
KEEP(*(.provided))
KEEP(*(.provided.*))
*(.got)
*(.got.plt)
/* Ensure compressed relocations end up aligned */
. = ALIGN ( 16 );
_edata = .;
}
@ -60,7 +67,6 @@ SECTIONS {
/* Runtime relocations (discarded after use) */
.rela.dyn {
_reloc = .;
*(.rela)
*(.rela.dyn)
}
@ -76,6 +82,8 @@ SECTIONS {
/* Uninitialised data */
.bss {
_bss = .;
*(.sbss)
*(.sbss.*)
*(.bss)
*(.bss.*)
*(COMMON)
@ -87,17 +95,21 @@ SECTIONS {
}
}
/* Calculate end of relocations
*
* This cannot be done by placing "_ereloc = .;" inside the
* .rela.dyn section, since the dynamic relocations are not
* present in the input sections but are instead generated during
* linking.
*/
_ereloc = ( _reloc + __load_stop_reladyn - __load_start_reladyn );
/* End virtual address */
_end = .;
/* Base virtual address */
_base = ABSOLUTE ( _prefix );
/* Relocations */
_reloc_offset = ( LOADADDR ( .rela.dyn ) - LOADADDR ( .prefix ) );
_reloc_filesz = SIZEOF ( .rela.dyn );
/* Length of initialised data */
_sbi_filesz = ABSOLUTE ( _ereloc );
_filesz = ( ABSOLUTE ( _edata ) - ABSOLUTE ( _prefix ) );
/* Length of in-memory image */
_memsz = ( ABSOLUTE ( _end ) - ABSOLUTE ( _prefix ) );
/* Unwanted sections */
/DISCARD/ : {
@ -110,6 +122,8 @@ SECTIONS {
*(.dynamic)
*(.dynsym)
*(.dynstr)
*(.hash)
*(.gnu.hash)
*(.einfo)
*(.einfo.*)
*(.discard)

View File

@ -1,3 +1,7 @@
# Specify compressor
#
ZBIN = $(ZBIN32)
# RISCV32-specific directories containing source files
#
SRCDIRS += arch/riscv32/core

View File

@ -1,5 +1,13 @@
# -*- makefile -*- : Force emacs to use Makefile mode
# Set base virtual address to 0xeb000000
#
# This is aligned to a 4MB boundary and so allows 4MB megapages to be
# used to map the iPXE binary. The address pattern is also easily
# recognisable if leaked to unexpected contexts.
#
LDFLAGS += --section-start=.prefix=0xeb000000
# Include generic SBI Makefile
#
MAKEDEPS += arch/riscv/Makefile.sbi

View File

@ -1,3 +1,7 @@
# Specify compressor
#
ZBIN = $(ZBIN64)
# RISCV64-specific directories containing source files
#
SRCDIRS += arch/riscv64/core

View File

@ -1,5 +1,13 @@
# -*- makefile -*- : Force emacs to use Makefile mode
# Set base virtual address to 0xffffffffeb000000
#
# This is aligned to a 2MB boundary and so allows 2MB megapages to be
# used to map the iPXE binary. The address pattern is also easily
# recognisable if leaked to unexpected contexts.
#
LDFLAGS += --section-start=.prefix=0xffffffffeb000000
# Include generic SBI Makefile
#
MAKEDEPS += arch/riscv/Makefile.sbi

View File

@ -250,6 +250,7 @@ static void cpuid_settings_init ( void ) {
/** CPUID settings initialiser */
struct init_fn cpuid_settings_init_fn __init_fn ( INIT_NORMAL ) = {
.name = "cpuid",
.initialise = cpuid_settings_init,
};

View File

@ -86,5 +86,6 @@ static void debugcon_init ( void ) {
* Debug port console initialisation function
*/
struct init_fn debugcon_init_fn __init_fn ( INIT_EARLY ) = {
.name = "debugcon",
.initialise = debugcon_init,
};

View File

@ -44,5 +44,6 @@ static void pci_autoboot_init ( void ) {
/** PCI autoboot device initialisation function */
struct init_fn pci_autoboot_init_fn __init_fn ( INIT_NORMAL ) = {
.name = "autoboot",
.initialise = pci_autoboot_init,
};

View File

@ -1,4 +1,5 @@
#include <ipxe/io.h>
#include <ipxe/uaccess.h>
#include <ipxe/memmap.h>
#include <registers.h>
/*
@ -41,82 +42,73 @@ extern char _etextdata[];
* to the prefix in %edi.
*/
__asmcall void relocate ( struct i386_all_regs *ix86 ) {
struct memory_map memmap;
uint32_t start, end, size, padded_size, max;
uint32_t new_start, new_end;
unsigned i;
struct memmap_region region;
physaddr_t start, end, max;
physaddr_t new_start, new_end;
physaddr_t r_start, r_end;
size_t size, padded_size;
/* Get memory map and current location */
get_memmap ( &memmap );
/* Show whole memory map (for debugging) */
memmap_dump_all ( 0 );
/* Get current location */
start = virt_to_phys ( _textdata );
end = virt_to_phys ( _etextdata );
size = ( end - start );
padded_size = ( size + ALIGN - 1 );
DBG ( "Relocate: currently at [%x,%x)\n"
"...need %x bytes for %d-byte alignment\n",
start, end, padded_size, ALIGN );
DBGC ( &region, "Relocate: currently at [%#08lx,%#08lx)\n"
"...need %#zx bytes for %d-byte alignment\n",
start, end, padded_size, ALIGN );
/* Determine maximum usable address */
max = MAX_ADDR;
if ( ix86->regs.ebp < max ) {
max = ix86->regs.ebp;
DBG ( "Limiting relocation to [0,%x)\n", max );
DBGC ( &region, "Limiting relocation to [0,%#08lx)\n", max );
}
/* Walk through the memory map and find the highest address
* below 4GB that iPXE will fit into.
* above the current iPXE and below 4GB that iPXE will fit
* into.
*/
new_end = end;
for ( i = 0 ; i < memmap.count ; i++ ) {
struct memory_region *region = &memmap.regions[i];
uint32_t r_start, r_end;
for_each_memmap_from ( &region, end, 0 ) {
DBG ( "Considering [%llx,%llx)\n", region->start, region->end);
/* Truncate block to maximum address. This will be
* less than 4GB, which means that we can get away
* with using just 32-bit arithmetic after this stage.
* strictly less than 4GB, which means that we can get
* away with using just 32-bit arithmetic after this
* stage.
*/
if ( region->start > max ) {
DBG ( "...starts after max=%x\n", max );
DBGC_MEMMAP ( &region, &region );
if ( region.min > max ) {
DBGC ( &region, "...starts after max=%#08lx\n", max );
break;
}
r_start = region.min;
if ( ! memmap_is_usable ( &region ) ) {
DBGC ( &region, "...not usable\n" );
continue;
}
r_start = region->start;
if ( region->end > max ) {
DBG ( "...end truncated to max=%x\n", max );
r_end = ( r_start + memmap_size ( &region ) );
if ( ( r_end == 0 ) || ( r_end > max ) ) {
DBGC ( &region, "...end truncated to max=%#08lx\n",
max );
r_end = max;
} else {
r_end = region->end;
}
DBG ( "...usable portion is [%x,%x)\n", r_start, r_end );
/* If we have rounded down r_end below r_ start, skip
* this block.
*/
if ( r_end < r_start ) {
DBG ( "...truncated to negative size\n" );
continue;
}
DBGC ( &region, "...usable portion is [%#08lx,%#08lx)\n",
r_start, r_end );
/* Check that there is enough space to fit in iPXE */
if ( ( r_end - r_start ) < size ) {
DBG ( "...too small (need %x bytes)\n", size );
if ( ( r_end - r_start ) < padded_size ) {
DBGC ( &region, "...too small (need %#zx bytes)\n",
padded_size );
continue;
}
/* If the start address of the iPXE we would
* place in this block is higher than the end address
* of the current highest block, use this block.
*
* Note that this avoids overlaps with the current
* iPXE, as well as choosing the highest of all viable
* blocks.
*/
if ( ( r_end - size ) > new_end ) {
new_end = r_end;
DBG ( "...new best block found.\n" );
}
/* Use highest block with enough space */
new_end = r_end;
DBGC ( &region, "...new best block found.\n" );
}
/* Calculate new location of iPXE, and align it to the
@ -126,9 +118,9 @@ __asmcall void relocate ( struct i386_all_regs *ix86 ) {
new_start += ( ( start - new_start ) & ( ALIGN - 1 ) );
new_end = new_start + size;
DBG ( "Relocating from [%x,%x) to [%x,%x)\n",
start, end, new_start, new_end );
DBGC ( &region, "Relocating from [%#08lx,%#08lx) to [%#08lx,%#08lx)\n",
start, end, new_start, new_end );
/* Let prefix know what to copy */
ix86->regs.esi = start;
ix86->regs.edi = new_start;

View File

@ -32,6 +32,7 @@ FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
#include <stddef.h>
#include <stdint.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include <errno.h>
#include <assert.h>
@ -69,6 +70,7 @@ static void cmdline_image_free ( struct refcnt *refcnt ) {
struct image *image = container_of ( refcnt, struct image, refcnt );
DBGC ( image, "RUNTIME freeing command line\n" );
free_image ( refcnt );
free ( cmdline_copy );
}
@ -76,6 +78,7 @@ static void cmdline_image_free ( struct refcnt *refcnt ) {
static struct image cmdline_image = {
.refcnt = REF_INIT ( cmdline_image_free ),
.name = "<CMDLINE>",
.flags = ( IMAGE_STATIC | IMAGE_STATIC_NAME ),
.type = &script_image_type,
};
@ -114,9 +117,7 @@ static void cmdline_strip ( char *cmdline, const char *cruft ) {
* @ret rc Return status code
*/
static int cmdline_init ( void ) {
userptr_t cmdline_user;
char *cmdline;
size_t len;
int rc;
/* Do nothing if no command line was specified */
@ -124,19 +125,15 @@ static int cmdline_init ( void ) {
DBGC ( colour, "RUNTIME found no command line\n" );
return 0;
}
cmdline_user = phys_to_user ( cmdline_phys );
len = ( strlen_user ( cmdline_user, 0 ) + 1 /* NUL */ );
/* Allocate and copy command line */
cmdline_copy = malloc ( len );
cmdline_copy = strdup ( phys_to_virt ( cmdline_phys ) );
if ( ! cmdline_copy ) {
DBGC ( colour, "RUNTIME could not allocate %zd bytes for "
"command line\n", len );
DBGC ( colour, "RUNTIME could not allocate command line\n" );
rc = -ENOMEM;
goto err_alloc_cmdline_copy;
}
cmdline = cmdline_copy;
copy_from_user ( cmdline, cmdline_user, 0, len );
DBGC ( colour, "RUNTIME found command line \"%s\" at %08x\n",
cmdline, cmdline_phys );
@ -151,7 +148,7 @@ static int cmdline_init ( void ) {
DBGC ( colour, "RUNTIME using command line \"%s\"\n", cmdline );
/* Prepare and register image */
cmdline_image.data = virt_to_user ( cmdline );
cmdline_image.data = cmdline;
cmdline_image.len = strlen ( cmdline );
if ( cmdline_image.len ) {
if ( ( rc = register_image ( &cmdline_image ) ) != 0 ) {
@ -193,7 +190,7 @@ static int initrd_init ( void ) {
initrd_phys, ( initrd_phys + initrd_len ) );
/* Create initrd image */
image = image_memory ( "<INITRD>", phys_to_user ( initrd_phys ),
image = image_memory ( "<INITRD>", phys_to_virt ( initrd_phys ),
initrd_len );
if ( ! image ) {
DBGC ( colour, "RUNTIME could not create initrd image\n" );

View File

@ -109,5 +109,6 @@ struct console_driver vga_console __console_driver = {
};
struct init_fn video_init_fn __init_fn ( INIT_EARLY ) = {
.name = "video",
.initialise = video_init,
};

View File

@ -23,6 +23,7 @@
FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
#include <string.h>
#include <ipxe/uaccess.h>
#include <ipxe/settings.h>
@ -47,12 +48,12 @@ FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
* @ret len Length of setting data, or negative error
*/
static int vram_fetch ( void *data, size_t len ) {
userptr_t vram = phys_to_user ( VRAM_BASE );
const void *vram = phys_to_virt ( VRAM_BASE );
/* Copy video RAM */
if ( len > VRAM_LEN )
len = VRAM_LEN;
copy_from_user ( data, vram, 0, len );
memcpy ( data, vram, len );
return VRAM_LEN;
}

View File

@ -1,5 +1,5 @@
/*
* Copyright (C) 2014 Michael Brown <mbrown@fensystems.co.uk>.
* Copyright (C) 2025 Michael Brown <mbrown@fensystems.co.uk>.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License as
@ -29,41 +29,47 @@ FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
*
*/
#include <errno.h>
#include <ipxe/uart.h>
#include <string.h>
#include <ipxe/serial.h>
#include <ipxe/ns16550.h>
/** UART port bases */
static uint16_t uart_base[] = {
[COM1] = 0x3f8,
[COM2] = 0x2f8,
[COM3] = 0x3e8,
[COM4] = 0x2e8,
};
/** Define a fixed ISA UART */
#define ISA_UART( NAME, BASE ) \
static struct ns16550_uart ns16550_ ## NAME = { \
.base = ( ( void * ) (BASE) ), \
.clock = NS16550_CLK_DEFAULT, \
}; \
struct uart NAME = { \
.refcnt = REF_INIT ( ref_no_free ), \
.name = #NAME, \
.op = &ns16550_operations, \
.priv = &ns16550_ ## NAME, \
}
/* Fixed ISA UARTs */
ISA_UART ( com1, COM1_BASE );
ISA_UART ( com2, COM2_BASE );
ISA_UART ( com3, COM3_BASE );
ISA_UART ( com4, COM4_BASE );
/**
* Select UART port
* Register fixed ISA UARTs
*
* @v uart UART
* @v port Port number, or 0 to disable
* @ret rc Return status code
*/
int uart_select ( struct uart *uart, unsigned int port ) {
int uart_register_fixed ( void ) {
static struct uart *ports[] = { COM1, COM2, COM3, COM4 };
unsigned int i;
int rc;
/* Set new UART base */
if ( port >= ( sizeof ( uart_base ) / sizeof ( uart_base[0] ) ) ) {
rc = -ENODEV;
goto err;
/* Register all fixed ISA UARTs */
for ( i = 0 ; i < ( sizeof ( ports ) / sizeof ( ports[0] ) ) ; i++ ) {
if ( ( rc = uart_register ( ports[i] ) ) != 0 ) {
DBGC ( ports[i], "UART could not register %s: %s\n",
ports[i]->name, strerror ( rc ) );
return rc;
}
}
uart->base = ( ( void * ) ( intptr_t ) uart_base[port] );
/* Check that UART exists */
if ( ( rc = uart_exists ( uart ) ) != 0 )
goto err;
return 0;
err:
uart->base = NULL;
return rc;
}

View File

@ -33,8 +33,16 @@ undiisr:
/* Check that we have an UNDI entry point */
cmpw $0, undinet_entry_point
je chain
/* Mask interrupt and set rearm flag */
movw undiisr_imr, %dx
inb %dx, %al
orb undiisr_bit, %al
outb %al, %dx
movb %al, undiisr_rearm
/* Issue UNDI API call */
movw %ds, %ax
movw %ax, %es
movw $undinet_params, %di
movw $PXENV_UNDI_ISR, %bx

View File

@ -373,6 +373,18 @@ extern void undiisr ( void );
uint8_t __data16 ( undiisr_irq );
#define undiisr_irq __use_data16 ( undiisr_irq )
/** IRQ mask register */
uint16_t __data16 ( undiisr_imr );
#define undiisr_imr __use_data16 ( undiisr_imr )
/** IRQ mask bit */
uint8_t __data16 ( undiisr_bit );
#define undiisr_bit __use_data16 ( undiisr_bit )
/** IRQ rearm flag */
uint8_t __data16 ( undiisr_rearm );
#define undiisr_rearm __use_data16 ( undiisr_rearm )
/** IRQ chain vector */
struct segoff __data16 ( undiisr_next_handler );
#define undiisr_next_handler __use_data16 ( undiisr_next_handler )
@ -395,6 +407,9 @@ static void undinet_hook_isr ( unsigned int irq ) {
assert ( undiisr_irq == 0 );
undiisr_irq = irq;
undiisr_imr = IMR_REG ( irq );
undiisr_bit = IMR_BIT ( irq );
undiisr_rearm = 0;
hook_bios_interrupt ( IRQ_INT ( irq ), ( ( intptr_t ) undiisr ),
&undiisr_next_handler );
}
@ -588,6 +603,14 @@ static void undinet_poll ( struct net_device *netdev ) {
* support interrupts.
*/
if ( ! undinet_isr_triggered() ) {
/* Rearm interrupt if needed */
if ( undiisr_rearm ) {
undiisr_rearm = 0;
assert ( undinic->irq != 0 );
enable_irq ( undinic->irq );
}
/* Allow interrupt to occur */
profile_start ( &undinet_irq_profiler );
__asm__ __volatile__ ( "sti\n\t"
@ -838,15 +861,19 @@ static const struct undinet_irq_broken undinet_irq_broken_list[] = {
{ 0x8086, 0x1503, PCI_ANY_ID, PCI_ANY_ID },
/* HP 745 G3 laptop */
{ 0x14e4, 0x1687, PCI_ANY_ID, PCI_ANY_ID },
/* ASUSTeK KNPA-U16 server */
{ 0x8086, 0x1521, 0x1043, PCI_ANY_ID },
};
/**
* Check for devices with broken support for generating interrupts
*
* @v desc Device description
* @v netdev Net device
* @ret irq_is_broken Interrupt support is broken; no interrupts are generated
*/
static int undinet_irq_is_broken ( struct device_description *desc ) {
static int undinet_irq_is_broken ( struct net_device *netdev ) {
struct undi_nic *undinic = netdev->priv;
struct device_description *desc = &netdev->dev->desc;
const struct undinet_irq_broken *broken;
struct pci_device pci;
uint16_t subsys_vendor;
@ -872,9 +899,25 @@ static int undinet_irq_is_broken ( struct device_description *desc ) {
( broken->pci_subsys_vendor == PCI_ANY_ID ) ) &&
( ( broken->pci_subsys == subsys ) ||
( broken->pci_subsys == PCI_ANY_ID ) ) ) {
DBGC ( undinic, "UNDINIC %p %04x:%04x subsys "
"%04x:%04x has broken interrupts\n",
undinic, desc->vendor, desc->device,
subsys_vendor, subsys );
return 1;
}
}
/* Check for a PCI Express capability. Given the number of
* issues found with legacy INTx emulation on PCIe systems, we
* assume that there is a high chance of interrupts not
* working on any PCIe device.
*/
if ( pci_find_capability ( &pci, PCI_CAP_ID_EXP ) ) {
DBGC ( undinic, "UNDINIC %p is PCI Express: assuming "
"interrupts are unreliable\n", undinic );
return 1;
}
return 0;
}
@ -972,6 +1015,10 @@ int undinet_probe ( struct undi_device *undi, struct device *dev ) {
}
DBGC ( undinic, "UNDINIC %p has MAC address %s and IRQ %d\n",
undinic, eth_ntoa ( netdev->hw_addr ), undinic->irq );
if ( undinic->irq ) {
/* Sanity check - prefix should have disabled the IRQ */
assert ( ! irq_enabled ( undinic->irq ) );
}
/* Get interface information */
memset ( &undi_iface, 0, sizeof ( undi_iface ) );
@ -993,7 +1040,7 @@ int undinet_probe ( struct undi_device *undi, struct device *dev ) {
undinic );
undinic->hacks |= UNDI_HACK_EB54;
}
if ( undinet_irq_is_broken ( &dev->desc ) ) {
if ( undinet_irq_is_broken ( netdev ) ) {
DBGC ( undinic, "UNDINIC %p forcing polling mode due to "
"broken interrupts\n", undinic );
undinic->irq_supported = 0;

View File

@ -25,6 +25,7 @@ FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
#include <stdint.h>
#include <stdio.h>
#include <string.h>
#include <errno.h>
#include <ipxe/malloc.h>
#include <ipxe/pci.h>

View File

@ -95,7 +95,4 @@ static int cpuid_exec ( int argc, char **argv ) {
}
/** x86 CPU feature detection command */
struct command cpuid_command __command = {
.name = "cpuid",
.exec = cpuid_exec,
};
COMMAND ( cpuid, cpuid_exec );

View File

@ -105,13 +105,5 @@ static int stoppxe_exec ( int argc __unused, char **argv __unused ) {
}
/** PXE commands */
struct command pxe_commands[] __command = {
{
.name = "startpxe",
.exec = startpxe_exec,
},
{
.name = "stoppxe",
.exec = stoppxe_exec,
},
};
COMMAND ( startpxe, startpxe_exec );
COMMAND ( stoppxe, stoppxe_exec );

View File

@ -32,12 +32,13 @@ FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
#include <stdint.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <errno.h>
#include <assert.h>
#include <realmode.h>
#include <bzimage.h>
#include <initrd.h>
#include <ipxe/initrd.h>
#include <ipxe/uaccess.h>
#include <ipxe/image.h>
#include <ipxe/segment.h>
@ -56,7 +57,7 @@ struct bzimage_context {
/** Real-mode kernel portion load segment address */
unsigned int rm_kernel_seg;
/** Real-mode kernel portion load address */
userptr_t rm_kernel;
void *rm_kernel;
/** Real-mode kernel portion file size */
size_t rm_filesz;
/** Real-mode heap top (offset from rm_kernel) */
@ -68,7 +69,7 @@ struct bzimage_context {
/** Real-mode kernel portion total memory size */
size_t rm_memsz;
/** Non-real-mode kernel portion load address */
userptr_t pm_kernel;
void *pm_kernel;
/** Non-real-mode kernel portion file and memory size */
size_t pm_sz;
/** Video mode */
@ -76,14 +77,9 @@ struct bzimage_context {
/** Memory limit */
uint64_t mem_limit;
/** Initrd address */
physaddr_t ramdisk_image;
void *initrd;
/** Initrd size */
physaddr_t ramdisk_size;
/** Command line magic block */
struct bzimage_cmdline cmdline_magic;
/** bzImage header */
struct bzimage_header bzhdr;
physaddr_t initrd_size;
};
/**
@ -91,35 +87,31 @@ struct bzimage_context {
*
* @v image bzImage file
* @v bzimg bzImage context
* @v src bzImage to parse
* @ret rc Return status code
*/
static int bzimage_parse_header ( struct image *image,
struct bzimage_context *bzimg,
userptr_t src ) {
struct bzimage_context *bzimg ) {
const struct bzimage_header *bzhdr;
unsigned int syssize;
int is_bzimage;
/* Initialise context */
memset ( bzimg, 0, sizeof ( *bzimg ) );
/* Sanity check */
if ( image->len < ( BZI_HDR_OFFSET + sizeof ( bzimg->bzhdr ) ) ) {
DBGC ( image, "bzImage %p too short for kernel header\n",
image );
if ( image->len < ( BZI_HDR_OFFSET + sizeof ( *bzhdr ) ) ) {
DBGC ( image, "bzImage %s too short for kernel header\n",
image->name );
return -ENOEXEC;
}
/* Read in header structures */
memset ( bzimg, 0, sizeof ( *bzimg ) );
copy_from_user ( &bzimg->cmdline_magic, src, BZI_CMDLINE_OFFSET,
sizeof ( bzimg->cmdline_magic ) );
copy_from_user ( &bzimg->bzhdr, src, BZI_HDR_OFFSET,
sizeof ( bzimg->bzhdr ) );
bzhdr = ( image->data + BZI_HDR_OFFSET );
/* Calculate size of real-mode portion */
bzimg->rm_filesz = ( ( ( bzimg->bzhdr.setup_sects ?
bzimg->bzhdr.setup_sects : 4 ) + 1 ) << 9 );
bzimg->rm_filesz = ( ( ( bzhdr->setup_sects ?
bzhdr->setup_sects : 4 ) + 1 ) << 9 );
if ( bzimg->rm_filesz > image->len ) {
DBGC ( image, "bzImage %p too short for %zd byte of setup\n",
image, bzimg->rm_filesz );
DBGC ( image, "bzImage %s too short for %zd byte of setup\n",
image->name, bzimg->rm_filesz );
return -ENOEXEC;
}
bzimg->rm_memsz = BZI_ASSUMED_RM_SIZE;
@ -129,13 +121,14 @@ static int bzimage_parse_header ( struct image *image,
syssize = ( ( bzimg->pm_sz + 15 ) / 16 );
/* Check for signatures and determine version */
if ( bzimg->bzhdr.boot_flag != BZI_BOOT_FLAG ) {
DBGC ( image, "bzImage %p missing 55AA signature\n", image );
if ( bzhdr->boot_flag != BZI_BOOT_FLAG ) {
DBGC ( image, "bzImage %s missing 55AA signature\n",
image->name );
return -ENOEXEC;
}
if ( bzimg->bzhdr.header == BZI_SIGNATURE ) {
if ( bzhdr->header == BZI_SIGNATURE ) {
/* 2.00+ */
bzimg->version = bzimg->bzhdr.version;
bzimg->version = bzhdr->version;
} else {
/* Pre-2.00. Check that the syssize field is correct,
* as a guard against accepting arbitrary binary data,
@ -145,20 +138,21 @@ static int bzimage_parse_header ( struct image *image,
* check this field.
*/
bzimg->version = 0x0100;
if ( bzimg->bzhdr.syssize != syssize ) {
DBGC ( image, "bzImage %p bad syssize %x (expected "
"%x)\n", image, bzimg->bzhdr.syssize, syssize );
if ( bzhdr->syssize != syssize ) {
DBGC ( image, "bzImage %s bad syssize %x (expected "
"%x)\n", image->name, bzhdr->syssize,
syssize );
return -ENOEXEC;
}
}
/* Determine image type */
is_bzimage = ( ( bzimg->version >= 0x0200 ) ?
( bzimg->bzhdr.loadflags & BZI_LOAD_HIGH ) : 0 );
( bzhdr->loadflags & BZI_LOAD_HIGH ) : 0 );
/* Calculate load address of real-mode portion */
bzimg->rm_kernel_seg = ( is_bzimage ? 0x1000 : 0x9000 );
bzimg->rm_kernel = real_to_user ( bzimg->rm_kernel_seg, 0 );
bzimg->rm_kernel = real_to_virt ( bzimg->rm_kernel_seg, 0 );
/* Allow space for the stack and heap */
bzimg->rm_memsz += BZI_STACK_SIZE;
@ -169,24 +163,24 @@ static int bzimage_parse_header ( struct image *image,
bzimg->rm_memsz += BZI_CMDLINE_SIZE;
/* Calculate load address of protected-mode portion */
bzimg->pm_kernel = phys_to_user ( is_bzimage ? BZI_LOAD_HIGH_ADDR
bzimg->pm_kernel = phys_to_virt ( is_bzimage ? BZI_LOAD_HIGH_ADDR
: BZI_LOAD_LOW_ADDR );
/* Extract video mode */
bzimg->vid_mode = bzimg->bzhdr.vid_mode;
bzimg->vid_mode = bzhdr->vid_mode;
/* Extract memory limit */
bzimg->mem_limit = ( ( bzimg->version >= 0x0203 ) ?
bzimg->bzhdr.initrd_addr_max : BZI_INITRD_MAX );
bzhdr->initrd_addr_max : BZI_INITRD_MAX );
/* Extract command line size */
bzimg->cmdline_size = ( ( bzimg->version >= 0x0206 ) ?
bzimg->bzhdr.cmdline_size : BZI_CMDLINE_SIZE );
bzhdr->cmdline_size : BZI_CMDLINE_SIZE );
DBGC ( image, "bzImage %p version %04x RM %#lx+%#zx PM %#lx+%#zx "
"cmdlen %zd\n", image, bzimg->version,
user_to_phys ( bzimg->rm_kernel, 0 ), bzimg->rm_filesz,
user_to_phys ( bzimg->pm_kernel, 0 ), bzimg->pm_sz,
DBGC ( image, "bzImage %s version %04x RM %#lx+%#zx PM %#lx+%#zx "
"cmdlen %zd\n", image->name, bzimg->version,
virt_to_phys ( bzimg->rm_kernel ), bzimg->rm_filesz,
virt_to_phys ( bzimg->pm_kernel ), bzimg->pm_sz,
bzimg->cmdline_size );
return 0;
@ -197,49 +191,44 @@ static int bzimage_parse_header ( struct image *image,
*
* @v image bzImage file
* @v bzimg bzImage context
* @v dst bzImage to update
*/
static void bzimage_update_header ( struct image *image,
struct bzimage_context *bzimg,
userptr_t dst ) {
struct bzimage_context *bzimg ) {
struct bzimage_header *bzhdr = ( bzimg->rm_kernel + BZI_HDR_OFFSET );
struct bzimage_cmdline *cmdline;
/* Set loader type */
if ( bzimg->version >= 0x0200 )
bzimg->bzhdr.type_of_loader = BZI_LOADER_TYPE_IPXE;
bzhdr->type_of_loader = BZI_LOADER_TYPE_IPXE;
/* Set heap end pointer */
if ( bzimg->version >= 0x0201 ) {
bzimg->bzhdr.heap_end_ptr = ( bzimg->rm_heap - 0x200 );
bzimg->bzhdr.loadflags |= BZI_CAN_USE_HEAP;
bzhdr->heap_end_ptr = ( bzimg->rm_heap - 0x200 );
bzhdr->loadflags |= BZI_CAN_USE_HEAP;
}
/* Set command line */
if ( bzimg->version >= 0x0202 ) {
bzimg->bzhdr.cmd_line_ptr = user_to_phys ( bzimg->rm_kernel,
bzimg->rm_cmdline );
bzhdr->cmd_line_ptr = ( virt_to_phys ( bzimg->rm_kernel )
+ bzimg->rm_cmdline );
} else {
bzimg->cmdline_magic.magic = BZI_CMDLINE_MAGIC;
bzimg->cmdline_magic.offset = bzimg->rm_cmdline;
cmdline = ( bzimg->rm_kernel + BZI_CMDLINE_OFFSET );
cmdline->magic = BZI_CMDLINE_MAGIC;
cmdline->offset = bzimg->rm_cmdline;
if ( bzimg->version >= 0x0200 )
bzimg->bzhdr.setup_move_size = bzimg->rm_memsz;
bzhdr->setup_move_size = bzimg->rm_memsz;
}
/* Set video mode */
bzimg->bzhdr.vid_mode = bzimg->vid_mode;
bzhdr->vid_mode = bzimg->vid_mode;
DBGC ( image, "bzImage %s vidmode %d\n",
image->name, bzhdr->vid_mode );
/* Set initrd address */
if ( bzimg->version >= 0x0200 ) {
bzimg->bzhdr.ramdisk_image = bzimg->ramdisk_image;
bzimg->bzhdr.ramdisk_size = bzimg->ramdisk_size;
bzhdr->ramdisk_image = virt_to_phys ( bzimg->initrd );
bzhdr->ramdisk_size = bzimg->initrd_size;
}
/* Write out header structures */
copy_to_user ( dst, BZI_CMDLINE_OFFSET, &bzimg->cmdline_magic,
sizeof ( bzimg->cmdline_magic ) );
copy_to_user ( dst, BZI_HDR_OFFSET, &bzimg->bzhdr,
sizeof ( bzimg->bzhdr ) );
DBGC ( image, "bzImage %p vidmode %d\n", image, bzimg->vid_mode );
}
/**
@ -270,8 +259,9 @@ static int bzimage_parse_cmdline ( struct image *image,
} else {
bzimg->vid_mode = strtoul ( vga, &end, 0 );
if ( *end ) {
DBGC ( image, "bzImage %p strange \"vga=\" "
"terminator '%c'\n", image, *end );
DBGC ( image, "bzImage %s strange \"vga=\" "
"terminator '%c'\n",
image->name, *end );
}
}
if ( sep )
@ -298,8 +288,8 @@ static int bzimage_parse_cmdline ( struct image *image,
case ' ':
break;
default:
DBGC ( image, "bzImage %p strange \"mem=\" "
"terminator '%c'\n", image, *end );
DBGC ( image, "bzImage %s strange \"mem=\" "
"terminator '%c'\n", image->name, *end );
break;
}
bzimg->mem_limit -= 1;
@ -317,76 +307,13 @@ static int bzimage_parse_cmdline ( struct image *image,
static void bzimage_set_cmdline ( struct image *image,
struct bzimage_context *bzimg ) {
const char *cmdline = ( image->cmdline ? image->cmdline : "" );
size_t cmdline_len;
char *rm_cmdline;
/* Copy command line down to real-mode portion */
cmdline_len = ( strlen ( cmdline ) + 1 );
if ( cmdline_len > bzimg->cmdline_size )
cmdline_len = bzimg->cmdline_size;
copy_to_user ( bzimg->rm_kernel, bzimg->rm_cmdline,
cmdline, cmdline_len );
DBGC ( image, "bzImage %p command line \"%s\"\n", image, cmdline );
}
/**
* Align initrd length
*
* @v len Length
* @ret len Length rounded up to INITRD_ALIGN
*/
static inline size_t bzimage_align ( size_t len ) {
return ( ( len + INITRD_ALIGN - 1 ) & ~( INITRD_ALIGN - 1 ) );
}
/**
* Load initrd
*
* @v image bzImage image
* @v initrd initrd image
* @v address Address at which to load, or UNULL
* @ret len Length of loaded image, excluding zero-padding
*/
static size_t bzimage_load_initrd ( struct image *image,
struct image *initrd,
userptr_t address ) {
const char *filename = cpio_name ( initrd );
struct cpio_header cpio;
size_t offset;
size_t pad_len;
/* Skip hidden images */
if ( initrd->flags & IMAGE_HIDDEN )
return 0;
/* Create cpio header for non-prebuilt images */
offset = cpio_header ( initrd, &cpio );
/* Copy in initrd image body (and cpio header if applicable) */
if ( address ) {
memmove_user ( address, offset, initrd->data, 0, initrd->len );
if ( offset ) {
memset_user ( address, 0, 0, offset );
copy_to_user ( address, 0, &cpio, sizeof ( cpio ) );
copy_to_user ( address, sizeof ( cpio ), filename,
cpio_name_len ( initrd ) );
}
DBGC ( image, "bzImage %p initrd %p [%#08lx,%#08lx,%#08lx)"
"%s%s\n", image, initrd, user_to_phys ( address, 0 ),
user_to_phys ( address, offset ),
user_to_phys ( address, ( offset + initrd->len ) ),
( filename ? " " : "" ), ( filename ? filename : "" ) );
DBGC2_MD5A ( image, user_to_phys ( address, offset ),
user_to_virt ( address, offset ), initrd->len );
}
offset += initrd->len;
/* Zero-pad to next INITRD_ALIGN boundary */
pad_len = ( ( -offset ) & ( INITRD_ALIGN - 1 ) );
if ( address )
memset_user ( address, offset, 0, pad_len );
return offset;
rm_cmdline = ( bzimg->rm_kernel + bzimg->rm_cmdline );
snprintf ( rm_cmdline, bzimg->cmdline_size, "%s", cmdline );
DBGC ( image, "bzImage %s command line \"%s\"\n",
image->name, rm_cmdline );
}
/**
@ -398,48 +325,52 @@ static size_t bzimage_load_initrd ( struct image *image,
*/
static int bzimage_check_initrds ( struct image *image,
struct bzimage_context *bzimg ) {
struct image *initrd;
userptr_t bottom;
size_t len = 0;
struct memmap_region region;
physaddr_t min;
physaddr_t max;
physaddr_t dest;
int rc;
/* Calculate total loaded length of initrds */
for_each_image ( initrd ) {
bzimg->initrd_size = initrd_len();
/* Calculate length */
len += bzimage_load_initrd ( image, initrd, UNULL );
len = bzimage_align ( len );
/* Succeed if there are no initrds */
if ( ! bzimg->initrd_size )
return 0;
DBGC ( image, "bzImage %p initrd %p from [%#08lx,%#08lx)%s%s\n",
image, initrd, user_to_phys ( initrd->data, 0 ),
user_to_phys ( initrd->data, initrd->len ),
( initrd->cmdline ? " " : "" ),
( initrd->cmdline ? initrd->cmdline : "" ) );
DBGC2_MD5A ( image, user_to_phys ( initrd->data, 0 ),
user_to_virt ( initrd->data, 0 ), initrd->len );
}
/* Calculate lowest usable address */
bottom = userptr_add ( bzimg->pm_kernel, bzimg->pm_sz );
/* Check that total length fits within space available for
* reshuffling. This is a conservative check, since CPIO
* headers are not present during reshuffling, but this
* doesn't hurt and keeps the code simple.
*/
if ( ( rc = initrd_reshuffle_check ( len, bottom ) ) != 0 ) {
DBGC ( image, "bzImage %p failed reshuffle check: %s\n",
image, strerror ( rc ) );
/* Calculate available load region after reshuffling */
if ( ( rc = initrd_region ( bzimg->initrd_size, &region ) ) != 0 ) {
DBGC ( image, "bzImage %s no region for initrds: %s\n",
image->name, strerror ( rc ) );
return rc;
}
/* Check that total length fits within kernel's memory limit */
if ( user_to_phys ( bottom, len ) > bzimg->mem_limit ) {
DBGC ( image, "bzImage %p not enough space for initrds\n",
image );
/* Limit region to avoiding kernel itself */
min = virt_to_phys ( bzimg->pm_kernel + bzimg->pm_sz );
if ( min < region.min )
min = region.min;
/* Limit region to kernel's memory limit */
max = region.max;
if ( max > bzimg->mem_limit )
max = bzimg->mem_limit;
/* Calculate installation address */
if ( max < ( bzimg->initrd_size - 1 ) ) {
DBGC ( image, "bzImage %s not enough space for initrds\n",
image->name );
return -ENOBUFS;
}
dest = ( ( max + 1 - bzimg->initrd_size ) & ~( INITRD_ALIGN - 1 ) );
if ( dest < min ) {
DBGC ( image, "bzImage %s not enough space for initrds\n",
image->name );
return -ENOBUFS;
}
bzimg->initrd = phys_to_virt ( dest );
DBGC ( image, "bzImage %s loading initrds from %#08lx downwards\n",
image->name, max );
return 0;
}
@ -451,65 +382,21 @@ static int bzimage_check_initrds ( struct image *image,
*/
static void bzimage_load_initrds ( struct image *image,
struct bzimage_context *bzimg ) {
struct image *initrd;
struct image *highest = NULL;
struct image *other;
userptr_t top;
userptr_t dest;
size_t offset;
size_t len;
/* Reshuffle initrds into desired order */
initrd_reshuffle ( userptr_add ( bzimg->pm_kernel, bzimg->pm_sz ) );
/* Find highest initrd */
for_each_image ( initrd ) {
if ( ( highest == NULL ) ||
( userptr_sub ( initrd->data, highest->data ) > 0 ) ) {
highest = initrd;
}
}
/* Do nothing if there are no initrds */
if ( ! highest )
if ( ! bzimg->initrd )
return;
/* Find highest usable address */
top = userptr_add ( highest->data, bzimage_align ( highest->len ) );
if ( user_to_phys ( top, -1 ) > bzimg->mem_limit ) {
top = phys_to_user ( ( bzimg->mem_limit + 1 ) &
~( INITRD_ALIGN - 1 ) );
}
DBGC ( image, "bzImage %p loading initrds from %#08lx downwards\n",
image, user_to_phys ( top, -1 ) );
/* Reshuffle initrds into desired order */
initrd_reshuffle();
/* Load initrds in order */
for_each_image ( initrd ) {
/* Calculate cumulative length of following
* initrds (including padding).
*/
offset = 0;
for_each_image ( other ) {
if ( other == initrd )
offset = 0;
offset += bzimage_load_initrd ( image, other, UNULL );
offset = bzimage_align ( offset );
}
/* Load initrd at this address */
dest = userptr_add ( top, -offset );
len = bzimage_load_initrd ( image, initrd, dest );
/* Record initrd location */
if ( ! bzimg->ramdisk_image )
bzimg->ramdisk_image = user_to_phys ( dest, 0 );
bzimg->ramdisk_size = ( user_to_phys ( dest, len ) -
bzimg->ramdisk_image );
}
DBGC ( image, "bzImage %p initrds at [%#08lx,%#08lx)\n",
image, bzimg->ramdisk_image,
( bzimg->ramdisk_image + bzimg->ramdisk_size ) );
/* Load initrds */
DBGC ( image, "bzImage %s initrds at [%#08lx,%#08lx)\n",
image->name, virt_to_phys ( bzimg->initrd ),
( virt_to_phys ( bzimg->initrd ) + bzimg->initrd_size ) );
len = initrd_load_all ( bzimg->initrd );
assert ( len == bzimg->initrd_size );
}
/**
@ -523,21 +410,20 @@ static int bzimage_exec ( struct image *image ) {
int rc;
/* Read and parse header from image */
if ( ( rc = bzimage_parse_header ( image, &bzimg,
image->data ) ) != 0 )
if ( ( rc = bzimage_parse_header ( image, &bzimg ) ) != 0 )
return rc;
/* Prepare segments */
if ( ( rc = prep_segment ( bzimg.rm_kernel, bzimg.rm_filesz,
bzimg.rm_memsz ) ) != 0 ) {
DBGC ( image, "bzImage %p could not prepare RM segment: %s\n",
image, strerror ( rc ) );
DBGC ( image, "bzImage %s could not prepare RM segment: %s\n",
image->name, strerror ( rc ) );
return rc;
}
if ( ( rc = prep_segment ( bzimg.pm_kernel, bzimg.pm_sz,
bzimg.pm_sz ) ) != 0 ) {
DBGC ( image, "bzImage %p could not prepare PM segment: %s\n",
image, strerror ( rc ) );
DBGC ( image, "bzImage %s could not prepare PM segment: %s\n",
image->name, strerror ( rc ) );
return rc;
}
@ -553,10 +439,9 @@ static int bzimage_exec ( struct image *image ) {
unregister_image ( image_get ( image ) );
/* Load segments */
memcpy_user ( bzimg.rm_kernel, 0, image->data,
0, bzimg.rm_filesz );
memcpy_user ( bzimg.pm_kernel, 0, image->data,
bzimg.rm_filesz, bzimg.pm_sz );
memcpy ( bzimg.rm_kernel, image->data, bzimg.rm_filesz );
memcpy ( bzimg.pm_kernel, ( image->data + bzimg.rm_filesz ),
bzimg.pm_sz );
/* Store command line */
bzimage_set_cmdline ( image, &bzimg );
@ -570,10 +455,10 @@ static int bzimage_exec ( struct image *image ) {
bzimage_load_initrds ( image, &bzimg );
/* Update kernel header */
bzimage_update_header ( image, &bzimg, bzimg.rm_kernel );
bzimage_update_header ( image, &bzimg );
DBGC ( image, "bzImage %p jumping to RM kernel at %04x:0000 "
"(stack %04x:%04zx)\n", image, ( bzimg.rm_kernel_seg + 0x20 ),
DBGC ( image, "bzImage %s jumping to RM kernel at %04x:0000 (stack "
"%04x:%04zx)\n", image->name, ( bzimg.rm_kernel_seg + 0x20 ),
bzimg.rm_kernel_seg, bzimg.rm_heap );
/* Jump to the kernel */
@ -609,8 +494,7 @@ int bzimage_probe ( struct image *image ) {
int rc;
/* Read and parse header from image */
if ( ( rc = bzimage_parse_header ( image, &bzimg,
image->data ) ) != 0 )
if ( ( rc = bzimage_parse_header ( image, &bzimg ) ) != 0 )
return rc;
return 0;

View File

@ -39,7 +39,7 @@ FILE_LICENCE ( GPL2_OR_LATER );
#include <ipxe/image.h>
#include <ipxe/segment.h>
#include <ipxe/init.h>
#include <ipxe/io.h>
#include <ipxe/memmap.h>
#include <ipxe/console.h>
/**
@ -49,8 +49,7 @@ FILE_LICENCE ( GPL2_OR_LATER );
* @ret rc Return status code
*/
static int com32_exec_loop ( struct image *image ) {
struct memory_map memmap;
unsigned int i;
struct memmap_region region;
int state;
uint32_t avail_mem_top;
@ -59,21 +58,12 @@ static int com32_exec_loop ( struct image *image ) {
switch ( state ) {
case 0: /* First time through; invoke COM32 program */
/* Get memory map */
get_memmap ( &memmap );
/* Find end of block covering COM32 image loading area */
for ( i = 0, avail_mem_top = 0 ; i < memmap.count ; i++ ) {
if ( (memmap.regions[i].start <= COM32_START_PHYS) &&
(memmap.regions[i].end > COM32_START_PHYS + image->len) ) {
avail_mem_top = memmap.regions[i].end;
break;
}
}
DBGC ( image, "COM32 %p: available memory top = 0x%x\n",
image, avail_mem_top );
memmap_describe ( COM32_START_PHYS, 1, &region );
assert ( memmap_is_usable ( &region ) );
avail_mem_top = ( COM32_START_PHYS + memmap_size ( &region ) );
DBGC ( image, "COM32 %s: available memory top = 0x%x\n",
image->name, avail_mem_top );
assert ( avail_mem_top != 0 );
/* Hook COMBOOT API interrupts */
@ -114,32 +104,32 @@ static int com32_exec_loop ( struct image *image ) {
/* Restore registers */
"popal\n\t" )
:
: "r" ( avail_mem_top ),
"r" ( virt_to_phys ( com32_cfarcall_wrapper ) ),
"r" ( virt_to_phys ( com32_farcall_wrapper ) ),
"r" ( get_fbms() * 1024 - ( COM32_BOUNCE_SEG << 4 ) ),
: "R" ( avail_mem_top ),
"R" ( virt_to_phys ( com32_cfarcall_wrapper ) ),
"R" ( virt_to_phys ( com32_farcall_wrapper ) ),
"R" ( get_fbms() * 1024 - ( COM32_BOUNCE_SEG << 4 ) ),
"i" ( COM32_BOUNCE_SEG << 4 ),
"r" ( virt_to_phys ( com32_intcall_wrapper ) ),
"r" ( virt_to_phys ( image->cmdline ?
"R" ( virt_to_phys ( com32_intcall_wrapper ) ),
"R" ( virt_to_phys ( image->cmdline ?
image->cmdline : "" ) ),
"i" ( COM32_START_PHYS )
: "memory" );
DBGC ( image, "COM32 %p: returned\n", image );
DBGC ( image, "COM32 %s: returned\n", image->name );
break;
case COMBOOT_EXIT:
DBGC ( image, "COM32 %p: exited\n", image );
DBGC ( image, "COM32 %s: exited\n", image->name );
break;
case COMBOOT_EXIT_RUN_KERNEL:
assert ( image->replacement );
DBGC ( image, "COM32 %p: exited to run kernel %s\n",
image, image->replacement->name );
DBGC ( image, "COM32 %s: exited to run kernel %s\n",
image->name, image->replacement->name );
break;
case COMBOOT_EXIT_COMMAND:
DBGC ( image, "COM32 %p: exited after executing command\n",
image );
DBGC ( image, "COM32 %s: exited after executing command\n",
image->name );
break;
default:
@ -162,17 +152,15 @@ static int com32_exec_loop ( struct image *image ) {
static int com32_identify ( struct image *image ) {
const char *ext;
static const uint8_t magic[] = { 0xB8, 0xFF, 0x4C, 0xCD, 0x21 };
uint8_t buf[5];
if ( image->len >= 5 ) {
if ( image->len >= sizeof ( magic ) ) {
/* Check for magic number
* mov eax,21cd4cffh
* B8 FF 4C CD 21
*/
copy_from_user ( buf, image->data, 0, sizeof(buf) );
if ( ! memcmp ( buf, magic, sizeof(buf) ) ) {
DBGC ( image, "COM32 %p: found magic number\n",
image );
if ( memcmp ( image->data, magic, sizeof ( magic) ) == 0 ) {
DBGC ( image, "COM32 %s: found magic number\n",
image->name );
return 0;
}
}
@ -182,16 +170,16 @@ static int com32_identify ( struct image *image ) {
ext = strrchr( image->name, '.' );
if ( ! ext ) {
DBGC ( image, "COM32 %p: no extension\n",
image );
DBGC ( image, "COM32 %s: no extension\n",
image->name );
return -ENOEXEC;
}
++ext;
if ( strcasecmp( ext, "c32" ) ) {
DBGC ( image, "COM32 %p: unrecognized extension %s\n",
image, ext );
DBGC ( image, "COM32 %s: unrecognized extension %s\n",
image->name, ext );
return -ENOEXEC;
}
@ -206,20 +194,20 @@ static int com32_identify ( struct image *image ) {
*/
static int com32_load_image ( struct image *image ) {
size_t filesz, memsz;
userptr_t buffer;
void *buffer;
int rc;
filesz = image->len;
memsz = filesz;
buffer = phys_to_user ( COM32_START_PHYS );
buffer = phys_to_virt ( COM32_START_PHYS );
if ( ( rc = prep_segment ( buffer, filesz, memsz ) ) != 0 ) {
DBGC ( image, "COM32 %p: could not prepare segment: %s\n",
image, strerror ( rc ) );
DBGC ( image, "COM32 %s: could not prepare segment: %s\n",
image->name, strerror ( rc ) );
return rc;
}
/* Copy image to segment */
memcpy_user ( buffer, 0, image->data, 0, filesz );
memcpy ( buffer, image->data, filesz );
return 0;
}
@ -230,22 +218,20 @@ static int com32_load_image ( struct image *image ) {
* @ret rc Return status code
*/
static int com32_prepare_bounce_buffer ( struct image * image ) {
unsigned int seg;
userptr_t seg_userptr;
void *seg;
size_t filesz, memsz;
int rc;
seg = COM32_BOUNCE_SEG;
seg_userptr = real_to_user ( seg, 0 );
seg = real_to_virt ( COM32_BOUNCE_SEG, 0 );
/* Ensure the entire 64k segment is free */
memsz = 0xFFFF;
filesz = 0;
/* Prepare, verify, and load the real-mode segment */
if ( ( rc = prep_segment ( seg_userptr, filesz, memsz ) ) != 0 ) {
DBGC ( image, "COM32 %p: could not prepare bounce buffer segment: %s\n",
image, strerror ( rc ) );
if ( ( rc = prep_segment ( seg, filesz, memsz ) ) != 0 ) {
DBGC ( image, "COM32 %s: could not prepare bounce buffer segment: %s\n",
image->name, strerror ( rc ) );
return rc;
}
@ -261,8 +247,6 @@ static int com32_prepare_bounce_buffer ( struct image * image ) {
static int com32_probe ( struct image *image ) {
int rc;
DBGC ( image, "COM32 %p: name '%s'\n", image, image->name );
/* Check if this is a COMBOOT image */
if ( ( rc = com32_identify ( image ) ) != 0 ) {
return rc;

View File

@ -35,7 +35,6 @@ FILE_LICENCE ( GPL2_OR_LATER );
#include <realmode.h>
#include <basemem.h>
#include <comboot.h>
#include <ipxe/uaccess.h>
#include <ipxe/image.h>
#include <ipxe/segment.h>
#include <ipxe/init.h>
@ -67,62 +66,53 @@ struct comboot_psp {
*
* @v image COMBOOT image
*/
static void comboot_copy_cmdline ( struct image * image, userptr_t seg_userptr ) {
static void comboot_copy_cmdline ( struct image * image, void *seg ) {
const char *cmdline = ( image->cmdline ? image->cmdline : "" );
int cmdline_len = strlen ( cmdline );
uint8_t *psp_cmdline;
/* Limit length of command line */
if( cmdline_len > COMBOOT_MAX_CMDLINE_LEN )
cmdline_len = COMBOOT_MAX_CMDLINE_LEN;
uint8_t len_byte = cmdline_len;
char spc = ' ', cr = '\r';
/* Copy length to byte before command line */
copy_to_user ( seg_userptr, COMBOOT_PSP_CMDLINE_OFFSET - 1,
&len_byte, 1 );
psp_cmdline = ( seg + COMBOOT_PSP_CMDLINE_OFFSET );
psp_cmdline[-1] = cmdline_len;
/* Command line starts with space */
copy_to_user ( seg_userptr,
COMBOOT_PSP_CMDLINE_OFFSET,
&spc, 1 );
psp_cmdline[0] = ' ';
/* Copy command line */
copy_to_user ( seg_userptr,
COMBOOT_PSP_CMDLINE_OFFSET + 1,
cmdline, cmdline_len );
memcpy ( &psp_cmdline[1], cmdline, cmdline_len );
/* Command line ends with CR */
copy_to_user ( seg_userptr,
COMBOOT_PSP_CMDLINE_OFFSET + cmdline_len + 1,
&cr, 1 );
psp_cmdline[ 1 + cmdline_len ] = '\r';
}
/**
* Initialize PSP
*
* @v image COMBOOT image
* @v seg_userptr segment to initialize
* @v seg segment to initialize
*/
static void comboot_init_psp ( struct image * image, userptr_t seg_userptr ) {
struct comboot_psp psp;
static void comboot_init_psp ( struct image * image, void *seg ) {
struct comboot_psp *psp;
/* Fill PSP */
psp = seg;
/* INT 20h instruction, byte order reversed */
psp.int20 = 0x20CD;
psp->int20 = 0x20CD;
/* get_fbms() returns BIOS free base memory counter, which is in
* kilobytes; x * 1024 / 16 == x * 64 == x << 6 */
psp.first_non_free_para = get_fbms() << 6;
psp->first_non_free_para = get_fbms() << 6;
DBGC ( image, "COMBOOT %p: first non-free paragraph = 0x%x\n",
image, psp.first_non_free_para );
/* Copy the PSP to offset 0 of segment.
* The rest of the PSP was already zeroed by
* comboot_prepare_segment. */
copy_to_user ( seg_userptr, 0, &psp, sizeof( psp ) );
DBGC ( image, "COMBOOT %s: first non-free paragraph = 0x%x\n",
image->name, psp->first_non_free_para );
/* Copy the command line to the PSP */
comboot_copy_cmdline ( image, seg_userptr );
comboot_copy_cmdline ( image, seg );
}
/**
@ -132,7 +122,7 @@ static void comboot_init_psp ( struct image * image, userptr_t seg_userptr ) {
* @ret rc Return status code
*/
static int comboot_exec_loop ( struct image *image ) {
userptr_t seg_userptr = real_to_user ( COMBOOT_PSP_SEG, 0 );
void *seg = real_to_virt ( COMBOOT_PSP_SEG, 0 );
int state;
state = rmsetjmp ( comboot_return );
@ -141,7 +131,7 @@ static int comboot_exec_loop ( struct image *image ) {
case 0: /* First time through; invoke COMBOOT program */
/* Initialize PSP */
comboot_init_psp ( image, seg_userptr );
comboot_init_psp ( image, seg );
/* Hook COMBOOT API interrupts */
hook_comboot_interrupts();
@ -181,23 +171,23 @@ static int comboot_exec_loop ( struct image *image ) {
"xorw %%di, %%di\n\t"
"xorw %%bp, %%bp\n\t"
"lret\n\t" )
: : "r" ( COMBOOT_PSP_SEG ) : "eax" );
DBGC ( image, "COMBOOT %p: returned\n", image );
: : "R" ( COMBOOT_PSP_SEG ) : "eax" );
DBGC ( image, "COMBOOT %s: returned\n", image->name );
break;
case COMBOOT_EXIT:
DBGC ( image, "COMBOOT %p: exited\n", image );
DBGC ( image, "COMBOOT %s: exited\n", image->name );
break;
case COMBOOT_EXIT_RUN_KERNEL:
assert ( image->replacement );
DBGC ( image, "COMBOOT %p: exited to run kernel %s\n",
image, image->replacement->name );
DBGC ( image, "COMBOOT %s: exited to run kernel %s\n",
image->name, image->replacement->name );
break;
case COMBOOT_EXIT_COMMAND:
DBGC ( image, "COMBOOT %p: exited after executing command\n",
image );
DBGC ( image, "COMBOOT %s: exited after executing command\n",
image->name );
break;
default:
@ -223,16 +213,16 @@ static int comboot_identify ( struct image *image ) {
ext = strrchr( image->name, '.' );
if ( ! ext ) {
DBGC ( image, "COMBOOT %p: no extension\n",
image );
DBGC ( image, "COMBOOT %s: no extension\n",
image->name );
return -ENOEXEC;
}
++ext;
if ( strcasecmp( ext, "cbt" ) ) {
DBGC ( image, "COMBOOT %p: unrecognized extension %s\n",
image, ext );
DBGC ( image, "COMBOOT %s: unrecognized extension %s\n",
image->name, ext );
return -ENOEXEC;
}
@ -246,12 +236,12 @@ static int comboot_identify ( struct image *image ) {
*/
static int comboot_prepare_segment ( struct image *image )
{
userptr_t seg_userptr;
void *seg;
size_t filesz, memsz;
int rc;
/* Load image in segment */
seg_userptr = real_to_user ( COMBOOT_PSP_SEG, 0 );
seg = real_to_virt ( COMBOOT_PSP_SEG, 0 );
/* Allow etra 0x100 bytes before image for PSP */
filesz = image->len + 0x100;
@ -260,17 +250,17 @@ static int comboot_prepare_segment ( struct image *image )
memsz = 0xFFFF;
/* Prepare, verify, and load the real-mode segment */
if ( ( rc = prep_segment ( seg_userptr, filesz, memsz ) ) != 0 ) {
DBGC ( image, "COMBOOT %p: could not prepare segment: %s\n",
image, strerror ( rc ) );
if ( ( rc = prep_segment ( seg, filesz, memsz ) ) != 0 ) {
DBGC ( image, "COMBOOT %s: could not prepare segment: %s\n",
image->name, strerror ( rc ) );
return rc;
}
/* Zero PSP */
memset_user ( seg_userptr, 0, 0, 0x100 );
memset ( seg, 0, 0x100 );
/* Copy image to segment:0100 */
memcpy_user ( seg_userptr, 0x100, image->data, 0, image->len );
memcpy ( ( seg + 0x100 ), image->data, image->len );
return 0;
}
@ -284,9 +274,6 @@ static int comboot_prepare_segment ( struct image *image )
static int comboot_probe ( struct image *image ) {
int rc;
DBGC ( image, "COMBOOT %p: name '%s'\n",
image, image->name );
/* Check if this is a COMBOOT image */
if ( ( rc = comboot_identify ( image ) ) != 0 ) {
@ -307,8 +294,8 @@ static int comboot_exec ( struct image *image ) {
/* Sanity check for filesize */
if( image->len >= 0xFF00 ) {
DBGC( image, "COMBOOT %p: image too large\n",
image );
DBGC( image, "COMBOOT %s: image too large\n",
image->name );
return -ENOEXEC;
}

View File

@ -23,8 +23,10 @@
FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
#include <string.h>
#include <errno.h>
#include <elf.h>
#include <librm.h>
#include <ipxe/image.h>
#include <ipxe/elf.h>
#include <ipxe/features.h>
@ -52,8 +54,8 @@ static int elfboot_exec ( struct image *image ) {
/* Load the image using core ELF support */
if ( ( rc = elf_load ( image, &entry, &max ) ) != 0 ) {
DBGC ( image, "ELF %p could not load: %s\n",
image, strerror ( rc ) );
DBGC ( image, "ELF %s could not load: %s\n",
image->name, strerror ( rc ) );
return rc;
}
@ -63,14 +65,15 @@ static int elfboot_exec ( struct image *image ) {
shutdown_boot();
/* Jump to OS with flat physical addressing */
DBGC ( image, "ELF %p starting execution at %lx\n", image, entry );
DBGC ( image, "ELF %s starting execution at %lx\n",
image->name, entry );
__asm__ __volatile__ ( PHYS_CODE ( "pushl %%ebp\n\t" /* gcc bug */
"call *%%edi\n\t"
"popl %%ebp\n\t" /* gcc bug */ )
: : "D" ( entry )
: "eax", "ebx", "ecx", "edx", "esi", "memory" );
DBGC ( image, "ELF %p returned\n", image );
DBGC ( image, "ELF %s returned\n", image->name );
/* It isn't safe to continue after calling shutdown() */
while ( 1 ) {}
@ -86,13 +89,13 @@ static int elfboot_exec ( struct image *image ) {
* @v dest Destination address
* @ret rc Return status code
*/
static int elfboot_check_segment ( struct image *image, Elf_Phdr *phdr,
static int elfboot_check_segment ( struct image *image, const Elf_Phdr *phdr,
physaddr_t dest ) {
/* Check that ELF segment uses flat physical addressing */
if ( phdr->p_vaddr != dest ) {
DBGC ( image, "ELF %p uses virtual addressing (phys %x, "
"virt %x)\n", image, phdr->p_paddr, phdr->p_vaddr );
DBGC ( image, "ELF %s uses virtual addressing (phys %x, virt "
"%x)\n", image->name, phdr->p_paddr, phdr->p_vaddr );
return -ENOEXEC;
}
@ -106,7 +109,7 @@ static int elfboot_check_segment ( struct image *image, Elf_Phdr *phdr,
* @ret rc Return status code
*/
static int elfboot_probe ( struct image *image ) {
Elf32_Ehdr ehdr;
const Elf32_Ehdr *ehdr;
static const uint8_t e_ident[] = {
[EI_MAG0] = ELFMAG0,
[EI_MAG1] = ELFMAG1,
@ -121,16 +124,22 @@ static int elfboot_probe ( struct image *image ) {
int rc;
/* Read ELF header */
copy_from_user ( &ehdr, image->data, 0, sizeof ( ehdr ) );
if ( memcmp ( ehdr.e_ident, e_ident, sizeof ( e_ident ) ) != 0 ) {
DBGC ( image, "Invalid ELF identifier\n" );
if ( image->len < sizeof ( *ehdr ) ) {
DBGC ( image, "ELF %s too short for ELF header\n",
image->name );
return -ENOEXEC;
}
ehdr = image->data;
if ( memcmp ( ehdr->e_ident, e_ident, sizeof ( e_ident ) ) != 0 ) {
DBGC ( image, "ELF %s invalid identifier\n", image->name );
return -ENOEXEC;
}
/* Check that this image uses flat physical addressing */
if ( ( rc = elf_segments ( image, &ehdr, elfboot_check_segment,
if ( ( rc = elf_segments ( image, ehdr, elfboot_check_segment,
&entry, &max ) ) != 0 ) {
DBGC ( image, "Unloadable ELF image\n" );
DBGC ( image, "ELF %s is not loadable: %s\n",
image->name, strerror ( rc ) );
return rc;
}

View File

@ -1,306 +0,0 @@
/*
* Copyright (C) 2012 Michael Brown <mbrown@fensystems.co.uk>.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License as
* published by the Free Software Foundation; either version 2 of the
* License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful, but
* WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
* 02110-1301, USA.
*
* You can also choose to distribute this program under the terms of
* the Unmodified Binary Distribution Licence (as given in the file
* COPYING.UBDL), provided that you have satisfied its requirements.
*/
FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
#include <errno.h>
#include <initrd.h>
#include <ipxe/image.h>
#include <ipxe/uaccess.h>
#include <ipxe/init.h>
#include <ipxe/memblock.h>
#include <ipxe/cpio.h>
/** @file
*
* Initial ramdisk (initrd) reshuffling
*
*/
/** Maximum address available for initrd */
userptr_t initrd_top;
/** Minimum address available for initrd */
userptr_t initrd_bottom;
/**
* Squash initrds as high as possible in memory
*
* @v top Highest possible address
* @ret used Lowest address used by initrds
*/
static userptr_t initrd_squash_high ( userptr_t top ) {
userptr_t current = top;
struct image *initrd;
struct image *highest;
size_t len;
/* Squash up any initrds already within or below the region */
while ( 1 ) {
/* Find the highest image not yet in its final position */
highest = NULL;
for_each_image ( initrd ) {
if ( ( userptr_sub ( initrd->data, current ) < 0 ) &&
( ( highest == NULL ) ||
( userptr_sub ( initrd->data,
highest->data ) > 0 ) ) ) {
highest = initrd;
}
}
if ( ! highest )
break;
/* Move this image to its final position */
len = ( ( highest->len + INITRD_ALIGN - 1 ) &
~( INITRD_ALIGN - 1 ) );
current = userptr_sub ( current, len );
DBGC ( &images, "INITRD squashing %s [%#08lx,%#08lx)->"
"[%#08lx,%#08lx)\n", highest->name,
user_to_phys ( highest->data, 0 ),
user_to_phys ( highest->data, highest->len ),
user_to_phys ( current, 0 ),
user_to_phys ( current, highest->len ) );
memmove_user ( current, 0, highest->data, 0, highest->len );
highest->data = current;
}
/* Copy any remaining initrds (e.g. embedded images) to the region */
for_each_image ( initrd ) {
if ( userptr_sub ( initrd->data, top ) >= 0 ) {
len = ( ( initrd->len + INITRD_ALIGN - 1 ) &
~( INITRD_ALIGN - 1 ) );
current = userptr_sub ( current, len );
DBGC ( &images, "INITRD copying %s [%#08lx,%#08lx)->"
"[%#08lx,%#08lx)\n", initrd->name,
user_to_phys ( initrd->data, 0 ),
user_to_phys ( initrd->data, initrd->len ),
user_to_phys ( current, 0 ),
user_to_phys ( current, initrd->len ) );
memcpy_user ( current, 0, initrd->data, 0,
initrd->len );
initrd->data = current;
}
}
return current;
}
/**
* Swap position of two adjacent initrds
*
* @v low Lower initrd
* @v high Higher initrd
* @v free Free space
* @v free_len Length of free space
*/
static void initrd_swap ( struct image *low, struct image *high,
userptr_t free, size_t free_len ) {
size_t len = 0;
size_t frag_len;
size_t new_len;
DBGC ( &images, "INITRD swapping %s [%#08lx,%#08lx)<->[%#08lx,%#08lx) "
"%s\n", low->name, user_to_phys ( low->data, 0 ),
user_to_phys ( low->data, low->len ),
user_to_phys ( high->data, 0 ),
user_to_phys ( high->data, high->len ), high->name );
/* Round down length of free space */
free_len &= ~( INITRD_ALIGN - 1 );
assert ( free_len > 0 );
/* Swap image data */
while ( len < high->len ) {
/* Calculate maximum fragment length */
frag_len = ( high->len - len );
if ( frag_len > free_len )
frag_len = free_len;
new_len = ( ( len + frag_len + INITRD_ALIGN - 1 ) &
~( INITRD_ALIGN - 1 ) );
/* Swap fragments */
memcpy_user ( free, 0, high->data, len, frag_len );
memmove_user ( low->data, new_len, low->data, len, low->len );
memcpy_user ( low->data, len, free, 0, frag_len );
len = new_len;
}
/* Adjust data pointers */
high->data = low->data;
low->data = userptr_add ( low->data, len );
}
/**
* Swap position of any two adjacent initrds not currently in the correct order
*
* @v free Free space
* @v free_len Length of free space
* @ret swapped A pair of initrds was swapped
*/
static int initrd_swap_any ( userptr_t free, size_t free_len ) {
struct image *low;
struct image *high;
size_t padded_len;
userptr_t adjacent;
/* Find any pair of initrds that can be swapped */
for_each_image ( low ) {
/* Calculate location of adjacent image (if any) */
padded_len = ( ( low->len + INITRD_ALIGN - 1 ) &
~( INITRD_ALIGN - 1 ) );
adjacent = userptr_add ( low->data, padded_len );
/* Search for adjacent image */
for_each_image ( high ) {
/* Stop search if all remaining potential
* adjacent images are already in the correct
* order.
*/
if ( high == low )
break;
/* If we have found the adjacent image, swap and exit */
if ( high->data == adjacent ) {
initrd_swap ( low, high, free, free_len );
return 1;
}
}
}
/* Nothing swapped */
return 0;
}
/**
* Dump initrd locations (for debug)
*
*/
static void initrd_dump ( void ) {
struct image *initrd;
/* Do nothing unless debugging is enabled */
if ( ! DBG_LOG )
return;
/* Dump initrd locations */
for_each_image ( initrd ) {
DBGC ( &images, "INITRD %s at [%#08lx,%#08lx)\n",
initrd->name, user_to_phys ( initrd->data, 0 ),
user_to_phys ( initrd->data, initrd->len ) );
DBGC2_MD5A ( &images, user_to_phys ( initrd->data, 0 ),
user_to_virt ( initrd->data, 0 ), initrd->len );
}
}
/**
* Reshuffle initrds into desired order at top of memory
*
* @v bottom Lowest address available for initrds
*
* After this function returns, the initrds have been rearranged in
* memory and the external heap structures will have been corrupted.
* Reshuffling must therefore take place immediately prior to jumping
* to the loaded OS kernel; no further execution within iPXE is
* permitted.
*/
void initrd_reshuffle ( userptr_t bottom ) {
userptr_t top;
userptr_t used;
userptr_t free;
size_t free_len;
/* Calculate limits of available space for initrds */
top = initrd_top;
if ( userptr_sub ( initrd_bottom, bottom ) > 0 )
bottom = initrd_bottom;
/* Debug */
DBGC ( &images, "INITRD region [%#08lx,%#08lx)\n",
user_to_phys ( bottom, 0 ), user_to_phys ( top, 0 ) );
initrd_dump();
/* Squash initrds as high as possible in memory */
used = initrd_squash_high ( top );
/* Calculate available free space */
free = bottom;
free_len = userptr_sub ( used, free );
/* Bubble-sort initrds into desired order */
while ( initrd_swap_any ( free, free_len ) ) {}
/* Debug */
initrd_dump();
}
/**
* Check that there is enough space to reshuffle initrds
*
* @v len Total length of initrds (including padding)
* @v bottom Lowest address available for initrds
* @ret rc Return status code
*/
int initrd_reshuffle_check ( size_t len, userptr_t bottom ) {
userptr_t top;
size_t available;
/* Calculate limits of available space for initrds */
top = initrd_top;
if ( userptr_sub ( initrd_bottom, bottom ) > 0 )
bottom = initrd_bottom;
available = userptr_sub ( top, bottom );
/* Allow for a sensible minimum amount of free space */
len += INITRD_MIN_FREE_LEN;
/* Check for available space */
return ( ( len < available ) ? 0 : -ENOBUFS );
}
/**
* initrd startup function
*
*/
static void initrd_startup ( void ) {
size_t len;
/* Record largest memory block available. Do this after any
* allocations made during driver startup (e.g. large host
* memory blocks for Infiniband devices, which may still be in
* use at the time of rearranging if a SAN device is hooked)
* but before any allocations for downloaded images (which we
* can safely reuse when rearranging).
*/
len = largest_memblock ( &initrd_bottom );
initrd_top = userptr_add ( initrd_bottom, len );
}
/** initrd startup function */
struct startup_fn startup_initrd __startup_fn ( STARTUP_LATE ) = {
.name = "initrd",
.startup = initrd_startup,
};

View File

@ -31,14 +31,14 @@ FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
*/
#include <stdio.h>
#include <string.h>
#include <errno.h>
#include <assert.h>
#include <realmode.h>
#include <multiboot.h>
#include <ipxe/uaccess.h>
#include <ipxe/image.h>
#include <ipxe/segment.h>
#include <ipxe/io.h>
#include <ipxe/memmap.h>
#include <ipxe/elf.h>
#include <ipxe/init.h>
#include <ipxe/features.h>
@ -59,6 +59,9 @@ FEATURE ( FEATURE_IMAGE, "MBOOT", DHCP_EB_FEATURE_MULTIBOOT, 1 );
*/
#define MAX_MODULES 8
/** Maximum number of memory map entries */
#define MAX_MEMMAP 8
/**
* Maximum combined length of command lines
*
@ -87,14 +90,6 @@ FEATURE ( FEATURE_IMAGE, "MBOOT", DHCP_EB_FEATURE_MULTIBOOT, 1 );
*/
#define MB_UNSUPPORTED_FLAGS ( MB_COMPULSORY_FLAGS & ~MB_SUPPORTED_FLAGS )
/** A multiboot header descriptor */
struct multiboot_header_info {
/** The actual multiboot header */
struct multiboot_header mb;
/** Offset of header within the multiboot image */
size_t offset;
};
/** Multiboot module command lines */
static char __bss16_array ( mb_cmdlines, [MB_MAX_CMDLINE] );
#define mb_cmdlines __use_data16 ( mb_cmdlines )
@ -114,32 +109,43 @@ static void multiboot_build_memmap ( struct image *image,
struct multiboot_info *mbinfo,
struct multiboot_memory_map *mbmemmap,
unsigned int limit ) {
struct memory_map memmap;
unsigned int i;
/* Get memory map */
get_memmap ( &memmap );
struct memmap_region region;
unsigned int remaining;
/* Translate into multiboot format */
memset ( mbmemmap, 0, sizeof ( *mbmemmap ) );
for ( i = 0 ; i < memmap.count ; i++ ) {
if ( i >= limit ) {
DBGC ( image, "MULTIBOOT %p limit of %d memmap "
"entries reached\n", image, limit );
remaining = limit;
for_each_memmap ( &region, 0 ) {
/* Ignore any non-memory regions */
if ( ! ( region.flags & MEMMAP_FL_MEMORY ) )
continue;
DBGC_MEMMAP ( image, &region );
/* Check Multiboot memory map limit */
if ( ! remaining ) {
DBGC ( image, "MULTIBOOT %s limit of %d memmap "
"entries reached\n", image->name, limit );
break;
}
mbmemmap[i].size = ( sizeof ( mbmemmap[i] ) -
sizeof ( mbmemmap[i].size ) );
mbmemmap[i].base_addr = memmap.regions[i].start;
mbmemmap[i].length = ( memmap.regions[i].end -
memmap.regions[i].start );
mbmemmap[i].type = MBMEM_RAM;
mbinfo->mmap_length += sizeof ( mbmemmap[i] );
if ( memmap.regions[i].start == 0 )
mbinfo->mem_lower = ( memmap.regions[i].end / 1024 );
if ( memmap.regions[i].start == 0x100000 )
mbinfo->mem_upper = ( ( memmap.regions[i].end -
0x100000 ) / 1024 );
/* Populate Multiboot memory map entry */
mbmemmap->size = ( sizeof ( *mbmemmap ) -
sizeof ( mbmemmap->size ) );
mbmemmap->base_addr = region.min;
mbmemmap->length = memmap_size ( &region );
mbmemmap->type = MBMEM_RAM;
/* Update Multiboot information */
mbinfo->mmap_length += sizeof ( *mbmemmap );
if ( mbmemmap->base_addr == 0 )
mbinfo->mem_lower = ( mbmemmap->length / 1024 );
if ( mbmemmap->base_addr == 0x100000 )
mbinfo->mem_upper = ( mbmemmap->length / 1024 );
/* Move to next Multiboot memory map entry */
mbmemmap++;
remaining--;
}
}
@ -199,8 +205,8 @@ static int multiboot_add_modules ( struct image *image, physaddr_t start,
for_each_image ( module_image ) {
if ( mbinfo->mods_count >= limit ) {
DBGC ( image, "MULTIBOOT %p limit of %d modules "
"reached\n", image, limit );
DBGC ( image, "MULTIBOOT %s limit of %d modules "
"reached\n", image->name, limit );
break;
}
@ -212,18 +218,18 @@ static int multiboot_add_modules ( struct image *image, physaddr_t start,
start = ( ( start + 0xfff ) & ~0xfff );
/* Prepare segment */
if ( ( rc = prep_segment ( phys_to_user ( start ),
if ( ( rc = prep_segment ( phys_to_virt ( start ),
module_image->len,
module_image->len ) ) != 0 ) {
DBGC ( image, "MULTIBOOT %p could not prepare module "
"%s: %s\n", image, module_image->name,
DBGC ( image, "MULTIBOOT %s could not prepare module "
"%s: %s\n", image->name, module_image->name,
strerror ( rc ) );
return rc;
}
/* Copy module */
memcpy_user ( phys_to_user ( start ), 0,
module_image->data, 0, module_image->len );
memcpy ( phys_to_virt ( start ), module_image->data,
module_image->len );
/* Add module to list */
module = &modules[mbinfo->mods_count++];
@ -231,8 +237,8 @@ static int multiboot_add_modules ( struct image *image, physaddr_t start,
module->mod_end = ( start + module_image->len );
module->string = multiboot_add_cmdline ( module_image );
module->reserved = 0;
DBGC ( image, "MULTIBOOT %p module %s is [%x,%x)\n",
image, module_image->name, module->mod_start,
DBGC ( image, "MULTIBOOT %s module %s is [%x,%x)\n",
image->name, module_image->name, module->mod_start,
module->mod_end );
start += module_image->len;
}
@ -255,8 +261,7 @@ static char __bss16_array ( mb_bootloader_name, [32] );
#define mb_bootloader_name __use_data16 ( mb_bootloader_name )
/** The multiboot memory map */
static struct multiboot_memory_map
__bss16_array ( mbmemmap, [MAX_MEMORY_REGIONS] );
static struct multiboot_memory_map __bss16_array ( mbmemmap, [MAX_MEMMAP] );
#define mbmemmap __use_data16 ( mbmemmap )
/** The multiboot module list */
@ -267,94 +272,101 @@ static struct multiboot_module __bss16_array ( mbmodules, [MAX_MODULES] );
* Find multiboot header
*
* @v image Multiboot file
* @v hdr Multiboot header descriptor to fill in
* @ret rc Return status code
* @ret offset Offset to Multiboot header, or negative error
*/
static int multiboot_find_header ( struct image *image,
struct multiboot_header_info *hdr ) {
uint32_t buf[64];
static int multiboot_find_header ( struct image *image ) {
const struct multiboot_header *mb;
size_t offset;
unsigned int buf_idx;
uint32_t checksum;
/* Scan through first 8kB of image file 256 bytes at a time.
* (Use the buffering to avoid the overhead of a
* copy_from_user() for every dword.)
*/
for ( offset = 0 ; offset < 8192 ; offset += sizeof ( buf[0] ) ) {
/* Scan through first 8kB of image file */
for ( offset = 0 ; offset < 8192 ; offset += 4 ) {
/* Check for end of image */
if ( offset > image->len )
if ( ( offset + sizeof ( *mb ) ) > image->len )
break;
/* Refill buffer if applicable */
buf_idx = ( ( offset % sizeof ( buf ) ) / sizeof ( buf[0] ) );
if ( buf_idx == 0 ) {
copy_from_user ( buf, image->data, offset,
sizeof ( buf ) );
}
mb = ( image->data + offset );
/* Check signature */
if ( buf[buf_idx] != MULTIBOOT_HEADER_MAGIC )
if ( mb->magic != MULTIBOOT_HEADER_MAGIC )
continue;
/* Copy header and verify checksum */
copy_from_user ( &hdr->mb, image->data, offset,
sizeof ( hdr->mb ) );
checksum = ( hdr->mb.magic + hdr->mb.flags +
hdr->mb.checksum );
checksum = ( mb->magic + mb->flags + mb->checksum );
if ( checksum != 0 )
continue;
/* Record offset of multiboot header and return */
hdr->offset = offset;
return 0;
/* Return header */
return offset;
}
/* No multiboot header found */
DBGC ( image, "MULTIBOOT %s has no multiboot header\n",
image->name );
return -ENOEXEC;
}
/**
* Load raw multiboot image into memory
*
* @v image Multiboot file
* @v hdr Multiboot header descriptor
* @v image Multiboot image
* @v offset Offset to Multiboot header
* @ret entry Entry point
* @ret max Maximum used address
* @ret rc Return status code
*/
static int multiboot_load_raw ( struct image *image,
struct multiboot_header_info *hdr,
static int multiboot_load_raw ( struct image *image, size_t offset,
physaddr_t *entry, physaddr_t *max ) {
size_t offset;
const struct multiboot_header *mb = ( image->data + offset );
size_t filesz;
size_t memsz;
userptr_t buffer;
void *buffer;
int rc;
/* Sanity check */
if ( ! ( hdr->mb.flags & MB_FLAG_RAW ) ) {
DBGC ( image, "MULTIBOOT %p is not flagged as a raw image\n",
image );
if ( ! ( mb->flags & MB_FLAG_RAW ) ) {
DBGC ( image, "MULTIBOOT %s is not flagged as a raw image\n",
image->name );
return -EINVAL;
}
/* Verify and prepare segment */
offset = ( hdr->offset - hdr->mb.header_addr + hdr->mb.load_addr );
filesz = ( hdr->mb.load_end_addr ?
( hdr->mb.load_end_addr - hdr->mb.load_addr ) :
/* Calculate starting offset within file */
if ( ( mb->load_addr > mb->header_addr ) ||
( ( mb->header_addr - mb->load_addr ) > offset ) ) {
DBGC ( image, "MULTIBOOT %s has misplaced header\n",
image->name );
return -EINVAL;
}
offset -= ( mb->header_addr - mb->load_addr );
assert ( offset < image->len );
/* Calculate length of initialized data */
filesz = ( mb->load_end_addr ?
( mb->load_end_addr - mb->load_addr ) :
( image->len - offset ) );
memsz = ( hdr->mb.bss_end_addr ?
( hdr->mb.bss_end_addr - hdr->mb.load_addr ) : filesz );
buffer = phys_to_user ( hdr->mb.load_addr );
if ( filesz > image->len ) {
DBGC ( image, "MULTIBOOT %s has overlength data\n",
image->name );
return -EINVAL;
}
/* Calculate length of uninitialised data */
memsz = ( mb->bss_end_addr ?
( mb->bss_end_addr - mb->load_addr ) : filesz );
DBGC ( image, "MULTIBOOT %s loading [%zx,%zx) to [%x,%zx,%zx)\n",
image->name, offset, ( offset + filesz ), mb->load_addr,
( mb->load_addr + filesz ), ( mb->load_addr + memsz ) );
/* Verify and prepare segment */
buffer = phys_to_virt ( mb->load_addr );
if ( ( rc = prep_segment ( buffer, filesz, memsz ) ) != 0 ) {
DBGC ( image, "MULTIBOOT %p could not prepare segment: %s\n",
image, strerror ( rc ) );
DBGC ( image, "MULTIBOOT %s could not prepare segment: %s\n",
image->name, strerror ( rc ) );
return rc;
}
/* Copy image to segment */
memcpy_user ( buffer, 0, image->data, offset, filesz );
memcpy ( buffer, ( image->data + offset ), filesz );
/* Record execution entry point and maximum used address */
*entry = hdr->mb.entry_addr;
*max = ( hdr->mb.load_addr + memsz );
*entry = mb->entry_addr;
*max = ( mb->load_addr + memsz );
return 0;
}
@ -373,8 +385,8 @@ static int multiboot_load_elf ( struct image *image, physaddr_t *entry,
/* Load ELF image*/
if ( ( rc = elf_load ( image, entry, max ) ) != 0 ) {
DBGC ( image, "MULTIBOOT %p ELF image failed to load: %s\n",
image, strerror ( rc ) );
DBGC ( image, "MULTIBOOT %s ELF image failed to load: %s\n",
image->name, strerror ( rc ) );
return rc;
}
@ -388,22 +400,24 @@ static int multiboot_load_elf ( struct image *image, physaddr_t *entry,
* @ret rc Return status code
*/
static int multiboot_exec ( struct image *image ) {
struct multiboot_header_info hdr;
const struct multiboot_header *mb;
physaddr_t entry;
physaddr_t max;
int offset;
int rc;
/* Locate multiboot header, if present */
if ( ( rc = multiboot_find_header ( image, &hdr ) ) != 0 ) {
DBGC ( image, "MULTIBOOT %p has no multiboot header\n",
image );
offset = multiboot_find_header ( image );
if ( offset < 0 ) {
rc = offset;
return rc;
}
mb = ( image->data + offset );
/* Abort if we detect flags that we cannot support */
if ( hdr.mb.flags & MB_UNSUPPORTED_FLAGS ) {
DBGC ( image, "MULTIBOOT %p flags %08x not supported\n",
image, ( hdr.mb.flags & MB_UNSUPPORTED_FLAGS ) );
if ( mb->flags & MB_UNSUPPORTED_FLAGS ) {
DBGC ( image, "MULTIBOOT %s flags %#08x not supported\n",
image->name, ( mb->flags & MB_UNSUPPORTED_FLAGS ) );
return -ENOTSUP;
}
@ -413,8 +427,10 @@ static int multiboot_exec ( struct image *image ) {
* behaviour.
*/
if ( ( ( rc = multiboot_load_elf ( image, &entry, &max ) ) != 0 ) &&
( ( rc = multiboot_load_raw ( image, &hdr, &entry, &max ) ) != 0 ))
( ( rc = multiboot_load_raw ( image, offset, &entry,
&max ) ) != 0 ) ) {
return rc;
}
/* Populate multiboot information structure */
memset ( &mbinfo, 0, sizeof ( mbinfo ) );
@ -444,8 +460,8 @@ static int multiboot_exec ( struct image *image ) {
( sizeof(mbmemmap) / sizeof(mbmemmap[0]) ) );
/* Jump to OS with flat physical addressing */
DBGC ( image, "MULTIBOOT %p starting execution at %lx\n",
image, entry );
DBGC ( image, "MULTIBOOT %s starting execution at %lx\n",
image->name, entry );
__asm__ __volatile__ ( PHYS_CODE ( "pushl %%ebp\n\t"
"call *%%edi\n\t"
"popl %%ebp\n\t" )
@ -454,7 +470,7 @@ static int multiboot_exec ( struct image *image ) {
"D" ( entry )
: "ecx", "edx", "esi", "memory" );
DBGC ( image, "MULTIBOOT %p returned\n", image );
DBGC ( image, "MULTIBOOT %s returned\n", image->name );
/* It isn't safe to continue after calling shutdown() */
while ( 1 ) {}
@ -469,17 +485,19 @@ static int multiboot_exec ( struct image *image ) {
* @ret rc Return status code
*/
static int multiboot_probe ( struct image *image ) {
struct multiboot_header_info hdr;
const struct multiboot_header *mb;
int offset;
int rc;
/* Locate multiboot header, if present */
if ( ( rc = multiboot_find_header ( image, &hdr ) ) != 0 ) {
DBGC ( image, "MULTIBOOT %p has no multiboot header\n",
image );
offset = multiboot_find_header ( image );
if ( offset < 0 ) {
rc = offset;
return rc;
}
DBGC ( image, "MULTIBOOT %p found header with flags %08x\n",
image, hdr.mb.flags );
mb = ( image->data + offset );
DBGC ( image, "MULTIBOOT %s found header at +%#x with flags %#08x\n",
image->name, offset, mb->flags );
return 0;
}

View File

@ -1,3 +1,4 @@
#include <string.h>
#include <errno.h>
#include <assert.h>
#include <realmode.h>
@ -106,12 +107,12 @@ struct ebinfo {
* @ret rc Return status code
*/
static int nbi_prepare_segment ( struct image *image, size_t offset __unused,
userptr_t dest, size_t filesz, size_t memsz ){
void *dest, size_t filesz, size_t memsz ) {
int rc;
if ( ( rc = prep_segment ( dest, filesz, memsz ) ) != 0 ) {
DBGC ( image, "NBI %p could not prepare segment: %s\n",
image, strerror ( rc ) );
DBGC ( image, "NBI %s could not prepare segment: %s\n",
image->name, strerror ( rc ) );
return rc;
}
@ -129,9 +130,9 @@ static int nbi_prepare_segment ( struct image *image, size_t offset __unused,
* @ret rc Return status code
*/
static int nbi_load_segment ( struct image *image, size_t offset,
userptr_t dest, size_t filesz,
void *dest, size_t filesz,
size_t memsz __unused ) {
memcpy_user ( dest, 0, image->data, offset, filesz );
memcpy ( dest, ( image->data + offset ), filesz );
return 0;
}
@ -144,22 +145,22 @@ static int nbi_load_segment ( struct image *image, size_t offset,
* @ret rc Return status code
*/
static int nbi_process_segments ( struct image *image,
struct imgheader *imgheader,
const struct imgheader *imgheader,
int ( * process ) ( struct image *image,
size_t offset,
userptr_t dest,
void *dest,
size_t filesz,
size_t memsz ) ) {
struct segheader sh;
const struct segheader *sh;
size_t offset = 0;
size_t sh_off;
userptr_t dest;
void *dest;
size_t filesz;
size_t memsz;
int rc;
/* Copy image header to target location */
dest = real_to_user ( imgheader->location.segment,
dest = real_to_virt ( imgheader->location.segment,
imgheader->location.offset );
filesz = memsz = NBI_HEADER_LENGTH;
if ( ( rc = process ( image, offset, dest, filesz, memsz ) ) != 0 )
@ -170,32 +171,32 @@ static int nbi_process_segments ( struct image *image,
sh_off = NBI_LENGTH ( imgheader->length );
do {
/* Read segment header */
copy_from_user ( &sh, image->data, sh_off, sizeof ( sh ) );
if ( sh.length == 0 ) {
sh = ( image->data + sh_off );
if ( sh->length == 0 ) {
/* Avoid infinite loop? */
DBGC ( image, "NBI %p invalid segheader length 0\n",
image );
DBGC ( image, "NBI %s invalid segheader length 0\n",
image->name );
return -ENOEXEC;
}
/* Calculate segment load address */
switch ( NBI_LOADADDR_FLAGS ( sh.flags ) ) {
switch ( NBI_LOADADDR_FLAGS ( sh->flags ) ) {
case NBI_LOADADDR_ABS:
dest = phys_to_user ( sh.loadaddr );
dest = phys_to_virt ( sh->loadaddr );
break;
case NBI_LOADADDR_AFTER:
dest = userptr_add ( dest, memsz + sh.loadaddr );
dest = ( dest + memsz + sh->loadaddr );
break;
case NBI_LOADADDR_BEFORE:
dest = userptr_add ( dest, -sh.loadaddr );
dest = ( dest - sh->loadaddr );
break;
case NBI_LOADADDR_END:
/* Not correct according to the spec, but
* maintains backwards compatibility with
* previous versions of Etherboot.
*/
dest = phys_to_user ( ( extmemsize() + 1024 ) * 1024
- sh.loadaddr );
dest = phys_to_virt ( ( extmemsize() + 1024 ) * 1024
- sh->loadaddr );
break;
default:
/* Cannot be reached */
@ -203,10 +204,11 @@ static int nbi_process_segments ( struct image *image,
}
/* Process this segment */
filesz = sh.imglength;
memsz = sh.memlength;
filesz = sh->imglength;
memsz = sh->memlength;
if ( ( offset + filesz ) > image->len ) {
DBGC ( image, "NBI %p segment outside file\n", image );
DBGC ( image, "NBI %s segment outside file\n",
image->name );
return -ENOEXEC;
}
if ( ( rc = process ( image, offset, dest,
@ -216,17 +218,18 @@ static int nbi_process_segments ( struct image *image,
offset += filesz;
/* Next segheader */
sh_off += NBI_LENGTH ( sh.length );
sh_off += NBI_LENGTH ( sh->length );
if ( sh_off >= NBI_HEADER_LENGTH ) {
DBGC ( image, "NBI %p header overflow\n", image );
DBGC ( image, "NBI %s header overflow\n",
image->name );
return -ENOEXEC;
}
} while ( ! NBI_LAST_SEGHEADER ( sh.flags ) );
} while ( ! NBI_LAST_SEGHEADER ( sh->flags ) );
if ( offset != image->len ) {
DBGC ( image, "NBI %p length wrong (file %zd, metadata %zd)\n",
image, image->len, offset );
DBGC ( image, "NBI %s length wrong (file %zd, metadata %zd)\n",
image->name, image->len, offset );
return -ENOEXEC;
}
@ -239,12 +242,13 @@ static int nbi_process_segments ( struct image *image,
* @v imgheader Image header information
* @ret rc Return status code, if image returns
*/
static int nbi_boot16 ( struct image *image, struct imgheader *imgheader ) {
static int nbi_boot16 ( struct image *image,
const struct imgheader *imgheader ) {
int discard_D, discard_S, discard_b;
int32_t rc;
DBGC ( image, "NBI %p executing 16-bit image at %04x:%04x\n", image,
imgheader->execaddr.segoff.segment,
DBGC ( image, "NBI %s executing 16-bit image at %04x:%04x\n",
image->name, imgheader->execaddr.segoff.segment,
imgheader->execaddr.segoff.offset );
__asm__ __volatile__ (
@ -277,7 +281,8 @@ static int nbi_boot16 ( struct image *image, struct imgheader *imgheader ) {
* @v imgheader Image header information
* @ret rc Return status code, if image returns
*/
static int nbi_boot32 ( struct image *image, struct imgheader *imgheader ) {
static int nbi_boot32 ( struct image *image,
const struct imgheader *imgheader ) {
struct ebinfo loaderinfo = {
product_major_version, product_minor_version,
0
@ -285,8 +290,8 @@ static int nbi_boot32 ( struct image *image, struct imgheader *imgheader ) {
int discard_D, discard_S, discard_b;
int32_t rc;
DBGC ( image, "NBI %p executing 32-bit image at %lx\n",
image, imgheader->execaddr.linear );
DBGC ( image, "NBI %s executing 32-bit image at %lx\n",
image->name, imgheader->execaddr.linear );
/* Jump to OS with flat physical addressing */
__asm__ __volatile__ (
@ -321,14 +326,15 @@ static int nbi_prepare_dhcp ( struct image *image ) {
boot_netdev = last_opened_netdev();
if ( ! boot_netdev ) {
DBGC ( image, "NBI %p could not identify a network device\n",
image );
DBGC ( image, "NBI %s could not identify a network device\n",
image->name );
return -ENODEV;
}
if ( ( rc = create_fakedhcpack ( boot_netdev, basemem_packet,
sizeof ( basemem_packet ) ) ) != 0 ) {
DBGC ( image, "NBI %p failed to build DHCP packet\n", image );
DBGC ( image, "NBI %s failed to build DHCP packet\n",
image->name );
return rc;
}
@ -342,15 +348,15 @@ static int nbi_prepare_dhcp ( struct image *image ) {
* @ret rc Return status code
*/
static int nbi_exec ( struct image *image ) {
struct imgheader imgheader;
const struct imgheader *imgheader;
int may_return;
int rc;
/* Retrieve image header */
copy_from_user ( &imgheader, image->data, 0, sizeof ( imgheader ) );
imgheader = image->data;
DBGC ( image, "NBI %p placing header at %hx:%hx\n", image,
imgheader.location.segment, imgheader.location.offset );
DBGC ( image, "NBI %s placing header at %hx:%hx\n", image->name,
imgheader->location.segment, imgheader->location.offset );
/* NBI files can have overlaps between segments; the bss of
* one segment may overlap the initialised data of another. I
@ -359,10 +365,10 @@ static int nbi_exec ( struct image *image ) {
* passes: first to initialise the segments, then to copy the
* data. This avoids zeroing out already-copied data.
*/
if ( ( rc = nbi_process_segments ( image, &imgheader,
if ( ( rc = nbi_process_segments ( image, imgheader,
nbi_prepare_segment ) ) != 0 )
return rc;
if ( ( rc = nbi_process_segments ( image, &imgheader,
if ( ( rc = nbi_process_segments ( image, imgheader,
nbi_load_segment ) ) != 0 )
return rc;
@ -371,25 +377,25 @@ static int nbi_exec ( struct image *image ) {
return rc;
/* Shut down now if NBI image will not return */
may_return = NBI_PROGRAM_RETURNS ( imgheader.flags );
may_return = NBI_PROGRAM_RETURNS ( imgheader->flags );
if ( ! may_return )
shutdown_boot();
/* Execute NBI image */
if ( NBI_LINEAR_EXEC_ADDR ( imgheader.flags ) ) {
rc = nbi_boot32 ( image, &imgheader );
if ( NBI_LINEAR_EXEC_ADDR ( imgheader->flags ) ) {
rc = nbi_boot32 ( image, imgheader );
} else {
rc = nbi_boot16 ( image, &imgheader );
rc = nbi_boot16 ( image, imgheader );
}
if ( ! may_return ) {
/* Cannot continue after shutdown() called */
DBGC ( image, "NBI %p returned %d from non-returnable image\n",
image, rc );
DBGC ( image, "NBI %s returned %d from non-returnable image\n",
image->name, rc );
while ( 1 ) {}
}
DBGC ( image, "NBI %p returned %d\n", image, rc );
DBGC ( image, "NBI %s returned %d\n", image->name, rc );
return rc;
}
@ -401,18 +407,19 @@ static int nbi_exec ( struct image *image ) {
* @ret rc Return status code
*/
static int nbi_probe ( struct image *image ) {
struct imgheader imgheader;
const struct imgheader *imgheader;
/* If we don't have enough data give up */
if ( image->len < NBI_HEADER_LENGTH ) {
DBGC ( image, "NBI %p too short for an NBI image\n", image );
DBGC ( image, "NBI %s too short for an NBI image\n",
image->name );
return -ENOEXEC;
}
imgheader = image->data;
/* Check image header */
copy_from_user ( &imgheader, image->data, 0, sizeof ( imgheader ) );
if ( imgheader.magic != NBI_MAGIC ) {
DBGC ( image, "NBI %p has no NBI signature\n", image );
if ( imgheader->magic != NBI_MAGIC ) {
DBGC ( image, "NBI %s has no NBI signature\n", image->name );
return -ENOEXEC;
}

View File

@ -30,10 +30,10 @@ FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
*
*/
#include <string.h>
#include <pxe.h>
#include <pxe_call.h>
#include <pic8259.h>
#include <ipxe/uaccess.h>
#include <ipxe/image.h>
#include <ipxe/segment.h>
#include <ipxe/netdevice.h>
@ -54,24 +54,24 @@ const char *pxe_cmdline;
* @ret rc Return status code
*/
static int pxe_exec ( struct image *image ) {
userptr_t buffer = real_to_user ( 0, 0x7c00 );
void *buffer = real_to_virt ( 0, 0x7c00 );
struct net_device *netdev;
int rc;
/* Verify and prepare segment */
if ( ( rc = prep_segment ( buffer, image->len, image->len ) ) != 0 ) {
DBGC ( image, "IMAGE %p could not prepare segment: %s\n",
image, strerror ( rc ) );
DBGC ( image, "IMAGE %s could not prepare segment: %s\n",
image->name, strerror ( rc ) );
return rc;
}
/* Copy image to segment */
memcpy_user ( buffer, 0, image->data, 0, image->len );
memcpy ( buffer, image->data, image->len );
/* Arbitrarily pick the most recently opened network device */
if ( ( netdev = last_opened_netdev() ) == NULL ) {
DBGC ( image, "IMAGE %p could not locate PXE net device\n",
image );
DBGC ( image, "IMAGE %s could not locate PXE net device\n",
image->name );
return -ENODEV;
}
netdev_get ( netdev );
@ -142,7 +142,7 @@ int pxe_probe ( struct image *image ) {
* @ret rc Return status code
*/
int pxe_probe_no_mz ( struct image *image ) {
uint16_t magic;
const uint16_t *magic;
int rc;
/* Probe PXE image */
@ -152,11 +152,11 @@ int pxe_probe_no_mz ( struct image *image ) {
/* Reject image with an "MZ" signature which may indicate an
* EFI image incorrectly handed out to a BIOS system.
*/
if ( image->len >= sizeof ( magic ) ) {
copy_from_user ( &magic, image->data, 0, sizeof ( magic ) );
if ( magic == cpu_to_le16 ( EFI_IMAGE_DOS_SIGNATURE ) ) {
DBGC ( image, "IMAGE %p may be an EFI image\n",
image );
if ( image->len >= sizeof ( *magic ) ) {
magic = image->data;
if ( *magic == cpu_to_le16 ( EFI_IMAGE_DOS_SIGNATURE ) ) {
DBGC ( image, "IMAGE %s may be an EFI image\n",
image->name );
return -ENOTTY;
}
}

View File

@ -44,33 +44,6 @@ FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
FEATURE ( FEATURE_IMAGE, "SDI", DHCP_EB_FEATURE_SDI, 1 );
/**
* Parse SDI image header
*
* @v image SDI file
* @v sdi SDI header to fill in
* @ret rc Return status code
*/
static int sdi_parse_header ( struct image *image, struct sdi_header *sdi ) {
/* Sanity check */
if ( image->len < sizeof ( *sdi ) ) {
DBGC ( image, "SDI %p too short for SDI header\n", image );
return -ENOEXEC;
}
/* Read in header */
copy_from_user ( sdi, image->data, 0, sizeof ( *sdi ) );
/* Check signature */
if ( sdi->magic != SDI_MAGIC ) {
DBGC ( image, "SDI %p is not an SDI image\n", image );
return -ENOEXEC;
}
return 0;
}
/**
* Execute SDI image
*
@ -78,30 +51,30 @@ static int sdi_parse_header ( struct image *image, struct sdi_header *sdi ) {
* @ret rc Return status code
*/
static int sdi_exec ( struct image *image ) {
struct sdi_header sdi;
const struct sdi_header *sdi;
uint32_t sdiptr;
int rc;
/* Parse image header */
if ( ( rc = sdi_parse_header ( image, &sdi ) ) != 0 )
return rc;
/* Sanity check */
assert ( image->len >= sizeof ( *sdi ) );
sdi = image->data;
/* Check that image is bootable */
if ( sdi.boot_size == 0 ) {
DBGC ( image, "SDI %p is not bootable\n", image );
if ( sdi->boot_size == 0 ) {
DBGC ( image, "SDI %s is not bootable\n", image->name );
return -ENOTTY;
}
DBGC ( image, "SDI %p image at %08lx+%08zx\n",
image, user_to_phys ( image->data, 0 ), image->len );
DBGC ( image, "SDI %p boot code at %08lx+%llx\n", image,
user_to_phys ( image->data, sdi.boot_offset ), sdi.boot_size );
DBGC ( image, "SDI %s image at %08lx+%08zx\n",
image->name, virt_to_phys ( image->data ), image->len );
DBGC ( image, "SDI %s boot code at %08llx+%llx\n", image->name,
( virt_to_phys ( image->data ) + sdi->boot_offset ),
sdi->boot_size );
/* Copy boot code */
memcpy_user ( real_to_user ( SDI_BOOT_SEG, SDI_BOOT_OFF ), 0,
image->data, sdi.boot_offset, sdi.boot_size );
memcpy ( real_to_virt ( SDI_BOOT_SEG, SDI_BOOT_OFF ),
( image->data + sdi->boot_offset ), sdi->boot_size );
/* Jump to boot code */
sdiptr = ( user_to_phys ( image->data, 0 ) | SDI_WTF );
sdiptr = ( virt_to_phys ( image->data ) | SDI_WTF );
__asm__ __volatile__ ( REAL_CODE ( "ljmp %0, %1\n\t" )
: : "i" ( SDI_BOOT_SEG ),
"i" ( SDI_BOOT_OFF ),
@ -122,12 +95,22 @@ static int sdi_exec ( struct image *image ) {
* @ret rc Return status code
*/
static int sdi_probe ( struct image *image ) {
struct sdi_header sdi;
int rc;
const struct sdi_header *sdi;
/* Parse image */
if ( ( rc = sdi_parse_header ( image, &sdi ) ) != 0 )
return rc;
/* Sanity check */
if ( image->len < sizeof ( *sdi ) ) {
DBGC ( image, "SDI %s too short for SDI header\n",
image->name );
return -ENOEXEC;
}
sdi = image->data;
/* Check signature */
if ( sdi->magic != SDI_MAGIC ) {
DBGC ( image, "SDI %s is not an SDI image\n",
image->name );
return -ENOEXEC;
}
return 0;
}

View File

@ -31,6 +31,7 @@ FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <assert.h>
#include <errno.h>
#include <ipxe/uaccess.h>
@ -149,41 +150,38 @@ static const char * ucode_vendor_name ( const union ucode_vendor_id *vendor ) {
*
* @v update Microcode update
* @v control Microcode update control
* @v status Microcode update status
* @v summary Microcode update summary
* @v id APIC ID
* @v optional Status report is optional
* @ret rc Return status code
*/
static int ucode_status ( struct ucode_update *update,
struct ucode_control *control,
static int ucode_status ( const struct ucode_update *update,
const struct ucode_control *control,
const struct ucode_status *status,
struct ucode_summary *summary,
unsigned int id, int optional ) {
struct ucode_status status;
struct ucode_descriptor *desc;
/* Sanity check */
assert ( id <= control->apic_max );
/* Read status report */
copy_from_user ( &status, phys_to_user ( control->status ),
( id * sizeof ( status ) ), sizeof ( status ) );
/* Ignore empty optional status reports */
if ( optional && ( ! status.signature ) )
if ( optional && ( ! status->signature ) )
return 0;
DBGC ( update, "UCODE %#08x signature %#08x ucode %#08x->%#08x\n",
id, status.signature, status.before, status.after );
id, status->signature, status->before, status->after );
/* Check CPU signature */
if ( ! status.signature ) {
if ( ! status->signature ) {
DBGC2 ( update, "UCODE %#08x has no signature\n", id );
return -ENOENT;
}
/* Check APIC ID is correct */
if ( status.id != id ) {
if ( status->id != id ) {
DBGC ( update, "UCODE %#08x wrong APIC ID %#08x\n",
id, status.id );
id, status->id );
return -EINVAL;
}
@ -195,29 +193,29 @@ static int ucode_status ( struct ucode_update *update,
}
/* Check microcode was not downgraded */
if ( status.after < status.before ) {
if ( status->after < status->before ) {
DBGC ( update, "UCODE %#08x was downgraded %#08x->%#08x\n",
id, status.before, status.after );
id, status->before, status->after );
return -ENOTTY;
}
/* Check that expected updates (if any) were applied */
for ( desc = update->desc ; desc->signature ; desc++ ) {
if ( ( desc->signature == status.signature ) &&
( status.after < desc->version ) ) {
if ( ( desc->signature == status->signature ) &&
( status->after < desc->version ) ) {
DBGC ( update, "UCODE %#08x failed update %#08x->%#08x "
"(wanted %#08x)\n", id, status.before,
status.after, desc->version );
"(wanted %#08x)\n", id, status->before,
status->after, desc->version );
return -EIO;
}
}
/* Update summary */
summary->count++;
if ( status.before < summary->low )
summary->low = status.before;
if ( status.after > summary->high )
summary->high = status.after;
if ( status->before < summary->low )
summary->low = status->before;
if ( status->after > summary->high )
summary->high = status->after;
return 0;
}
@ -231,13 +229,13 @@ static int ucode_status ( struct ucode_update *update,
* @ret rc Return status code
*/
static int ucode_update_all ( struct image *image,
struct ucode_update *update,
const struct ucode_update *update,
struct ucode_summary *summary ) {
struct ucode_control control;
struct ucode_vendor *vendor;
userptr_t status;
struct ucode_status *status;
unsigned int max;
unsigned int i;
unsigned int id;
size_t len;
int rc;
@ -248,7 +246,7 @@ static int ucode_update_all ( struct image *image,
/* Allocate status reports */
max = mp_max_cpuid();
len = ( ( max + 1 ) * sizeof ( struct ucode_status ) );
len = ( ( max + 1 ) * sizeof ( *status ) );
status = umalloc ( len );
if ( ! status ) {
DBGC ( image, "UCODE %s could not allocate %d status reports\n",
@ -256,12 +254,12 @@ static int ucode_update_all ( struct image *image,
rc = -ENOMEM;
goto err_alloc;
}
memset_user ( status, 0, 0, len );
memset ( status, 0, len );
/* Construct control structure */
memset ( &control, 0, sizeof ( control ) );
control.desc = virt_to_phys ( update->desc );
control.status = user_to_phys ( status, 0 );
control.status = virt_to_phys ( status );
vendor = update->vendor;
if ( vendor ) {
control.ver_clear = vendor->ver_clear;
@ -274,8 +272,9 @@ static int ucode_update_all ( struct image *image,
/* Update microcode on boot processor */
mp_exec_boot ( ucode_update, &control );
if ( ( rc = ucode_status ( update, &control, summary,
mp_boot_cpuid(), 0 ) ) != 0 ) {
id = mp_boot_cpuid();
if ( ( rc = ucode_status ( update, &control, &status[id],
summary, id, 0 ) ) != 0 ) {
DBGC ( image, "UCODE %s failed on boot processor: %s\n",
image->name, strerror ( rc ) );
goto err_boot;
@ -293,9 +292,9 @@ static int ucode_update_all ( struct image *image,
/* Check status reports */
summary->count = 0;
for ( i = 0 ; i <= max ; i++ ) {
if ( ( rc = ucode_status ( update, &control, summary,
i, 1 ) ) != 0 ) {
for ( id = 0 ; id <= max ; id++ ) {
if ( ( rc = ucode_status ( update, &control, &status[id],
summary, id, 1 ) ) != 0 ) {
goto err_status;
}
}
@ -359,24 +358,22 @@ static void ucode_describe ( struct image *image, size_t start,
* @ret rc Return status code
*/
static int ucode_verify ( struct image *image, size_t start, size_t len ) {
uint32_t checksum = 0;
uint32_t dword;
size_t offset;
const uint32_t *dword;
uint32_t checksum;
unsigned int count;
/* Check length is a multiple of dwords */
if ( ( len % sizeof ( dword ) ) != 0 ) {
if ( ( len % sizeof ( *dword ) ) != 0 ) {
DBGC ( image, "UCODE %s+%#04zx invalid length %#zx\n",
image->name, start, len );
return -EINVAL;
}
dword = ( image->data + start );
/* Calculate checksum */
for ( offset = start ; len ;
offset += sizeof ( dword ), len -= sizeof ( dword ) ) {
copy_from_user ( &dword, image->data, offset,
sizeof ( dword ) );
checksum += dword;
}
count = ( len / sizeof ( *dword ) );
for ( checksum = 0 ; count ; count-- )
checksum += *(dword++);
if ( checksum != 0 ) {
DBGC ( image, "UCODE %s+%#04zx bad checksum %#08x\n",
image->name, start, checksum );
@ -396,9 +393,9 @@ static int ucode_verify ( struct image *image, size_t start, size_t len ) {
*/
static int ucode_parse_intel ( struct image *image, size_t start,
struct ucode_update *update ) {
struct intel_ucode_header hdr;
struct intel_ucode_ext_header exthdr;
struct intel_ucode_ext ext;
const struct intel_ucode_header *hdr;
const struct intel_ucode_ext_header *exthdr;
const struct intel_ucode_ext *ext;
struct ucode_descriptor desc;
size_t remaining;
size_t offset;
@ -409,27 +406,27 @@ static int ucode_parse_intel ( struct image *image, size_t start,
/* Read header */
remaining = ( image->len - start );
if ( remaining < sizeof ( hdr ) ) {
if ( remaining < sizeof ( *hdr ) ) {
DBGC ( image, "UCODE %s+%#04zx too small for Intel header\n",
image->name, start );
return -ENOEXEC;
}
copy_from_user ( &hdr, image->data, start, sizeof ( hdr ) );
hdr = ( image->data + start );
/* Determine lengths */
data_len = hdr.data_len;
data_len = hdr->data_len;
if ( ! data_len )
data_len = INTEL_UCODE_DATA_LEN;
len = hdr.len;
len = hdr->len;
if ( ! len )
len = ( sizeof ( hdr ) + data_len );
len = ( sizeof ( *hdr ) + data_len );
/* Verify a selection of fields */
if ( ( hdr.hver != INTEL_UCODE_HVER ) ||
( hdr.lver != INTEL_UCODE_LVER ) ||
( len < sizeof ( hdr ) ) ||
if ( ( hdr->hver != INTEL_UCODE_HVER ) ||
( hdr->lver != INTEL_UCODE_LVER ) ||
( len < sizeof ( *hdr ) ) ||
( len > remaining ) ||
( data_len > ( len - sizeof ( hdr ) ) ) ||
( data_len > ( len - sizeof ( *hdr ) ) ) ||
( ( data_len % sizeof ( uint32_t ) ) != 0 ) ||
( ( len % INTEL_UCODE_ALIGN ) != 0 ) ) {
DBGC2 ( image, "UCODE %s+%#04zx is not an Intel update\n",
@ -444,48 +441,46 @@ static int ucode_parse_intel ( struct image *image, size_t start,
return rc;
/* Populate descriptor */
desc.signature = hdr.signature;
desc.version = hdr.version;
desc.address = user_to_phys ( image->data,
( start + sizeof ( hdr ) ) );
desc.signature = hdr->signature;
desc.version = hdr->version;
desc.address = ( virt_to_phys ( image->data ) + start +
sizeof ( *hdr ) );
/* Add non-extended descriptor, if applicable */
ucode_describe ( image, start, &ucode_intel, &desc, hdr.platforms,
ucode_describe ( image, start, &ucode_intel, &desc, hdr->platforms,
update );
/* Construct extended descriptors, if applicable */
offset = ( sizeof ( hdr ) + data_len );
if ( offset <= ( len - sizeof ( exthdr ) ) ) {
offset = ( sizeof ( *hdr ) + data_len );
if ( offset <= ( len - sizeof ( *exthdr ) ) ) {
/* Read extended header */
copy_from_user ( &exthdr, image->data, ( start + offset ),
sizeof ( exthdr ) );
offset += sizeof ( exthdr );
exthdr = ( image->data + start + offset );
offset += sizeof ( *exthdr );
/* Read extended signatures */
for ( i = 0 ; i < exthdr.count ; i++ ) {
for ( i = 0 ; i < exthdr->count ; i++ ) {
/* Read extended signature */
if ( offset > ( len - sizeof ( ext ) ) ) {
if ( offset > ( len - sizeof ( *ext ) ) ) {
DBGC ( image, "UCODE %s+%#04zx extended "
"signature overrun\n",
image->name, start );
return -EINVAL;
}
copy_from_user ( &ext, image->data, ( start + offset ),
sizeof ( ext ) );
offset += sizeof ( ext );
ext = ( image->data + start + offset );
offset += sizeof ( *ext );
/* Avoid duplicating non-extended descriptor */
if ( ( ext.signature == hdr.signature ) &&
( ext.platforms == hdr.platforms ) ) {
if ( ( ext->signature == hdr->signature ) &&
( ext->platforms == hdr->platforms ) ) {
continue;
}
/* Construct descriptor, if applicable */
desc.signature = ext.signature;
desc.signature = ext->signature;
ucode_describe ( image, start, &ucode_intel, &desc,
ext.platforms, update );
ext->platforms, update );
}
}
@ -502,10 +497,10 @@ static int ucode_parse_intel ( struct image *image, size_t start,
*/
static int ucode_parse_amd ( struct image *image, size_t start,
struct ucode_update *update ) {
struct amd_ucode_header hdr;
struct amd_ucode_equivalence equiv;
struct amd_ucode_patch_header phdr;
struct amd_ucode_patch patch;
const struct amd_ucode_header *hdr;
const struct amd_ucode_equivalence *equiv;
const struct amd_ucode_patch_header *phdr;
const struct amd_ucode_patch *patch;
struct ucode_descriptor desc;
size_t remaining;
size_t offset;
@ -515,91 +510,85 @@ static int ucode_parse_amd ( struct image *image, size_t start,
/* Read header */
remaining = ( image->len - start );
if ( remaining < sizeof ( hdr ) ) {
if ( remaining < sizeof ( *hdr ) ) {
DBGC ( image, "UCODE %s+%#04zx too small for AMD header\n",
image->name, start );
return -ENOEXEC;
}
copy_from_user ( &hdr, image->data, start, sizeof ( hdr ) );
hdr = ( image->data + start );
/* Check header */
if ( hdr.magic != AMD_UCODE_MAGIC ) {
if ( hdr->magic != AMD_UCODE_MAGIC ) {
DBGC2 ( image, "UCODE %s+%#04zx is not an AMD update\n",
image->name, start );
return -ENOEXEC;
}
DBGC2 ( image, "UCODE %s+%#04zx is an AMD update\n",
image->name, start );
if ( hdr.type != AMD_UCODE_EQUIV_TYPE ) {
if ( hdr->type != AMD_UCODE_EQUIV_TYPE ) {
DBGC ( image, "UCODE %s+%#04zx unsupported equivalence table "
"type %d\n", image->name, start, hdr.type );
"type %d\n", image->name, start, hdr->type );
return -ENOTSUP;
}
if ( hdr.len > ( remaining - sizeof ( hdr ) ) ) {
if ( hdr->len > ( remaining - sizeof ( *hdr ) ) ) {
DBGC ( image, "UCODE %s+%#04zx truncated equivalence table\n",
image->name, start );
return -EINVAL;
}
/* Count number of equivalence table entries */
offset = sizeof ( hdr );
for ( count = 0 ; offset < ( sizeof ( hdr ) + hdr.len ) ;
count++, offset += sizeof ( equiv ) ) {
copy_from_user ( &equiv, image->data, ( start + offset ),
sizeof ( equiv ) );
if ( ! equiv.signature )
offset = sizeof ( *hdr );
equiv = ( image->data + start + offset );
for ( count = 0 ; offset < ( sizeof ( *hdr ) + hdr->len ) ;
count++, offset += sizeof ( *equiv ) ) {
if ( ! equiv[count].signature )
break;
}
DBGC2 ( image, "UCODE %s+%#04zx has %d equivalence table entries\n",
image->name, start, count );
/* Parse available updates */
offset = ( sizeof ( hdr ) + hdr.len );
offset = ( sizeof ( *hdr ) + hdr->len );
used = 0;
while ( used < count ) {
/* Read patch header */
if ( ( offset + sizeof ( phdr ) ) > remaining ) {
if ( ( offset + sizeof ( *phdr ) ) > remaining ) {
DBGC ( image, "UCODE %s+%#04zx truncated patch "
"header\n", image->name, start );
return -EINVAL;
}
copy_from_user ( &phdr, image->data, ( start + offset ),
sizeof ( phdr ) );
offset += sizeof ( phdr );
phdr = ( image->data + start + offset );
offset += sizeof ( *phdr );
/* Validate patch header */
if ( phdr.type != AMD_UCODE_PATCH_TYPE ) {
if ( phdr->type != AMD_UCODE_PATCH_TYPE ) {
DBGC ( image, "UCODE %s+%#04zx unsupported patch type "
"%d\n", image->name, start, phdr.type );
"%d\n", image->name, start, phdr->type );
return -ENOTSUP;
}
if ( phdr.len < sizeof ( patch ) ) {
if ( phdr->len < sizeof ( *patch ) ) {
DBGC ( image, "UCODE %s+%#04zx underlength patch\n",
image->name, start );
return -EINVAL;
}
if ( phdr.len > ( remaining - offset ) ) {
if ( phdr->len > ( remaining - offset ) ) {
DBGC ( image, "UCODE %s+%#04zx truncated patch\n",
image->name, start );
return -EINVAL;
}
/* Read patch and construct descriptor */
copy_from_user ( &patch, image->data, ( start + offset ),
sizeof ( patch ) );
desc.version = patch.version;
desc.address = user_to_phys ( image->data, ( start + offset ) );
offset += phdr.len;
patch = ( image->data + start + offset );
desc.version = patch->version;
desc.address = ( virt_to_phys ( image->data ) +
start + offset );
offset += phdr->len;
/* Parse equivalence table to find matching signatures */
for ( i = 0 ; i < count ; i++ ) {
copy_from_user ( &equiv, image->data,
( start + sizeof ( hdr ) +
( i * ( sizeof ( equiv ) ) ) ),
sizeof ( equiv ) );
if ( patch.id == equiv.id ) {
desc.signature = equiv.signature;
if ( patch->id == equiv[i].id ) {
desc.signature = equiv[i].signature;
ucode_describe ( image, start, &ucode_amd,
&desc, 0, update );
used++;
@ -744,19 +733,19 @@ static int ucode_exec ( struct image *image ) {
* @ret rc Return status code
*/
static int ucode_probe ( struct image *image ) {
union {
const union {
struct intel_ucode_header intel;
struct amd_ucode_header amd;
} header;
} *header;
/* Sanity check */
if ( image->len < sizeof ( header ) ) {
if ( image->len < sizeof ( *header ) ) {
DBGC ( image, "UCODE %s too short\n", image->name );
return -ENOEXEC;
}
/* Read first microcode image header */
copy_from_user ( &header, image->data, 0, sizeof ( header ) );
header = image->data;
/* Check for something that looks like an Intel update
*
@ -769,19 +758,19 @@ static int ucode_probe ( struct image *image ) {
* the image, and do not want to have a microcode image
* erroneously treated as a PXE boot executable.
*/
if ( ( header.intel.hver == INTEL_UCODE_HVER ) &&
( header.intel.lver == INTEL_UCODE_LVER ) &&
( ( header.intel.date.century == 0x19 ) ||
( ( header.intel.date.century >= 0x20 ) &&
( header.intel.date.century <= 0x29 ) ) ) ) {
if ( ( header->intel.hver == INTEL_UCODE_HVER ) &&
( header->intel.lver == INTEL_UCODE_LVER ) &&
( ( header->intel.date.century == 0x19 ) ||
( ( header->intel.date.century >= 0x20 ) &&
( header->intel.date.century <= 0x29 ) ) ) ) {
DBGC ( image, "UCODE %s+%#04zx looks like an Intel update\n",
image->name, ( ( size_t ) 0 ) );
return 0;
}
/* Check for AMD update signature */
if ( ( header.amd.magic == AMD_UCODE_MAGIC ) &&
( header.amd.type == AMD_UCODE_EQUIV_TYPE ) ) {
if ( ( header->amd.magic == AMD_UCODE_MAGIC ) &&
( header->amd.type == AMD_UCODE_EQUIV_TYPE ) ) {
DBGC ( image, "UCODE %s+%#04zx looks like an AMD update\n",
image->name, ( ( size_t ) 0 ) );
return 0;

View File

@ -27,9 +27,4 @@ static inline unsigned int get_fbms ( void ) {
extern void set_fbms ( unsigned int new_fbms );
/* Actually in hidemem.c, but putting it here avoids polluting the
* architecture-independent include/hidemem.h.
*/
extern void hide_basemem ( void );
#endif /* _BASEMEM_H */

View File

@ -1,69 +0,0 @@
#ifndef BIOS_DISKS_H
#define BIOS_DISKS_H
#include "dev.h"
/*
* Constants
*
*/
#define BIOS_DISK_MAX_NAME_LEN 6
struct bios_disk_sector {
char data[512];
};
/*
* The location of a BIOS disk
*
*/
struct bios_disk_loc {
uint8_t drive;
};
/*
* A physical BIOS disk device
*
*/
struct bios_disk_device {
char name[BIOS_DISK_MAX_NAME_LEN];
uint8_t drive;
uint8_t type;
};
/*
* A BIOS disk driver, with a valid device ID range and naming
* function.
*
*/
struct bios_disk_driver {
void ( *fill_drive_name ) ( char *buf, uint8_t drive );
uint8_t min_drive;
uint8_t max_drive;
};
/*
* Define a BIOS disk driver
*
*/
#define BIOS_DISK_DRIVER( _name, _fill_drive_name, _min_drive, _max_drive ) \
static struct bios_disk_driver _name = { \
.fill_drive_name = _fill_drive_name, \
.min_drive = _min_drive, \
.max_drive = _max_drive, \
}
/*
* Functions in bios_disks.c
*
*/
/*
* bios_disk bus global definition
*
*/
extern struct bus_driver bios_disk_driver;
#endif /* BIOS_DISKS_H */

View File

@ -8,8 +8,7 @@ FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
* @{
*/
#define ERRFILE_memtop_umalloc ( ERRFILE_ARCH | ERRFILE_CORE | 0x00000000 )
#define ERRFILE_memmap ( ERRFILE_ARCH | ERRFILE_CORE | 0x00010000 )
#define ERRFILE_int15 ( ERRFILE_ARCH | ERRFILE_CORE | 0x00010000 )
#define ERRFILE_pnpbios ( ERRFILE_ARCH | ERRFILE_CORE | 0x00020000 )
#define ERRFILE_bios_smbios ( ERRFILE_ARCH | ERRFILE_CORE | 0x00030000 )
#define ERRFILE_biosint ( ERRFILE_ARCH | ERRFILE_CORE | 0x00040000 )
@ -42,7 +41,6 @@ FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
#define ERRFILE_comboot_resolv ( ERRFILE_ARCH | ERRFILE_IMAGE | 0x00090000 )
#define ERRFILE_comboot_call ( ERRFILE_ARCH | ERRFILE_IMAGE | 0x000a0000 )
#define ERRFILE_sdi ( ERRFILE_ARCH | ERRFILE_IMAGE | 0x000b0000 )
#define ERRFILE_initrd ( ERRFILE_ARCH | ERRFILE_IMAGE | 0x000c0000 )
#define ERRFILE_pxe_call ( ERRFILE_ARCH | ERRFILE_IMAGE | 0x000d0000 )
#define ERRFILE_ucode ( ERRFILE_ARCH | ERRFILE_IMAGE | 0x000e0000 )

View File

@ -0,0 +1,14 @@
#ifndef _BITS_MEMMAP_H
#define _BITS_MEMMAP_H
/** @file
*
* x86-specific system memory map API implementations
*
*/
FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
#include <ipxe/int15.h>
#endif /* _BITS_MEMMAP_H */

View File

@ -0,0 +1,60 @@
#ifndef _BITS_NS16550_H
#define _BITS_NS16550_H
/** @file
*
* 16550-compatible UART
*
*/
FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
#include <stdint.h>
#include <ipxe/io.h>
/**
* Write to UART register
*
* @v ns16550 16550 UART
* @v address Register address
* @v data Data
*/
static inline __attribute__ (( always_inline )) void
ns16550_write ( struct ns16550_uart *ns16550, unsigned int address,
uint8_t data ) {
outb ( data, ( ns16550->base + address ) );
}
/**
* Read from UART register
*
* @v ns16550 16550 UART
* @v address Register address
* @ret data Data
*/
static inline __attribute__ (( always_inline )) uint8_t
ns16550_read ( struct ns16550_uart *ns16550, unsigned int address ) {
return inb ( ns16550->base + address );
}
/* Fixed ISA serial port base addresses */
#define COM1_BASE 0x3f8
#define COM2_BASE 0x2f8
#define COM3_BASE 0x3e8
#define COM4_BASE 0x2e8
/* Fixed ISA serial ports */
extern struct uart com1;
extern struct uart com2;
extern struct uart com3;
extern struct uart com4;
/* Fixed ISA serial port names */
#define COM1 &com1
#define COM2 &com2
#define COM3 &com3
#define COM4 &com4
#endif /* _BITS_NS16550_H */

View File

@ -1,14 +0,0 @@
#ifndef _BITS_UACCESS_H
#define _BITS_UACCESS_H
/** @file
*
* x86-specific user access API implementations
*
*/
FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
#include <librm.h>
#endif /* _BITS_UACCESS_H */

View File

@ -1,41 +0,0 @@
#ifndef _BITS_UART_H
#define _BITS_UART_H
/** @file
*
* 16550-compatible UART
*
*/
FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
#include <stdint.h>
#include <ipxe/io.h>
/**
* Write to UART register
*
* @v uart UART
* @v addr Register address
* @v data Data
*/
static inline __attribute__ (( always_inline )) void
uart_write ( struct uart *uart, unsigned int addr, uint8_t data ) {
outb ( data, ( uart->base + addr ) );
}
/**
* Read from UART register
*
* @v uart UART
* @v addr Register address
* @ret data Data
*/
static inline __attribute__ (( always_inline )) uint8_t
uart_read ( struct uart *uart, unsigned int addr ) {
return inb ( uart->base + addr );
}
extern int uart_select ( struct uart *uart, unsigned int port );
#endif /* _BITS_UART_H */

View File

@ -1,14 +0,0 @@
#ifndef _BITS_UMALLOC_H
#define _BITS_UMALLOC_H
/** @file
*
* x86-specific user memory allocation API implementations
*
*/
FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
#include <ipxe/memtop_umalloc.h>
#endif /* _BITS_UMALLOC_H */

View File

@ -1,23 +0,0 @@
#ifndef _INITRD_H
#define _INITRD_H
/** @file
*
* Initial ramdisk (initrd) reshuffling
*
*/
FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
#include <ipxe/uaccess.h>
/** Minimum free space required to reshuffle initrds
*
* Chosen to avoid absurdly long reshuffling times
*/
#define INITRD_MIN_FREE_LEN ( 512 * 1024 )
extern void initrd_reshuffle ( userptr_t bottom );
extern int initrd_reshuffle_check ( size_t len, userptr_t bottom );
#endif /* _INITRD_H */

View File

@ -0,0 +1,21 @@
#ifndef _IPXE_INT15_H
#define _IPXE_INT15_H
/** @file
*
* INT15-based memory map
*
*/
FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
#ifdef MEMMAP_INT15
#define MEMMAP_PREFIX_int15
#else
#define MEMMAP_PREFIX_int15 __int15_
#endif
extern void int15_intercept ( int intercept );
extern void hide_basemem ( void );
#endif /* _IPXE_INT15_H */

View File

@ -1,18 +0,0 @@
#ifndef _IPXE_MEMTOP_UMALLOC_H
#define _IPXE_MEMTOP_UMALLOC_H
/** @file
*
* External memory allocation
*
*/
FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
#ifdef UMALLOC_MEMTOP
#define UMALLOC_PREFIX_memtop
#else
#define UMALLOC_PREFIX_memtop __memtop_
#endif
#endif /* _IPXE_MEMTOP_UMALLOC_H */

View File

@ -20,9 +20,9 @@ FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
*
* @v signature Requested table signature
* @v index Requested index of table with this signature
* @ret table Table, or UNULL if not found
* @ret table Table, or NULL if not found
*/
static inline __attribute__ (( always_inline )) userptr_t
static inline __attribute__ (( always_inline )) const struct acpi_header *
ACPI_INLINE ( rsdp, acpi_find ) ( uint32_t signature, unsigned int index ) {
return acpi_find_via_rsdt ( signature, index );

View File

@ -194,7 +194,7 @@ copy_from_user ( void *dest, userptr_t buffer, off_t offset, size_t len ) {
* @ret buffer User buffer
*/
static inline __attribute__ (( always_inline )) userptr_t
real_to_user ( unsigned int segment, unsigned int offset ) {
real_to_virt ( unsigned int segment, unsigned int offset ) {
return ( ( segment << 16 ) | offset );
}
@ -210,7 +210,7 @@ real_to_user ( unsigned int segment, unsigned int offset ) {
*/
static inline __attribute__ (( always_inline )) userptr_t
virt_to_user ( void * virtual ) {
return real_to_user ( rm_ds, ( intptr_t ) virtual );
return real_to_virt ( rm_ds, ( intptr_t ) virtual );
}
/* TEXT16_CODE: declare a fragment of code that resides in .text16 */

View File

@ -64,12 +64,6 @@ FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
#else /* ASSEMBLY */
#ifdef UACCESS_LIBRM
#define UACCESS_PREFIX_librm
#else
#define UACCESS_PREFIX_librm __librm_
#endif
/**
* Call C function from real-mode code
*
@ -79,114 +73,6 @@ FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
"pushl $( " _S2 ( VIRTUAL ( function ) ) " )\n\t" \
"call virt_call\n\t"
/* Variables in librm.S */
extern const unsigned long virt_offset;
/**
* Convert physical address to user pointer
*
* @v phys_addr Physical address
* @ret userptr User pointer
*/
static inline __always_inline userptr_t
UACCESS_INLINE ( librm, phys_to_user ) ( unsigned long phys_addr ) {
/* In a 64-bit build, any valid physical address is directly
* usable as a virtual address, since the low 4GB is
* identity-mapped.
*/
if ( sizeof ( physaddr_t ) > sizeof ( uint32_t ) )
return phys_addr;
/* In a 32-bit build, subtract virt_offset */
return ( phys_addr - virt_offset );
}
/**
* Convert user buffer to physical address
*
* @v userptr User pointer
* @v offset Offset from user pointer
* @ret phys_addr Physical address
*/
static inline __always_inline unsigned long
UACCESS_INLINE ( librm, user_to_phys ) ( userptr_t userptr, off_t offset ) {
unsigned long addr = ( userptr + offset );
/* In a 64-bit build, any virtual address in the low 4GB is
* directly usable as a physical address, since the low 4GB is
* identity-mapped.
*/
if ( ( sizeof ( physaddr_t ) > sizeof ( uint32_t ) ) &&
( addr <= 0xffffffffUL ) )
return addr;
/* In a 32-bit build or in a 64-bit build with a virtual
* address above 4GB: add virt_offset
*/
return ( addr + virt_offset );
}
static inline __always_inline userptr_t
UACCESS_INLINE ( librm, virt_to_user ) ( volatile const void *addr ) {
return trivial_virt_to_user ( addr );
}
static inline __always_inline void *
UACCESS_INLINE ( librm, user_to_virt ) ( userptr_t userptr, off_t offset ) {
return trivial_user_to_virt ( userptr, offset );
}
static inline __always_inline userptr_t
UACCESS_INLINE ( librm, userptr_add ) ( userptr_t userptr, off_t offset ) {
return trivial_userptr_add ( userptr, offset );
}
static inline __always_inline off_t
UACCESS_INLINE ( librm, userptr_sub ) ( userptr_t userptr,
userptr_t subtrahend ) {
return trivial_userptr_sub ( userptr, subtrahend );
}
static inline __always_inline void
UACCESS_INLINE ( librm, memcpy_user ) ( userptr_t dest, off_t dest_off,
userptr_t src, off_t src_off,
size_t len ) {
trivial_memcpy_user ( dest, dest_off, src, src_off, len );
}
static inline __always_inline void
UACCESS_INLINE ( librm, memmove_user ) ( userptr_t dest, off_t dest_off,
userptr_t src, off_t src_off,
size_t len ) {
trivial_memmove_user ( dest, dest_off, src, src_off, len );
}
static inline __always_inline int
UACCESS_INLINE ( librm, memcmp_user ) ( userptr_t first, off_t first_off,
userptr_t second, off_t second_off,
size_t len ) {
return trivial_memcmp_user ( first, first_off, second, second_off, len);
}
static inline __always_inline void
UACCESS_INLINE ( librm, memset_user ) ( userptr_t buffer, off_t offset,
int c, size_t len ) {
trivial_memset_user ( buffer, offset, c, len );
}
static inline __always_inline size_t
UACCESS_INLINE ( librm, strlen_user ) ( userptr_t buffer, off_t offset ) {
return trivial_strlen_user ( buffer, offset );
}
static inline __always_inline off_t
UACCESS_INLINE ( librm, memchr_user ) ( userptr_t buffer, off_t offset,
int c, size_t len ) {
return trivial_memchr_user ( buffer, offset, c, len );
}
/******************************************************************************
*
* Access to variables in .data16 and .text16
@ -244,8 +130,8 @@ extern const uint16_t __text16 ( rm_cs );
extern const uint16_t __text16 ( rm_ds );
#define rm_ds __use_text16 ( rm_ds )
extern uint16_t copy_user_to_rm_stack ( userptr_t data, size_t size );
extern void remove_user_from_rm_stack ( userptr_t data, size_t size );
extern uint16_t copy_to_rm_stack ( const void *data, size_t size );
extern void remove_from_rm_stack ( void *data, size_t size );
/* CODE_DEFAULT: restore default .code32/.code64 directive */
#ifdef __x86_64__
@ -479,7 +365,8 @@ extern char __text16_array ( sipi, [] );
#define sipi __use_text16 ( sipi )
/** Length of startup IPI real-mode handler */
extern char sipi_len[];
extern size_t ABS_SYMBOL ( sipi_len );
#define sipi_len ABS_VALUE ( sipi_len )
/** Startup IPI real-mode handler copy of real-mode data segment */
extern uint16_t __text16 ( sipi_ds );

View File

@ -47,9 +47,6 @@ FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
/* Macros to enable/disable IRQs */
#define IMR_REG(x) ( (x) < IRQ_PIC_CUTOFF ? PIC1_IMR : PIC2_IMR )
#define IMR_BIT(x) ( 1 << ( (x) % IRQ_PIC_CUTOFF ) )
#define irq_enabled(x) ( ( inb ( IMR_REG(x) ) & IMR_BIT(x) ) == 0 )
#define enable_irq(x) outb ( inb( IMR_REG(x) ) & ~IMR_BIT(x), IMR_REG(x) )
#define disable_irq(x) outb ( inb( IMR_REG(x) ) | IMR_BIT(x), IMR_REG(x) )
/* Macros for acknowledging IRQs */
#define ICR_REG( irq ) ( (irq) < IRQ_PIC_CUTOFF ? PIC1_ICR : PIC2_ICR )
@ -63,6 +60,50 @@ FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
#define IRQ_MAX 15
#define IRQ_NONE -1U
/**
* Check if interrupt is enabled
*
* @v irq Interrupt number
* @ret enabled Interrupt is currently enabled
*/
static inline __attribute__ (( always_inline )) int
irq_enabled ( unsigned int irq ) {
int imr = inb ( IMR_REG ( irq ) );
int mask = IMR_BIT ( irq );
return ( ( imr & mask ) == 0 );
}
/**
* Enable interrupt
*
* @v irq Interrupt number
* @ret enabled Interrupt was previously enabled
*/
static inline __attribute__ (( always_inline )) int
enable_irq ( unsigned int irq ) {
int imr = inb ( IMR_REG ( irq ) );
int mask = IMR_BIT ( irq );
outb ( ( imr & ~mask ), IMR_REG ( irq ) );
return ( ( imr & mask ) == 0 );
}
/**
* Disable interrupt
*
* @v irq Interrupt number
* @ret enabled Interrupt was previously enabled
*/
static inline __attribute__ (( always_inline )) int
disable_irq ( unsigned int irq ) {
int imr = inb ( IMR_REG ( irq ) );
int mask = IMR_BIT ( irq );
outb ( ( imr | mask ), IMR_REG ( irq ) );
return ( ( imr & mask ) == 0 );
}
/* Function prototypes
*/
void send_eoi ( unsigned int irq );

View File

@ -85,8 +85,6 @@ struct pxe_api_call {
* @ret exit PXE API call exit code
*/
PXENV_EXIT_t ( * entry ) ( union u_PXENV_ANY *params );
/** Length of parameters */
uint16_t params_len;
/** Opcode */
uint16_t opcode;
};
@ -112,7 +110,6 @@ struct pxe_api_call {
( union u_PXENV_ANY *params ) ) _entry ) \
: ( ( PXENV_EXIT_t ( * ) \
( union u_PXENV_ANY *params ) ) _entry ) ), \
.params_len = sizeof ( _params_type ), \
.opcode = _opcode, \
}

View File

@ -2,7 +2,9 @@
#define REALMODE_H
#include <stdint.h>
#include <string.h>
#include <registers.h>
#include <librm.h>
#include <ipxe/uaccess.h>
/*
@ -65,15 +67,15 @@ FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
*/
/**
* Convert segment:offset address to user buffer
* Convert segment:offset address to virtual address
*
* @v segment Real-mode segment
* @v offset Real-mode offset
* @ret buffer User buffer
* @ret virt Virtual address
*/
static inline __always_inline userptr_t
real_to_user ( unsigned int segment, unsigned int offset ) {
return ( phys_to_user ( ( segment << 4 ) + offset ) );
static inline __always_inline void *
real_to_virt ( unsigned int segment, unsigned int offset ) {
return ( phys_to_virt ( ( segment << 4 ) + offset ) );
}
/**
@ -87,7 +89,7 @@ real_to_user ( unsigned int segment, unsigned int offset ) {
static inline __always_inline void
copy_to_real ( unsigned int dest_seg, unsigned int dest_off,
void *src, size_t n ) {
copy_to_user ( real_to_user ( dest_seg, dest_off ), 0, src, n );
memcpy ( real_to_virt ( dest_seg, dest_off ), src, n );
}
/**
@ -101,7 +103,7 @@ copy_to_real ( unsigned int dest_seg, unsigned int dest_off,
static inline __always_inline void
copy_from_real ( void *dest, unsigned int src_seg,
unsigned int src_off, size_t n ) {
copy_from_user ( dest, real_to_user ( src_seg, src_off ), 0, n );
memcpy ( dest, real_to_virt ( src_seg, src_off ), n );
}
/**

View File

@ -102,20 +102,19 @@ static void acpi_udelay ( unsigned long usecs ) {
* @ret rc Return status code
*/
static int acpi_timer_probe ( void ) {
struct acpi_fadt fadtab;
userptr_t fadt;
const struct acpi_fadt *fadt;
unsigned int pm_tmr_blk;
/* Locate FADT */
fadt = acpi_table ( FADT_SIGNATURE, 0 );
fadt = container_of ( acpi_table ( FADT_SIGNATURE, 0 ),
struct acpi_fadt, acpi );
if ( ! fadt ) {
DBGC ( &acpi_timer, "ACPI could not find FADT\n" );
return -ENOENT;
}
/* Read FADT */
copy_from_user ( &fadtab, fadt, 0, sizeof ( fadtab ) );
pm_tmr_blk = le32_to_cpu ( fadtab.pm_tmr_blk );
pm_tmr_blk = le32_to_cpu ( fadt->pm_tmr_blk );
if ( ! pm_tmr_blk ) {
DBGC ( &acpi_timer, "ACPI has no timer\n" );
return -ENOENT;

View File

@ -23,6 +23,7 @@
FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
#include <string.h>
#include <unistd.h>
#include <errno.h>
#include <byteswap.h>
@ -62,8 +63,8 @@ FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
* uglier hacks I have ever implemented, but it's still prettier than
* the ACPI specification itself.
*/
static int acpi_extract_sx ( userptr_t zsdt, size_t len, size_t offset,
void *data ) {
static int acpi_extract_sx ( const struct acpi_header *zsdt, size_t len,
size_t offset, void *data ) {
unsigned int *sx = data;
uint8_t bytes[4];
uint8_t *byte;
@ -77,7 +78,8 @@ static int acpi_extract_sx ( userptr_t zsdt, size_t len, size_t offset,
}
/* Read first four bytes of value */
copy_from_user ( bytes, zsdt, offset, sizeof ( bytes ) );
memcpy ( bytes, ( ( ( const void * ) zsdt ) + offset ),
sizeof ( bytes ) );
DBGC ( colour, "ACPI found \\_Sx containing %02x:%02x:%02x:%02x\n",
bytes[0], bytes[1], bytes[2], bytes[3] );
@ -111,8 +113,7 @@ static int acpi_extract_sx ( userptr_t zsdt, size_t len, size_t offset,
* @ret rc Return status code
*/
int acpi_poweroff ( void ) {
struct acpi_fadt fadtab;
userptr_t fadt;
const struct acpi_fadt *fadt;
unsigned int pm1a_cnt_blk;
unsigned int pm1b_cnt_blk;
unsigned int pm1a_cnt;
@ -123,16 +124,16 @@ int acpi_poweroff ( void ) {
int rc;
/* Locate FADT */
fadt = acpi_table ( FADT_SIGNATURE, 0 );
fadt = container_of ( acpi_table ( FADT_SIGNATURE, 0 ),
struct acpi_fadt, acpi );
if ( ! fadt ) {
DBGC ( colour, "ACPI could not find FADT\n" );
return -ENOENT;
}
/* Read FADT */
copy_from_user ( &fadtab, fadt, 0, sizeof ( fadtab ) );
pm1a_cnt_blk = le32_to_cpu ( fadtab.pm1a_cnt_blk );
pm1b_cnt_blk = le32_to_cpu ( fadtab.pm1b_cnt_blk );
pm1a_cnt_blk = le32_to_cpu ( fadt->pm1a_cnt_blk );
pm1b_cnt_blk = le32_to_cpu ( fadt->pm1b_cnt_blk );
pm1a_cnt = ( pm1a_cnt_blk + ACPI_PM1_CNT );
pm1b_cnt = ( pm1b_cnt_blk + ACPI_PM1_CNT );

View File

@ -27,7 +27,7 @@ FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
#include <realmode.h>
#include <bios.h>
#include <basemem.h>
#include <ipxe/hidemem.h>
#include <ipxe/memmap.h>
/** @file
*

View File

@ -24,6 +24,7 @@
FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
#include <stdint.h>
#include <string.h>
#include <ipxe/init.h>
#include <ipxe/cachedhcp.h>
#include <realmode.h>
@ -60,7 +61,7 @@ static void cachedhcp_init ( void ) {
/* Record cached DHCPACK */
if ( ( rc = cachedhcp_record ( &cached_dhcpack, 0,
phys_to_user ( cached_dhcpack_phys ),
phys_to_virt ( cached_dhcpack_phys ),
sizeof ( BOOTPLAYER_t ) ) ) != 0 ) {
DBGC ( colour, "CACHEDHCP could not record DHCPACK: %s\n",
strerror ( rc ) );
@ -73,5 +74,6 @@ static void cachedhcp_init ( void ) {
/** Cached DHCPACK initialisation function */
struct init_fn cachedhcp_init_fn __init_fn ( INIT_NORMAL ) = {
.name = "cachedhcp",
.initialise = cachedhcp_init,
};

View File

@ -30,6 +30,7 @@ FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
*/
#include <registers.h>
#include <librm.h>
#include <ipxe/uaccess.h>
#include <ipxe/timer.h>
#include <ipxe/msr.h>
@ -92,9 +93,9 @@ static void bios_mp_exec_boot ( mp_func_t func, void *opaque ) {
"pushl %k1\n\t"
"call *%k0\n\t"
"addl $8, %%esp\n\t" )
: : "r" ( mp_address ( mp_call ) ),
"r" ( mp_address ( func ) ),
"r" ( mp_address ( opaque ) ) );
: : "R" ( mp_address ( mp_call ) ),
"R" ( mp_address ( func ) ),
"R" ( mp_address ( opaque ) ) );
}
/**

View File

@ -29,6 +29,7 @@ FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
*
*/
#include <string.h>
#include <ipxe/reboot.h>
#include <realmode.h>
#include <bios.h>
@ -38,14 +39,14 @@ FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
/**
* Reboot system
*
* @v warm Perform a warm reboot
* @v flags Reboot flags
*/
static void bios_reboot ( int warm ) {
uint16_t flag;
static void bios_reboot ( int flags ) {
uint16_t type;
/* Configure BIOS for cold/warm reboot */
flag = ( warm ? BDA_REBOOT_WARM : 0 );
put_real ( flag, BDA_SEG, BDA_REBOOT );
type = ( ( flags & REBOOT_WARM ) ? BDA_REBOOT_WARM : 0 );
put_real ( type, BDA_SEG, BDA_REBOOT );
/* Jump to system reset vector */
__asm__ __volatile__ ( REAL_CODE ( "ljmp $0xf000, $0xfff0" ) : );

View File

@ -45,19 +45,18 @@ FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
* @ret rc Return status code
*/
static int bios_find_smbios2 ( struct smbios *smbios ) {
struct smbios_entry entry;
int rc;
const struct smbios_entry *entry;
/* Scan through BIOS segment to find SMBIOS 32-bit entry point */
if ( ( rc = find_smbios_entry ( real_to_user ( BIOS_SEG, 0 ), 0x10000,
&entry ) ) != 0 )
return rc;
entry = find_smbios_entry ( real_to_virt ( BIOS_SEG, 0 ), 0x10000 );
if ( ! entry )
return -ENOENT;
/* Fill in entry point descriptor structure */
smbios->address = phys_to_user ( entry.smbios_address );
smbios->len = entry.smbios_len;
smbios->count = entry.smbios_count;
smbios->version = SMBIOS_VERSION ( entry.major, entry.minor );
smbios->address = phys_to_virt ( entry->smbios_address );
smbios->len = entry->smbios_len;
smbios->count = entry->smbios_count;
smbios->version = SMBIOS_VERSION ( entry->major, entry->minor );
return 0;
}
@ -69,26 +68,25 @@ static int bios_find_smbios2 ( struct smbios *smbios ) {
* @ret rc Return status code
*/
static int bios_find_smbios3 ( struct smbios *smbios ) {
struct smbios3_entry entry;
int rc;
const struct smbios3_entry *entry;
/* Scan through BIOS segment to find SMBIOS 64-bit entry point */
if ( ( rc = find_smbios3_entry ( real_to_user ( BIOS_SEG, 0 ), 0x10000,
&entry ) ) != 0 )
return rc;
entry = find_smbios3_entry ( real_to_virt ( BIOS_SEG, 0 ), 0x10000 );
if ( ! entry )
return -ENOENT;
/* Check that address is accessible */
if ( entry.smbios_address > ~( ( physaddr_t ) 0 ) ) {
if ( entry->smbios_address > ~( ( physaddr_t ) 0 ) ) {
DBG ( "SMBIOS3 at %08llx is inaccessible\n",
( ( unsigned long long ) entry.smbios_address ) );
( ( unsigned long long ) entry->smbios_address ) );
return -ENOTSUP;
}
/* Fill in entry point descriptor structure */
smbios->address = phys_to_user ( entry.smbios_address );
smbios->len = entry.smbios_len;
smbios->address = phys_to_virt ( entry->smbios_address );
smbios->len = entry->smbios_len;
smbios->count = 0;
smbios->version = SMBIOS_VERSION ( entry.major, entry.minor );
smbios->version = SMBIOS_VERSION ( entry->major, entry->minor );
return 0;
}

View File

@ -1,3 +1,4 @@
#include <string.h>
#include <errno.h>
#include <realmode.h>
#include <biosint.h>

View File

@ -564,6 +564,8 @@ int15_88:
int15:
/* See if we want to intercept this call */
pushfw
cmpb $0, %cs:int15_intercept_flag
je 3f
cmpw $0xe820, %ax
jne 1f
cmpl $SMAP, %edx
@ -587,3 +589,9 @@ int15:
int15_vector:
.long 0
.size int15_vector, . - int15_vector
.section ".text16.data", "aw", @progbits
.globl int15_intercept_flag
int15_intercept_flag:
.byte 1
.size int15_intercept_flag, . - int15_intercept_flag

View File

@ -22,6 +22,7 @@
FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
#include <string.h>
#include <assert.h>
#include <realmode.h>
#include <biosint.h>
@ -29,7 +30,8 @@ FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
#include <fakee820.h>
#include <ipxe/init.h>
#include <ipxe/io.h>
#include <ipxe/hidemem.h>
#include <ipxe/uheap.h>
#include <ipxe/memmap.h>
/** Set to true if you want to test a fake E820 map */
#define FAKE_E820 0
@ -72,13 +74,17 @@ extern void int15();
extern struct segoff __text16 ( int15_vector );
#define int15_vector __use_text16 ( int15_vector )
/** INT 15 interception flag */
extern uint8_t __text16 ( int15_intercept_flag );
#define int15_intercept_flag __use_text16 ( int15_intercept_flag )
/* The linker defines these symbols for us */
extern char _textdata[];
extern char _etextdata[];
extern char _text16_memsz[];
#define _text16_memsz ( ( size_t ) _text16_memsz )
extern char _data16_memsz[];
#define _data16_memsz ( ( size_t ) _data16_memsz )
extern size_t ABS_SYMBOL ( _text16_memsz );
#define _text16_memsz ABS_VALUE ( _text16_memsz )
extern size_t ABS_SYMBOL ( _data16_memsz );
#define _data16_memsz ABS_VALUE ( _data16_memsz )
/**
* Hide region of memory from system memory map
@ -112,15 +118,6 @@ void hide_basemem ( void ) {
hidemem_base.start = ( get_fbms() * 1024 );
}
/**
* Hide umalloc() region
*
*/
void hide_umalloc ( physaddr_t start, physaddr_t end ) {
assert ( end <= virt_to_phys ( _textdata ) );
hide_region ( &hidemem_umalloc, start, end );
}
/**
* Hide .text and .data
*
@ -130,6 +127,37 @@ void hide_textdata ( void ) {
virt_to_phys ( _etextdata ) );
}
/**
* Synchronise in-use regions with the externally visible system memory map
*
*/
static void int15_sync ( void ) {
physaddr_t start;
physaddr_t end;
/* Besides our fixed base memory and textdata regions, we
* support hiding only a single in-use memory region (the
* umalloc region), which must be placed before the hidden
* textdata region (even if zero-length).
*/
start = uheap_start;
end = uheap_end;
if ( start == end )
start = end = virt_to_phys ( _textdata );
hide_region ( &hidemem_umalloc, start, end );
}
/**
* Set INT 15 interception flag
*
* @v intercept Intercept INT 15 calls to modify memory map
*/
void int15_intercept ( int intercept ) {
/* Set flag for INT 15 handler */
int15_intercept_flag = intercept;
}
/**
* Hide Etherboot
*
@ -137,26 +165,25 @@ void hide_textdata ( void ) {
* returned by the BIOS.
*/
static void hide_etherboot ( void ) {
struct memory_map memmap;
unsigned int rm_ds_top;
unsigned int rm_cs_top;
unsigned int fbms;
/* Dump memory map before mangling */
DBG ( "Hiding iPXE from system memory map\n" );
get_memmap ( &memmap );
memmap_dump_all ( 1 );
/* Hook in fake E820 map, if we're testing one */
if ( FAKE_E820 ) {
DBG ( "Hooking in fake E820 map\n" );
fake_e820();
get_memmap ( &memmap );
memmap_dump_all ( 1 );
}
/* Initialise the hidden regions */
hide_basemem();
hide_umalloc ( virt_to_phys ( _textdata ), virt_to_phys ( _textdata ) );
hide_textdata();
int15_sync();
/* Some really moronic BIOSes bring up the PXE stack via the
* UNDI loader entry point and then don't bother to unload it
@ -183,7 +210,7 @@ static void hide_etherboot ( void ) {
/* Dump memory map after mangling */
DBG ( "Hidden iPXE from system memory map\n" );
get_memmap ( &memmap );
memmap_dump_all ( 1 );
}
/**
@ -193,7 +220,6 @@ static void hide_etherboot ( void ) {
* possible.
*/
static void unhide_etherboot ( int flags __unused ) {
struct memory_map memmap;
int rc;
/* If we have more than one hooked interrupt at this point, it
@ -224,7 +250,7 @@ static void unhide_etherboot ( int flags __unused ) {
/* Dump memory map after unhiding */
DBG ( "Unhidden iPXE from system memory map\n" );
get_memmap ( &memmap );
memmap_dump_all ( 1 );
}
/** Hide Etherboot startup function */
@ -233,3 +259,5 @@ struct startup_fn hide_etherboot_startup_fn __startup_fn ( STARTUP_EARLY ) = {
.startup = hide_etherboot,
.shutdown = unhide_etherboot,
};
PROVIDE_MEMMAP ( int15, memmap_sync, int15_sync );

View File

@ -25,17 +25,18 @@ FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
#include <stdint.h>
#include <stdlib.h>
#include <string.h>
#include <limits.h>
#include <byteswap.h>
#include <errno.h>
#include <assert.h>
#include <ipxe/blockdev.h>
#include <ipxe/io.h>
#include <ipxe/acpi.h>
#include <ipxe/sanboot.h>
#include <ipxe/device.h>
#include <ipxe/pci.h>
#include <ipxe/eltorito.h>
#include <ipxe/memmap.h>
#include <realmode.h>
#include <bios.h>
#include <biosint.h>
@ -181,8 +182,7 @@ static int int13_parse_eltorito ( struct san_device *sandev, void *scratch ) {
int rc;
/* Read boot record volume descriptor */
if ( ( rc = sandev_read ( sandev, ELTORITO_LBA, 1,
virt_to_user ( boot ) ) ) != 0 ) {
if ( ( rc = sandev_read ( sandev, ELTORITO_LBA, 1, boot ) ) != 0 ) {
DBGC ( sandev->drive, "INT13 drive %02x could not read El "
"Torito boot record volume descriptor: %s\n",
sandev->drive, strerror ( rc ) );
@ -228,7 +228,7 @@ static int int13_guess_geometry_hdd ( struct san_device *sandev, void *scratch,
int rc;
/* Read partition table */
if ( ( rc = sandev_read ( sandev, 0, 1, virt_to_user ( mbr ) ) ) != 0 ) {
if ( ( rc = sandev_read ( sandev, 0, 1, mbr ) ) != 0 ) {
DBGC ( sandev->drive, "INT13 drive %02x could not read "
"partition table to guess geometry: %s\n",
sandev->drive, strerror ( rc ) );
@ -517,12 +517,12 @@ static int int13_rw_sectors ( struct san_device *sandev,
int ( * sandev_rw ) ( struct san_device *sandev,
uint64_t lba,
unsigned int count,
userptr_t buffer ) ) {
void *buffer ) ) {
struct int13_data *int13 = sandev->priv;
unsigned int cylinder, head, sector;
unsigned long lba;
unsigned int count;
userptr_t buffer;
void *buffer;
int rc;
/* Validate blocksize */
@ -549,7 +549,7 @@ static int int13_rw_sectors ( struct san_device *sandev,
lba = ( ( ( ( cylinder * int13->heads ) + head )
* int13->sectors_per_track ) + sector - 1 );
count = ix86->regs.al;
buffer = real_to_user ( ix86->segs.es, ix86->regs.bx );
buffer = real_to_virt ( ix86->segs.es, ix86->regs.bx );
DBGC2 ( sandev->drive, "C/H/S %d/%d/%d = LBA %08lx <-> %04x:%04x "
"(count %d)\n", cylinder, head, sector, lba, ix86->segs.es,
@ -710,12 +710,12 @@ static int int13_extended_rw ( struct san_device *sandev,
int ( * sandev_rw ) ( struct san_device *sandev,
uint64_t lba,
unsigned int count,
userptr_t buffer ) ) {
void *buffer ) ) {
struct int13_disk_address addr;
uint8_t bufsize;
uint64_t lba;
unsigned long count;
userptr_t buffer;
void *buffer;
int rc;
/* Extended reads are not allowed on floppy drives.
@ -743,11 +743,11 @@ static int int13_extended_rw ( struct san_device *sandev,
if ( ( addr.count == 0xff ) ||
( ( addr.buffer.segment == 0xffff ) &&
( addr.buffer.offset == 0xffff ) ) ) {
buffer = phys_to_user ( addr.buffer_phys );
buffer = phys_to_virt ( addr.buffer_phys );
DBGC2 ( sandev->drive, "%08llx",
( ( unsigned long long ) addr.buffer_phys ) );
} else {
buffer = real_to_user ( addr.buffer.segment,
buffer = real_to_virt ( addr.buffer.segment,
addr.buffer.offset );
DBGC2 ( sandev->drive, "%04x:%04x", addr.buffer.segment,
addr.buffer.offset );
@ -1058,7 +1058,7 @@ static int int13_cdrom_read_boot_catalog ( struct san_device *sandev,
/* Read from boot catalog */
if ( ( rc = sandev_read ( sandev, start, command.count,
phys_to_user ( command.buffer ) ) ) != 0 ) {
phys_to_virt ( command.buffer ) ) ) != 0 ) {
DBGC ( sandev->drive, "INT13 drive %02x could not read boot "
"catalog: %s\n", sandev->drive, strerror ( rc ) );
return -INT13_STATUS_READ_ERROR;
@ -1455,8 +1455,8 @@ static int int13_load_eltorito ( unsigned int drive, struct segoff *address ) {
"catalog (status %04x)\n", drive, status );
return -EIO;
}
copy_from_user ( &catalog, phys_to_user ( eltorito_cmd.buffer ), 0,
sizeof ( catalog ) );
memcpy ( &catalog, phys_to_virt ( eltorito_cmd.buffer ),
sizeof ( catalog ) );
/* Sanity checks */
if ( catalog.valid.platform_id != ELTORITO_PLATFORM_X86 ) {
@ -1523,7 +1523,6 @@ static int int13_load_eltorito ( unsigned int drive, struct segoff *address ) {
*/
static int int13_boot ( unsigned int drive,
struct san_boot_config *config __unused ) {
struct memory_map memmap;
struct segoff address;
int rc;
@ -1537,7 +1536,7 @@ static int int13_boot ( unsigned int drive,
* many problems that turn out to be memory-map related that
* it's worth doing.
*/
get_memmap ( &memmap );
memmap_dump_all ( 1 );
/* Jump to boot sector */
if ( ( rc = call_bootsector ( address.segment, address.offset,

View File

@ -288,6 +288,7 @@ static void int13con_init ( void ) {
* INT13 console initialisation function
*/
struct init_fn int13con_init_fn __init_fn ( INIT_CONSOLE ) = {
.name = "int13con",
.initialise = int13con_init,
};

View File

@ -24,11 +24,14 @@
FILE_LICENCE ( GPL2_OR_LATER_OR_UBDL );
#include <stdint.h>
#include <string.h>
#include <errno.h>
#include <assert.h>
#include <realmode.h>
#include <bios.h>
#include <memsizes.h>
#include <ipxe/io.h>
#include <ipxe/memmap.h>
/**
* @file
@ -151,7 +154,7 @@ static unsigned int extmemsize_88 ( void ) {
* @ret extmem Extended memory size, in kB
*
* Note that this is only an approximation; for an accurate picture,
* use the E820 memory map obtained via get_memmap();
* use the E820 memory map obtained via memmap_describe();
*/
unsigned int extmemsize ( void ) {
unsigned int extmem_e801;
@ -166,12 +169,13 @@ unsigned int extmemsize ( void ) {
/**
* Get e820 memory map
*
* @v memmap Memory map to fill in
* @v region Memory region of interest to be updated
* @ret rc Return status code
*/
static int meme820 ( struct memory_map *memmap ) {
struct memory_region *region = memmap->regions;
struct memory_region *prev_region = NULL;
static int meme820 ( struct memmap_region *region ) {
unsigned int count = 0;
uint64_t start = 0;
uint64_t len = 0;
uint32_t next = 0;
uint32_t smap;
uint32_t size;
@ -225,13 +229,6 @@ static int meme820 ( struct memory_map *memmap ) {
break;
}
/* If first region is not RAM, assume map is invalid */
if ( ( memmap->count == 0 ) &&
( e820buf.type != E820_TYPE_RAM ) ) {
DBG ( "INT 15,e820 failed, first entry not RAM\n" );
return -EINVAL;
}
DBG ( "INT 15,e820 region [%llx,%llx) type %d",
e820buf.start, ( e820buf.start + e820buf.len ),
( int ) e820buf.type );
@ -258,27 +255,36 @@ static int meme820 ( struct memory_map *memmap ) {
continue;
}
region->start = e820buf.start;
region->end = e820buf.start + e820buf.len;
/* Check for adjacent regions and merge them */
if ( prev_region && ( region->start == prev_region->end ) ) {
prev_region->end = region->end;
if ( e820buf.start == ( start + len ) ) {
len += e820buf.len;
} else {
prev_region = region;
region++;
memmap->count++;
start = e820buf.start;
len = e820buf.len;
}
if ( memmap->count >= ( sizeof ( memmap->regions ) /
sizeof ( memmap->regions[0] ) ) ) {
DBG ( "INT 15,e820 too many regions returned\n" );
/* Not a fatal error; what we've got so far at
* least represents valid regions of memory,
* even if we couldn't get them all.
*/
break;
/* Sanity check: first region (base memory) should
* start at address zero.
*/
if ( ( count == 0 ) && ( start != 0 ) ) {
DBG ( "INT 15,e820 region 0 starts at %llx (expected "
"0); assuming insane\n", start );
return -EINVAL;
}
/* Sanity check: second region (extended memory)
* should start at address 0x100000.
*/
if ( ( count == 1 ) && ( start != 0x100000 ) ) {
DBG ( "INT 15,e820 region 1 starts at %llx (expected "
"100000); assuming insane\n", start );
return -EINVAL;
}
/* Update region of interest */
memmap_update ( region, start, len, MEMMAP_FL_MEMORY, "e820" );
count++;
} while ( next != 0 );
/* Sanity checks. Some BIOSes report complete garbage via INT
@ -287,19 +293,9 @@ static int meme820 ( struct memory_map *memmap ) {
* region (starting at 0) and at least one high memory region
* (starting at 0x100000).
*/
if ( memmap->count < 2 ) {
if ( count < 2 ) {
DBG ( "INT 15,e820 returned only %d regions; assuming "
"insane\n", memmap->count );
return -EINVAL;
}
if ( memmap->regions[0].start != 0 ) {
DBG ( "INT 15,e820 region 0 starts at %llx (expected 0); "
"assuming insane\n", memmap->regions[0].start );
return -EINVAL;
}
if ( memmap->regions[1].start != 0x100000 ) {
DBG ( "INT 15,e820 region 1 starts at %llx (expected 100000); "
"assuming insane\n", memmap->regions[0].start );
"insane\n", count );
return -EINVAL;
}
@ -307,37 +303,52 @@ static int meme820 ( struct memory_map *memmap ) {
}
/**
* Get memory map
* Describe memory region from system memory map
*
* @v memmap Memory map to fill in
* @v min Minimum address
* @v hide Hide in-use regions from the memory map
* @v region Region descriptor to fill in
*/
void x86_get_memmap ( struct memory_map *memmap ) {
unsigned int basemem, extmem;
static void int15_describe ( uint64_t min, int hide,
struct memmap_region *region ) {
unsigned int basemem;
unsigned int extmem;
uint64_t inaccessible;
int rc;
DBG ( "Fetching system memory map\n" );
/* Initialise region */
memmap_init ( min, region );
/* Clear memory map */
memset ( memmap, 0, sizeof ( *memmap ) );
/* Mark addresses above 4GB as inaccessible: we have no way to
* access them either in a 32-bit build or in a 64-bit build
* (since the 64-bit build identity-maps only the 32-bit
* address space).
*/
inaccessible = ( 1ULL << 32 );
memmap_update ( region, inaccessible, -inaccessible,
MEMMAP_FL_INACCESSIBLE, NULL );
/* Get base and extended memory sizes */
basemem = basememsize();
DBG ( "FBMS base memory size %d kB [0,%x)\n",
basemem, ( basemem * 1024 ) );
extmem = extmemsize();
/* Try INT 15,e820 first */
if ( ( rc = meme820 ( memmap ) ) == 0 ) {
/* Enable/disable INT 15 interception as applicable */
int15_intercept ( hide );
/* Try INT 15,e820 first, falling back to constructing a map
* from basemem and extmem sizes
*/
if ( ( rc = meme820 ( region ) ) == 0 ) {
DBG ( "Obtained system memory map via INT 15,e820\n" );
return;
} else {
basemem = basememsize();
DBG ( "FBMS base memory size %d kB [0,%x)\n",
basemem, ( basemem * 1024 ) );
extmem = extmemsize();
memmap_update ( region, 0, ( basemem * 1024 ),
MEMMAP_FL_MEMORY, "basemem" );
memmap_update ( region, 0x100000, ( extmem * 1024 ),
MEMMAP_FL_MEMORY, "extmem" );
}
/* Fall back to constructing a map from basemem and extmem sizes */
DBG ( "INT 15,e820 failed; constructing map\n" );
memmap->regions[0].end = ( basemem * 1024 );
memmap->regions[1].start = 0x100000;
memmap->regions[1].end = 0x100000 + ( extmem * 1024 );
memmap->count = 2;
/* Restore INT 15 interception */
int15_intercept ( 1 );
}
PROVIDE_IOAPI ( x86, get_memmap, x86_get_memmap );
PROVIDE_MEMMAP ( int15, memmap_describe, int15_describe );

Some files were not shown because too many files have changed in this diff Show More