We currently require the variable CROSS (or CROSS_COMPILE) to be set
to specify the global cross-compilation prefix. This becomes
cumbersome when developing across multiple CPU architectures,
requiring frequent editing of build command lines and preventing
incompatible architectures from being built with a single command.
Allow a default cross-compilation prefix for each architecture to be
specified via the CROSS_COMPILE_<arch> variables. These may then be
provided as environment variables, e.g. using
export CROSS_COMPILE_arm32=arm-linux-gnu-
export CROSS_COMPILE_arm64=aarch64-linux-gnu-
export CROSS_COMPILE_loong64=loongarch64-linux-gnu-
export CROSS_COMPILE_riscv32=riscv64-linux-gnu-
export CROSS_COMPILE_riscv64=riscv64-linux-gnu-
This change requires some portions of the Makefile to be rearranged,
to allow for the fact that $(CROSS_COMPILE) may not have been set
until the build directory has been parsed to determine the CPU
architecture.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The seed CSR defined by the Zkr extension is accessible only in M-mode
by default. Older versions of OpenSBI (prior to version 1.4) do not
set mseccfg.sseed, with the result that attempts to access the seed
CSR from S-mode will raise an illegal instruction exception.
Add a facility for testing the accessibility of arbitrary CSRs, and
use it to check that the seed CSR is accessible before reporting the
seed CSR entropy source as being functional.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add basic support for running directly on top of SBI, with no UEFI
firmware present. Build as e.g.:
make CROSS=riscv64-linux-gnu- bin-riscv64/ipxe.sbi
The resulting binary can be tested in QEMU using e.g.:
qemu-system-riscv64 -M virt -cpu max -serial stdio \
-kernel bin-riscv64/ipxe.sbi
No drivers or executable binary formats are supported yet, but the
unit test suite may be run successfully.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Restructure the parsing of the build directory name from
bin[[-<arch>]-<platform>]
to
bin[-<arch>[-<platform>]]
and allow for a per-architecture default build platform.
For the sake of backwards compatibility, handle "bin-efi" as a special
case equivalent to "bin-i386-efi".
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The timer and entropy seed CSRs will, by design, return different
values each time they are read.
Add the missing volatile qualifiers on the inline assembly to prevent
gcc from assuming that repeated invocations may be elided.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The Zkr entropy source extension defines a potentially unprivileged
seed CSR that can be read to obtain 16 bits of entropy input, with a
mandated requirement that 256 entropy input bits read from the seed
CSR will contain at least 128 bits of min-entropy.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The Zicntr extension defines an unprivileged wall-clock time CSR that
roughly matches the behaviour of an invariant TSC on x86. The nominal
frequency of this timer may be read from the "timebase-frequency"
property of the CPU node in the device tree.
Add a timer source using RDTIME to provide implementations of udelay()
and currticks(), modelled on the existing RDTSC-based timer for x86.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
RISC-V seems to allow for direct discovery of CPU features only from
M-mode (e.g. by setting up a trap handler and then attempting to
access a CSR), with S-mode code expected to read the resulting
constructed ISA description from the device tree.
Add the ability to check for the presence of named extensions listed
in the "riscv,isa" property of the device tree node corresponding to
the boot hart.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Allow for the existence of platforms with no PCI bus by including the
PCI settings mechanism only if PCI bus support is included.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Running with flat physical addressing is a fairly common early boot
environment. Rename UACCESS_EFI to UACCESS_FLAT so that this code may
be reused in non-UEFI boot environments that also use flat physical
addressing.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add the ability to issue Supervisor Binary Interface (SBI) calls via
the ECALL instruction, and use the SBI DBCN extension to implement a
debug console.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Montgomery multiplication requires calculating the inverse of the
modulus modulo a larger power of two.
Add bigint_mod_invert() to calculate the inverse of any (odd) big
integer modulo an arbitrary power of two, using a lightly modified
version of the algorithm presented in "A New Algorithm for Inversion
mod p^k (Koç, 2017)".
The power of two is taken to be 2^k, where k is the number of bits
available in the big integer representation of the invertend. The
inverse modulo any smaller power of two may be obtained simply by
masking off the relevant bits in the inverse.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Allow scripts to read basic information from USB device descriptors
via the settings mechanism. For example:
echo USB vendor ID: ${usb/${busloc}.8.2}
echo USB device ID: ${usb/${busloc}.10.2}
echo USB manufacturer name: ${usb/${busloc}.14.0}
The general syntax is
usb/<bus:dev>.<offset>.<length>
where bus:dev is the USB bus:device address (as obtained via the
"usbscan" command, or from e.g. ${net0/busloc} for a USB network
device), and <offset> and <length> select the required portion of the
USB device descriptor.
Following the usage of SMBIOS settings tags, a <length> of zero may be
used to indicate that the byte at <offset> contains a USB string
descriptor index, and an <offset> of zero may be used to indicate that
the <length> contains a literal USB string descriptor index.
Since the byte at offset zero can never contain a string index, and a
literal string index can never be zero, the combination of both
<length> and <offset> being zero may be used to indicate that the
entire device descriptor is to be read as a raw hex dump.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Implement a "usbscan" command as a direct analogy of the existing
"pciscan" command, allowing scripts to iterate over all detected USB
devices.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Faster modular multiplication algorithms such as Montgomery
multiplication will still require the ability to perform a single
direct modular reduction.
Neaten up the implementation of direct reduction and split it out into
a separate bigint_reduce() function, complete with its own unit tests.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Every architecture uses the same implementation for bigint_is_set(),
and there is no reason to suspect that a future CPU architecture will
provide a more efficient way to implement this operation.
Simplify the code by providing a single architecture-independent
implementation of bigint_is_set().
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The big integer shift operations are misleadingly described as
rotations since the original x86 implementations are essentially
trivial loops around the relevant rotate-through-carry instruction.
The overall operation performed is a shift rather than a rotation.
Update the function names and descriptions to reflect this.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
An n-bit multiplication product may be added to up to two n-bit
integers without exceeding the range of a (2n)-bit integer:
(2^n - 1)*(2^n - 1) + (2^n - 1) + (2^n - 1) = 2^(2n) - 1
Exploit this to perform big integer multiplication in constant time
without requiring the caller to provide temporary carry space.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add support for building as a Linux userspace binary for AArch32.
This allows the self-test suite to be more easily run for the 32-bit
ARM code. For example:
make CROSS=arm-linux-gnu- bin-arm32-linux/tests.linux
qemu-arm -L /usr/arm-linux-gnu/sys-root/ \
./bin-arm32-linux/tests.linux
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Reading from PMCCNTR causes an undefined instruction exception when
running in PL0 (e.g. as a Linux userspace binary), unless the
PMUSERENR.EN bit is set.
Restructure profile_timestamp() for 32-bit ARM to perform an
availability check on the first invocation, with subsequent
invocations returning zero if PMCCNTR could not be enabled.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
All consumers of profile_timestamp() currently treat the value as an
unsigned long. Only the elapsed number of ticks is ever relevant: the
absolute value of the timestamp is not used. Profiling is used to
measure short durations that are generally fewer than a million CPU
cycles, for which an unsigned long is easily large enough.
Standardise the return type of profile_timestamp() as unsigned long
across all CPU architectures. This allows 32-bit architectures such
as i386 and riscv32 to omit all logic associated with retrieving the
upper 32 bits of the 64-bit hardware counter, which simplifies the
code and allows riscv32 and riscv64 to share the same implementation.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Big integer multiplication currently performs immediate carry
propagation from each step of the long multiplication, relying on the
fact that the overall result has a known maximum value to minimise the
number of carries performed without ever needing to explicitly check
against the result buffer size.
This is not a constant-time algorithm, since the number of carries
performed will be a function of the input values. We could make it
constant-time by always continuing to propagate the carry until
reaching the end of the result buffer, but this would introduce a
large number of redundant zero carries.
Require callers of bigint_multiply() to provide a temporary carry
storage buffer, of the same size as the result buffer. This allows
the carry-out from the accumulation of each double-element product to
be accumulated in the temporary carry space, and then added in via a
single call to bigint_add() after the multiplication is complete.
Since the structure of big integer multiplication is identical across
all current CPU architectures, provide a single shared implementation
of bigint_multiply(). The architecture-specific operation then
becomes the multiplication of two big integer elements and the
accumulation of the double-element product.
Note that any intermediate carry arising from accumulating the lower
half of the double-element product may be added to the upper half of
the double-element product without risk of overflow, since the result
of multiplying two n-bit integers can never have all n bits set in its
upper half. This simplifies the carry calculations for architectures
such as RISC-V and LoongArch64 that do not have a carry flag.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The admin queue API requires us to tell the device how many event
counters we have provided via the "configure device resources" admin
queue command. There is, of course, absolutely no documentation
indicating how many event counters actually need to be provided.
We require only two event counters: one for the transmit queue, one
for the receive queue. (The receive queue doesn't seem to actually
make any use of its event counter, but the "create receive queue"
admin queue command will fail if it doesn't have an available event
counter to choose.)
In the absence of any documentation, we currently make the assumption
that allocating and configuring 16 counters (i.e. one whole cacheline)
will be sufficient to allow for the use of two counters.
This assumption turns out to be incorrect. On larger instance types
(observed with a c3d-standard-16 instance in europe-west4-a), we find
that creating the transmit or receive queues will each fail with a
probability of around 50% with the "failed precondition" error code.
Experimentation suggests that even though the device has accepted our
"configure device resources" command indicating that we are providing
only 16 event counters, it will attempt to choose any of its potential
32 event counters (and will then fail since the event counter that it
unilaterally chose is outside of the agreed range).
Work around this firmware bug by always allocating the maximum number
of event counters supported by the device. (This requires deferring
the allocation of the event counters until after issuing the "describe
device" command.)
Signed-off-by: Michael Brown <mcb30@ipxe.org>
As of commit 79c0173 ("[build] Create util/genfsimg for building
filesystem-based images"), the EFI boot file name for each CPU
architecture is defined within the genfsimg script itself, rather than
being passed in as a Makefile parameter.
Remove the now-redundant Makefile definitions for EFI_BOOT_FILE.
Reported-by: Christian I. Nilsson <nikize@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add support for building iPXE as a 64-bit or 32-bit RISC-V binary, for
either UEFI or Linux userspace platforms. For example:
# RISC-V 64-bit UEFI
make CROSS=riscv64-linux-gnu- bin-riscv64-efi/ipxe.efi
# RISC-V 32-bit UEFI
make CROSS=riscv64-linux-gnu- bin-riscv32-efi/ipxe.efi
# RISC-V 64-bit Linux
make CROSS=riscv64-linux-gnu- bin-riscv64-linux/tests.linux
qemu-riscv64 -L /usr/riscv64-linux-gnu/sys-root \
./bin-riscv64-linux/tests.linux
# RISC-V 32-bit Linux
make CROSS=riscv64-linux-gnu- SYSROOT=/usr/riscv32-linux-gnu/sys-root \
bin-riscv32-linux/tests.linux
qemu-riscv32 -L /usr/riscv32-linux-gnu/sys-root \
./bin-riscv32-linux/tests.linux
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The cross-compiler will typically use the appropriate sysroot
directory automatically. This may not work for toolchains where a
single cross-compiler is used to produce output for multiple CPU
variants (e.g. 32-bit and 64-bit RISC-V).
Add a SYSROOT=... parameter that may be used to specify the relevant
sysroot directory, e.g.
make CROSS=riscv64-linux-gnu- SYSROOT=/usr/riscv32-linux-gnu/sys-root \
bin-riscv32-linux/tests.linux
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The EDK2 header macros VA_START(), VA_ARG() etc produce build errors
on some CPU architectures (notably on 32-bit RISC-V, which is not yet
supported by EDK2).
Fix by using the standard variable argument list macros.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
For some 32-bit CPUs, we need to provide implementations of 64-bit
shifts as libgcc helper functions. Add test cases to cover these.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Define a cpu_halt() function which is architecture-specific but
platform-independent, and merge the multiple architecture-specific
implementations of the EFI cpu_nap() function into a single central
efi_cpu_nap() that uses cpu_halt() if applicable.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The definitions of the setjmp() and longjmp() functions are common to
all architectures, with only the definition of the jump buffer
structure being architecture-specific.
Move the architecture-specific portions to bits/setjmp.h and provide a
common setjmp.h for the function definitions.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add the "--retain <N>" option to limit the number of retained old AMI
images (within the same family, architecture, and public visibility).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Allow for easier identification of images and snapshots created by the
aws-import script by adding tags for image family (e.g. "iPXE") and
architecture (e.g. "x86_64") to both.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
As described in commit 3b81a4e ("[ena] Provide a host information
page"), we currently report an operating system type of "Linux" in
order to work around broken versions of the ENA firmware that will
fail to create a completion queue if we report the correct operating
system type.
As of September 2024, the ENA team at AWS assures us that the entire
AWS fleet has been upgraded to fix this bug, and that we are now safe
to report the correct operating system type value in the "type" field
of struct ena_host_info.
The ENA team has also clarified that at least some deployed versions
of the ENA firmware still have the defect that requires us to report
an operating system version number of 2 (regardless of operating
system type), and so we continue to report ENA_HOST_INFO_VERSION_WTF
in the "version" field of struct ena_host_info.
Add an explicit warning on the previous known failure path, in case
some deployed versions of the ENA firmware turn out to not have been
upgraded as expected.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Move the <gdbmach.h> file to <bits/gdbmach.h>, and provide a common
dummy implementation for all architectures that have not yet
implemented support for GDB.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Simplify the process of adding a new CPU architecture by providing
common implementations of typically empty architecture-specific header
files.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
This patch adds support for the AQtion Ethernet controller, enabling
iPXE to recognize and utilize the specific models (AQC114, AQC113, and
AQC107).
Tested-by: Animesh Bhatt <animeshb@marvell.com>
Signed-off-by: Animesh Bhatt <animeshb@marvell.com>
The link status check in falcon_xaui_link_ok() reads from the
FCN_XX_CORE_STAT_REG_MAC register only on production hardware (where
the FPGA version reads as zero), but modifies the value and writes
back to this register unconditionally. This triggers an uninitialised
variable warning on newer versions of gcc.
Fix by assuming that the register exists only on production hardware,
and so moving the "modify-write" portion of the "read-modify-write"
operation to also be covered by the same conditional check.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add the "imgdecrypt" command that can be used to decrypt a detached
encrypted data image using a cipher key obtained from a separate CMS
envelope image. For example:
# Create non-detached encrypted CMS messages
#
openssl cms -encrypt -binary -aes-256-gcm -recip client.crt \
-in vmlinuz -outform DER -out vmlinuz.cms
openssl cms -encrypt -binary -aes-256-gcm -recip client.crt \
-in initrd.img -outform DER -out initrd.img.cms
# Detach data from envelopes (using iPXE's contrib/crypto/cmsdetach)
#
cmsdetach vmlinuz.cms -d vmlinuz.dat -e vmlinuz.env
cmsdetach initrd.img.cms -d initrd.img.dat -e initrd.img.env
and then within iPXE:
#!ipxe
imgfetch http://192.168.0.1/vmlinuz.dat
imgfetch http://192.168.0.1/initrd.img.dat
imgdecrypt vmlinuz.dat http://192.168.0.1/vmlinuz.env
imgdecrypt initrd.img.dat http://192.168.0.1/initrd.img.env
boot vmlinuz
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add support for decrypting images containing detached encrypted data
using a cipher key obtained from a separate CMS envelope image (in DER
or PEM format).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The openssl toolchain does not currently seem to support creating CMS
envelopedData or authEnvelopedData messages with detached encrypted
data.
Add a standalone tool "cmsdetach" that can be used to detach the
encrypted data from a CMS message. For example:
openssl cms -encrypt -binary -aes-256-gcm -recip client.crt \
-in bootfile -outform DER -out bootfile.cms
cmsdetach bootfile.cms --data bootfile.dat --envelope bootfile.env
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Generalise CMS self-test data structure and macro names to refer to
"messages" rather than "signatures", in preparation for adding image
decryption tests.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some ASN.1 OID-identified algorithms require additional parameters,
such as an initialisation vector for a block cipher. The structure of
the parameters is defined by the individual algorithm.
Extend asn1_algorithm() to allow these additional parameters to be
returned via a separate ASN.1 cursor.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Reduce the number of dynamic allocations required to parse a CMS
message by retaining the ASN.1 cursor returned from image_asn1() for
the lifetime of the CMS message. This allows embedded ASN.1 cursors
to be used for parsed objects within the message, such as embedded
signatures.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Instances of cipher and digest algorithms tend to get called
repeatedly to process substantial amounts of data. This is not true
for public-key algorithms, which tend to get called only once or twice
for a given key.
Simplify the public-key algorithm API so that there is no reusable
algorithm context. In particular, this allows callers to omit the
error handling currently required to handle memory allocation (or key
parsing) errors from pubkey_init(), and to omit the cleanup calls to
pubkey_final().
This change does remove the ability for a caller to distinguish
between a verification failure due to a memory allocation failure and
a verification failure due to a bad signature. This difference is not
material in practice: in both cases, for whatever reason, the caller
was unable to verify the signature and so cannot proceed further, and
the cause of the error will be visible to the user via the return
status code.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The TLS connection structure has grown to become unmanageably large as
new features and support for new TLS protocol versions have been added
over time.
Split out the portions of struct tls_connection that are specific to
client and server operations into separate structures, and simplify
some structure field names.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The TLS connection structure has grown to become unmanageably large as
new features and support for new TLS protocol versions have been added
over time.
Split out the portions of struct tls_connection that are specific to
transmit and receive operations into separate structures, and simplify
some structure field names.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The rom-o-matic code does not form part of the iPXE codebase, has not
been maintained for over a decade, and does not appear to still be in
use anywhere in the world.
It does, however, result in a large number of false positive security
vulnerability reports from some low quality automated code analysis
tools such as Fortify SCA.
Remove this unused and obsolete code to reduce the burden of
responding to these false positives.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Generalise the existing support for performing RSA public-key
encryption, decryption, signature, and verification tests, and update
the code to use okx() for neater reporting of test results.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Asymmetric keys are invariably encountered within ASN.1 structures
such as X.509 certificates, and the various large integers within an
RSA key are themselves encoded using ASN.1.
Simplify all code handling asymmetric keys by passing keys as a single
ASN.1 cursor, rather than separate data and length pointers.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Generalise the logic for identifying the matching PCI root bridge I/O
protocol to allow for identifying the closest matching PCI bus:dev.fn
address range, and use this to provide PCI address range discovery
(while continuing to inhibit automatic PCI bus probing).
This allows the "pciscan" command to work as expected under UEFI.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The UEFI device model requires us to not probe the PCI bus directly,
but instead to wait to be offered the opportunity to drive devices via
our driver service binding handle.
We currently inhibit PCI bus probing by having pci_discover() return
an empty range when using the EFI PCI I/O API. This has the unwanted
side effect that scanning the bus manually using the "pciscan" command
will also fail to discover any devices.
Separate out the concept of being allowed to probe PCI buses from the
mechanism for discovering PCI bus:dev.fn address ranges, so that this
limitation may be removed.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
An attempt to use a validator for an empty certificate chain will
correctly fail the overall validation with the "empty certificate
chain" error propagated from x509_auto_append().
In a debug build, the call to validator_name() will attempt to call
x509_name() on a non-existent certificate, resulting in garbage in the
debug message.
Fix by checking for the special case of an empty certificate chain.
This issue does not affect non-debug builds, since validator_name() is
(as per its description) called only for debug messages.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
There is some exploitable similarity between the data structures used
for representing CMS signatures and CMS encryption keys. In both
cases, the CMS message fundamentally encodes a list of participants
(either message signers or message recipients), where each participant
has an associated certificate and an opaque octet string representing
the signature or encrypted cipher key. The ASN.1 structures are not
identical, but are sufficiently similar to be worth exploiting: for
example, the SignerIdentifier and RecipientIdentifier data structures
are defined identically.
Rename data structures and functions, and add the concept of a CMS
message type.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Extend the definition of an ASN.1 OID-identified algorithm to include
a potential cipher suite, and add identifiers for AES-CBC and AES-GCM.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The cms_signature() and cms_verify() functions currently accept raw
data pointers. This will not be possible for cms_decrypt(), which
will need the ability to extract fragments of ASN.1 data from a
potentially large image.
Change cms_signature() and cms_verify() to accept an image as an input
parameter, and move the responsibility for setting the image trust
flag within cms_verify() since that now becomes a more natural fit.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Allow passing a NULL value for the certificate list to all functions
used for identifying an X.509 certificate from an existing set of
certificates, and rename function parameters to indicate that this
certificate list represents an unordered certificate store (rather
than an ordered certificate chain).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Centralise all current mechanisms for identifying an X.509 certificate
(by raw content, by subject, by issuer and serial number, and by
matching public key), and remove the certstore-specific and
CMS-specific variants of these functions.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Handling large ASN.1 objects such as encrypted CMS files will require
the ability to use the asn1_enter() and asn1_skip() family of
functions on partial object cursors, where a defined additional length
is known to exist after the end of the data buffer pointed to by the
ASN.1 object cursor.
We already have support for partial object cursors in the underlying
asn1_start() operation used by both asn1_enter() and asn1_skip(), and
this is used by the DER image probe routine to check that the
potential DER file comprises a single ASN.1 SEQUENCE object.
Add asn1_enter_partial() to formalise the process of entering an ASN.1
partial object, and refactor the DER image probe routine to use this
instead of open-coding calls to the underlying asn1_start() operation.
There is no need for an equivalent asn1_skip_partial() function, since
only objects that are wholly contained within the partial cursor may
be successfully skipped.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Calling asn1_skip_if_exists() on a malformed ASN.1 object may
currently leave the cursor in a partially-updated state, where the tag
byte and one of the length bytes have been stripped. The cursor is
left with a valid data pointer and length and so no out-of-bounds
access can arise, but the cursor no longer points to the start of an
ASN.1 object.
Ensure that each ASN.1 cursor manipulation code path leads to the
cursor being either fully updated, left unmodified, or invalidated,
and update the function descriptions to reflect this.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Successfully reaching the end of a well-formed ASN.1 object list is
arguably not an error, but the current code (dating back to the
original ASN.1 commit in 2007) will explicitly check for and report
this as an error condition.
Remove the explicit check for reaching the end of a well-formed ASN.1
object list, and instead return success along with a zero-length (and
hence implicitly invalidated) cursor.
Almost every existing caller of asn1_skip() or asn1_skip_if_exists()
currently ignores the return value anyway. Skipped objects are (by
definition) not of interest to the caller, and the invalidation
behaviour of asn1_skip() ensures that any errors will be safely caught
on a subsequent attempt to actually use the ASN.1 object content.
Since these existing callers ignore the return value, they cannot be
affected by this change.
There is one existing caller of asn1_skip_if_exists() that does check
the return value: in asn1_skip() itself, an error returned from
asn1_skip_if_exists() will cause the cursor to be invalidated. In the
case of an error indicating only that the cursor length is already
zero, invalidation is a no-op, and so this change affects only the
return value propagated from asn1_skip().
This leaves only a single call site within ocsp_request() where the
return value from asn1_skip() is currently checked. The return status
here is moot since there is no way for the code in question to fail
(absent a bug in the ASN.1 construction or parsing code).
There are therefore no callers of asn1_skip() or asn1_skip_if_exists()
that rely on an error being returned for successfully reaching the end
of a well-formed ASN.1 object list. Simplify the code by redefining
this as a successful outcome.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Redefine bit 30 of an SMBIOS numerical setting to be part of the
function number, in order to allow access to hypervisor CPUID leaves.
This technically breaks backwards compatibility with scripts
attempting to read more than 64 consecutive functions. Since there is
no meaningful block of 64 consecutive related functions, it is
vanishingly unlikely that this capability has ever been used.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Hypervisors typically intercept CPUID leaves in the range 0x40000000
to 0x400000ff, with leaf 0x40000000 returning the maximum supported
function within this range in register %eax.
iPXE currently masks off bit 30 from the requested CPUID leaf when
checking to see if a function is supported, which causes this check to
read from leaf 0x00000000 instead of 0x40000000.
Fix by including bit 30 within the mask.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The general syntax for SMBIOS settings:
smbios/<instance>.<type>.<offset>.<length>
is currently extended such that a <length> of zero indicates that the
byte at <offset> contains a string index, and an <offset> of zero
indicates that the <length> contains a literal string index.
Since the byte at offset zero can never contain a string index, and a
literal string index can never have a zero value, the combination of
both <length> and <offset> being zero is currently invalid and will
always return "not found".
Extend the syntax such that the combination of both <length> and
<offset> being zero may be used to read the entire data structure.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Following the example of aws-int13con, add a utility that can be used
to read the INT13 console log from a used iPXE boot disk in Google
Compute Engine.
There seems to be no easy way to directly read the contents of either
a disk image or a snapshot in Google Cloud. Work around this
limitation by creating a snapshot and attaching this snapshot as a
data disk to a temporary Linux instance, which is then used to echo
the INT13 console log to the serial port.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Experiments suggest that using fewer than 64 receive buffers leads to
excessive packet drop rates on some instance types (observed with a
c3-standard-4 instance in europe-west4-a).
Fix by increasing the number of receive data buffers (and adjusting
the length of the registrable queue page address list to match).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The Google Virtual Ethernet NIC (GVE or gVNIC) is found only in Google
Cloud instances. There is essentially zero documentation available
beyond the mostly uncommented source code in the Linux kernel.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Following the example of aws-import, add a utility that can be used to
upload an iPXE disk image to Google Compute Engine as a bootable
image. For example:
make CONFIG=cloud EMBED=config/cloud/gce.ipxe \
bin-x86_64-pcbios/ipxe.usb bin-x86_64-efi/ipxe.usb
make CONFIG=cloud EMBED=config/cloud/gce.ipxe \
CROSS=aarch64-linux-gnu- bin-arm64-efi/ipxe.usb
../contrib/cloud/gce-import -p \
bin-x86_64-pcbios/ipxe.usb \
bin-x86_64-efi/ipxe.usb \
bin-arm64-efi/ipxe.usb
The iPXE disk image is automatically wrapped into a tarball containing
a single file named "disk.raw", uploaded to a temporary bucket in
Google Cloud Storage, and used to create a bootable image. The
temporary bucket is deleted after use.
An appropriate image family name is identified automatically: "ipxe"
for BIOS images, "ipxe-uefi-x86-64" for x86_64 UEFI images, and
"ipxe-uefi-arm64" for AArch64 UEFI images. This allows the latest
image within each family to be launched within needing to know the
precise image name.
Google Compute Engine images are globally scoped and are available
(and cached upon first use) in all regions. The initial placement of
the image may be controlled indirectly by using the "--location"
option to specify the Google Cloud Storage location used for the
temporary upload bucket: the image will then be created in the closest
multi-region to the storage location.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The DHCPv6 protocol does not itself provide a router address or a
prefix length. This information is instead obtained from the router
advertisements.
Our IPv6 minirouting table construction logic will first construct an
entry for each advertised prefix, and later update the entry to
include an address assigned within that prefix via stateful DHCPv6 (if
applicable).
This logic fails if the address assigned via stateful DHCPv6 does not
fall within any of the advertised prefixes (e.g. if the network is
configured to use DHCPv6-assigned /128 addresses with no advertised
on-link prefixes). We will currently treat this situation as
equivalent to having a manually assigned address with no corresponding
router address or prefix length: the routing table entry will use the
default /64 prefix length and will not include the router address.
DHCPv6 is triggered only in response to a router advertisement with
the "Managed Address Configuration (M)" or "Other Configuration (O)"
flags set, and a router address is therefore available at the point
that we initiate DHCPv6.
Record the router address when initiating DHCPv6, and expose this
router address as part of the DHCPv6 settings block. This allows the
routing table entry for any address assigned via stateful DHCPv6 to
correctly include the router address, even if the assigned address
does not fall within an advertised prefix.
Also provide a fixed /128 prefix length as part of the DHCPv6 settings
block. When an address assigned via stateful DHCPv6 does not fall
within an advertised prefix, this will cause the routing table entry
to have a /128 prefix length as expected. (When such an address does
fall within an advertised prefix, it will continue to use the
advertised prefix length.)
Originally-fixed-by: Guvenc Gulce <guevenc.guelce@sap.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
In a small subnet (with a /31 or /32 subnet mask), all addresses
within the subnet are valid host addresses: there is no separate
network address or directed broadcast address.
The logic used in iPXE to determine whether or not to use a link-layer
broadcast address will currently fail in these subnets. In a /31
subnet, the higher of the two host addresses (i.e. the address with
all host bits set) will be treated as a broadcast address. In a /32
subnet, the single valid host address will be treated as a broadcast
address.
Fix by adding the concept of a host mask, defined such that an address
in the local subnet with all of the mask bits set to zero represents
the network address, and an address in the local subnet with all of
the mask bits set to one represents the directed broadcast address.
For most subnets, this is simply the inverse of the subnet mask. For
small subnets (/31 or /32) we can obtain the desired behaviour by
setting the host mask to all ones, so that only the local broadcast
address 255.255.255.255 will be treated as a broadcast address.
Originally-fixed-by: Lukas Stockner <lstockner@genesiscloud.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Remove the now-unused generalised text widget user interface, along
with the associated concept of a widget set and the implementation of
a read-only label widget.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Rewrite the code implementing the "login" user interface to use a
predefined interactive form. The command "login" then becomes roughly
equivalent to:
#!ipxe
form
item username Username
item --secret password Password
present
with the result that login form customisations (e.g. to add a Windows
domain name) may be implemented within the scripting language.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add support for presenting a dynamic user interface as an interactive
form, alongside the existing support for presenting a dynamic user
interface as a menu.
An interactive form may be used to allow a user to input (or edit)
values for multiple settings on a single screen, as a user-friendly
alternative to prompting for setting values via the "read" command.
In the present implementation, all input fields must fit on a single
screen (with no scrolling), and the only supported widget type is an
editable text box.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
For interactive forms, the concept of a secret value becomes
meaningful (e.g. for password fields).
Add a flag to indicate that an item represents a secret value, and
allow this flag to be set via the "--secret" option of the "item"
command.
This flag has no meaning for menu items, but is silently accepted
anyway to keep the code size minimal.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Generalise the ability to look up a dynamic user interface item by
index or by shortcut key, to allow for reuse of this code for
interactive forms.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
We currently have an abstract model of a dynamic menu as a list of
items, each of which has a name, a description, and assorted metadata
such as a shortcut key. The "menu" and "item" commands construct
representations in this abstract model, and the "choose" command then
presents the items as a single-choice menu, with the selected item's
name used as the output value.
This same abstraction may be used to model a dynamic form as a list of
editable items, each of which has a corresponding setting name, an
optional description label, and assorted metadata such as a shortcut
key. By defining a "form" command as an alias for the "menu" command,
we could construct and present forms using commands such as:
#!ipxe
form Login to ${url}
item username Username or email address
item --secret password Password
present
or
#!ipxe
form Configure IPv4 networking for ${netX/ifname}
item netX/ip IPv4 address
item netX/netmask Subnet mask
item netX/gateway Gateway address
item netX/dns DNS server address
present
Reusing the same abstract model for both menus and forms allows us to
minimise the increase in code size, since the implementation of the
"form" and "item" commands is essentially zero-cost.
Rename everything within the abstract data model from "menu" to
"dynamic user interface" to reflect this generalisation.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add support for wraparound scrolling and allow the tab key to be used
to move forward through a list of elements, wrapping back around to
the beginning of the list on overflow.
This is mildly useful for a menu, and likely to be a strong user
expectation for an interactive form.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Switch terminology for the "item" command from "item <label> <text>"
to "item <name> <text>", in preparation for repurposing the "item"
command to cover interactive forms as well as menus.
Since this renaming affects only a positional parameter, it does not
break compatibility with any existing scripts.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The msg() and alert() functions currently defined in settings_ui.c
provide a general-purpose facility for printing messages centred on
the screen.
Split this out to a separate file to allow for reuse by the form
presentation code.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The curses concept of a window has been supported but never actively
used in iPXE since the mucurses library was first implemented in 2006.
Simplify the code by removing the ability to place a widget set in a
specified window, and instead use the standard screen for all drawing
operations.
This simplification allows the widget set parameter to be omitted for
the draw_widget() and edit_widget() operations, since the only reason
for its inclusion was to provide access to the specified window.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Create a generic abstraction of a text widget, refactor the existing
editable text box widget to use this abstraction, add an
implementation of a non-editable text label widget, and generalise the
login user interface to use this generic widget abstraction.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The comments for replace_string() state that a successful return
status guarantees that the dynamically allocated string pointer is no
longer NULL (even if it was initially NULL and the replacement string
is NULL or empty). This is relied upon by readline() to guarantee
that it will always return a non-NULL string if successful.
The code behaviour does not currently match this comment: an empty
replacement string may result in a successful return status even if
the (single-byte) allocation fails.
Fix up the code behaviour to match the comments, and to additionally
ensure that the edit history is filled in even in the event of an
allocation failure.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The reference implementation of Dhcp6Dxe in EDK2 has a fatal flaw: the
code in EfiDhcp6Stop() will poll the network in a tight loop until
either a response is received or a timer tick (at TPL_CALLBACK)
occurs. When EfiDhcp6Stop() is called at TPL_CALLBACK or higher, this
will result in an endless loop and an apparently frozen system.
Since this is the reference implementation of Dhcp6Dxe, it is likely
that almost all platforms have the same problem.
Fix by vetoing the broken driver. If the upstream driver is ever
fixed and a new version number issued, then we could plausibly test
against the version number exposed via the driver binding protocol.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Editable strings currently require a fixed-size buffer, which is
inelegant and limits the potential for creating interactive forms with
a variable number of edit box widgets.
Remove this limitation by switching to using a dynamically allocated
buffer for editable strings and edit box widgets.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
If we do not have a current working URI (after applying the EFI device
path settings and any cached DHCP settings), then an attempt to
download autoexec.ipxe will fail since there is no base URI from which
to resolve the full autoexec.ipxe URI.
Avoid this potentially confusing error message by attempting the
download only if we have successfully obtained a current working URI.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add a new setting to provide access to the link layer protocol type
from scripts. This can be useful in order to skip configuring
interfaces based on their link layer protocol or, conversely,
configure only selected interface types (Ethernet, IPoIB, etc.)
Example script:
set idx:int32 0
:loop
isset ${net${idx}/mac} || exit 0
iseq ${net${idx}/linktype} IPoIB && goto try_next ||
autoboot net${idx} ||
:try_next
inc idx && goto loop
Signed-off-by: Pavel Krotkiy <porsh@nebius.com>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
We currently attempt to obtain the autoexec.ipxe script via early use
of the EFI_SIMPLE_FILE_SYSTEM_PROTOCOL or EFI_PXE_BASE_CODE_PROTOCOL
interfaces to obtain an opaque block of memory, which is then
registered as an image at an appropriate point during our startup
sequence. The early use of these existent interfaces allows us to
obtain the script even if our subsequent actions (e.g. disconnecting
drivers in order to connect up our own) may cause the script to become
inaccessible.
This mirrors the approach used under BIOS, where the autoexec.ipxe
script is provided by the prefix (e.g. as an initrd image when using
the .lkrn build of iPXE) and so must be copied into a normally
allocated image from wherever it happens to previously exist in
memory.
We do not currently have support for downloading an autoexec.ipxe
script if we were ourselves downloaded via UEFI HTTP boot.
There is an EFI_HTTP_PROTOCOL defined within the UEFI specification,
but it is so poorly designed as to be unusable for the simple purpose
of downloading an additional file from the same directory. It
provides almost nothing more than a very slim wrapper around
EFI_TCP4_PROTOCOL (or EFI_TCP6_PROTOCOL). It will not handle
redirection, content encoding, retries, or even fundamentals such as
the Content-Length header, leaving all of this up to the caller.
The UEFI HTTP Boot driver will install an EFI_LOAD_FILE_PROTOCOL
instance on the loaded image's device handle. This looks promising at
first since it provides the LoadFile() API call which is specified to
accept an arbitrary filename parameter. However, experimentation (and
inspection of the code in EDK2) reveals a multitude of problems that
prevent this from being usable. Calling LoadFile() will idiotically
restart the entire DHCP process (and potentially pop up a UI requiring
input from the user for e.g. a wireless network password). The
filename provided to LoadFile() will be ignored. Any downloaded file
will be rejected unless it happens to match one of the limited set of
types expected by the UEFI HTTP Boot driver. The list of design
failures and conceptual mismatches is fairly impressive.
Choose to bypass every possible aspect of UEFI HTTP support, and
instead use our own HTTP client and network stack to download the
autoexec.ipxe script over a temporary MNP network device. Since this
approach works for TFTP as well as HTTP, drop the direct use of
EFI_PXE_BASE_CODE_PROTOCOL. For consistency and simplicity, also drop
the direct use of EFI_SIMPLE_FILE_SYSTEM_PROTOCOL and rely upon our
existing support to access local files via "file:" URIs.
This approach results in console output during the "iPXE initialising
devices...ok" message that appears while startup is in progress.
Remove the trailing "ok" so that this intermediate output appears at a
sensible location on the screen. The welcome banner that will be
printed immediately afterwards provides an indication that startup has
completed successfully even absent the explicit "ok".
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Retain a reference to the cached DHCPACK until the late startup phase,
and allow it to be recycled for reuse. This allows the cached DHCPACK
to be used for a temporary MNP network device and then subsequently
reused for the corresponding real network device.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
An MNP network device may be temporarily and non-destructively
installed on top of an existing UEFI network stack without having to
disconnect existing drivers.
Add the ability to create such a temporary network device.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Split out the code that allocates our internal struct efi_device
representations, to allow for the creation of temporary MNP devices in
order to download the autoexec.ipxe script.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add an abbreviated "Not found" error message for an HTTP 404 status
code, so that any automatic attempt to download a non-existent
autoexec.ipxe script produces only a minimal error message.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add an abbreviated "Not found" error message for a TFTP "file not
found" error code, so that any automatic attempt to download a
non-existent autoexec.ipxe script produces only a minimal error
message.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add an abbreviated "Not found" error message for an EFI_NOT_FOUND
error encountered when attempting to open a file on a local
filesystem, so that any automatic attempt to download a non-existent
autoexec.ipxe script produces only a minimal error message.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
iPXE is designed around fully asynchronous I/O, including asynchronous
connection opening. Almost all errors are therefore necessarily
reported as occurring during an in-progress download, rather than
occurring at the time that the URI is opened.
Local file access is currently an exception to this: errors such as
nonexistent files will be encountered while opening the URI. This
results in mildly unexpected error messages of the form "Could not
start download", rather than the usual pattern of showing the URI, the
initial progress dots, and then the error message.
Fix this inconsistency by deferring the local filesystem access until
the local file download process is running.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some URI schemes allow for a path name to be specified via the opaque
component of the URI (e.g. "file:/script.ipxe" to specify a path on
the filesystem from which iPXE itself was loaded). Files loaded from
such paths will currently fail to be assigned an appropriate name,
since only the path component of the URI will be used to construct a
default image name.
Fix by falling back to attempt deriving an image name from the opaque
component of a URI, if no path component is specified.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
For unknown reasons, miscellaneous versions of gcc seem to struggle
with the static assertions used to ensure the correct layout of the
GCM structures.
Adjust the assertions to use offsetof() rather than direct pointer
comparison, on the basis that offsetof() must be a compile-time
constant value.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The UEFI HTTP boot mechanism is extraordinarily badly designed, even
by the standards of the UEFI specification in general. It has the
symptoms of a feature that has been designed entirely in terms of user
stories, without any consideration at all being given to the
underlying technical architecture. It does work, provided that you
are doing precisely and only what was envisioned by the product owner.
If you want to try anything outside the bounds of the product owner's
extremely limited imagination, then you are almost certainly about to
enter a world of pain.
As one very minor example of this: the cached DHCP packet is not
available when using HTTP boot. The UEFI HTTP boot code does perform
DHCP, but it pointlessly and unhelpfully throws away the DHCP packet
and trashes the network interface configuration before handing over to
the downloaded executable.
Work around this imbecility by parsing and applying the few network
configuration settings that are persisted into the loaded image's
device path. This is limited to very basic information such as the IP
address, gateway address, and DNS server address, but it does at least
provide enough for a functional routing table.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
We want exclusive access to the network device, both for performance
reasons and because we perform operations such as EAPoL that affect
the entire link. We currently drive the network card via either a
native hardware driver or via the SNP or NII/UNDI interfaces, both of
which grant us this exclusive access.
Add an alternative driver that drives the network card non-exclusively
via the EFI_MANAGED_NETWORK_PROTOCOL interface. This can function as
a fallback for situations where neither SNP nor NII/UNDI interfaces
are functional, and also opens up the possibility of non-destructively
installing a temporary network device over which to download the
autoexec.ipxe script.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
When using a service binding protocol, CreateChild() will create a new
protocol instance (and optionally a new handle). The caller will then
typically open this new protocol instance with BY_DRIVER attributes,
since the service binding mechanism has no equivalent of the driver
binding protocol's Stop() method, and there is therefore no other way
for the caller to be informed if the protocol instance is about to
become invalid (e.g. because the service driver wants to remove the
child).
The caller cannot ask CreateChild() to install the new protocol
instance on the original handle (i.e. the service binding handle),
since the whole point of the service binding protocol is to allow for
the existence of multiple children, and UEFI does not permit multiple
instances of the same protocol to be installed on a handle.
Our current drivers all open the original handle (as passed to our
driver binding's Start() method) with BY_DRIVER attributes, and so the
same handle will be passed to our Stop() method. This changes when
our driver must use a separate handle, as described above.
Add an optional "child handle" field to struct efi_device (on the
assumption that we will not have any drivers that need to create
multiple children), and generalise efidev_find() to match on either
the original handle or the child handle.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The EFI service binding abstraction is used to add and remove child
handles for multiple different protocols. Provide a common interface
for doing so.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Commit 4c5b794 ("[efi] Use the SNP protocol instance to match the SNP
chainloading device") switched the chainloaded device matching logic
to use a target protocol instance rather than the loaded image's
device handle, on the basis that we want to bind to the parent SNP
device rather than to a duplicate SNP protocol instance installed onto
an IPv4 or IPv6 child device handle.
It is possible that our calls to DisconnectController() and
ConnectController() will cause the target protocol instance to be
uninstalled and reinstalled, which may change the value of the
protocol instance pointer. Allow for this by identifying and matching
against the uppermost handle that initially has this target protocol
instance installed.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
When booted via HTTP, our loaded image's device path will include the
URI from which we were downloaded. Set this as the current working
URI, so that an embedded script may perform subsequent downloads
relative to the iPXE binary, or construct explicit relative paths via
the ${cwduri} setting.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
iPXE maintains a concept of a current working URI, which is used when
resolving relative URIs and allows scripts to download files using
URIs relative to the script itself.
There are situations in which it is valuable for a script to be able
to access the URI explicitly as a string, not just implicitly as a
base URI for subsequent downloads. For example, when booting a Fedora
installer, the "inst.repo" command-line parameter may be used to pass
the URI of the repository to the installer.
Expose the current working URI as ${cwuri}. Since relative URIs may
be constructed as strings only from a directory URI (not from a full
URI), also expose the current working directory URI as ${cwduri}.
This feature may be used as e.g.
#!ipxe
echo Booting from ${cwuri}
prompt -k 0x197e -t 2000 Press F12 to install Fedora... || exit
kernel images/pxeboot/vmlinux inst.repo=${cwduri}
initrd images/pxeboot/initrd.img
boot
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The Mellanox/Nvidia UEFI driver is built from the same codebase as the
iPXE driver, and appears to contain the bug that was fixed in commit
c11734e ("[golan] Use ETH_HLEN for inline header size"). This results
in identical failures when using the SNP or NII interface (via
e.g. snponly.efi) to drive a Mellanox card while EAPoL is enabled.
Work around the underlying UEFI driver bug by padding transmit I/O
buffers to the minimum Ethernet frame length before passing them to
the underlying driver's transmit function.
This padding is not technically necessary, since almost all modern
hardware will insert transmit padding as necessary (and where the
hardware does not support doing so, the underlying UEFI driver is
responsible for adding any necessary padding). However, it is
guaranteed to be harmless (other than a miniscule performance impact):
the Ethernet specification requires zero padding up to the minimum
frame length for packets that are transmitted onto the wire, and so
the receiver will see the same packet whether or not we manually
insert this padding in software.
The additional padding causes the underlying Mellanox driver to avoid
its faulty code path, since it will never be asked to transmit a very
short packet.
Tested-by: Eric Hagberg <ehagberg@janestreet.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The driver does not correctly handle very short transmitted packets
such as EAPoL-Start where the entire DMA content lies within the
current send work queue entry inline header length of 18 bytes.
Fix by reducing the inline header length to the Ethernet frame header
length of 14 bytes.
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Older versions of gcc (observed with gcc 4.8.5 on CentOS 7) complain
about having the label "err_ioremap" at the end of a compound
statement in bios_mp_start_all(). The label is correctly placed,
since it immediately follows the iounmap() that would be required to
undo a successful ioremap() in the non-error case.
Fix by adding an explicit "return" immediately after the label.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some SNP implementations (observed with a wifi adapter in a Dell
Latitude 3440 laptop) seem to require additional space in the
allocated receive buffers, otherwise full-length packets will be
silently dropped.
The EDK2 MnpDxe driver happens to allocate an additional 8 bytes of
padding (4 for a VLAN tag, 4 for the Ethernet frame checksum). Match
this behaviour since drivers are very likely to have been tested
against MnpDxe.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Intel and AMD distribute microcode updates, which are typically
applied by the BIOS and/or the booted operating system.
BIOS updates can be difficult to obtain and cumbersome to apply, and
are often neglected. Operating system updates may be subject to
strict change control processes, particularly for production
workloads. There is therefore value in being able to update the
microcode at boot time using a freshly downloaded microcode update
file, particularly in scenarios where the physical hardware and the
installed operating system are controlled by different parties (such
as in a public cloud infrastructure).
Add support for parsing Intel and AMD microcode update images, and for
applying the updates to all CPUs in the system.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Provide an implementation of the iPXE multiprocessor API for BIOS,
based on sending broadcast INIT and SIPI interprocessor interrupts to
start up all application processors.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Application processors are started via INIT and SIPI interprocessor
interrupts: the INIT places the processor into a "wait for SIPI"
state, and the SIPI then starts the processor in real mode at a
page-aligned address derived from the SIPI vector number.
Add support for installing a real-mode SIPI handler that will switch
the CPU into protected mode with flat physical addressing, load
initial register contents, and then jump to the address of a
protected-mode SIPI handler. No stack pointer is set up, to avoid the
need to allocate stack space for each available processor.
We use 32-bit physical addressing in order to minimise the changes
required for a 64-bit build. The existing long mode transition code
relies on the existence of the stack, so we cannot easily switch the
application processor into long mode. We could use 32-bit virtual
addressing, but this runtime environment does not currently exist
outside of librm.S itself in a 64-bit build, and using it would
complicate the implementation of the protected-mode SIPI handler.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Provide an implementation of the iPXE multiprocessor API for EFI,
based on using EFI_MP_SERVICES to start up a wrapper function on all
application processors.
Note that the processor numbers used by EFI_MP_SERVICES are opaque
integers that bear no relation to the underlying CPU identity
(e.g. the APIC ID), and so we must rely on our own (architecture-
specific) implementation to determine the relevant CPU identifiers.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Define an API for executing very limited functions on application
processors in a multiprocessor system, along with an x86-only
implementation.
The normal iPXE runtime environment is effectively non-existent on
application processors. There is no ability to make firmware calls
(e.g. to write to a console), and there may be no stack space
available.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The return status from efi_block_local() indicates whether or not the
handle is eligible to be assigned a local virtual drive number. There
will always be several enumerated EFI_BLOCK_IO_PROTOCOL handles that
are not eligible for a local virtual drive number (e.g. the handles
corresponding to partitions, rather than to complete disks), and this
is not an interesting error to report.
Do not report errors from efi_block_local() as the overall error
status for a SAN boot, since doing so would be likely to mask a much
more relevant error from having previously attempted to scan for a
matching filesystem within an eligible block device handle.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add a "--label" option that can be used to specify a filesystem label,
to be matched against the FAT volume label.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add an "--extra" option that can be used to specify an extra
(non-boot) filename that must exist within the booted filesystem.
Note that only files within the FAT-formatted bootable partition will
be visible to this filter. Files within the operating system's root
disk (e.g. "/etc/redhat-release") are not generally accessible to the
firmware and so cannot be used as the existence check filter filename.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add a "--uuid" option which may be used to specify a boot device UUID,
to be matched against the GPT partition GUID.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
EFI provides no API for determining the partition GUID (if any) for a
specified device handle. The partition GUID appears to be exposed
only as part of the device path.
Add efi_path_guid() to extract the partition GUID (if any) from a
device path.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The drive specification alone does not necessarily contain enough
information to perform a SAN boot (or local disk boot) under UEFI. If
the next-stage bootloader is installed in the EFI system partition
under a non-standard name (e.g. "\EFI\debian\grubx64.efi") then this
explicit boot filename must also be specified.
Generalise this concept to use a "SAN boot configuration parameters"
structure (currently containing only the optional explicit boot
filename), to allow for easy expansion to provide other parameters
such as the partition UUID or volume label.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Extend the EFI SAN boot code to allow for booting from a local disk,
as is already possible with the BIOS SAN boot code.
There is unfortunately no direct UEFI equivalent of the BIOS drive
number. The UEFI shell does provide numbered mappings fs0:, blk0:,
etc, but these numberings exist only while the UEFI shell is running
and are not necessarily stable between shell invocations or across
reboots.
A substantial amount of existing third-party documentation for iPXE
will suggest using "sanboot --drive 0x80" to boot from a local disk
(when no SAN drives are present), since this suggestion has been
present in the official documentation for the "sanboot" command for
almost thirteen years. We therefore aim to ensure that this
instruction will also work for UEFI, i.e. that in a situation where
there are local disks but no SAN disks, then the first local disk will
be treated as being drive 0x80.
We therefore assign local disks the virtual drive numbers 0x80, 0x81,
etc, matching the numbering typically used in a BIOS environment.
Where a SAN disk is already occupying one of these drive numbers, the
local disks' virtual drive numbers will be incremented as necessary.
This provides a rough approximation of the equivalent functionality
under BIOS, where existing local disks' drive numbers are remapped to
make way for SAN disks.
We do not make any attempt to sort the list of local disks: the order
used for allocating virtual drive numbers will be whatever order is
returned by LocateHandle(). This will typically match the creation
order of the EFI handles, which will typically match the hardware
enumeration order of the devices, which will typically match user
expectations as to which local disk is first, second, etc.
We explicitly do not attempt to match the numbering used by the UEFI
shell (which initially sorts in increasing order of device path, but
does not renumber when new devices are added or removed). We can
never guarantee matching this partly transient UEFI shell numbering,
so it is best not to set any expectation that it will be matched.
(Using local drive numbers starting at 0x80 helps to avoid setting up
this impossible expectation, since the UEFI shell uses local drive
numbers starting at zero.)
Since floppy disks are essentially non-existent in any plausible UEFI
system, overload "--drive 0" to mean "boot from any drive containing
the specified (or default) boot filename".
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Maintain the SAN device list in order of drive number, and provide
sandev_next() to locate the first SAN device at or above a given drive
number.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
SAN devices created by iPXE are visible to the firmware, and may be
accessed using the firmware's standard block I/O device interface
(e.g. INT 13 for BIOS, or EFI_BLOCK_IO_PROTOCOL for UEFI). The iPXE
code to perform a SAN boot acts as a client of this standard block I/O
device interface, even when the underlying block I/O is being
performed by iPXE itself.
We rely on this separation to allow the "sanboot" command to be used
to boot from a local disk: since the code to perform a SAN boot does
not need direct access to an underlying iPXE SAN device, it may be
used to boot from any device providing the firmware's standard block
I/O device interface.
Clean up the EFI SAN boot code to require only a drive number and an
EFI_BLOCK_IO_PROTOCOL handle, in preparation for adding support for
booting from a local disk under UEFI.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The "sanboot" command allows a custom boot filename to be specified
via the "--filename" option. We currently rely on LoadImage() to
perform both the existence check and to load the image ready for
execution. This may give a false negative result if Secure Boot is
enabled and the boot file is not correctly signed.
Carry out the existence check using EFI_SIMPLE_FILE_SYSTEM_PROTOCOL
separately from loading the image via LoadImage().
Signed-off-by: Michael Brown <mcb30@ipxe.org>
We currently use the SAN device pointer as the debug message stream
identifier. This pointer is not always available: for example, when
booting from a local disk there is no underlying SAN device.
Switch to using the drive number as the debug message colour stream
identifier, so that all block device debug messages may be colourised
consistently.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
We currently call ConvertDevicePathToText() with DisplayOnly=TRUE when
constructing a device path to appear within a debug message. For
ATAPI device paths, this will unfortunately omit some key information:
the textual representation will not indicate which ATA bus or drive is
represented. This can lead to misleading debug messages that appear
to refer to identical devices.
Fix by setting DisplayOnly=FALSE to select the long form of device
path textual representations.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The ":uuid" and ":guid" settings types are currently format-only: it
is possible to format a setting as a UUID (via e.g. "show foo:uuid")
but it is not currently possible to parse a string into a UUID setting
(via e.g. "set foo:uuid 406343fe-998b-44be-8a28-44ca38cb202b").
Use uuid_aton() to implement parsing of these settings types, and add
appropriate test cases for both.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add uuid_aton() to parse a UUID value from a string (analogous to
inet_aton(), inet6_aton(), sock_aton(), etc), treating it as a
32-digit hex string with optional hyphen separators. The placement of
the separators is not checked: each byte within the hex string may be
separated by a hyphen, or not separated at all.
Add dedicated self-tests for UUID parsing and formatting (already
partially covered by the ":uuid" and ":guid" settings self-tests).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The UEFI shim installs wrappers around several boot services functions
before invoking its next stage bootloader, in an attempt to enforce
its desired behaviour upon the aforementioned bootloader. For
example, shim checks that the bootloader has either invoked
StartImage() or has called into the "shim lock protocol" before
allowing an ExitBootServices() call to proceed.
When invoking a shim, iPXE will also install boot services function
wrappers in order to work around assorted bugs in the UEFI shim code
that would otherwise prevent it from being used to boot a kernel. For
details on these workarounds, see commits 28184b7 ("[efi] Add support
for executing images via a shim") and 5b43181 ("[efi] Support versions
of shim that perform SBAT verification").
Using boot services function wrappers in this way is not intrinsically
problematic, provided that wrappers are installed before starting the
wrapped program, and uninstalled only after the wrapped program exits.
This strict ordering requirement ensures that all layers of wrappers
are called in the expected order, and that no calls are issued through
a no-longer-valid function pointer.
Unfortunately, the UEFI shim does not respect this strict ordering
requirement, and will instead uninstall (and reinstall) its wrappers
midway through the execution of the wrapped program. This leaves the
wrapped program with an inconsistent view of the boot services table,
leading to incorrect behaviour.
This results in a boot failure when a first shim is used to boot iPXE,
which then uses a second shim to boot a Linux kernel:
- First shim installs StartImage() and ExitBootServices() wrappers
- First shim invokes iPXE via its own PE loader
- iPXE installs ExitBootServices() wrapper
- iPXE invokes second shim via StartImage()
At this point, the first shim's StartImage() wrapper will illegally
uninstall its ExitBootServices() wrapper, without first checking that
nothing else has modified the ExitBootServices function pointer. This
effectively bypasses iPXE's own ExitBootServices() wrapper, which
causes a boot failure since the code within that wrapper does not get
called.
A proper fix would be for shim to install its wrappers before starting
the image and uninstall its wrappers only after the started image has
exited. Instead of repeatedly uninstalling and reinstalling its
wrappers while the wrapped program is running, shim should simply use
a flag to keep track of whether or not it needs to modify the
behaviour of the wrapped calls.
Experience shows that there is unfortunately no point in trying to get
a fix for this upstreamed into shim. We therefore work around the
shim bug by removing our ExitBootServices() wrapper and moving the
relevant code into our GetMemoryMap() wrapper.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add support for EAP-MSCHAPv2 (note that this is not the same as
PEAP-MSCHAPv2), controllable via the build configuration option
EAP_METHOD_MSCHAPV2 in config/general.h.
Our model for EAP does not encompass mutual authentication: we will
starting sending plaintext packets (e.g. DHCP requests) over the link
even before EAP completes, and our only use for an EAP success is to
mark the link as unblocked.
We therefore ignore the content of the EAP-MSCHAPv2 success request
(containing the MS-CHAPv2 authenticator response) and just send back
an EAP-MSCHAPv2 success response, so that the EAP authenticator will
complete the process and send through the real EAP success packet
(which will, in turn, cause us to unblock the link).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
RFC 3748 states that implementations must support the MD5-Challenge
method. However, some network environments may wish to disable it as
a matter of policy.
Allow support for MD5-Challenge to be controllable via the build
configuration option EAP_METHOD_MD5 in config/general.h.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add debug messages for each EAP Request and Response, and to show the
list of methods offered when sending a Nak.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Several new relocations types have been added in LoongArch ABI version
2.10. In particular:
- R_LARCH_B16 (18-bit PC-relative jump)
- R_LARCH_B21 (23-bit PC-relative jump)
- R_LARCH_PCREL20_S2 (22-bit PC-relative offset)
Also relocation relaxations have been introduced. Recent GCC (13.2)
and binutils 2.41+ use these types of relocations, which confuses
elf2efi tool. As a result, iPXE EFI images for LoongArch fail to
build with the following error:
Unrecognised relocation type 103
Fix by ignoring R_LARCH_B{16,21} and R_LARCH_PCREL20_S2 (as with other
PC-relative relocations), and by ignoring relaxations (R_LARCH_RELAX).
Relocation relaxations are basically optimizations: ignoring them
results in a correct binary (although it might be suboptimal).
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Done with the help of this Perl script:
$MARKER = 'PCI_ROM'; # a regex
$AB = 1; # At Begin
@HEAD = ();
@ITEMS = ();
@TAIL = ();
foreach $fn (@ARGV) {
open(IN, $fn) or die "Can't open file '$fn': $!\n";
while (<IN>) {
if (/$MARKER/) {
push @ITEMS, $_;
$AB = 0; # not anymore at begin
}
else {
if ($AB) {
push @HEAD, $_;
}
else {
push @TAIL, $_;
}
}
}
} continue {
close IN;
open(OUT, ">$fn") or die "Can't open file '$fn' for output: $!\n";
print OUT @HEAD;
print OUT sort @ITEMS;
print OUT @TAIL;
close OUT;
# For a next file
$AB = 1;
@HEAD = ();
@ITEMS = ();
@TAIL = ();
}
Executed that script while src/drivers/ as current working directory,
provided '$(grep -rl PCI_ROM)' as argument.
Signed-off-by: Geert Stappers <stappers@stappers.it>
Inspection of the generated assembly shows that gcc will often emit
standalone implementations of frequently invoked functions such as
digest_update(), which contain no logic and exist only as syntactic
sugar.
Force inlining of these functions to reduce the overall binary size.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add an implementation of the authentication portions of the MS-CHAPv2
algorithm as defined in RFC 2759, along with the single test vector
provided therein.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Certificates issued by Let's Encrypt have two options for their chain
of trust: the chain can either terminate in the self-signed ISRG Root
X1 root certificate, or in an intermediate ISRG Root X1 certificate
that is signed in turn by the self-signed DST Root CA X3 root
certificate. This is a historical artifact: when Let's Encrypt first
launched as a project, the chain ending in DST Root CA X3 was used
since existing clients would not have recognised the ISRG Root X1
certificate as a trusted root certificate.
The DST Root CA X3 certificate expired in September 2021, and so is no
longer trusted by clients (such as iPXE) that validate the expiry
times of all certificates in the certificate chain.
In order to maintain usability of certificates on older Android
devices, the default certificate chain provided by Let's Encrypt still
terminates in DST Root CA X3, even though that certificate has now
expired. On newer devices which include ISRG Root X1 as a trusted
root certificate, the intermediate version of ISRG Root X1 in the
certificate chain is ignored and validation is performed as though the
chain had terminated in the self-signed ISRG Root X1 root certificate.
On older Android devices which do not include ISRG Root X1 as a
trusted root certificate, the validation succeeds since Android
chooses to ignore expiry times for root certificates and so continues
to trust the DST Root CA X3 root certificate.
This backwards compatibility hack unfortunately breaks the cross-
signing mechanism used by iPXE, which assumes that the certificate
chain will always terminate in a non-expired root certificate.
Generalise the validator's cross-signed certificate download mechanism
to walk up the certificate chain in the event of a failure, attempting
to find a replacement cross-signed certificate chain starting from the
next level up. This allows the validator to step over the expired
(and hence invalidatable) DST Root CA X3 certificate, and instead
download the cross-signed version of the ISRG Root X1 certificate.
This generalisation also gives us the ability to handle servers that
provide a full certificate chain including their root certificate:
iPXE will step over the untrusted public root certificate and attempt
to find a cross-signed version of it instead.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Downloading a cross-signed certificate chain to partially replace
(rather than simply extend) an existing chain will require the ability
to discard all certificates after a specified link in the chain.
Extract the relevant logic from x509_free_chain() and expose it
separately as x509_truncate().
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some versions of gcc (observed with gcc 4.8.5 in CentOS 7) will report
spurious build_assert() failures for some assertions about structure
layouts. There is no clear pattern as to what causes these spurious
failures, and the build assertion does succeed in that no unresolvable
symbol reference is generated in the compiled code.
Adjust the assertions to work around these apparent compiler issues.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
We build with -Werror by default so that any warning is treated as an
error and aborts the build. The build system allows NO_WERROR=1 to be
used to override this behaviour, in order to allow builds to succeed
when spurious warnings occur (e.g. when using a newer compiler that
includes checks for which the codebase is not yet prepared).
Some versions of gcc (observed with gcc 4.8.5 in CentOS 7) will report
spurious build_assert() failures: the compilation will fail due to an
allegedly unelided call to the build assertion's external function
declared with __attribute__((error)) even though the compiler does
manage to successfully elide the call (as verified by the fact that
there are no unresolvable symbol references in the compiler output).
Change build_assert() to declare __attribute__((warning)) instead of
__attribute__((error)) on its extern function. This will still abort
a normal build if the assertion fails, but may be overridden using
NO_WERROR=1 if necessary to work around a spurious assertion failure.
Note that if the build assertion has genuinely failed (i.e. if the
compiler has genuinely not been able to elide the call) then the
object will still contain an unresolvable symbol reference that will
cause the link to fail (which matches the behaviour of the old
linker_assert() mechanism).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The DES block cipher dates back to the 1970s. It is no longer
relevant for use in TLS cipher suites, but it is still used by the
MS-CHAPv2 authentication protocol which remains unfortunately common
for 802.1x port authentication.
Add an implementation of the DES block cipher, complete with the
extremely comprehensive test vectors published by NBS (the precursor
to NIST) in the form of an utterly adorable typewritten and hand-drawn
paper document.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
A block cipher in ECB mode has no concept of an initialisation vector,
and any data provided to cipher_setiv() for an ECB cipher will be
ignored. There is no requirement within our cipher algorithm
abstraction for a dummy initialisation vector to be provided.
Remove the entirely spurious dummy 16-byte initialisation vector from
the ECB test cases.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The CBC_CIPHER() macro contains some accidentally hardcoded references
to an underlying AES cipher, instead of using the cipher specified in
the macro parameters.
Fix by using the macro parameter as required.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Coverity reported that tls_send_plaintext() failed to check the return
status from tls_generate_random(), which could potentially result in
uninitialised random data being used as the block initialisation
vector (instead of intentionally random data).
Add the missing return status check, and separate out the error
handling code paths (since on the successful exit code path there will
be no need to free either the plaintext or the ciphertext anyway).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
When ExitBootServices() invokes efi_shutdown_hook(), there may be
nothing to generate an interrupt since the timer is disabled in the
first step of ExitBootServices(). Additionally, for VMs OVMF masks
everything from the PIC (except the timer) by default. This means
that calling cpu_nap() may hang indefinitely. This was seen in
practice in netfront_reset() when running in a VM on XenServer.
Fix this by skipping the halt if an EFI shutdown is in progress.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Allow the choice of key exchange algorithms to be controlled via build
configuration options in config/crypto.h, as is already done for the
choices of public-key algorithms, cipher algorithms, and digest
algorithms.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
DHE and ECDHE use essentially the same mechanism for verifying the
signature over the Diffie-Hellman parameters, though the format of the
parameters is different between the two methods.
Split out the verification of the parameter signature so that it may
be shared between the DHE and ECDHE key exchange algorithms.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The construction of the key material for the pending cipher suites
from the TLS master secret must happen regardless of which key
exchange algorithm is in use, and the key material is not required to
send the ClientKeyExchange handshake (which is sent before changing
cipher suites).
Centralise the call to tls_generate_keys() after performing key
exchange via the selected algorithm.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Define an individual local structure for each extension and a single
structure for the list of extensions. This makes it viable to add
extensions such as the Supported Elliptic Curves extension, which must
not be present if the list of curves is empty.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Define an abstraction of an elliptic curve with a fixed generator and
one supported operation (scalar multiplication of a curve point).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
RFC7748 states that it is entirely optional for X25519 Diffie-Hellman
implementations to check whether or not the result is the all-zero
value (indicating that an attacker sent a malicious public key with a
small order). RFC8422 states that implementations in TLS must abort
the handshake if the all-zero value is obtained.
Return an error if the all-zero value is obtained, so that the TLS
code will not require knowledge specific to the X25519 curve.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add an implementation of the X25519 key exchange algorithm as defined
in RFC7748.
This implementation is inspired by and partially based upon the paper
"Implementing Curve25519/X25519: A Tutorial on Elliptic Curve
Cryptography" by Martin Kleppmann, available for download from
https://www.cl.cam.ac.uk/teaching/2122/Crypto/curve25519.pdf
The underlying modular addition, subtraction, and multiplication
operations are completely redesigned for substantially improved
efficiency compared to the TweetNaCl implementation studied in that
paper (approximately 5x-10x faster and with 70% less memory usage).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The slightly incomprehensible LoongArch64 implementation for
bigint_subtract() is observed to produce incorrect results for some
input values.
Replace the suspicious LoongArch64 implementations of bigint_add(),
bigint_subtract(), bigint_rol() and bigint_ror(), and add a test case
for a subtraction that was producing an incorrect result with the
previous implementation.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add a helper function bigint_swap() that can be used to conditionally
swap a pair of big integers in constant time.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Big integers may be efficiently copied using bigint_shrink() (which
will always copy only the size of the destination integer), but this
is potentially confusing to a reader of the code.
Provide bigint_copy() as an alias for bigint_shrink() so that the
intention of the calling code may be more obvious.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Big integer multiplication is currently used only as part of modular
exponentiation, where both multiplicand and multiplier will be the
same size.
Relax this requirement to allow for the use of big integer
multiplication in other contexts.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
We currently implement build-time assertions via a mechanism that
generates a call to an undefined external function that will cause the
link to fail unless the compiler can prove that the asserted condition
is true (and thereby eliminate the undefined function call).
This assertion mechanism can be used for conditions that are not
amenable to the use of static_assert(), since static_assert() will not
allow for proofs via dead code elimination.
Add __attribute__((error(...))) to the undefined external function, so
that the error is raised at compile time rather than at link time.
This allows us to provide a more meaningful error message (which will
include the file name and line number, as with any other compile-time
error), and avoids the need for the caller to specify a unique symbol
name for the external function.
Change the name from linker_assert() to build_assert(), since the
assertion now takes place at compile time rather than at link time.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Expose static_assert() via assert.h and migrate link-time assertions
to build-time assertions where possible.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Newer versions of the GNU assembler (observed with binutils 2.41) will
complain about the ".arch i386" in files assembled with "as --64",
with the message "Error: 64bit mode not supported on 'i386'".
In files such as stack.S that contain no instructions to be assembled,
the ".arch i386" is redundant and may be removed entirely.
In the remaining files, fix by moving ".arch i386" below the relevant
".code16" or ".code32" directive, so that the assembler is no longer
expecting 64-bit instructions to be used by the time that the ".arch
i386" directive is encountered.
Reported-by: Ali Mustakim <alim@forwardcomputers.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The .text directive is entirely redundant when followed by a .section
directive giving an explicit section name and attributes.
Remove these unnecessary directives to simplify the code.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
RFC 3748 states that support for MD5-Challenge is mandatory for EAP
implementations. The MD5 and CHAP code is already included in the
default build since it is required by iSCSI, and so this does not
substantially increase the binary size.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Allow the ${netX/username} setting to be used to specify an EAP
identity to be returned in response to a Request-Identity, and provide
a mechanism for responding with a NAK to indicate which authentication
types we support.
If no identity is specified then fall back to the current behaviour of
not sending any Request-Identity response, so that switches will time
out and switch to MAC Authentication Bypass (MAB) if applicable.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
EAP responses (including our own) may be broadcast by switches but are
not of interest to us and can be safely ignored if received.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Ensure that .gitignore rules do not cover any files that do exist
within the repository.
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Support scanning for the 64-bit SMBIOS3 entry point in addition to the
32-bit SMBIOS2 entry point.
Prefer use of the 32-bit entry point if present, since this is
guaranteed to be within accessible memory.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add definitions for relocation types that may be missing on older
versions of the host system's elf.h.
This mirrors wimboot commit 47f6298 ("[efi] Add potentially missing
relocation types").
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The result of multiplying a uint16_t by another uint16_t will be a
signed int. Comparing this against a size_t will perform an unwanted
sign extension.
Fix by explicitly casting e_phnum to an unsigned int, thereby matching
the data type used for the loop index variable (and avoiding the
unwanted sign extension).
This mirrors wimboot commit 15f6162 ("[efi] Fix Coverity warning about
unintended sign extension").
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add additional PC-relative relocation types that may be encountered
when converting binaries compiled with clang.
This mirrors the relevant elf2efi portions of wimboot commit 7910830
("[build] Support building with the clang compiler").
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The clang compiler does not (and apparently will not ever) allow for
variable-length arrays within structs.
Work around this limitation by using a fixed-length array to hold the
PDB filename in the debug section.
This mirrors wimboot commit f52c3ff ("[efi] Allow compiling elf2efi
with clang").
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The function efi_pecoff_debug_name() (called by efi_handle_name()) is
used to extract a filename from the debug data directory entry located
within a PE/COFF image. The name is copied into a temporary static
buffer to allow for modifications, but the code currently erroneously
modifies the original name within the loaded PE/COFF image.
Fix by performing the modification on the copy in the temporary
buffer, as originally intended.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Hybrid bzImage and UEFI binaries (such as wimboot) may place sections
at explicit offsets within the PE file, as described in commit b30a098
("[efi] Use load memory address as file offset for hybrid binaries").
This can leave a gap after the PE headers that is not covered by any
section. It is not entirely clear whether or not such gaps are
permitted in binaries submitted for Secure Boot signing.
To minimise potential problems, extend the PE header size to cover any
space before the first explicitly placed section.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
iPXE images are linked with a starting virtual address of zero. Other
images (such as wimboot) may use a non-zero starting virtual address.
There is no direct equivalent of the PE ImageBase address field within
ELF object files. Choose to use the highest possible address that
accommodates all sections and the PE header itself, since this will
minimise the memory allocated to hold the loaded image.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The BaseOfCode (and, in PE32, BaseOfData) fields imply an assumption
that binaries are laid out as code followed by initialised data
followed by uninitialised data. This assumption may not be valid for
complex binaries such as wimboot.
Remove this implicit assumption, and use arguably justifiable values
for the assorted summary start and size fields within the PE headers.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Hybrid bzImage and UEFI binaries (such as wimboot) may include 16-bit
sections such as .bss16 that do not need to consume an entry in the PE
section list. Treat any such sections as hidden.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The PE debug information generated by elf2efi is used only to hold the
image filename, and the debug information is located via the relevant
data directory entry rather than via the section table.
Make the .debug section a hidden section in order to save one entry in
the PE section list. Choose to place the debug information in the
unused space at the end of the PE headers, since it no longer needs to
satisfy the general section alignment constraints.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Commit 1e4c378 ("[efi] Shrink size of data directory in PE header")
reduced the number of entries used in the data directory and reduced
the recorded size of the NT "optional" header, but did not also adjust
the recorded overall size of the PE headers, resulting in unused space
between the PE headers and the first section.
Fix by reducing the initial recorded size of the PE headers by the
size of the omitted data directory entries.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Hybrid bzImage and UEFI binaries (such as wimboot) include a bzImage
header within a section starting at offset zero, with the PE header
effectively occupying unused space within this section.
Allow for this by treating a section placed at offset zero as hidden,
and by deferring the writing of the PE header until after the output
sections have been written.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Hybrid bzImage and UEFI binaries (such as wimboot) may be loaded as a
single contiguous blob without reference to the PE headers, and the
placement of sections within the PE file must therefore be known at
link time.
Use the load memory address (extracted from the ELF program headers)
to determine the physical placement of the section within the PE file
when generating a hybrid binary.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The images generated by elf2efi can be loaded anywhere in the address
space, and are not limited to the low 2GB.
Indicate this by setting the "large address aware" flag within the PE
header, for compatibility with EFI images generated by the EDK2 build
process. (The EDK2 PE loader does not ever check this flag, and it is
unlikely that any other EFI PE loader ever does so, but we may as well
report it accurately.)
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Indicate that the binary is compatible with W^X protections by setting
the NXCOMPAT bit in the DllCharacteristics field of the PE header.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Hybrid bzImage and UEFI binaries (such as wimboot) may include 16-bit
executable code that is opaque data from the perspective of a UEFI PE
binary, as described in wimboot commit fe456ca ("[efi] Use separate
.text and .data PE sections").
The ELF section will be marked as containing both executable code and
writable data. Choose to treat such a section as a data section
rather than a code section, since that matches the expected semantics
for ELF files that we expect to process.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some AWS instance types still do not support serial console output or
screenshots. For these instance types, the only viable way to extract
debugging information is to use the INT13 console (which is already
enabled via CONFIG=cloud for all AWS images).
Obtaining the INT13 console output can be very cumbersome, since there
is no direct way to read from an AWS volume. The simplest current
approach is to stop the instance under test, detach its root volume,
and reattach the volume to a Linux instance in the same region.
Add a utility script aws-int13con to retrieve the INT13 console output
by creating a temporary snapshot, reading the first block from the
snapshot, and extracting the INT13 console partition content.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
AMI names must be unique within a region. Add a --overwrite option
that allows an existing AMI of the same name to be deregistered (and
its underlying snapshot deleted).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
EAP exchanges may take a long time to reach a final status, especially
when relying upon MAC Authentication Bypass (MAB). Our current
behaviour of sending EAPoL-Start every few seconds until a final
status is obtained can prevent these exchanges from ever completing.
Fix by redefining the EAP supplicant state to allow EAPoL-Start to be
suppressed: either temporarily (while waiting for a full EAP exchange
to complete, in which case we need to eventually resend EAPoL-Start if
the final Success or Failure packet is lost), or permanently (while
waiting for the potentially very long MAC Authentication Bypass
timeout period).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The PCI cloud API (PCIAPI_CLOUD) currently selects the first PCI API
that successfully discovers a PCI device address range. The ECAM API
may discover an address range but subsequently be unable to map the
configuration space region, which would result in the selected PCI API
being unusable.
Fix by instead selecting the first PCI API that can be successfully
used to discover a PCI device.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some machines (observed with an AWS EC2 m7a.large instance) will place
the ECAM configuration space window above 4GB, thereby making it
unreachable from non-paged 32-bit code. This problem is currently
ignored by iPXE, since the address is silently truncated in the call
to ioremap(). (Note that other uses of ioremap() are not affected
since the PCI core will already have checked for unreachable 64-bit
BARs when retrieving the physical address to be mapped.)
Fix by adding an explicit check that the region to be mapped starts
within the reachable memory address space. (Assume that no machines
will be sufficiently peverse to provide a region that straddles the
4GB boundary.)
Signed-off-by: Michael Brown <mcb30@ipxe.org>
When an error occurs during ECAM configuration space mapping, preserve
the error within the existing cached mapping (instead of invalidating
the cached mapping) in order to avoid flooding the debug log with
repeated identical mapping errors.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The base address provided in the PCI ECAM allocation within the ACPI
MCFG table is the base address for the segment as a whole, not for the
starting bus within that allocation. On machines that provide ECAM
allocations with a non-zero starting bus number (observed with an AWS
EC2 m7a.large instance), this will result in iPXE accessing the wrong
memory addresses within the ECAM region.
Fix by adding the appropriate starting bus offset to the base address.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The PCIe specification requires that "processor and host bridge
implementations must ensure that a method exists for the software to
determine when the write using the ECAM is completed by the completer"
but does not specify any particular method to be used. Some platforms
might treat writes to the ECAM region as non-posted, others might
require reading back from a dedicated (and implementation-specific)
completion register to determine when the configuration space write
has completed.
Since PCI configuration space writes will never be used for any
performance-critical datapath operations (on any sane hardware), a
simple and platform-independent solution is to always read back from
the written register in order to guarantee that the write must have
completed. This is safe to do, since the PCIe specification defines a
limited set of configuration register types, none of which have read
side effects.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The ipair_tx() function uses a va_list twice (first to calculate the
formatted string length before allocation, then to construct the
string in the allocated buffer) but is missing the va_start() and
va_end() around the second usage. This is undefined behaviour that
happens to work on some build platforms.
Fix by adding the missing va_start() and va_end() around the second
usage of the variadic argument list.
Reported-by: Andreas Hammarskjöld <andreas@2PintSoftware.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
We currently use the number of timer ticks since power-on as a seed
for the non-cryptographic RNG implemented by random(). Since iPXE is
often executed directly after power-on, and since the timer tick
resolution is generally low, this can often result in identical seed
values being used on each cold boot attempt.
As of commit 41f786c ("[settings] Add "unixtime" builtin setting to
expose the current time"), the current wall-clock time is always
available within the default build of iPXE. Use this time instead, to
introduce variability between cold boot attempts on the same host.
(Note that variability between different hosts is obtained by using
the MAC address as an additional seed value.)
This has no effect on the separate DRBG used by cryptographic code.
Suggested-by: Heiko <heik0@xs4all.nl>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
We have no way to force a link-layer restart in iPXE, and therefore no
way to explicitly trigger a restart of EAP authentication. If an iPXE
script has performed some action that requires such a restart
(e.g. registering a device such that the port VLAN assignment will be
changed), then the only means currently available to effect the
restart is to reboot the whole system. If iPXE is taking over a
physical link already used by a preceding bootloader, then even a
reboot may not work.
In the EAP model, the supplicant is a pure responder and never
initiates transmissions. EAPoL extends this to include an EAPoL-Start
packet type that may be sent by the supplicant to (re)trigger EAP.
Add support for sending EAPoL-Start packets at two-second intervals on
links that are open and have reached physical link-up, but for which
EAP has not yet completed. This allows "ifclose ; ifopen" to be used
to restart the EAP process.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Extend the EAP model to include a record of whether or not EAP
authentication has completed (successfully or otherwise), and to
provide a method for transmitting EAP responses.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Simplify the FCoE code by using driver-private data to hold the FCoE
port for each network device, instead of using a separate allocation.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Simplify the per-netdevice GuestInfo settings code by using
driver-private data to hold the settings block, instead of using a
separate allocation.
The settings block (if existent) will be automatically unregistered
when the parent network device settings block is unregistered, and no
longer needs to be separately freed. The guestinfo_net_remove()
function may therefore be omitted completely.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Simplify the IPv6 link-local settings code by using driver-private
data to hold the settings block, instead of using a separate
allocation.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Simplify the LLDP code by using driver-private data to hold the LLDP
settings block, instead of using a separate allocation. This avoids
the need to maintain a list of LLDP settings blocks (since the LLDP
settings block pointer can always be obtained using netdev_priv()) and
obviates several failure paths.
Any recorded LLDP data is now freed when the network device is
unregistered, since there is no longer a dedicated reference counter
for the LLDP settings block. To minimise surprise, we also now
explicitly unregister the settings block. This is not strictly
necessary (since the block will be automatically unregistered when the
parent network device settings block is unregistered), but it
maintains symmetry between lldp_probe() and lldp_remove().
The overall reduction in the size of the LLDP code is around 15%.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Allow network upper-layer drivers (such as LLDP, which attaches to
each network device in order to provide a corresponding LLDP settings
block) to specify a size for private data, which will be allocated as
part of the network device structure (as with the existing private
data allocated for the underlying device driver).
This will allow network upper-layer drivers to be simplified by
omitting memory allocation and freeing code. If the upper-layer
driver requires a reference counter (e.g. for interface
initialisation), then it may use the network device's existing
reference counter, since this is now the reference counter for the
containing block of memory.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some network device drivers use the trivial netdev_priv() helper
function while others use the netdev->priv pointer directly.
Standardise on direct use of netdev->priv, in order to free up the
function name netdev_priv() for reuse.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
We currently use "push $1f" within inline assembly to push the address
of the real-mode code fragment, relying on the assembler to treat this
as "pushl" for 32-bit code or "pushq" for 64-bit code.
As of binutils commit 5cc0077 ("x86: further adjust extend-to-32bit-
address conditions"), first included in binutils-2.41, this implicit
operand size is no longer calculated as expected and 64-bit builds
will fail with
Error: operand size mismatch for `push'
Fix by adding an explicit operand size to the "push" instruction.
Originally-fixed-by: Justin Cano <jstncno@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The current implementation of vpm_ioread32() erroneously reads only 16
bits of data, which fails when used with the (stricter) virtio device
emulation in VirtualBox.
Fix by using the correct readl()/inl() I/O wrappers.
Reworded-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Define the IPv4 NTP server setting to simplify the use of a
DHCP-provided NTP server in scripts, using e.g.
#!ipxe
dhcp
ntp ${ntp}
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Commit 3ef4f7e ("[console] Avoid overlap between special keys and
Unicode characters") renumbered the special key encoding to avoid
collisions with Unicode key values outside the ASCII range. This
change broke backwards compatibility with existing scripts that
specify key values using e.g. "prompt --key" or "menu --key".
Restore compatibility with existing scripts by tweaking the special
key encoding so that the relative key value (i.e. the delta from
KEY_MIN) is numerically equal to the old pre-Unicode key value, and by
modifying parse_key() to accept a relative key value.
Reported-by: Sven Dreyer <sven@dreyer-net.de>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Avoid the need to always specify a local MAC address on the command
line by setting a default hardware MAC address (using the same default
address as for slirp devices).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
A running link block timer holds a reference to the network device and
will prevent it from being freed until the timer expires. It is
impossible for free_netdev() to be called while the timer is still
running: the call to stop_timer() therein is therefore a no-op.
Stop the link block timer when the device is closed, to allow a
link-blocked device to be freed immediately upon unregistration of the
device. (Since link block state is updated in response to received
packets, the state is effectively undefined for a closed device: there
is therefore no reason to leave the timer running.)
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The interface debug message values constructed by INTF_DBG() et al
rely on the interface being embedded within a containing object. This
assumption is not valid for the temporary outbound-only interfaces
constructed on the stack by intf_shutdown() and xfer_vredirect().
Formalise the notion of a temporary outbound-only interface as having
a NULL interface descriptor, and overload the "original interface
descriptor" field to contain a pointer to the original interface that
the temporary interface is shadowing.
Originally-fixed-by: Vincent Fazio <vfazio@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The special key range (from KEY_MIN upwards) currently overlaps with
the valid range for Unicode characters, and therefore prohibits the
use of Unicode key values outside the ASCII range.
Create space for Unicode key values by moving the special keys to the
range immediately above the maximum valid Unicode character. This
allows the existing encoding of special keys as an efficiently packed
representation of the equivalent ANSI escape sequence to be maintained
almost as-is.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The keyboard remapping flags currently occupy bits 8 and upwards of
the to-be-mapped character value. This overlaps the range used for
special keys (KEY_MIN and upwards) and also overlaps the valid Unicode
character range.
No conflict is created by this overlap, since by design only ASCII
character values (as generated by an ASCII-only keyboard driver) are
subject to remapping, and so the to-be-remapped character values exist
in a conceptually separate namespace from either special keys or
non-ASCII Unicode characters. However, the overlap is potentially
confusing for readers of the code.
Minimise cognitive load by using bits 24 and upwards for the keyboard
remapping flags.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some versions of ld will complain that the automatically created (and
unused by our build process) ELF program headers include a "LOAD
segment with RWX permissions".
Silence this warning by adding "-z separate-code" to the linker
options, where supported.
For BIOS builds, where the prefix will generally require writable
access to its own (tiny) code segment, simply inhibit the warning
completely via "--no-warn-rwx-segments".
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Multiple target patterns in pattern rules are treated as grouped
targets regardless of the separator character. Newer verions of make
will generate "warning: pattern recipe did not update peer target" to
warn that the rule was expected to update all of the (implicitly)
grouped targets.
Fix by splitting all multiple target pattern rules into single target
pattern rules.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
There is no common standard for I/O-space access for non-x86 CPU
families, and non-MMIO peripherals are vanishingly rare.
Generalise the existing ARM definitions for dummy PIO to allow for
reuse by other CPU architectures.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The PCI I/O API (supporting accesses to PCI configuration space) is
not related to the general I/O API (supporting accesses to
memory-mapped I/O peripherals).
Remove the spurious inclusion of ipxe/io.h from the PCI I/O header.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
While not guaranteed by the UEFI specification, the enumeration of
handles, protocols, and openers will generally return results in order
of creation. Processing these objects in reverse order (as is already
done when calling DisconnectController() on the list of all handles)
will generally therefore perform the forcible uninstallation
operations in reverse order of object creation, which minimises the
number of implicit operations performed (e.g. when disconnecting a
controller that itself still has existent child controllers).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The UEFI specification states that the AgentHandle may be either the
driving binding protocol handle or the image handle.
Check for both handles when searching for stale handles to be forcibly
closed on behalf of a vetoed driver.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
In most cases, the driver handle will be the image handle itself.
However, this is not required by the UEFI specification, and some
images will install multiple driver binding handles.
Use the image handle (extracted from the driver binding protocol
instance) when attempting to unload the driver's image.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Pass the driver binding handle, the driver binding protocol instance,
the image handle, and the loaded image protocol instance to all veto
methods.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Simplify the process of adding new entries to the veto list by
including the manufacturer name within the standard debug output.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Polling for TX completions is arguably redundant when there are no
transmissions currently in progress. Commit c6c7e78 ("[efi] Poll for
TX completions only when there is an outstanding TX buffer") switched
to setting the PXE_OPFLAGS_GET_TRANSMITTED_BUFFERS flag only when
there is an in-progress transmission awaiting completion, in order to
reduce reported TX errors and debug message noise from buggy NII
implementations that report spurious TX completions whenever the
transmit queue is empty.
Some other NII implementations (observed with the Realtek driver in a
Dell Latitude 3440) seem to have a bug in the transmit datapath
handling which results in the transmit ring freezing after sending a
few hundred packets under heavy load. The symptoms are that the
TPPoll register's NPQ bit remains set and the 256-entry transmit ring
contains a large number of uncompleted descriptors (with the OWN bit
set), the first two of which have identical data buffer addresses.
Though iPXE will submit at most one in-progress transmission via NII,
the Dell/Realtek driver seems to make a page-aligned copy of each
transmit data buffer and to report TX completions immediately without
waiting for the packet to actually be transmitted. These synthetic TX
completions continue even after the hardware transmit ring freezes.
Setting PXE_OPFLAGS_GET_TRANSMITTED_BUFFERS on every poll reduces the
probability of this Dell/Realtek driver bug being triggered by a
factor of around 500, which brings the failure rate down to the point
that it can sensibly be managed by external logic such as the
"--timeout" option for image downloads. Closing and reopening the
interface (via "ifclose"/"ifopen") will clear the error condition and
allow transmissions to resume.
Revert to setting PXE_OPFLAGS_GET_TRANSMITTED_BUFFERS on every poll,
and silently ignore situations in which the hardware reports a
completion when no transmission is in progress. This approximately
matches the behaviour of the SnpDxe driver, which will also generally
set PXE_OPFLAGS_GET_TRANSMITTED_BUFFERS on every poll.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
EFI variables do not map neatly to the iPXE settings mechanism, since
the EFI variable identifier includes a namespace GUID that cannot
cleanly be supplied as part of a setting name. Creating a new EFI
variable requires the variable's attributes to be specified, which
does not fit within iPXE's settings concept.
However, EFI variable names are generally unique even without the
namespace GUID, and EFI does provide a mechanism to iterate over all
existent variables. We can therefore provide read-only access to EFI
variables by comparing only the names and ignoring the namespace
GUIDs.
Provide an "efi" settings block that implements this mechanism using a
syntax such as:
echo Platform language is ${efi/PlatformLang:string}
show efi/SecureBoot:int8
Settings are returned as raw binary values by default since an EFI
variable may contain boolean flags, integer values, ASCII strings,
UCS-2 strings, EFI device paths, X.509 certificates, or any other
arbitrary blob of data.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The EDK2 UefiPxeBcDxe driver includes some remarkably convoluted and
unsafe logic in its driver binding protocol Start() and Stop() methods
in order to support a pair of nominally independent driver binding
protocols (one for IPv4, one for IPv6) sharing a single dynamically
allocated data structure. This PXEBC_PRIVATE_DATA structure is
installed as a dummy protocol on the NIC handle in order to allow both
IPv4 and IPv6 driver binding protocols to locate it as needed.
The error handling code path in the UefiPxeBcDxe driver's Start()
method may attempt to uninstall the dummy protocol but fail to do so.
This failure is ignored and the containing memory is subsequently
freed anyway. On the next invocation of the driver binding protocol,
it will find and use this already freed block of memory. At some
point another memory allocation will occur, the PXEBC_PRIVATE_DATA
structure will be corrupted, and some undefined behaviour will occur.
The UEFI firmware used in VMware ESX 8 includes some proprietary
changes which attempt to install copies of the EFI_LOAD_FILE_PROTOCOL
and EFI_PXE_BASE_CODE_PROTOCOL instances from the IPv4 child handle
onto the NIC handle (along with a VMware-specific protocol with GUID
5190120d-453b-4d48-958d-f0bab3bc2161 and a NULL instance pointer).
This will inevitably fail with iPXE, since the NIC handle already
includes an EFI_LOAD_FILE_PROTOCOL instance.
These VMware proprietary changes end up triggering the unsafe error
handling code path described above. The typical symptom is that an
attempt to exit from iPXE back to the UEFI firmware will crash the VM
with a General Protection fault from within the UefiPxeBcDxe driver:
this happens when the UefiPxeBcDxe driver's Stop() method attempts to
call through a function pointer in the (freed) PXEBC_PRIVATE_DATA
structure, but the function pointer has by then been overwritten by
UCS-2 character data from an unrelated memory allocation.
Work around this failure by adding the VMware UefiPxeBcDxe driver to
the driver veto list.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The old IPv4-only IScsiDxe driver in MdeModulePkg/Universal/Network
was replaced by a dual-stack IScsiDxe driver in NetworkPkg.
Add the module GUID for this driver.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The EDK2 headers may be included even in builds for non-EFI platforms.
Commits such as 9de6c45 ("[arm] Use -fno-short-enums for all 32-bit
ARM builds") have so far ensured that the compile-time checks within
the EDK2 headers will pass even when building for a non-EFI platform.
As a more general solution, temporarily disable static assertions
while including UefiBaseType.h if building on a non-EFI platform.
This avoids the need to modify the ABI on other platforms.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The "shim" command will skip downloading the shim binary (and is
therefore a conditional no-op) if there is already a selected EFI
image that can be executed directly via LoadImage()/StartImage().
This allows the same iPXE script to be used with Secure Boot either
enabled or disabled.
Generalise this further to provide a dummy "shim" command that is an
unconditional no-op on non-EFI platforms. This then allows the same
iPXE script to be used for BIOS, EFI with Secure Boot disabled, or EFI
with Secure Boot enabled.
The same effect could be achieved by using "iseq ${platform} efi"
within the script, but this would complicate end-user documentation.
To minimise the code size impact, the dummy "shim" command is a pure
no-op that does not call parse_options() and so will ignore even
standardised arguments such as "--help".
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The UEFI shim implements a fairly nicely designed revocation mechanism
designed around the concept of security generations. Unfortunately
nobody in the shim community has thus far added the relevant metadata
to the Linux kernel, with the result that current versions of shim are
incapable of booting current versions of the Linux kernel.
Experience shows that there is unfortunately no point in trying to get
a fix for this upstreamed into shim. We therefore default to working
around this undesirable behaviour by patching data read from the
"SbatLevel" variable used to hold SBAT configuration.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Allow a shim to be used to facilitate booting a kernel using a script
such as:
kernel /images/vmlinuz console=ttyS0,115200n8
initrd /images/initrd.img
shim /images/shimx64.efi
boot
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add support for using a shim as a helper to execute an EFI image.
When a shim has been specified via shim(), the shim image will be
passed to LoadImage() instead of the selected EFI image and the
command line will be prepended with the name of the selected EFI
image. The selected EFI image will be accessible to the shim via the
virtual filesystem as a hidden file.
Reduce the Secure Boot attack surface by removing, where possible, the
spurious requirement for a third party second stage loader binary such
as GRUB to be used solely in order to call the "shim lock protocol"
entry point.
Do not install the EFI PXE APIs when using a shim, since if shim finds
EFI_PXE_BASE_CODE_PROTOCOL on the loaded image's device handle then it
will attempt to download files afresh instead of using the files
already downloaded by iPXE and exposed via the EFI_SIMPLE_FILE_SYSTEM
protocol. (Experience shows that there is no point in trying to get a
fix for this upstreamed into shim.)
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The UEFI shim includes a "shim lock protocol" that can be used by a
third party second stage loader such as GRUB to verify a kernel image.
Add definitions for the relevant portions of this protocol interface.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Most image flags are independent values: any combination of flags may
be set for any image, and the flags for one image are independent of
the flags for any other image. The "selected" flag does not follow
this pattern: at most one image may be marked as selected at any time.
When invoking a kernel via the UEFI shim, there will be multiple
"special" images: the selected kernel itself, the shim image, and
potentially a shim-signed GRUB binary to be used as a crutch to assist
shim in loading the kernel (since current versions of the UEFI shim
are not capable of directly loading a Linux kernel).
Remove the "selected" image flag and replace it with a general concept
of an image tag with the same semantics: a given tag may be assigned
to at most one image, an image may be found by its tag only while the
image is currently registered, and a tag will survive unregistration
and reregistration of an image (if it has not already been assigned to
a new image). For visual consistency, also replace the current image
pointer with a current image tag.
The image pointer stored within the image tag holds only a weak
reference to the image, since the selection of an image should not
prevent that image from being freed. (The strong reference to the
currently executing image is held locally within the execution scope
of image_exec(), and is logically separate from the current image
pointer.)
Signed-off-by: Michael Brown <mcb30@ipxe.org>
An EFI image that is rejected by LoadImage() due to failing Secure
Boot verification is still an EFI image. Unfortunately, the extremely
broken UEFI Secure Boot model provides no way for us to unambiguously
determine that a valid EFI executable image was rejected only because
it failed signature verification. We must therefore use heuristics to
guess whether not an image that was rejected by LoadImage() could
still be loaded via a separate PE loader such as the UEFI shim.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The libc6-dbg:i386 package has spontaneously started failing to
install from the Azure package repositories used by the GitHub Actions
runners, with the somewhat recalcitrant error message:
libc6:i386: Depends: libgcc-s1:i386 but it is not going to be installed
Work around this unexplained issue by explicitly requesting
installation of the libgcc-s1:i386 package.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Versions 15.4 and earlier of the UEFI shim are incapable of correctly
parsing the command line in order to extract the second stage loader
filename, and will always attempt to load "grubx64.efi" or equivalent.
Versions 15.3 and later of the UEFI shim are currently incapable of
loading a Linux kernel directly anyway, since the kernel does not
include SBAT metadata. These versions will require a genuine
shim-signed GRUB binary to be used as a crutch to assist shim in
loading a Linux kernel.
This leaves versions 15.2 and earlier of the UEFI shim (as currently
used in e.g. RHEL7) as being capable of directly loading a Linux
kernel, but incorrectly attempting to load it using the filename
"grubx64.efi" or equivalent. To support the bugs in these older
versions of the UEFI shim, allow the currently selected image to be
opened via any filename of the form "grub*.efi".
Signed-off-by: Michael Brown <mcb30@ipxe.org>
When invoking a kernel via the UEFI shim, the kernel image must be
accessible via EFI_SIMPLE_FILE_SYSTEM_PROTOCOL but must not be present
in the magic initrd constructed from all registered images.
Re-register a currently executing EFI image and mark it as hidden,
thereby allowing it to be accessed via the virtual filesystem exposed
via EFI_SIMPLE_FILE_SYSTEM_PROTOCOL without appearing in the magic
initrd contents.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
When invoking a kernel via the UEFI shim, the kernel (and potentially
also a helper binary such as GRUB) must be accessible via the virtual
filesystem exposed via EFI_SIMPLE_FILE_SYSTEM_PROTOCOL but must not be
present in the magic initrd constructed from all registered images.
Allow for images to be flagged as hidden, which will cause them to be
excluded from API-level lists of all images such as the virtual
filesystem directory contents, the magic initrd, or the Multiboot
module list. Hidden images remain visible to iPXE commands including
"imgstat", which will show a "[HIDDEN]" flag for such images.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Show the original filename as used by the consumer when calling our
EFI_SIMPLE_FILE_SYSTEM_PROTOCOL's Open() method.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Try searching for a matching registered image before checking for
fixed filenames (such as "initrd.magic" for the dynamically generated
magic initrd file). This minimises surprise by ensuring that an
explicitly downloaded image will always be used verbatim.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Hybrid bzImage and UEFI binaries (such as wimboot) include a bzImage
header within a section starting at offset zero, with the PE header
effectively occupying unused space within this section. This section
should not appear as a named section in the resulting PE file.
Allow for the existence of hidden sections that do not result in a
section header being written to the PE file.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Hybrid 32-bit BIOS and 64-bit UEFI binaries (such as wimboot) may
include R_X86_64_32 relocation records for the 32-bit BIOS portions.
These should be ignored when generating PE relocation records, since
they apply only to code that cannot be executed within the context of
the 64-bit UEFI binary, and creating a 4-byte relocation record is
invalid in a binary that may be relocated anywhere within the 64-bit
address space (see commit 907cffb "[efi] Disallow R_X86_64_32
relocations").
Add a "--hybrid" option to elf2efi, which will cause R_X86_64_32
relocation records to be silently discarded.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Hybrid bzImage and UEFI binaries (such as wimboot) require the PE
header to be kept as small as possible, since the bzImage header
starts at a fixed offset 0x1f1.
The EFI_IMAGE_OPTIONAL_HEADER structures in PeImage.h define an
optional header containing 16 data directory entries, of which the
last eight are unused in binaries that we create. Shrink the data
directory to contain only the first eight entries, to minimise the
overall size of the PE header.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Hybrid bzImage and UEFI binaries (such as wimboot) require the PE
header to be kept as small as possible, since the bzImage header
starts at a fixed offset 0x1f1.
The PE header currently includes 128 bytes of zero padding between the
DOS and NT header portions. This padding has been present since
commit 81d92c6 ("[efi] Add EFI image format and basic runtime
environment") first added support for EFI images in iPXE, and was
included on the basis of matching the observed behaviour of the
Microsoft toolchain. There appears to be no requirement for this
padding to exist: EDK2 binaries built with gcc include only 64 bytes
of zero padding, Linux kernel binaries include 66 bytes of non-zero
padding, and wimboot binaries include no padding at all.
Remove the unnecessary padding between the DOS and NT header portions
to minimise the overall size of the PE header.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Prepare for the possibility that a record handler may choose not to
consume the entire record by passing the I/O buffer and requiring the
handler to mark consumed data using iob_pull().
Signed-off-by: Michael Brown <mcb30@ipxe.org>
As documented in commits 6a004be ("[efi] Support the initrd
autodetection mechanism in newer Linux kernels") and 04e60a2 ("[efi]
Omit EFI_LOAD_FILE2_PROTOCOL for a zero-length initrd"), the choice in
Linux of using a fixed device path requires bootloaders to allow for
the fact that a previous bootloader may have already installed a
handle with the fixed device path.
We currently deal with this situation by reusing the existing handle,
replacing the EFI_LOAD_FILE2_PROTOCOL instance with our own. Simplify
the code by instead uninstalling the EFI_DEVICE_PATH_PROTOCOL instance
from the existing handle (if present), thereby allowing the creation
of a new handle to succeed.
Create the new handle only if we have a non-empty initrd to provide.
This works around bugs in bootloaders such as the systemd EFI stub
that fail to allow for the existence of multiple-bootloader chains.
(The workaround is not comprehensive: if the user has downloaded other
images in iPXE before invoking the systemd Unified Kernel Image (UKI),
then the systemd EFI stub will still crash and burn since it fails to
allow for the fact that a previous bootloader has already installed a
handle with the fixed device path.)
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The Intel I210's packet buffer size registers reset only on power up,
not when a reset signal is asserted. This can lead to the inability
to pass traffic in the event that the DMA TX Maximum Packet Size
(which does reset to its default value on reset) is bigger than the TX
Packet Buffer Size.
For example, an operating system may be using the time sensitive
networking features of the I210 and the registers may be programmed
correctly, but then a reset signal is asserted and iPXE on the next
boot will be unable to use the I210.
Mimic what Linux does and forcibly set the registers to their default
values.
Signed-off-by: Matt Parrella <parrella.matthew@gmail.com>
When a DHCP transaction does not result in the registration of a new
"proxydhcp" or "pxebs" settings block, any existing settings blocks
are currently left unaltered.
This can cause surprising behaviour. For example: when chainloading
iPXE, the "proxydhcp" and "pxebs" settings blocks may be prepopulated
using cached values from the previous PXE bootloader. If iPXE
performs a subsequent DHCP request, then the DHCP or ProxyDHCP servers
may choose to respond differently to iPXE. The response may choose to
omit the ProxyDHCP or PXEBS stages, in which case no new "proxydhcp"
or "pxebs" settings blocks may be registered. This will result in
iPXE using a combination of both old and new DHCP responses.
Fix by assuming that a successful DHCPACK effectively acquires
ownership of the "proxydhcp" and "pxebs" settings blocks, and that any
existing settings blocks should therefore be unregistered.
Reported-by: Henry Tung <htung@palantir.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
We unregister script images during their execution, to prevent a
"boot" command from re-executing the containing script. This also has
the side effect of preventing executing scripts from showing up within
the Linux magic initrd image (or the Multiboot module list).
Additional logic in bzimage.c and efi_file.c prevents a currently
executing kernel from showing up within the magic initrd image.
Similar logic in multiboot.c prevents the Multiboot kernel from
showing up as a Multiboot module.
This still leaves some corner cases that are not covered correctly.
For example: when using a gzip-compressed kernel image, nothing will
currently hide the original compressed image from the magic initrd.
Fix by moving the logic that temporarily unregisters the current image
from script_exec() to image_exec(), so that it applies to all image
types, and simplify the magic initrd and Multiboot module list
construction logic on the basis that no further filtering of the
registered image list is necessary.
This change has the side effect of hiding currently executing EFI
images from the virtual filesystem exposed by iPXE. For example, when
using iPXE to boot wimboot, the wimboot binary itself will no longer
be visible within the virtual filesystem.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Prepare for the parameter mechanism to be generalised to specifying
request parameters that are passed via mechanisms other than an
application/x-www-form-urlencoded form.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
An attempt to use an existent but empty form parameter list will
currently result in an invalid POST request since the Content-Length
header will be missing.
Fix by using GET instead of POST if the form parameter list is empty.
This is a non-breaking change (since the current behaviour produces an
invalid request), and simplifies the imminent generalisation of the
parameter list concept to handle both header and form parameters.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
When the Linux kernel is being used with no initrd, iPXE will still
provide a zero-length initrd.magic file within the virtual filesystem.
As of commit 6a004be ("[efi] Support the initrd autodetection
mechanism in newer Linux kernels"), this zero-length file will also be
exposed via an EFI_LOAD_FILE2_PROTOCOL instance on a handle with a
fixed device path.
The correct handling of zero-length files via EFI_LOAD_FILE2_PROTOCOL
is unfortunately not well defined.
Linux expects the first call to LoadFile() to always fail with
EFI_BUFFER_TOO_SMALL. When the initrd is genuinely zero-length, iPXE
will return success since the buffer is not too small to hold the
(zero-length) file. This causes Linux to immediately report a
spurious EFI_LOAD_ERROR boot failure.
We could change the logic in iPXE's efi_file_load() to always return
EFI_BUFFER_TOO_SMALL if Buffer is NULL on entry. Since the correct
behaviour of LoadFile() in the corner case of a zero-length file is
left undefined by the UEFI specification, this would be permissible.
Unfortunately this approach would not fix the problem. If we return
EFI_BUFFER_TOO_SMALL and set the file length to zero, then Linux will
call the boot services AllocatePages() method with a zero length. In
at least the EDK2 implementation, this combination of parameters will
cause AllocatePages() to return EFI_OUT_OF_RESOURCES, and Linux will
again report a boot failure.
Another approach would be to install the initrd device path handle
only if we have a non-empty initrd to offer. Unfortunately this would
lead to a failure in yet another corner case: if a previous bootloader
has installed an initrd device path handle (e.g. to pass a boot script
to iPXE) then we must not leave that initrd in place, since then our
loaded kernel would end up seeing the wrong initrd content.
The cleanest fix seems to be to ensure that the initrd device path
handle is installed with the EFI_DEVICE_PATH_PROTOCOL instance present
but with the EFI_LOAD_FILE2_PROTOCOL instance absent (and forcibly
uninstalled if necessary), matching the state in which we leave the
handle after uninstalling our virtual filesystem. Linux will then not
find any handle that supports EFI_LOAD_FILE2_PROTOCOL within the fixed
device path, and so will fall through to trying other mechanisms to
locate the initrd.
Reported-by: Chris Bradshaw <cwbshaw@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Commit 7ca801d ("[efi] Use the EFI_RNG_PROTOCOL as an entropy source
if available") added EFI_RNG_PROTOCOL as an alternative entropy source
via an ad-hoc mechanism specific to efi_entropy.c.
Split out EFI_RNG_PROTOCOL to a separate entropy source, and allow the
entropy core to handle the selection of RDRAND, EFI_RNG_PROTOCOL, or
timer ticks as the active source.
The fault detection logic added in commit a87537d ("[efi] Detect and
disable seriously broken EFI_RNG_PROTOCOL implementations") may be
removed completely, since the failure will already be detected by the
generic ANS X9.82-mandated repetition count test and will now be
handled gracefully by the entropy core.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Provide per-source state variables for the repetition count test and
adaptive proportion test, to allow for the situation in which an
entropy source can be enabled but then fails during the startup tests,
thereby requiring an alternative entropy source to be used.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
As noted in commit 3c83843 ("[rng] Check for several functioning RTC
interrupts"), experimentation shows that Hyper-V cannot be trusted to
reliably generate RTC interrupts. (As noted in commit f3ba0fb
("[hyperv] Provide timer based on the 10MHz time reference count
MSR"), Hyper-V appears to suffer from a general problem in reliably
generating any legacy interrupts.) An alternative entropy source is
therefore required for an image that may be used in a Hyper-V Gen1
virtual machine.
The x86 RDRAND instruction provides a suitable alternative entropy
source, but may not be supported by all CPUs. We must therefore allow
for multiple entropy sources to be compiled in, with the single active
entropy source selected only at runtime.
Restructure the internal entropy API to allow a working entropy source
to be detected and chosen at runtime.
Enable the RDRAND entropy source for all x86 builds, since it is
likely to be substantially faster than any other source.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
We currently specify only the iSCSI default value for MaxBurstLength
and ignore any negotiated value, since our internal block device API
allows only for receiving directly into caller-allocated buffers and
so we have no intrinsic limit on burst length.
A conscientious target may however refuse to attempt a transfer that
we request for a number of blocks that would exceed the negotiated
maximum burst length.
Fix by recording the negotiated maximum burst length and using it to
limit the maximum number of blocks per transfer as reported by the
SCSI layer.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Linux 5.7 added the ability to autodetect an initrd by searching for a
handle via a fixed vendor-specific "Linux initrd device path" and then
locating and using the EFI_LOAD_FILE2_PROTOCOL instance on that
handle.
This maps quite naturally onto our existing concept of a "magic
initrd" as introduced for EFI in commit e5f0255 ("[efi] Provide an
"initrd.magic" file for use by UEFI kernels").
Add an EFI_LOAD_FILE2_PROTOCOL instance to our EFI virtual files
(backed by simply calling the existing EFI_SIMPLE_FILE_SYSTEM_PROTOCOL
method to read from the file), and install the protocol instance for
the "initrd.magic" virtual file onto a new device handle that also
provides the Linux initrd device path.
The design choice in Linux of using a single fixed device path makes
this unfortunately messy to support, since device paths must be unique
within a system. When multiple bootloaders are used (e.g. GRUB
loading iPXE loading Linux) then only one bootloader can ever install
the device path onto a handle. Subsequent bootloaders must locate the
existing handle and replace the load file protocol instance with their
own.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Show the requested range when a caller reads from a virtual file via
the EFI_SIMPLE_FILE_SYSTEM_PROTOCOL interface.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The Linux kernel bzImage image format and the CPIO archive constructor
will parse the image command line for certain arguments of the form
"key=value". This parsing is currently implemented using strstr() in
a way that can cause a false positive suffix match. For example, a
command line containing "highmem=<n>" would erroneously be treated as
containing a value for "mem=<n>".
Fix by centralising the logic used for parsing such arguments, and
including a check that the argument immediately follows a whitespace
delimiter (or is at the start of the string).
Reported-by: Filippo Giunchedi <filippo@esaurito.net>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Commit 74222cd ("[rng] Check for functioning RTC interrupt") added a
check that the RTC is capable of generating interrupts via the legacy
PIC, since this mechanism appears to be broken in some Hyper-V virtual
machines.
Experimentation shows that the RTC is sometimes capable of generating
a single interrupt, but will then generate no subsequent interrupts.
This currently causes rtc_entropy_check() to falsely detect that the
entropy gathering mechanism is functional.
Fix by checking for several RTC interrupts before declaring that it is
a functional entropy source.
Reported-by: Andreas Hammarskjöld <junior@2PintSoftware.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
EISA expansion slot I/O port addresses overlap space that may be
assigned to PCI devices, which can lead to register reads and writes
with unwanted side effects during EISA probing.
Reduce the chances of performing EISA probing on PCI devices by
probing EISA slot vendor and product ID registers only if the EISA
system board vendor ID register indicates that the motherboard
supports EISA.
Debugged-by: Václav Ovsík <vaclav.ovsik@gmail.com>
Tested-by: Václav Ovsík <vaclav.ovsik@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add support for building a LoongArch64 Linux userspace binary.
Signed-off-by: Xiaotian Wu <wuxiaotian@loongson.cn>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The test suites for the various architectures are often run back to
back, and there is currently nothing to visually distinguish one test
run from another.
Include the architecture name within the self-test startup banner, to
aid in visual identification of test results.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Speed up the "Install packages" step for each CI run by caching the
downloaded packages in /var/cache/apt.
Do not include libc6-dbg:i386 within the cache, since apt seems to
complain if asked to download both gcc-aarch64-linux-gnu and
libc6-dbg:i386 at the same time.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The PAGE_SHIFT definition is an architectural property, rather than an
aspect of a particular I/O API implementation (of which, in theory,
there may be more than one per architecture).
Reflect this by moving the definition to the top-level bits/io.h for
each architecture.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Over the years, the undocumented operand modifier used to produce the
unprefixed constant values in __einfo_error() has varied from "%c0" to
"%a0" in commit 1a77466 ("[build] Fix use of inline assembly on GCC
4.8 ARM64 builds") and back to "%c0" in commit 3fb3ffc ("[build] Fix
use of inline assembly on GCC 8 ARM64 builds"), according to the
evolving demands of the toolchain.
LoongArch64 suffers from a similar issue: GCC 13 will allow either,
but the currently released GCC 12 allows only the "%a0" form.
Introduce a macro ASM_NO_PREFIX, defined in bits/compiler.h, to
abstract away this difference and allow different architectures to use
different operand modifiers.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The Xen headers support only x86 and ARM. Allow for platforms such as
LoongArch64 to build despite the absence of Xen support by providing
an architecture-specific <bits/xen.h> that simply does:
#ifndef _BITS_XEN_H
#define _BITS_XEN_H
#include <ipxe/nonxen.h>
#endif /* _BITS_XEN_H */
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add support for recording LLDP packets and exposing TLV values via the
settings mechanism. LLDP settings are encoded as
${netX.lldp/<prefix>.<type>.<index>.<offset>.<length>}
where
<type> is the TLV type
<offset> is the starting offset within the TLV value
<length> is the length (or zero to read the from <offset> to the end)
<prefix>, if it has a non-zero value, is the subtype byte string of
length <offset> to match at the start of the TLV value, up to a
maximum matched length of 4 bytes
<index> is the index of the entry matching <type> and <prefix> to be
accessed, with zero indicating the first matching entry
The <prefix> is designed to accommodate both matching of the OUI
within an organization-specific TLV (e.g. 0x0080c2 for IEEE 802.1
TLVs) and of a subtype byte as found within many TLVs.
This encoding allows most LLDP values to be extracted easily. For
example
System name: ${netX.lldp/5.0.0.0:string}
System description: ${netX.lldp/6.0.0.0:string}
Port description: ${netX.lldp/4.0.0.0:string}
Port interface name: ${netX.lldp/5.2.0.1.0:string}
Chassis MAC address: ${netX.lldp/4.1.0.1.0:hex}
Management IPv4 address: ${netX.lldp/5.1.8.0.2.4:ipv4}
Port VLAN ID: ${netX.lldp/0x0080c2.1.127.0.4.2:int16}
Port VLAN name: ${netX.lldp/0x0080c2.3.127.0.7.0:string}
Maximum frame size: ${netX.lldp/0x00120f.4.127.0.4.2:uint16}
Originally-implemented-by: Marin Hannache <git@mareo.fr>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
# -*- makefile -*- : Force emacs to use Makefile mode
# Starting virtual address
#
LDFLAGS+= -Ttext=0x120000000
# Include generic Linux Makefile
#
MAKEDEPS+= Makefile.linux
includeMakefile.linux
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.