mirror of https://github.com/upx/upx.git
110 lines
6.0 KiB
Plaintext
110 lines
6.0 KiB
Plaintext
Decompressing ELF Directly to Memory on Linux/x86
|
|
Copyright (C) 2000-2024 John F. Reiser jreiser@BitWagon.com
|
|
|
|
References:
|
|
<elf.h> definitions for the ELF file format
|
|
/usr/src/linux/fs/binfmt_elf.c what Linux execve() does with ELF
|
|
objdump --private-headers a.elf dump the Elf32_Phdr
|
|
http://www.cygnus.com/pubs/gnupro/5_ut/b_Usingld/ldLinker_scripts.html
|
|
how to construct unusual ELF using /bin/ld
|
|
|
|
There is exactly one immovable object: In all of the Linux kernel,
|
|
only the execve() system call sets the initial value of "the brk(0)",
|
|
the value that is manipulated by system call 45 (__NR_brk in
|
|
/usr/include/asm/unistd.h). For "direct to memory" decompression,
|
|
there will be no execve() except for the execve() of the decompressor
|
|
program itself. So, the decompressor program (which contains the
|
|
compressed version of the original executable) must have the same
|
|
brk() as the original executable. So, the first PT_LOAD
|
|
ELF "segment" of the compressed program is used only to set the brk(0).
|
|
See src/p_lx_elf.cpp, function PackLinuxElf32::generateElfHdr.
|
|
All of the decompressor's code, and all of the compressed image
|
|
of the original executable, reside in the first PT_LOAD of the
|
|
decompressor program.
|
|
|
|
The decompressor program stub is just under 2K bytes when linked.
|
|
After linking, the decompressor code is converted to an initialized
|
|
array, and #included into the compilation of the compressor;
|
|
see stub/i386-linux.elf-entry.h. To make self-contained compressed
|
|
executables even smaller, the compressor also compresses all but the
|
|
startup and decompression subroutine of the decompressor itself,
|
|
saving a few hundred bytes. The startup code first decompresses the
|
|
rest of the decompressor, then jumps to it. A nonstandard linker
|
|
script src/stub/src/i386-linux.elf-entry.lds arranges the SECTIONS
|
|
so that PackLinuxElf32x86::buildLoader() and buildLinuxLoader()
|
|
generate the desired stub code, which goes into PT_LOAD[1].
|
|
|
|
At runtime, the decompressed stub lives close beyond the brk().
|
|
In order for the decompressed stub to work properly at an address
|
|
that is different from its link-time address, the compiled code must
|
|
contain no absolute addresses. So, the data items in stub code
|
|
must be only parameters and automatic (on-stack) local variables;
|
|
no global data, no static data, and no string constants. Also,
|
|
the '&' operator may not be used to take the address of a function.
|
|
|
|
Decompression of the executable begins by decompressing the Elf32_Ehdr
|
|
and Elf32_Phdr, and then uses those Ehdr and Phdrs to control decompression
|
|
of the PT_LOAD segments. Subroutine do_xmap() of src/stub/src/
|
|
i386-linux.elf-main.c performs the
|
|
"virtual execve()" using the compressed data as source, and stores
|
|
the decompressed bytes directly into the appropriate virtual addresses.
|
|
|
|
Before transferring control to the PT_INTERP "program interpreter",
|
|
minor tricks are required to setup the Elf32_auxv_t entries,
|
|
clear the free portion of the stack (to compensate for ld-linux.so.2
|
|
assuming that its automatic stack variables are initialized to zero),
|
|
and remove (all but 4 bytes of) the decompression program (and
|
|
compressed executable) from the address space.
|
|
|
|
As of upx-3.05, by default on Linux, upon decompression then one page
|
|
of the compressed executable remains mapped into the address space
|
|
of the process. If all of the pages of the compressed executable are
|
|
unmapped, then the Linux kernel erases the symlink /proc/self/exe,
|
|
and this can cause trouble for the runtime shared library loader
|
|
expanding $ORIGIN in -rpath, or for application code that relies on
|
|
/proc/self/exe. Use the compress-time command-line option
|
|
--unmap-all-pages to achieve that effect at run time. Upx-3.04
|
|
and previous versions did this by default with no option. However,
|
|
too much other software erroneously assumes that /proc/self/exe
|
|
always exists. upx-4.3.0 made /proc/self/exe optional so that
|
|
chroot() and related environments can work.
|
|
For Elf formats, UPX adds an environment variable named " " [three
|
|
spaces] which saves the results of readlink("/proc/self/exe",,)
|
|
If /proc/self/exe is ENOENT, then the variable has the same value
|
|
as its name "/proc/self/exe".
|
|
|
|
All of the above documentation refers to ET_EXEC main programs,
|
|
which always use the same virtual addresses. An ET_DYN executable
|
|
(main program or shared library) follows much the same scheme,
|
|
re-using the address space that the kernel chose originally.
|
|
|
|
Linux stores the pathname argument that was specified to execve()
|
|
immediately after the '\0' which terminates the character string of the
|
|
last environment variable [as of execve()]. This is true for at least
|
|
all Linux 2.6, 2.4, and 2.2 kernels. Linux kernel 2.6.29 and later
|
|
records a pointer to that character string in Elf32_auxv[AT_EXECFN].
|
|
The pathname is not "bound" to the file as strongly as /proc/self/exe
|
|
(the file may be changed without affecting the pathname), but the
|
|
pathname does provide some information. The pathname may be relative
|
|
to the working directory, so look before performing any chdir().
|
|
|
|
On any page, then SELinux in strictest enforcing mode prohibits
|
|
simultaneous PROT_EXEC and PROT_WRITE, and also prohibits adding
|
|
PROT_EXEC if the kernel VMA struct (Virtual Memory Area struct)
|
|
that manages that page ever has had PROT_WRITE. This implies that
|
|
the only way to get PROT_EXEC is to map the page directly from a file.
|
|
Therefore, in late 2023 the various decompression stubs are being
|
|
rewritten to "bounce" the decompressed data through pages in a
|
|
memory-resident file created by the memfd_create() system call,
|
|
and subsequently mapped PROT_EXEC. Actual copying of the pages
|
|
can be avoided by careful sequence mmap() modes, but the overhead
|
|
of an additional system call is required.
|
|
|
|
The not-as-strict "targeted enforcing" mode of
|
|
SELinux seems not to demand this extra work, except for executables
|
|
that run with elevated privileges, such as various system daemons.
|
|
So "ordinary" user-mode apps can run in current "targeted enforcing"
|
|
mode. But because the actual runtime mode of SELinux is unknown
|
|
at compression time, then the memfd_create method should be used
|
|
all the time.
|