3. To avoid custom linker scripts (which I hate with a passion), I embed my hand-crafted ELF within a regular ELF, and slice it out at the end (using a python script). The "container" ELF is a regular full-fat ELF, potentially including working debug symbols, but the inner ELF has none of the cruft.
Using this technique, I wrote a barely-functional TLS1.3 client that fits in ~3.5KB (see the rest of repo from the first link)
1vuio0pswjnm7 13 hours ago [-]
"This is repo hosts my WIP entry to BGGP5. This README acts as a dev log of sorts (It's a bit of an un-edited stream of consciousness right now, I'll do a proper writeup later. hopefully).
The main goal of BGGP5 is to download the file at https://binary.golf/5/5 and display its contents, using less than 4KB of code (stored in whatever format you like).
Tiny disclaimer: As part of the BGGP staff team I knew about the theme in advance, and I absolutely could not resist getting started a few days early. This entry is more about being cool than being competitive, so I hope you can forgive me!"
What is size of busybox with ssl_client as the only applet and wolfssl as the TLS library
Retr0id 13 hours ago [-]
> Are we excluding the size of sh, wget and cat
Yes. It's not very interesting, but you can do that.
> What is size of busybox with ssl_client as the only applet and wolfssl as the TLS library
Larger than 4096 bytes.
1vuio0pswjnm7 11 hours ago [-]
Using kernel TLS would reduce size but is it compiled into all Linux kernels by default, e.g., Alpine Linux used to disable it
almostgotcaught 1 days ago [-]
> To avoid custom linker scripts (which I hate with a passion)
lol why? i mean the syntax sucks but this seems like howling into the wind...
Retr0id 19 hours ago [-]
Firstly, yes, the syntax sucks. But most of all it's a a moving target. Every so often the compiler will decide to emit some fancy new segment or other metadata, and your linker script won't know what to do with it, and you have to re-learn linker script syntax to fix it.
saurik 16 hours ago [-]
This is also why I avoid them like the plague. I would be a lot less annoyed by the concept of linker scripts if the mechanism were more set up as a concatenative language where you could provide a pile of modifications to behavior that add up to something useful, rather than having to whole-hog replace all of the behavior the compiler starts with (and like, in a world using clang/lld, which supports linker scripts but doesn't internally use them--meaning there is no "default" linker script you can dump/patch--the situation is even worse than it was before... I honestly have a hard time understanding why anyone considers this system acceptable).
almostgotcaught 19 hours ago [-]
-Wl,--orphan-handling=error but hey whatever floats your boat
Retr0id 18 hours ago [-]
That doesn't solve the problem, though. orphan-handling lets you choose between loud breakage and quiet breakage. I want no breakage!
boricj 2 days ago [-]
The Linux kernel source tree has nolibc [1], a header-only C standard library implementation that is about as barebones and paper-thin as it gets and is the next step up from a pure freestanding environment as shown in this article. I've used it to create a tiny but working program that prints out the ASCII table [2] as part of my Ghidra extension test suite.
> from a pure freestanding environment as shown in this article
Isn't a freestanding enviroment one without an OS? The author in the article explicitly codes against Linux syscalls and is creating an ELF file (so a hosted executable).
saurik 1 days ago [-]
I think of "freestanding" as being related to the "-ffreestanding" flag of modern compilers, which merely means something similar to "don't assume that functions have their usual C standard definitions, as I don't have a normal libc".
perching_aix 21 hours ago [-]
I looked up the GCC docs, and it says both that and what I said. Bit confusing, but makes sense in hindsight.
jart 2 days ago [-]
I love articles like this. If you want to see a tutorial on how you can take this a step further, by creating a tiny ELF file that runs on Linux, FreeBSD, NetBSD, and OpenBSD 7.3 then check out https://justine.lol/sizetricks/#elf
matheusmoreira 2 days ago [-]
I would also recommend the legendary Teensy Files:
They sparked my interest in ELF and freestanding programs.
LegionMammal978 2 days ago [-]
If anyone's interested, last year I replicated this exercise for an x86-64 Linux executable [0], and also golfed a Hello World as small as I could. I ended up using a little-known pattern (an ET_DYN executable with no interpreter, normally only used for the ld.so binary) to shave off more bytes than anyone else who had tried it, to the best of my knowledge.
And Chris Wellons' "A Magnetized Needle and a Steady Hand," detailing how to build an ELF implementation of 'true' using nothing more than 'echo' or 'printf': https://nullprogram.com/blog/2016/11/17/
It's a webserver written in x86 assembler, which makes raw syscalls. It has no functions, and unmaps the stack so it uses only one 4KB page of memory at runtime.
nils-m-holm 23 hours ago [-]
My T3X/9 compiler generates ELF with no sections at all, there is just a code and data segment. A later version even gets rid of the data segment, but that is not ready for publication.
http://t3x.org/t3x/index.html#t3x9
ryukoposting 16 hours ago [-]
I keep a little book of "cursed things you can do with C." I'll definitely be adding "emojis in linker scripts." Good read.
I wrote this page for my own compiler that I'm working on, but I think it would be a good complement to this article. Note that the page is not that great on mobile, the extra real estate on desktop really helps.
ptspts 1 days ago [-]
For 32-bit x86 (i386 and i686), I've written a libc and a toolchain to.automate this: https://github.com/pts/minilibc686 . It can use mainstream free C compilers (GCC, Clang, OpenWatcom cc386, TinyCC and PCC) and assemblers (GNU as and NASM) out of the box.
A printf-hello-world is about 1 KiB. A write-hello-world (syscalls only) is less than 200 bytes. Assembly programming skills not needed to use it.
compiler-guy 2 days ago [-]
If one properly specifies the input, output, and clobber constraints to the asm statement, there is no need for the volatile keyword in any of this.
jcalvinowens 1 days ago [-]
I don't think that's correct for the sys_exit() call with no outputs: the compiler doesn't know the syscall instruction has side effects, I think it would be within its rights to omit that asm statement without volatile. Adding an output and code to consume the result seems like a waste of space in .text, it doesn't return.
Well, neither have outputs, doh, so they both need volatile don't they?
Adding an output for the %rax result would prevent the call from being omitted without volatile (assuming it is actually consumed by something), but it could still be reordered, right? I suppose with general syscalls that might be okay, but certainly not with sys_exit().
The custom entry points look wrong to me. Aren't they breaking the rules over stack alignment when calling functions? Specifically, that rip is supposed to be congruent to 8 mod 16 at the beginning of a function, and supposed to be divisible by 16 right before a call instruction. The problem is that when code execution starts at the entry point, rip is divisible by 16, but by writing it as a C function, the compiler will assume it's off by 8 from what it actually is.
fsmv 1 days ago [-]
This is from the SysV calling convention not x86 itself. The CPU can do unaligned just fine. You don't have to use the calling convention when not calling out to a library.
josephcsible 1 days ago [-]
You're right that it's not inherent to the architecture, but even if you're only calling your own code, if your own code is written in C, then GCC will assume it too, unless you use command-line arguments or attributes to tell it otherwise, neither of which is being done here.
oguz-ismail 1 days ago [-]
Does it matter unless you're reading a float from varargs? What else can it break?
josephcsible 1 days ago [-]
I don't know exactly what, but I know there is more than just that, because calling printf breaks with a misaligned stack even when you're not passing it any floating-point arguments. And even if it doesn't break anything for you today, you're basically committing UB by violating the compiler's assumptions.
ptspts 1 days ago [-]
Aren't there GCC command-line flags to specify alignment assumptions?
I must not be the target audience for this. What exactly is the purpose of this article? How to rewrite a simple C program in a complex combination of assembly and syscalls?
oguz-ismail 17 hours ago [-]
> rewrite a simple C program in a complex combination of assembly and syscalls
That'd be a good introduction to assembly for someone who already knows C well.
CaesarA 2 days ago [-]
I still don't understand how people were able to write software in the days when assembly was the only option for speedy execution.
throw-qqqqq 1 days ago [-]
You can define macros over the assembly to gain a high level language sort of similar to an untyped dialect of C.
For me it would be sort of like writing programs in C versus higher level languages: much more tedious, will take longer and require better planning/upfront design, but doable.
With practice you learn some tricks that can seem clever to anyone not writing a lot of asm. It’s “just” a very low level language IMO.
6SixTy 1 days ago [-]
Keeping things pretty simple in project scope and hardware helps quite a lot
matheusmoreira 2 days ago [-]
I would like to note that Linux is the only kernel which will allow you to do this! The Linux system call interface is stable and defined at the instruction set level. Linking against some system library is absolutely required on every other system.
You can get incredibly far with just this. I wrote a freestanding lisp interpreter with nothing but Linux system calls. It turned into a little framework for freestanding Linux programs. It's been incredibly fun.
Freestanding C is a much better language. A lot of legacy nonsense is in the standard library. The Linux system call interface is really nice to work with. Calling write is not that hard. It's the printf style string building and formatting that I sometimes miss.
LegionMammal978 2 days ago [-]
"Absolutely required" is some strong language. It's perfectly possible to, e.g., perform direct syscalls on Windows, and you'll occasionally see malware that does it to avoid certain forms of detection. You just have to switch on the OS version, and update your binary if you want it to be compatible with a newer version.
matheusmoreira 1 days ago [-]
I agree that it was too strong a claim. It's not supported by the developers and if you bypass their system libraries your program will break when they change things up.
Linux kernel is known to be able to run binaries compiled in the 90s. Breaking user space makes Linus yell at people until the breakage gets reverted. A platform that stable is worth building on top of. Updating executables is a lot of work, sometimes it's straight up impossible.
racingmars 23 hours ago [-]
> I would like to note that Linux is the only kernel which will allow you to do this!
I'm pretty sure that MVS syscalls (that is, the numbers you use with the SVC opcode) have remained backward-compatible at least as far back as MVS 3.8 in the 1970s and those binaries making those "raw" syscalls will still work on the latest z/OS releases.
There are a _lot_ more operating systems than Linux, Windows, and the BSDs... making a statement that the Linux kernel is the only kernel to do something a certain way is a risky proposition :-)
matheusmoreira 21 hours ago [-]
That's awesome. I didn't know about that system and never thought to look for it. Can you point me towards documentation where the vendor promises the interface will remain stable and backwards compatible? I'll remember it.
As a web developer, 90% of what you just wrote is nonsense to me. How did you learn this stuff? Do you use it for useful projects or just for fun?
matheusmoreira 1 days ago [-]
Curiosity and free time. You learn stuff like this by reading tens of thousands of lines of text and code for every line of code that you write.
I've always been all about the hidden fun stuff. The magical little programs that somehow configure audio cards. The ALSA mixer tool for example does it via special ioctls. I was reading its source code not too long ago. The manuals said those definitions were for the curious and that those ioctls were private, as though it was the library's author exclusive privilege to use those things. I seriously hate it when they say that. When they imply I'm some mere mortal who's better off using the libraries that were gifted to us by the gods of programming.
Good or bad, quite a bit of hubris is involved. Takes a certain audacity to think I can make a better wheel than people who are probably much smarter than I am. Sometimes I start projects just to prove to myself that I'm not clinically insane for thinking a better way is possible. Sometimes it works, sometimes it doesn't. Someone once called an idea I had schizophrenic. I'll never forget that day.
This Linux system call stuff started after I read an LWN article about glibc and Linux specific system call support, getrandom to be specific. Took glibc years to add support. I started a liblinux project because of that article. The idea was to get rid of libc and talk to Linux directly. In order to accomplish that, I was forced to learn a lot of compiler, linker and executable stuff. The musl libc source code taught me a lot.
It seems like the C library is doing a huge amount of stuff but it turns out you don't actually need most of it. Linux just puts your binary in memory and jumps into some address specified in the ELF header. Normally this when the C library or dynamic linker takes over in order to prepare to call main(). Turns out I can just replace all that with some simple code that calls a function and then exits the process when it returns. It just works. I won't have init/fini section processing but I can live with that, that's harmful stuff that shouldn't even have been invented to begin with.
oguz-ismail 2 days ago [-]
> Linking against some system library is absolutely required on every other system.
Not on FreeBSD, NetBSD, OpenBSD or Solaris.
The article you linked says this but it's not true:
> Sometimes it's not even possible to use system calls at all. OpenBSD has implemented system call origin verification, a security mechanism that only allows system calls originating from the system's libc. So not only is the kernel ABI unstable, normal programs are not even allowed to interface with the kernel at all.
You can still make system calls from normal programs, you just need to list the addresses of system call instructions in an ELF section named openbsd.syscalls.
matheusmoreira 1 days ago [-]
> Not on FreeBSD, NetBSD, OpenBSD or Solaris.
Can you cite any sources? I wasn't able to find any documentation that corroborates what you said when I wrote the article. The few texts I found actually suggested otherwise. Maybe things have changed since then?
> You can still make system calls from normal programs, you just need to list the addresses of system call instructions in an ELF section named openbsd.syscalls.
I see. So they have added a mechanism to list the sections allowed to perform system calls. That's news to me. Do they guarantee the system call numbers will remain stable though? That older system calls will remain available?
LegionMammal978 1 days ago [-]
> Can you cite any sources?
For one, the FreeBSD kernel specifically has a compatibility layer for Linux binaries to use their familiar syscalls [0]. For its ordinary syscalls, it also has a policy not to break binary compatibility without good reason [1]. Most other OSes just don't maintain quite the level of 'indefinite stability' that the Linux kernel does across different versions. And even Linux doesn't implement older versions of syscalls when the kernel is ported to new architectures, so eventually you have to rotate your implementation regardless, if you want people to run your code on new systems.
> The few texts I found actually suggested otherwise.
People often say "X is impossible" when the truth is "X is tricky and full of caveats, and I don't want to think about it, so stop asking". (Or if the devs themselves are saying it, it might be "I want to look like I'm 'tough on crime' toward users of undocumented behavior", as if that could stop Hyrum's law from running its course.) In this case, it's generally "If you do it on an OS other than Linux, you can run into big compatibility issues," not "It's impossible on OSes other than Linux."
As for compatibility issues, you're running into that the moment you do undocumented fun stuff like omitting ELF sections or overlapping headers, which future Linux versions could start rejecting on the basis of "no one needs to do that legitimately". So I wouldn't start drawing the line on syscall number compatibility.
> For one, the FreeBSD kernel specifically has a compatibility layer for Linux binaries to use their familiar syscalls [0].
I believe this strengthens my argument. Linux kernel-userspace interface is so stable other projects are implementing it. I remember Justine Tunney mentioning this before, the idea that the x86_64 Linux system call ABI is turning into some kind of lingua franca of systems programming.
> x86-64 Linux ABI Makes a Pretty Good Lingua Franca
Would be interesting if people started targeting Linux because of this, banking on the fact that other systems will just implement Linux. Even Windows has Linux built into it these days.
> For its ordinary syscalls, it also has a policy not to break binary compatibility without good reason.
Thank you for the source. I don't think that's a particularly strong guarantee. It's certainly stronger than OpenBSD's at least.
> Most other OSes just don't maintain quite the level of 'indefinite stability' that the Linux kernel does across different versions
Yeah. I think this is something that makes Linux unique.
> And even Linux doesn't implement older versions of syscalls when the kernel is ported to new architectures, so eventually you have to rotate your implementation regardless, if you want people to run your code on new systems.
That's true. Only new architectures are affected though. The old ones have all the old system calls, many with multiple versions, all supported. Porting to a new architecture doesn't invalidate the stability of existing ones.
> People often say "X is impossible" when the truth is "X is tricky and full of caveats, and I don't want to think about it, so stop asking".
> Or if the devs themselves are saying it, it might be "I want to look like I'm 'tough on crime' toward users of undocumented behavior"
I get what you're saying. I truly apologize if I came across that way. I did not mean to say that.
I got interested in this low level direct system call stuff because I literally got sick of reading "but you, mere mortal, are not meant to access these raw system interfaces, that's for us, you are meant to call the little library function we made for you" in the Linux and libc manuals. Last thing I want is to end up doing the same to others.
By "can't do this" I meant to say the developers maintaining the system don't want you bypassing their system libraries and won't take responsibility for it if you do so. If the program breaks because the kernel interfaces changed, they'll tell us it's our own fault and refuse fix to it.
Linux takes the opposite approach: breaking user space makes Linus Torvalds yell at the people until the breakage is reverted. I'm enthusiastic about it because it's the only system where this is supported.
> As for compatibility issues, you're running into that the moment you start doing undocumented fun stuff like omitting ELF sections or overlapping headers
I agree. Should be fine as long as the ELF specification is respected. It's okay though, ELF is flexible enough that even in 2024 it's possible to invent some new fun stuff.
Embedding arbitrary files into an existing ELF and patching it so that Linux automatically maps it in before the program even runs. Since Linux gives processes a pointer to the program headers, the file is in memory and reachable without a issuing a single system call.
oguz-ismail 1 days ago [-]
> Can you cite any sources?
Personal experience.
> Do they guarantee the system call numbers will remain stable though?
No. Doesn't mean you can't make system calls from outside the libc though.
matheusmoreira 21 hours ago [-]
Every process must be able to make system calls. This is after all the mechanism by which the system libraries will interface with the kernel.
The problem is the system's developers don't want us bypassing those libraries. We can do it but things can and probably will break in the future when they change things. It's not supported.
moonlion_eth 1 days ago [-]
Rich Hickey mentioned
sylware 20 hours ago [-]
The point: ELF is the issue.
I did design my own runtime binary executable/dynamic library format which I do embed in an ELF capsule to be loaded by legacy systems. The thing I need to port though is the core user level drivers:vulkan/drm & alsa-lib. The main issue would be the alsa-lib since some part of its API still "requires" a C runtime (you have to call free() on some returned data).
The issue with this "format": it is so much simple, I wonder if it would not be better if each software "dynamic library/user level system interface" should design its own minimal and giga simple "dynamic library" format, taylored for its semantics.
Dunno yet.
On modern hardware architecture, you load position independent memory segment (code and data). You should need its alignment requirement and you are good to go.
Basically, a magic with the alignment, then a table of offsets or re-entrant code (possible on modern hardware architecture which supports try-lock hardware semantics) right after the "header". I chose to use the re-entrant code guarded with an hardware try-lock mechanism, because it is more generic and will be cleaner on the long run than a table of offsets.
Bending the product of code generators (assemblers) into some runtime format was a good idea until most hardware architectures support a hardware try-lock mechanism, then it became really nasty legacy.
einpoklum 2 days ago [-]
1. X86_64 assumed...
2. Why is it that exiting at the end of main() requires a system call? Wouldn't a `ret` instruction go "back" to somplace where the OS itself will do cleanup work?
boricj 2 days ago [-]
> Why is it that exiting at the end of main() requires a system call? Wouldn't a `ret` instruction go "back" to somplace where the OS itself will do cleanup work?
Usually that's done by the C runtime library, but there isn't one there since this is a freestanding environment. Had the program not exited through a syscall (or entered an infinite loop), it would most likely crash after veering off the main() function.
cesarb 1 days ago [-]
> Why is it that exiting at the end of main() requires a system call? Wouldn't a `ret` instruction go "back" to somplace where the OS itself will do cleanup work?
The only way for execution to cross the barrier between "user space" and "kernel space" is through a system call or an interrupt (we won't speak of call gates). Even if the OS had put an address on the stack, so that the "ret" would go there after returning from main(), the code there would still need to do a system call to go back to the OS.
While nowadays Linux has a shared page of code mapped on every process (the vDSO), that wasn't the case in the past; all code on the "user space" side had to come from either the executable itself, or a library it loaded. Given that, it's natural that it was left to the executable to call the "exit" system call at the end.
compiler-guy 2 days ago [-]
Not without libc doing the glue work.
A return instruction from main hands things back to libc which does some cleanup and then makes this same syscall.
EGreg 2 days ago [-]
An ELF, and almost in time for Christmas!
quotemstr 1 days ago [-]
Christ, why couldn't PE have won?
boricj 1 days ago [-]
As in the Portable Executable file format? There are no tricks used in this article that rely on the specifics of ELF, unlike some more extreme examples [1] that abuse every trick in the book to shave off more bytes from executables.
If anything, PE piggybacks on top of COFF which is a complete mess of a file format. I'm currently writing a standalone library for reading and writing toolchain file formats [2] (to replace some messy bespoke code in my Ghidra extension) and this under-specified, fragmented into multiple dialects, weirdly contorted relic is a pain to deal with.
COFF was a stepping stone from a.out to ELF that should've lasted only a couple of years on Unix systems and somehow it managed to metastasize at a crucial point in time inside multiple software ecosystems, most notably Windows and indirectly .NET and UEFI through PE. Frankly, I'd ask instead why couldn't PE and COFF have lost.
1. hand-written minimal ELF headers, with enough asm to do `_exit(main(argc, argv))`: https://github.com/DavidBuchanan314/kurl/blob/main/golfed/el... (currently only implemented for aarch64)
2. "Linux Syscall Support" library for conveniently making raw syscalls from C: https://chromium.googlesource.com/linux-syscall-support/
3. To avoid custom linker scripts (which I hate with a passion), I embed my hand-crafted ELF within a regular ELF, and slice it out at the end (using a python script). The "container" ELF is a regular full-fat ELF, potentially including working debug symbols, but the inner ELF has none of the cruft.
Using this technique, I wrote a barely-functional TLS1.3 client that fits in ~3.5KB (see the rest of repo from the first link)
The main goal of BGGP5 is to download the file at https://binary.golf/5/5 and display its contents, using less than 4KB of code (stored in whatever format you like).
Tiny disclaimer: As part of the BGGP staff team I knew about the theme in advance, and I absolutely could not resist getting started a few days early. This entry is more about being cool than being competitive, so I hope you can forgive me!"
https://binary.golf/5/
"A valid submission will:
Be 4096 bytes or less
Download the text file at https://binary.golf/5/5
Display the file's contents in some way
Example Entry:
#!/bin/sh
wget https://binary.golf/5/5
cat 5 "
Are we excluding the size of sh, wget and cat
What is size of busybox with ssl_client as the only applet and wolfssl as the TLS library
Yes. It's not very interesting, but you can do that.
> What is size of busybox with ssl_client as the only applet and wolfssl as the TLS library
Larger than 4096 bytes.
lol why? i mean the syntax sucks but this seems like howling into the wind...
[1] https://github.com/torvalds/linux/tree/master/tools/include/...
[2] https://github.com/boricj/ghidra-delinker-extension/tree/mas...
Isn't a freestanding enviroment one without an OS? The author in the article explicitly codes against Linux syscalls and is creating an ELF file (so a hosted executable).
https://www.muppetlabs.com/~breadbox/software/tiny/
They sparked my interest in ELF and freestanding programs.
[0] https://tmpout.sh/3/22.html
It's a webserver written in x86 assembler, which makes raw syscalls. It has no functions, and unmaps the stack so it uses only one 4KB page of memory at runtime.
I wrote this page for my own compiler that I'm working on, but I think it would be a good complement to this article. Note that the page is not that great on mobile, the extra real estate on desktop really helps.
A printf-hello-world is about 1 KiB. A write-hello-world (syscalls only) is less than 200 bytes. Assembly programming skills not needed to use it.
It reminds me of a funny little bug in ARM Linux, fixed by adding volatile to an asm statement: https://lore.kernel.org/lkml/92a00580828a1bdf96e7e36545f6d22...
Adding an output for the %rax result would prevent the call from being omitted without volatile (assuming it is actually consumed by something), but it could still be reordered, right? I suppose with general syscalls that might be okay, but certainly not with sys_exit().
They also need memory clobbers, but I don't think memory clobbers would necessarily prevent reordering? In the case of the ARM bug though, it did: https://lore.kernel.org/lkml/Zqa4SAyPKPuaXdgg@mozart.vkv.me/
That'd be a good introduction to assembly for someone who already knows C well.
For me it would be sort of like writing programs in C versus higher level languages: much more tedious, will take longer and require better planning/upfront design, but doable.
With practice you learn some tricks that can seem clever to anyone not writing a lot of asm. It’s “just” a very low level language IMO.
I've written an article about this idea:
https://www.matheusmoreira.com/articles/linux-system-calls
You can get incredibly far with just this. I wrote a freestanding lisp interpreter with nothing but Linux system calls. It turned into a little framework for freestanding Linux programs. It's been incredibly fun.
Freestanding C is a much better language. A lot of legacy nonsense is in the standard library. The Linux system call interface is really nice to work with. Calling write is not that hard. It's the printf style string building and formatting that I sometimes miss.
Linux kernel is known to be able to run binaries compiled in the 90s. Breaking user space makes Linus yell at people until the breakage gets reverted. A platform that stable is worth building on top of. Updating executables is a lot of work, sometimes it's straight up impossible.
I'm pretty sure that MVS syscalls (that is, the numbers you use with the SVC opcode) have remained backward-compatible at least as far back as MVS 3.8 in the 1970s and those binaries making those "raw" syscalls will still work on the latest z/OS releases.
There are a _lot_ more operating systems than Linux, Windows, and the BSDs... making a statement that the Linux kernel is the only kernel to do something a certain way is a risky proposition :-)
The Linux promise:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...
I've always been all about the hidden fun stuff. The magical little programs that somehow configure audio cards. The ALSA mixer tool for example does it via special ioctls. I was reading its source code not too long ago. The manuals said those definitions were for the curious and that those ioctls were private, as though it was the library's author exclusive privilege to use those things. I seriously hate it when they say that. When they imply I'm some mere mortal who's better off using the libraries that were gifted to us by the gods of programming.
Good or bad, quite a bit of hubris is involved. Takes a certain audacity to think I can make a better wheel than people who are probably much smarter than I am. Sometimes I start projects just to prove to myself that I'm not clinically insane for thinking a better way is possible. Sometimes it works, sometimes it doesn't. Someone once called an idea I had schizophrenic. I'll never forget that day.
This Linux system call stuff started after I read an LWN article about glibc and Linux specific system call support, getrandom to be specific. Took glibc years to add support. I started a liblinux project because of that article. The idea was to get rid of libc and talk to Linux directly. In order to accomplish that, I was forced to learn a lot of compiler, linker and executable stuff. The musl libc source code taught me a lot.
It seems like the C library is doing a huge amount of stuff but it turns out you don't actually need most of it. Linux just puts your binary in memory and jumps into some address specified in the ELF header. Normally this when the C library or dynamic linker takes over in order to prepare to call main(). Turns out I can just replace all that with some simple code that calls a function and then exits the process when it returns. It just works. I won't have init/fini section processing but I can live with that, that's harmful stuff that shouldn't even have been invented to begin with.
Not on FreeBSD, NetBSD, OpenBSD or Solaris.
The article you linked says this but it's not true:
> Sometimes it's not even possible to use system calls at all. OpenBSD has implemented system call origin verification, a security mechanism that only allows system calls originating from the system's libc. So not only is the kernel ABI unstable, normal programs are not even allowed to interface with the kernel at all.
You can still make system calls from normal programs, you just need to list the addresses of system call instructions in an ELF section named openbsd.syscalls.
Can you cite any sources? I wasn't able to find any documentation that corroborates what you said when I wrote the article. The few texts I found actually suggested otherwise. Maybe things have changed since then?
> You can still make system calls from normal programs, you just need to list the addresses of system call instructions in an ELF section named openbsd.syscalls.
I see. So they have added a mechanism to list the sections allowed to perform system calls. That's news to me. Do they guarantee the system call numbers will remain stable though? That older system calls will remain available?
For one, the FreeBSD kernel specifically has a compatibility layer for Linux binaries to use their familiar syscalls [0]. For its ordinary syscalls, it also has a policy not to break binary compatibility without good reason [1]. Most other OSes just don't maintain quite the level of 'indefinite stability' that the Linux kernel does across different versions. And even Linux doesn't implement older versions of syscalls when the kernel is ported to new architectures, so eventually you have to rotate your implementation regardless, if you want people to run your code on new systems.
> The few texts I found actually suggested otherwise.
People often say "X is impossible" when the truth is "X is tricky and full of caveats, and I don't want to think about it, so stop asking". (Or if the devs themselves are saying it, it might be "I want to look like I'm 'tough on crime' toward users of undocumented behavior", as if that could stop Hyrum's law from running its course.) In this case, it's generally "If you do it on an OS other than Linux, you can run into big compatibility issues," not "It's impossible on OSes other than Linux."
As for compatibility issues, you're running into that the moment you do undocumented fun stuff like omitting ELF sections or overlapping headers, which future Linux versions could start rejecting on the basis of "no one needs to do that legitimately". So I wouldn't start drawing the line on syscall number compatibility.
[0] https://docs.freebsd.org/en/books/handbook/linuxemu/
[1] https://wiki.freebsd.org/AddingSyscalls#Backward_compatibily
I believe this strengthens my argument. Linux kernel-userspace interface is so stable other projects are implementing it. I remember Justine Tunney mentioning this before, the idea that the x86_64 Linux system call ABI is turning into some kind of lingua franca of systems programming.
https://justine.lol/ape.html
> x86-64 Linux ABI Makes a Pretty Good Lingua Franca
Would be interesting if people started targeting Linux because of this, banking on the fact that other systems will just implement Linux. Even Windows has Linux built into it these days.
> For its ordinary syscalls, it also has a policy not to break binary compatibility without good reason.
Thank you for the source. I don't think that's a particularly strong guarantee. It's certainly stronger than OpenBSD's at least.
> Most other OSes just don't maintain quite the level of 'indefinite stability' that the Linux kernel does across different versions
Yeah. I think this is something that makes Linux unique.
> And even Linux doesn't implement older versions of syscalls when the kernel is ported to new architectures, so eventually you have to rotate your implementation regardless, if you want people to run your code on new systems.
That's true. Only new architectures are affected though. The old ones have all the old system calls, many with multiple versions, all supported. Porting to a new architecture doesn't invalidate the stability of existing ones.
> People often say "X is impossible" when the truth is "X is tricky and full of caveats, and I don't want to think about it, so stop asking".
> Or if the devs themselves are saying it, it might be "I want to look like I'm 'tough on crime' toward users of undocumented behavior"
I get what you're saying. I truly apologize if I came across that way. I did not mean to say that.
I got interested in this low level direct system call stuff because I literally got sick of reading "but you, mere mortal, are not meant to access these raw system interfaces, that's for us, you are meant to call the little library function we made for you" in the Linux and libc manuals. Last thing I want is to end up doing the same to others.
By "can't do this" I meant to say the developers maintaining the system don't want you bypassing their system libraries and won't take responsibility for it if you do so. If the program breaks because the kernel interfaces changed, they'll tell us it's our own fault and refuse fix to it.
Linux takes the opposite approach: breaking user space makes Linus Torvalds yell at the people until the breakage is reverted. I'm enthusiastic about it because it's the only system where this is supported.
> As for compatibility issues, you're running into that the moment you start doing undocumented fun stuff like omitting ELF sections or overlapping headers
I agree. Should be fine as long as the ELF specification is respected. It's okay though, ELF is flexible enough that even in 2024 it's possible to invent some new fun stuff.
https://www.matheusmoreira.com/articles/self-contained-lone-...
Embedding arbitrary files into an existing ELF and patching it so that Linux automatically maps it in before the program even runs. Since Linux gives processes a pointer to the program headers, the file is in memory and reachable without a issuing a single system call.
Personal experience.
> Do they guarantee the system call numbers will remain stable though?
No. Doesn't mean you can't make system calls from outside the libc though.
The problem is the system's developers don't want us bypassing those libraries. We can do it but things can and probably will break in the future when they change things. It's not supported.
I did design my own runtime binary executable/dynamic library format which I do embed in an ELF capsule to be loaded by legacy systems. The thing I need to port though is the core user level drivers:vulkan/drm & alsa-lib. The main issue would be the alsa-lib since some part of its API still "requires" a C runtime (you have to call free() on some returned data).
The issue with this "format": it is so much simple, I wonder if it would not be better if each software "dynamic library/user level system interface" should design its own minimal and giga simple "dynamic library" format, taylored for its semantics.
Dunno yet.
On modern hardware architecture, you load position independent memory segment (code and data). You should need its alignment requirement and you are good to go.
Basically, a magic with the alignment, then a table of offsets or re-entrant code (possible on modern hardware architecture which supports try-lock hardware semantics) right after the "header". I chose to use the re-entrant code guarded with an hardware try-lock mechanism, because it is more generic and will be cleaner on the long run than a table of offsets.
Bending the product of code generators (assemblers) into some runtime format was a good idea until most hardware architectures support a hardware try-lock mechanism, then it became really nasty legacy.
2. Why is it that exiting at the end of main() requires a system call? Wouldn't a `ret` instruction go "back" to somplace where the OS itself will do cleanup work?
Usually that's done by the C runtime library, but there isn't one there since this is a freestanding environment. Had the program not exited through a syscall (or entered an infinite loop), it would most likely crash after veering off the main() function.
The only way for execution to cross the barrier between "user space" and "kernel space" is through a system call or an interrupt (we won't speak of call gates). Even if the OS had put an address on the stack, so that the "ret" would go there after returning from main(), the code there would still need to do a system call to go back to the OS.
While nowadays Linux has a shared page of code mapped on every process (the vDSO), that wasn't the case in the past; all code on the "user space" side had to come from either the executable itself, or a library it loaded. Given that, it's natural that it was left to the executable to call the "exit" system call at the end.
A return instruction from main hands things back to libc which does some cleanup and then makes this same syscall.
If anything, PE piggybacks on top of COFF which is a complete mess of a file format. I'm currently writing a standalone library for reading and writing toolchain file formats [2] (to replace some messy bespoke code in my Ghidra extension) and this under-specified, fragmented into multiple dialects, weirdly contorted relic is a pain to deal with.
COFF was a stepping stone from a.out to ELF that should've lasted only a couple of years on Unix systems and somehow it managed to metastasize at a crucial point in time inside multiple software ecosystems, most notably Windows and indirectly .NET and UEFI through PE. Frankly, I'd ask instead why couldn't PE and COFF have lost.
[1] https://nathanotterness.com/2021/10/tiny_elf_modernized.html
[2] https://github.com/boricj/binary-file-toolkit