Yes, you must use "C" if you use the libc (seems obvious)
However, you can issue your syscalls directly. Of course, to do that, you will have to rewrite your own libc .. nothing is free.
It shall be noted, however, that this is not possible in every environments: I believe that openbsd's code must use the system libc. At the same time, on windows, you must use the provided library. This does not mean the OS library is always C (but this means your new language must do FFI with whatever language is used by the system).
Maybe only Linux allows anybody to issue syscalls directly ?
> Maybe only Linux allows anybody to issue syscalls directly ?
That is correct. For the reason that Linux is a kernel, and glibc is a separate and independent project, so syscalls are the system interface on Linux.
On other systems things range from syscalls being willfully broken on every update (windows) to syscalls being ABI-stable but removed unless compat-built (FreeBSD), in the middle you have e.g. MacOS which makes no guarantees and will break syscall ABI without warning (even if they’re generally stable) and OpenBSD trying to lock them down to static libc only to mitigate gadgets.
Windows also allows you to issue syscalls directly (meaning without using ntdll), but that'll usually get your process killed&quarantined by pretty much any anti-malware suite, as that's highly suspicious behavior.
EDIT: You edited your post mentioning this, but I'll leave it here. :-)
> Windows also allows you to issue syscalls directly
Technically most systems do, OpenBSD is the only one I know of actively trying to prevent that.
The line is rather whether there is support and maintenance for doing that aka how likely is it that your program will break on updates.
For Windows, it’s virtually guaranteed because IIRC the syscall numbers are just the index of the syscall name in a big sorted table, so if any syscall gets added or removed the table gets offset and you get the wrong syscall.
We fully agree: it's technically possible, but it's not practical and really only done by malware to skip the userland hooks set up by antivirus software.
One interesting angle on this is that with io_uring, you start using fewer glibc-functions-calling-syscalls, and more and more of your hot inner loop is about the data structures in the ringbuffers.
C is the thin wrapper of people talk around machine talk. It's the primary interface between man and microprocessor. FFI is a pretty good convention for giving your latest, redundant, unnecessary, vanity language the capability talk to the OS which is almost universally written in C.
That's a common misconception. Advanced compilers like GCC and LLVM have optimizers working primarily on a common intermediate representation of the program, which is largely backend-independent, and written for semantics of the C abstract machine.
UB has such inexplicable and hard to explain side effects, because all the implicit assumptions in the optimization passes, and symbolic execution used to simplify expressions follow semantics of the C spec, not the details of the specific target hardware.
Programmers have an ideal obvious translation of C programs to machine instructions in their head, but there's no spec for that.
It creates impossible expectations for the compilers. My recent favorite paradox is: in clang's optimizer comparing addresses of two variables always gives false, because they're obviously separate entities (the hardcoded assumption allows optimizing out the comparison, and doesn't stop variables from being in registers).
But then clang is expected to avoid redundant memcpys and remove useless copies, so the same memory location can be reused instead of copying, and then two variables on the stack can end up having the same address, contradicting the previous hardcoded result. You get a different result of the same expression depending on order of optimizations. Clang won't remove this paradox, because there are programs and benchmarks that rely on both of these.
And yet, in practice the C have written over the last 20+ years mostly ended up with "undefined behaviour" = "what CPU does"...
That may not be true in fact but it ends up that way because "undefined behaviour" tends to be implemented "to work" and we're used to behaviour of common CPUs that may have fed into expectation of what "to work" should be...
That's not how compiler developers interpret undefined behaviour. Undefined behaviour is closer to an 'U', 'X' or "don't care" in VHDL. These things don't exist in real hardware and only in simulation, therefore the synthesis tool simply assumes that they never happen and optimizes according to that. However, C does not have a simulation environment and UB propagation is not a thing. It will simply do weird shit, like run an infinite loop, because you forgot to write the return keyword, when you changed a function from void.
"undefined behavior behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements
NOTE Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).
EXAMPLE An example of undefined behavior is the behavior on integer overflow." (The C99 standard, section 3.4.3)
translates into "whatever your CPU does" because while there is not requirement imposed, in general the compiler does make it work "in a manner characteristic of the environment".
I believe that memory accesses outside of array bounds, signed integer overflow, null pointer dereference are all examples of "undefined behaviour", which in practice all boil down to what the CPU does in those cases. I.e. commonly memory access outside of array bounds returns whatever is at that address as long as address is valid because there are no checks and that's what the CPU does when asked to load from address. Integer overflow? If a result of adding/subtracting, commonly it wraps around because that's how the CPU behaves, etc.
And I believe this is all on purpose. C is an abstraction over assembly and I believe that people who were used to their CPU's behaviour wanted to keep it that way in C, and also compilers were simple.
For someone who's been writing "C for 20+ years" according to your other post, you come across as extremely ignorant of how modern optimizing C compilers work. I suggest you thoroughly read and understand Russ Cox's simple expose [1] and the UB posts in John Regeh'r blog [2].
The C security issues that plague us today are partly fueled by the attitude of guesswork and ignorance demonstrated in your posts.
You can write programs without so much as even using the concept of functions. there are single-program embedded devices that don't have an OS. how can you look at something ada for example and say it is a wrapper around C?
C does have rules that must be obeyed and string concepts of data types or lack thereof (like not having a string). There are assembly language designs specific to C++, until recently arm had java byte code specific feature set (Jazelle). C is dominant but I'd hesitate to say it is a machine code wrapper.
You have platforms with different [efficient] ways to pass arguments, operating systems and firmware with different conventions, and on top of that there are different compilers that will inevitably differ in fringe areas. All of this must be supported in a binary form.
What is the answer and why C’s answer is more problematic than any other? TFA reads like a misdirected blame or venting on “legacy”.
Might sound cynical, but: design your own hardware with a stable ISA. Develop an operating system on top of it. Ship your language on top of that OS and boast about the ABI stability to the few users of your language.
Well, Mill is vaporware but its portals would be a (micro-kernel ready but Linux-usable) syscall boundary, defining at least the core of the ABI -- and they're pretty much just a flavor of the regular function call on the platform.
What Zig has done is actually build their own LibC equivalent from scratch so they don't have to deal with interfacing with that, but they also have famously good C compatibility good enough that you can just import a library almost like it's C and use it
I'd say the author's mental model of the nature of C as a programming language isn't accurate, not helped by efforts to standardise the language.
It could be argued that C is little more of a defined programming language any more than assembly languages is. It's worth remembering that C was invented in part, to abstract away architectural differences, so that Unix could be ported to new platforms.
If you want 'standard' with a sane well-defined ABI that never changes, there are better choices in 2024.
It's a weakly-typed language that is rooted in the typeless, machine-word-oriented nature of BCPL. The "char" type in the early implementation for PDP-11 was just a feature to benefit from the byte-addressing capabilities of the machine. Everything else was an int. Pointers were compatible with ints without any conversion. Functions were implicitly returning int and taking an unspecified number of int parameters.
As the language grew and attempted to cover a larger set of architectures, "int", "short", "long" started to mean different things on different machines. C99 attempted to fix the mess with stdint.h.
People just live in bubbles don't they. If I want to write a 100 line program for a tiny microcontroller what should I do? Write it in assembler? I mean I could, but I think the author would have even more esoteric problems with that! Meanwhile I would just get the job done with a language that apparently doesn't exist anymore.
I think you missed hos point. The problem is that C is not well designed enough to be used like it is today. The abi checker showing failed tests even on ubuntu is evidence of that.
You can write your program in any lang you want, thats up to you but using C as the os-level standard is problematic according to the author.
C’s integers being wobbly-sized is the reason we have C compilers for pretty much every conceivable cpu architecture and we only have rust/swift/other compilers for a handful of carefully picked architecture.
There are entire industries where C-as-a-protocol is fine enough (and actually appreciated, even though the author doesn't like it).
There are entire industries that do not use X86 or ARM, let alone RISC-V. There are many use-cases for MIPS, weird/other architectures (PowerPC, 8051, m68k-derivatives, pic-micro)... I had a relative that used to work on control software for space stuff, the CPUs used there are even older and definitely not x86/x86-64, arm or risc-v).
So yeah, complaining that the C language does not fit your own very specific use case is very shortsighted.
We should all remember that the world does not revolve around us and around our own use case.
Yeah, part of the problem is bias of computer scientists, which I've also seen on SV-centric communities like HN.
A while back there was a complaint that CAN communication should be explained as it's "too obscure". So much obscure tech is talked about on this site without explanation.
I think most of the developers on this site don't understand the wide world of embedded systems. C is a high-level language. It has a weird clunky elegance of it's own.
Once you leave the world of servers and natively built apps on an OS, things get really complicated. Just go on Digikey, Mouser or whatever is big now and see how many processers are out there.
Not true. At work we maintain 1 million+ line C/C++ applications and libraries that run large international companies all around the world. And we hire C/C++ developers all the time.
while I don't particularly mind C myself, I have on occasion passed messages between processes over nanomsg to avoid merging the incompatible contents of two conda environments.
I wonder if more platforms would consider having a stub that connects with the OS and talks to the rest of its kin via sockets of one sort or another. it does introduce a ton of lag and multiple kernel-userspace jumps I suppose.
Yes, exactly! In fact I wish software would expose most of their functionality through something like these, and use it themselves. E.g. a part that queries the backend, and can be called by the GUI. That way it would be easy to have alternative versions, e.g. one frontend to access multiple messaging services (like those that existed back in the days to do yahoo,msn,google, etc at the same time).
It's a slight dig at England because English (UK) politics is designed for stability. Instead of doing new and innovative things the politics and ceremony are rooted firmly in Empire. -- one could make the argument that UK Politics is also lowest common denominator.
Regardless, the same is true of C, it can't change meaningfully because half the universe would break, so it's stuck with stability - stuck in time, warts and all.
Designed and stability are probably too strong words to use for a system that gave its people Boris Johnson and a cabbage.
But yes, only one round of "first past the post" is very simple, hard to go lower or commoner than that. :)
And it has a lot of added efficiency because there's no need to separately pick a figurehead - there's always one helpfully marked with a crown. (Of course the fact that changing the PM and the full cabinet can be done at any time without involving the electorate can help stability. But of course it also keeps inefficient coalitions going far too long, meaning eventually there will be a bigger correction event, which is bad for long-term stability.)
Yes, exactly. But "somehow" increasingly crazier groups got into power within the party (hence Brexit) and now it the consequences finally (?) catching up (hence the flip to Labour).
...wouldn't have mentioned it if I hadn't started reading and then remembered that I had already read it some time ago. Might have been https://news.ycombinator.com/item?id=33509223
Oh boo hoo. Yes, C really sucks, but it exports a crummy and limited abstract machine which is fundamentally a PDP-11. The author might just as well complain, in fact would better complain, that interfacing to machine code is painful.
Just suck it up and put the OS interface (with the appropriate paradigms and affordances for your language) into your language’s standard runtime. That’s part of the work of writing a language.
Whatever model you think should replace it will inevitably only be useful for some languages and as bad as a C interface (or worse) for others. The iAPX 432 was a notorious example — in hardware! Lispms, sadly, were another.
It is irksome that some will consider C the “lowest common denominator”. It ain’t. It’s just a shitty, path-dependent valley in the possibility space. And as for shitty, well, I use the toilet every day, and it’s not the high point of my day, but it’s where we ended up (so far) as part of the mechanism by which we power our bodies and also get to enjoy yummies. No point I’m complaining about it, it simply is what it is.
> Just suck it up and put the OS interface into your language’s standard runtime.
The OS interface is a huge and complex animal that provides crucial primitives for writing high-performing programs (async IO, memory management, timers, etc.). If you want your language to be practical for system-level programming, you need a 1:1 mapping between them.
Crafting a runtime that abstracts those calls and presents a new world to the programmer adds another layer of complexity and will probably worsen the performance.
> Crafting a runtime that abstracts those calls and presents a new world to the programmer adds another layer of complexity and will probably worsen the performance.
But that’s what the article is calling for — claiming (correctly IMHO) that the C interface is pretty cruddy.
Unfortunately whichever higher level interface you choose will be inappropriate for most languages that aren’t the implementation language. I didn’t mention the 432 by accident: its baked in “object-oriented” architecture turned out not only not to help but instead slowed down the whole processor for no gain.
The only exception is when only one language is intended and there’s a tight coupling between silicon and that language (e.g. the lispm case).
The article started it. The title's premise rapidly changes into "We must all speak C, and therefore C is not just a programming language anymore" (my italics), and is then abandoned, and the rest is complaining about ABIs. And it's full of memes. But the last few sentences try to reach back to the premise:
> But it’s not, it’s a protocol. A bad protocol certainly, but the protocol we have to use nonetheless!
> Sorry C, you conquered the world, maybe you don’t get to have nice things anymore.
> You don’t see England trying to improve itself, do you?
So, the complaint is that C is popular. Fair enough, I too reflexively hate popular things, but this bad habit hasn't turned me against C, yet. And then "maybe you don't get to have nice things any more" is a distorted Simpsons quote, and what's being implied here? That whatever is the lingua franca for interfaces should be taken away by ... the ABI police ... and replaced by an endless series of alternatives, each of which is itself also soon removed for having become too gnarly?
I don't understand the closing sentence at all. England? What? Something to do with the English language presumably. I am fond of that kludgy language, too. Should we all get aboard a bandwagon to replace it with a highly rational conlang?
Do you realize the article is full of "fuck" repeated over and over?
> This could have been an important comment, but the acid and hostile terms just got in the way of the message.
Is it so big of an issue, getting an "important" (to use your words) point ignoring the irrelevant parts?
> But well... I have to cope and seethe I guess?
The comment was really directed at the author of the article (not even whoever posted it) not sure why you're taking that as if it was directed to you, and on a personal level.
> Do you realize the article is full of "fuck" repeated over and over?
So what? Does it make it ok to pollute the comment section, which is separate from the article, with hostile comments?
Let's get an overblown example: does it make it ok to be racist and antisemitic if you are in a group discussing Mein Kampf? I don't think so.
> Is it so big of an issue, getting an "important" (to use your words) point ignoring the irrelevant parts?
It would be if the irrelevant part don't poison the rest of the discourse. I guess there are many intelligent racists out there. We don't need them and I don't care what they have to say.
> The comment was really directed at the author of the article (not even whoever posted it) not sure why you're taking that as if it was directed to you, and on a personal level.
I am not. That last part is using a figure of speech called sarcasm. Could I have written my comment without it? Sure, but it was designed to redirect the kind of hostility towards the OP so that it becomes clear that it's not acceptable.
> It would be if the irrelevant part don't poison the rest of the discourse. I guess there are many intelligent racists out there. We don't need them and I don't care what they have to say.
"It is the mark of an educated mind to be able to entertain a thought without accepting it."
What can I say, this tells a lot about you.
Discussing with people you strongly disagree with is a necessary and unavoidable step to change other people's mind.
If you're not willing to go closer to people you disagree with and try and educate about a more open point of view (a not racist point of view, to go along with your example) then you'll never make any change in the world: it really means you're okay with racism as it is.
At this point i'm really done arguing with you (I did my part).
Amusing that someone disagrees with you on appraisal and your first reaction is to accuse of a call for rulebreaking. There's no vote going on here. Whatever it was is flagged now. If you want to restate your opinion at least add to the conversation.
Interestingly enough, my comment was flagged with just -1 of score point.
This kind of rhetoric effectively shuts down discussion as long as it does not adhere to some pre-approved narrative.
What is said is:
> Shortsighted discussion.
> C’s integers being wobbly-sized is the reason we have C compilers for pretty much every conceivable cpu architecture and we only have rust/swift/other compilers for a handful of carefully picked architecture.
I'll leave out the "controversial" (?) part (suggesting some ways to deal with other points of view, no racial slurs nor cussing involved).
Anyway: there are entire industries where C-as-a-protocol is fine enough (and actually appreciated, even though the author doesn't like it).
And then again... Some other people commented that "in 2024, only a handful of CPU architectures really matter anymore." which is even more shortsighted.
There are entire industries that do not use X86 or ARM, let alone RISC-V. There are many use-cases for MIPS, weird/other architectures (PowerPC, 8051, m68k-derivatives, pic-micro)... I had a relative that used to work on control software for space stuff, the CPUs used there are even older and definitely not x86/x86-64, arm or risc-v).
So yeah, complaining that the C language does not fit your own very specific use case is shortsighted (and frankly, I'm putting this very politely).
We should all remember that the world does not revolve around us and around our own use case.
> Be kind. Don't be snarky. Converse curiously; don't cross-examine. Edit out swipes.
OP was by one of the foremost experts in ABI and FFI issues, especially as pertains to Rust, who has studied this issue more than most programmers. I'd hardly call it short-sighted.
And in 2024, only a handful of CPU architectures really matter anymore.
If by "close to the machine" you mean "close to the PDP-11", then sure, but the machine you're likely to write for hasn't been at all like a PDP-11 for quite some time: https://queue.acm.org/detail.cfm?id=3212479
Maybe he would like to have a modern C that is actually close to the machine which exposes L1-3 caches go the programmer which he has to manage himself.
I read until the "FFI" section
Yes, you must use "C" if you use the libc (seems obvious)
However, you can issue your syscalls directly. Of course, to do that, you will have to rewrite your own libc .. nothing is free.
It shall be noted, however, that this is not possible in every environments: I believe that openbsd's code must use the system libc. At the same time, on windows, you must use the provided library. This does not mean the OS library is always C (but this means your new language must do FFI with whatever language is used by the system).
Maybe only Linux allows anybody to issue syscalls directly ?