We can still get away with those sins today if you change C's implicit type from int to long. I modified chibicc to do just that it was a 1 LOC patch. Suddenly I didn't need prototypes anymore and everything just worked without #include lines.
> C99 requires that you at least declare a function before calling it
Where is that written? Please quote the standard. My reading of ISO/IEC 9899:TC3 is that Foreword says (paraphrasing) "hey, we removed implicit int and implicit prototypes from the standard". So they simply stopped specifying it, but as far as I can tell, the standard says nothing about forbidding them. From my point of view, that means compilers are free to still implement implicit types as a compiler extension to the language, and everyone does, because that extension is needed in order to support the older c89 standard. It would only impact people who want to be able to claim their codebase is c99 standard compliant, because you can't do that if you depend on compiler extensions.
ISO/IEC 9899:TC3 [1] §6.5.1 ¶2: "An identifier is a primary expression, provided it has been declared as designating an object (in which case it is an lvalue) or a function (in which case it is a function designator)."
There's even a footnote to underscore this point: "79) Thus, an undeclared identifier is a violation of the syntax."
Prototypes are required in C23 as part of the expungement of K&R syntax. This also brings harmonization with C++ where an empty parameter list is implicitly void.
But why does it have to be that way though, can't the compiler scan the rest of the code/files and see if a defenition/declaration is present somewhere?
There is no such requirement of a declaration before first call in Java for example?
I suppose you could just create a ".d" file standard that doesn't have that requirement but processes into a ".c" file that has the necessary prototypes. You could probably also auto-insert all the #if'n'def nonsense automatically and skip it in the ".d" files.
Kind of like how the JavaScript dudes all use this ".ts" stuff now for convenience that processes into ".js"
Just to be a little picky, but if you want “convenient” then just stick with pure JS - it’s overly forgiving and simple. TypeScript is lovely and I much prefer it over JS, but having to type _everything_ is far from convenient imo
It is! I’ve been working in an environment that essentially requires us to type as we code (rules against using the “any” and “unknown” types) that in just used to them being enforced now lol. So I suppose my point is moot, as the tediousness isn’t forced by the language necessarily.
The Arduino build system does this (preprocesses your source code to pull out prototypes and put them at the top). To make things easier for beginners.
Prior to C99 (i.e., in C90), there was a rule that any undeclared identifier id that was followed by a function call parenthesis would be treated as though there were an implicit declaration in the scope in which it was used like this:
extern int id();
The empty parentheses meant it could take any number of arguments, since it was declared in the traditional style, where the argument types weren't specified first.
This implicit declaration meant that if it was later defined in the same file as returning a type other than int, the compiler wasn't permitted to go back up in the file and treat the function call as though it returned a type other than int.
This requirement was removed in C99, but in practice, compilers kept doing it for backwards compatibility, even when not invoked in C90 mode.
What I understand is the preprocessor causes issues where you could do that maybe 99.9% of the time, especially with well written code. But it'd fail that 0.1% of the time.
You do have analyses like ctags. In theory the compiler could use ctags to find definitions in source code. My experience is ctags can get confused. It's possible in a set of files to have multiple implementations of a function and ctags can't tell which one is used. And I see cases where it can't find definitions.
Personally I'd be happy if the compiler tried to find a missing definition using ctags and issued a warning.
I have wondered if adding funct, public, private keywords to the language might allow the compiler to reliably find function and stuct definitions in source files.
The fact that I'm calling a function, it must exist, otherwise the compiler will throw an error ("undefined reference to function"). So forward declarations is just needless typing.
> The fact that I'm calling a function, it must exist
That's not necessarily true. It's possible that the symbol is a function pointer; calling a function pointer requires slightly different code generation. Compare the generated code between:
In practice, it's probably a function, and that's what the compiler assumes if there's no declaration to go off of. But we all know what happens when you make assumptions.
no they are not.
but
>The fact that I'm calling a function, it must exist, otherwise the compiler will throw an error ("undefined reference to function")
you mean the linker will throw an error. The linker is trying to link together the "references" be they forward or backward, that the compiler has created, and the compiler needs to have generated the right references, round peg round hole, square peg square hole.
You don't want your linker throwing errors, it doesn't have the context the compiler does; and you don't want to turn the linker into ChatGPT that can converse with you about your source code, just use the C version of forward references which are not particularly forward, they just say "I don't know where this is defined, it's just not defined here, but we know it's a square peg"
For example, there are architectures where space is tight (embedded for example) and nearby things can be called more efficiently than far away things, so the compiler needs to generate as many near calls as it can, falling back to far away when it has to. It doesn't know in advance how far away things are going to be, but it might uncover plenty of near things as it goes. Square pegs, round pegs.
when you recompile your project code, the library code, is not necessarily around. When other people recompile their library code, your project code isn't around. What's the size of what's being put on the stack? Still gotta get the square/round pegs right.
In which case, it needs to keep parsing the include tree until it finds it. I know why it exists. I’m just not happy about duplicating code. Other compilers are smarter than this.
The thing I hate about this is that it really has no idea what a “public” header is. If I have it on I’ll use something like uintptr_t and get a random include for <__bits/std/uintptr.h> or something that I don’t want. I assume there’s like some manual auditing one can do to go “don’t peek through stdint.h” but either nobody does this reliably or it doesn’t work.
I haven't looked that deeply, but I don't see many sins in this source code, either, except for the K&R style, weird formatting and doing too much in one line, like this line in the file CAT.C:
if(*(Ptr = argv[i]) == '-') {
Older C compilers did let you get away with more, like using integers as pointers, and dereferencing pointers as though they were struct pointers when they aren't. But I don't see that in this code.
The "sin" I was referring to was calling undeclared functions -- which is how there are so few #include directives in much of this code. Most of the files I looked at include <stdio.h> (probably required for variadic arguments on printf-type functions) and <file.h> (for the FILE type), and call other functions blindly.
That does make sense. My bad, I should have looked closer and noticed the implicitly defined functions.
It's not even a good idea to do that on modern computers, because implicitly declared functions do type promotions from float to double and char/short to int, and on the System V ABI, it has to pass in the number of floating point arguments there are in register eax.
It's not a good idea to do that in production code. FTFY. If you're writing research, experimental, or one-off programs then it can be a real productivity boon. I say do it for fun. You have a right to enjoy yourself and it'll still come out 10x more readable than Perl at the end of the day, let's be real.
>Older C compilers did let you get away with more, like using integers as pointers
In older C compilers (maybe pre-ANSI, or later, can't remember), you could literally write i[a] instead of a[i], where a was an array and i was an int, and it would work equivalently, i.e. no compiler error, and give the same result as a[i]. This was because a was the address of the start of the array, and i was an offset, so a[i] actually meant *(a + i), which, by the commutative property of arithmetic, was equivalent to *(i + a), which was equivalent to i[a].
I had read this in some C book, maybe K&R or the Waite Group C Microsoft Bible.
Never forgot it, because it was counter-intuitive, unless you knew the above reason.
And tried it out in one or more C compilers at that approximate period, and it worked as stated.
This is still true, and it's mandated by the ISO C standard that a[i] is equivalent to (*((a)+(i))) (yes, they specify it with that many parentheses). You're still able to compile code that treats a[i] and i[a] interchangeably.
Older C compilers will let you get away with a lot of sins which a modern C compiler will (correctly) call you out for.