> A lot of these programs somehow only manage to have 1-2 standard c #include li...

jart · on Nov 3, 2023

We can still get away with those sins today if you change C's implicit type from int to long. I modified chibicc to do just that it was a 1 LOC patch. Suddenly I didn't need prototypes anymore and everything just worked without #include lines.

duskwuff · on Nov 3, 2023

> Suddenly I didn't need prototypes anymore

Under some very old C standards, maybe. But C99 requires that you at least declare a function before calling it.

jart · on Nov 4, 2023

> C99 requires that you at least declare a function before calling it

Where is that written? Please quote the standard. My reading of ISO/IEC 9899:TC3 is that Foreword says (paraphrasing) "hey, we removed implicit int and implicit prototypes from the standard". So they simply stopped specifying it, but as far as I can tell, the standard says nothing about forbidding them. From my point of view, that means compilers are free to still implement implicit types as a compiler extension to the language, and everyone does, because that extension is needed in order to support the older c89 standard. It would only impact people who want to be able to claim their codebase is c99 standard compliant, because you can't do that if you depend on compiler extensions.

duskwuff · on Nov 4, 2023

> Where is that written?

ISO/IEC 9899:TC3 [1] §6.5.1 ¶2: "An identifier is a primary expression, provided it has been declared as designating an object (in which case it is an lvalue) or a function (in which case it is a function designator)."

There's even a footnote to underscore this point: "79) Thus, an undeclared identifier is a violation of the syntax."

[1]: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf

kevin_thibedeau · on Nov 4, 2023

Prototypes are required in C23 as part of the expungement of K&R syntax. This also brings harmonization with C++ where an empty parameter list is implicitly void.

fuzztester · on Nov 4, 2023

Pascal had a forward declaration from way back, even before Object Pascal:

https://en.m.wikipedia.org/wiki/Forward_declaration

https://en.m.wikipedia.org/wiki/Pascal_(programming_language...

https://en.m.wikipedia.org/wiki/Object_Pascal

https://www.thedelphigeek.com/2017/03/forward-record-declara...

billfruit · on Nov 4, 2023

But why does it have to be that way though, can't the compiler scan the rest of the code/files and see if a defenition/declaration is present somewhere?

There is no such requirement of a declaration before first call in Java for example?

rezonant · on Nov 4, 2023

It's because the C compiler was designed to be single pass, so as to be speedy and require less memory on the systems of it's time.

dheera · on Nov 4, 2023

I suppose you could just create a ".d" file standard that doesn't have that requirement but processes into a ".c" file that has the necessary prototypes. You could probably also auto-insert all the #if'n'def nonsense automatically and skip it in the ".d" files.

Kind of like how the JavaScript dudes all use this ".ts" stuff now for convenience that processes into ".js"

phatskat · on Nov 5, 2023

Just to be a little picky, but if you want “convenient” then just stick with pure JS - it’s overly forgiving and simple. TypeScript is lovely and I much prefer it over JS, but having to type _everything_ is far from convenient imo

vinceguidry · on Nov 5, 2023

Isn't Typescript gradually typed?

phatskat · on Nov 5, 2023

It is! I’ve been working in an environment that essentially requires us to type as we code (rules against using the “any” and “unknown” types) that in just used to them being enforced now lol. So I suppose my point is moot, as the tediousness isn’t forced by the language necessarily.

rezonant · on Nov 5, 2023

It's also considered by many to be an antipattern to add a type that is otherwise inferred. For a trivial example,

let x: number = 6;

You should instead allow the compiler to infer where it can and do

let x = 6;

Inference doesn't work across function calls like it does (did?) in Flow, which is a good thing, so those always need to be typed.

If you wanted the type of x to be 6 instead of number, you would use

let x = 6 as const;

foodevl · on Nov 4, 2023

The Arduino build system does this (preprocesses your source code to pull out prototypes and put them at the top). To make things easier for beginners.

fuzztester · on Nov 4, 2023

Like early Pascal compilers, or maybe all of them. Don't know about the latter case (all).

billfruit · on Nov 4, 2023

Is that a requirement that is relevant presently? I would say not so much. Do modern C compilers also make only a single pass?

hulitu · on Nov 4, 2023

Yes. Gcc throws errors like hell with c files which have the definitions after functions.

rezonant · on Nov 4, 2023

Modern compilers still have to follow the rules of the language.

billfruit · on Nov 5, 2023

Yes, but is the constraint that compiler shall make only a single pass, a rule of the language?

trealira · on Nov 5, 2023

Prior to C99 (i.e., in C90), there was a rule that any undeclared identifier id that was followed by a function call parenthesis would be treated as though there were an implicit declaration in the scope in which it was used like this:

  extern int id();

The empty parentheses meant it could take any number of arguments, since it was declared in the traditional style, where the argument types weren't specified first.

This implicit declaration meant that if it was later defined in the same file as returning a type other than int, the compiler wasn't permitted to go back up in the file and treat the function call as though it returned a type other than int.

This requirement was removed in C99, but in practice, compilers kept doing it for backwards compatibility, even when not invoked in C90 mode.

Gibbon1 · on Nov 4, 2023

What I understand is the preprocessor causes issues where you could do that maybe 99.9% of the time, especially with well written code. But it'd fail that 0.1% of the time.

You do have analyses like ctags. In theory the compiler could use ctags to find definitions in source code. My experience is ctags can get confused. It's possible in a set of files to have multiple implementations of a function and ctags can't tell which one is used. And I see cases where it can't find definitions.

Personally I'd be happy if the compiler tried to find a missing definition using ctags and issued a warning.

I have wondered if adding funct, public, private keywords to the language might allow the compiler to reliably find function and stuct definitions in source files.

fuzztester · on Nov 4, 2023

Good question.

reactordev · on Nov 4, 2023

The fact that I'm calling a function, it must exist, otherwise the compiler will throw an error ("undefined reference to function"). So forward declarations is just needless typing.

duskwuff · on Nov 4, 2023

> The fact that I'm calling a function, it must exist

That's not necessarily true. It's possible that the symbol is a function pointer; calling a function pointer requires slightly different code generation. Compare the generated code between:

  void fn(void);
  void call_fn(void) { fn(); }

and

  void (*fn_ptr)(void);
  void call_fn_ptr(void) { fn_ptr(); }

In practice, it's probably a function, and that's what the compiler assumes if there's no declaration to go off of. But we all know what happens when you make assumptions.

reactordev · on Nov 4, 2023

Making no assumption on safety, just stating that forward declarations are superfluous.

fsckboy · on Nov 4, 2023

no they are not. but >The fact that I'm calling a function, it must exist, otherwise the compiler will throw an error ("undefined reference to function")

you mean the linker will throw an error. The linker is trying to link together the "references" be they forward or backward, that the compiler has created, and the compiler needs to have generated the right references, round peg round hole, square peg square hole.

You don't want your linker throwing errors, it doesn't have the context the compiler does; and you don't want to turn the linker into ChatGPT that can converse with you about your source code, just use the C version of forward references which are not particularly forward, they just say "I don't know where this is defined, it's just not defined here, but we know it's a square peg"

For example, there are architectures where space is tight (embedded for example) and nearby things can be called more efficiently than far away things, so the compiler needs to generate as many near calls as it can, falling back to far away when it has to. It doesn't know in advance how far away things are going to be, but it might uncover plenty of near things as it goes. Square pegs, round pegs.

when you recompile your project code, the library code, is not necessarily around. When other people recompile their library code, your project code isn't around. What's the size of what's being put on the stack? Still gotta get the square/round pegs right.

reactordev · on Nov 4, 2023

I know why it exists. It’s still superfluous to most use cases. Don’t force me. Require it when you can’t figure it out.

fuzztester · on Nov 4, 2023

"long far pascal", et al, anyone?

Win32 C code from way back.

Heh.

saagarjha · on Nov 4, 2023

The compiler needs to see it to know how to call it.

reactordev · on Nov 4, 2023

In which case, it needs to keep parsing the include tree until it finds it. I know why it exists. I’m just not happy about duplicating code. Other compilers are smarter than this.

duskwuff · on Nov 4, 2023

If you're using a modern IDE, clangd can be configured to automatically insert #include statements (--header-insertion=iwyu).

saagarjha · on Nov 5, 2023

The thing I hate about this is that it really has no idea what a “public” header is. If I have it on I’ll use something like uintptr_t and get a random include for <__bits/std/uintptr.h> or something that I don’t want. I assume there’s like some manual auditing one can do to go “don’t peek through stdint.h” but either nobody does this reliably or it doesn’t work.

eVeechu7 · on Nov 4, 2023

A little bit of advice like that is a tremendous blessing to those of us who just need to touch C occasionally.

trealira · on Nov 3, 2023

I haven't looked that deeply, but I don't see many sins in this source code, either, except for the K&R style, weird formatting and doing too much in one line, like this line in the file CAT.C:

  if(*(Ptr = argv[i]) == '-') {

Older C compilers did let you get away with more, like using integers as pointers, and dereferencing pointers as though they were struct pointers when they aren't. But I don't see that in this code.

duskwuff · on Nov 3, 2023

The "sin" I was referring to was calling undeclared functions -- which is how there are so few #include directives in much of this code. Most of the files I looked at include <stdio.h> (probably required for variadic arguments on printf-type functions) and <file.h> (for the FILE type), and call other functions blindly.

trealira · on Nov 3, 2023

That does make sense. My bad, I should have looked closer and noticed the implicitly defined functions.

It's not even a good idea to do that on modern computers, because implicitly declared functions do type promotions from float to double and char/short to int, and on the System V ABI, it has to pass in the number of floating point arguments there are in register eax.

jart · on Nov 3, 2023

It's not a good idea to do that in production code. FTFY. If you're writing research, experimental, or one-off programs then it can be a real productivity boon. I say do it for fun. You have a right to enjoy yourself and it'll still come out 10x more readable than Perl at the end of the day, let's be real.

rezonant · on Nov 4, 2023

> it'll still come out 10x more readable than Perl at the end of the day, let's be real.

A low bar, really :-P

queuebert · on Nov 4, 2023

Wait til you see old Fortran codes.

fuzztester · on Nov 4, 2023

>Older C compilers did let you get away with more, like using integers as pointers

In older C compilers (maybe pre-ANSI, or later, can't remember), you could literally write i[a] instead of a[i], where a was an array and i was an int, and it would work equivalently, i.e. no compiler error, and give the same result as a[i]. This was because a was the address of the start of the array, and i was an offset, so a[i] actually meant *(a + i), which, by the commutative property of arithmetic, was equivalent to *(i + a), which was equivalent to i[a].

I had read this in some C book, maybe K&R or the Waite Group C Microsoft Bible.

Never forgot it, because it was counter-intuitive, unless you knew the above reason.

And tried it out in one or more C compilers at that approximate period, and it worked as stated.

trealira · on Nov 6, 2023

This is still true, and it's mandated by the ISO C standard that a[i] is equivalent to (*((a)+(i))) (yes, they specify it with that many parentheses). You're still able to compile code that treats a[i] and i[a] interchangeably.

https://godbolt.org/z/37Kv44o1b

> I had read this in some C book, maybe K&R or the Waite Group C Microsoft Bible.

I haven't read the Microsoft C Bible, but it does say this in K&R while explaining pointers.

fuzztester · on Nov 7, 2023

>This is still true, and it's mandated by the ISO C standard ...

Wow, good to know.

>but it does say this in K&R

Then that must be where I had read it.

krackers · on Nov 5, 2023

This still works iirc.

flykespice · on Nov 4, 2023

Hmm implicit int conversion