According to GitHub, the totals are: backendA: 11 files, 1 directory, 799 lines ...

Chris_Newton · on Nov 10, 2019

I really don't get this fetish for lots of tiny files and nested directories, which seems to be a recent trend;

I suspect it is the same kind of thinking that says all functions should be very small (without reference to whether each function provides a single meaningful behaviour). Locally, this keeps things relatively simple, but it ignores the global issue that now there are potentially many more connections to follow around and everything becomes less cohesive. As far as I’m aware, such research as we have available on this still tends to show worse results (in particular, higher bug frequencies) in very short and very long functions, but that doesn’t stop a lot of people from making an intuitive argument for keeping individual elements very small.

A similar issue comes up once again in designing APIs: do you go for minimal but complete, or do you also provide extra help in common cases even if it is technically redundant? The former is “cleaner”, but in practice the latter is often easier to use for those writing a client for that API. Smaller isn’t automatically better.

MeteorMarc · on Nov 10, 2019

The book "A Philosophy of Software Design" should interest you then: https://www.amazon.com/t/dp/1732102201 It argues, among other things, that deep interfaces matter more than code complexity inside a module.

Chris_Newton · on Nov 10, 2019

That one was the first software book I read in a while where I got to the end and felt like if I wrote a book myself then that is very close to what I would want it to say. I highly recommend it to anyone who has built up a bit of practical programming experience and wants to improve further.

skohan · on Nov 10, 2019

> The excessive bureaucracy of Enterprise Java (and to a lesser extent, C#) leads to even simple changes requiring lots of "threading the data" through many layers. I've worked with codebases like that before, many years ago, and don't ever wish to do it again.

Yeah I tend to like something like a semantic compression approach: I'll start in a single file, and then split it into separate files organized by domain as the length of the file starts to get unwieldy. And so on into more files and later subdirectories as the program grows.

In my opinion it's much better to let the "needs of the program" dictate code and filesystem structure rather than some academic ideas about how a program should be organized. As you say, when I've worked on projects which are very strict about adopting a particular structure, a lot of time ends up being wasted figuring out how to map my intent to that structure rather than just writing the damn code.

imglorp · on Nov 10, 2019

> excessive bureaucracy

I like to call this mountain of abstractions forced on you (as opposed to coming from your domain): gratuitous object astronautics.

napsterbr · on Nov 10, 2019

When we are talking about 500-1500 sloc I completely agree this kind of structure is overkill. But when dealing with medium to large codebases (anything beyond, say, 100kloc) I much prefer the second approach, bonus points if you can get a fractal-like hierarchy.

Digging through files manually (I.e. Using a mouse) is painful, but your IDE is your friend. It takes me less than 3 seconds to search and open any file of the codebase I currently work in (it has a bit more than 2k files). And having a sane hierarchy means I type the folder / file name as I remember it, and filter the search results on-demand.

matheusmoreira · on Nov 10, 2019

Splitting things up into multiple independent translation units enables incremental compilation. One function per file is the most extreme version of this. For example:

https://git.musl-libc.org/cgit/musl/tree/src/stdio

skohan · on Nov 10, 2019

That seems like a problem for compiler optimizers to solve, not programmers.

matheusmoreira · on Nov 10, 2019

It's actually the domain of build systems. Splitting code into as many independent files as possible gives the build system more data to work with, allowing it to compile more parts of the program in parallel only when necessary.

If a file contains two functions and the developer changes one of them, both functions will be recompiled. If two files contain one function each, only the file with the changed function will be recompiled.

Build times increase with language power and complexity as well as the size of the project. Avoiding needless work is always a major victory.

twic · on Nov 10, 2019

In C++, the experience is the opposite - a "unity build", where everything is #included into a single translation unit, tends to be faster:

https://mesonbuild.com/Unity-builds.html

http://onqtam.com/programming/2018-07-07-unity-builds/

https://buffered.io/posts/the-magic-of-unity-builds/

matheusmoreira · on Nov 10, 2019

Unity builds are useful too but they have limitations. They are equivalent to full rebuilds and can't be done in parallel. The optimizations they enable can also be achieved via link time optimization. Language features that leverage file scope can interact badly with this type of build. They require a lot of memory since the compiler reads and processes the entire source code of the project and its dependencies.

Unity builds improve compilation times because the preprocessor and compiler is invoked only once. It is most useful in projects with lots of huge dependencies that require the inclusion of complex headers. The effect is less pronounced in simpler projects and they shouldn't be necessary at all in languages that have an actual module system instead of a preprocessor: Rust, Zig.

BubRoss · on Nov 10, 2019

I have to clarify here a little bit and say that it is faster on one core. If you have multiple cores, having your translation unit count in the same order of magnitude as your core count will be faster. There is a lot more redundant work going on, but the parallelism can make up for it.

dooglius · on Nov 10, 2019

> If a file contains two functions and the developer changes one of them, both functions will be recompiled. If two files contain one function each, only the file with the changed function will be recompiled.

Still sounds like a compiler problem

kyberias · on Nov 10, 2019

Multiple files also seems like a problem for IDEs to solve, not programmers.

lazulicurio · on Nov 10, 2019

That brings to mind an interesting idea for an IDE: having one big virtual file that you edit, which gets split into multiple physical files on disk (based on module/class/whatever). Although, thinking about it, there are some languages that would make such automatic restructuring rather difficult.

thyrsus · on Nov 10, 2019

You've just described Leo - leoeditor.com - where you're effectively editing a gigantic single xml file hidden by a GUI. The structuring is only occasionally automatic - mostly manual. It has python available the way emacs has elisp.

Git conflict resolution of that single file is intractable, so I convert the representation into thousands of tiny files for git, which I reassemble into the xml for Leo.

kyberias · on Nov 10, 2019

Yes! Why can't OOP language editors (IDE) simply represent the source code of classes, interfaces and other type definitions as they are without even revealing anything about the files they reside in? The technical detail of source code being stored in files is mundane.

userbinator · on Nov 10, 2019

That brings to mind an interesting idea for an IDE: having one big virtual file that you edit, which gets split into multiple physical files on disk

If you're going to work with it as one big file, then what's the point of multiple physical files anyway? Just store it as one big file then.

ScottFree · on Nov 10, 2019

Why store it as a (text) file at all? Why not store the code in a database? Or as binary? Then you can store metadata pertaining to the code and not just the code itself. Unreal blueprints are an interesting way of structuring code and providing a componetized api. It would be interesting if they were more closely integrated with the code itself. Then you could manipulate data flows, code and even do debugging from inside the same interface.

Yes, this is all pie in the sky stuff, but it's interesting to think about.

pdimitar · on Nov 11, 2019

I have been toying with the idea to store programming projects in a single sqlite3 database butt never seen enough value to actually pursue it.

As you mentioned though, it's interesting to think about.

jghn · on Nov 10, 2019

There’s not one perfect answer, the eye is in the beholder.

I personally have a harder time coming up to speed on things that don’t break things down into fairly small chunks. I have an easier time dealing with abstraction and would rather implementation details of what I’m looking at to be hidden until I drill in another level. IDEs make that latter part easy.

However I’ve come to realize that there’s not a one size fits all here. I’ve worked with people who are the exact opposite, and everything in between.

The best one can do is try to find the happiest medium for everyone involved and power on