More

yowlingcat · 2026-01-06T00:40:50 1767660050

Someone here has lost the plot and at this point I wonder if it is me. Is software supposed to be deterministic anymore? Are incremental steps expected to be upgrades and not regressions? Is stability of behavior and dependability desirable? Should we culturally reward striving to get more done with less.

...no, I haven't lost the plot. I'm seeing another fad of the intoxicated parting with their money bending a useful tool into a golden hammer of a caricature. I dread seeing the eventual wreckage and self-realization from the inevitable hangover.

Quarrelsome · 2026-01-06T02:27:14 1767666434

i always thought my job was to be able to prove the correctness of the system but maybe the reality is that my job was actually just to sling shit at someone until they were satisfied.

orangecat · 2026-01-06T03:32:56 1767670376

Is software supposed to be deterministic anymore?

I've never understood this argument. Do you ever work with other humans? They are very much not deterministic, yet they can often produce useful code that helps you achieve more than you could by yourself.

yowlingcat · 2026-01-02T02:51:27 1767322287

Run a model at all, run a model fast, run a model cheap. Pick 2.

With LLM workloads, you can run some of the larger local models (at all) and you can run them cheap on the unified 128G RAM machines (Strix Halo/Spark) - for example, gpt-oss-120b. At 4bit quantization given it's an MoE that's natively trained at NVFP4, it'll be pretty quick. Some of the other MoEs with highly compressed active parameter models will also be quick as well. But things will get sluggish as the active parameters increase. The best way to run these models is with a multi-GPU rig so you get speed and VRAM density at once, but that's expensive.

With other workloads such as image/video generation, the unified vram doesn't help as much and the operations themselves intrinsically run better on the beefier GPU cores, in part because many of the models are relatively small compared to LLM (6B-20B active parameters) but generating from those parameters is definitely GPU compute intensive. So you get infinitely more from a 3090 (maybe even a slightly lesser card) than you do from a unified memory rig.

If you're running a mixture of LLM and image/video generation workloads, there is no easy answer. Some folks on a budget opt for a unified memory machine with an eGPU to get the best of both worlds, but I hear drivers are an issue. Some folks use the Mac studios which while quite fast force you to be inside the Metal ecosystem rather than CUDA and aren't as pleasant for dev or user ecosystem. Some folks build a multi CPU server rig with a ton of vanilla RAM (used to be popular for folks who wanted to run DeepSeek before RAM prices spiked). Some folks buy older servers with VRAM dense but dated cards (thing Pascal, Volta, etc, or AMD MI50/100). There's no free lunch with any of these options, honestly.

If you don't have a very clear sense of something you can buy that you won't regret, it's hard to go wrong using any of the cloud GPU hyperscalers (Runpod, Modal, Northflank, etc) or something like Fal or Replicate where you can try out the open source models and pay per request. Sure, you'll spend a bit more on unit costs, but it'll force you to figure out if you have your workloads figured out enough to where the pain of having it in the cloud stings enough to where you want to buy and own the metal -- if the answer is no, even if you could afford it, you'll often be most happiest just using the right cloud service!

Ask me how I figured out all of the above the hard way...

fragmede · 2026-01-02T18:58:16 1767380296

Thank you for sharing your painful learning experiences! What do you have now?

yowlingcat · 2026-01-02T20:10:04 1767384604

Locally, I use a Mac Studio with a ton of VRAM and just accept the limitations of the Metal ecosystem, which is generally fine for the inference workloads I am consistently running locally (but I think would be a pain for a lot of people).

I can't see it making sense for training workloads if and when I get to them (which I'd put on the cloud). I have a box with a single 3090 to do CUDA dev if I need to but I haven't needed to do it that often. And frankly the Mac Studio has rough computational parity with a bit under a 3090 in terms of grunt, but with an order of magnitude more unified VRAM so it hits the mark for medium-ish MoE models I like to run locally as well as some of the diffusion inference workloads.

Anything that doesn't work great locally or which is throwaway (but needs to be fast) ends up getting thrown at the cloud. I pull it back to something I can run locally once I'm running it over and over again on a recurring basis.

Y_Y · 2026-01-02T11:47:04 1767354424

I choose fast and cheap

yowlingcat · 2026-01-02T02:06:55 1767319615

Very cool! Managing ones boxes as cattle and not pets almost always seems like a better idea in retrospect but historically it is easier said than done. Moreover, I like the idea of being able to diff a box's actual state from a current Ansible system to verify that it actually is as configured for further parity between deployed/planned.

_mig5 · 2026-01-02T02:19:34 1767320374

Definitely! It's all too easy to make a direct change and later forget to 'fold it in' to Ansible and run a playbook. My hope is that `enroll diff` serves as a good reminder if nothing else.

I'm pondering adding some sort of `--enforce` argument to make it re-apply a 'golden' harvest state if you really want to be strictly against drift. For now, it's notifications only though.

yowlingcat · 2025-12-31T03:34:02 1767152042

Ah, I remember scheming about buying an NF7-S + an Athlon XP Barton and unlocking it, combining it with a geforce 4 ti4200 and overclocking both but not even having enough of the pocket change to pull that off. By the time I was far enough along to have some of that in school, I picked up an A64 and a top of the line Geforce 5 from a black friday sale and had a great time gaming and coding.

Ironically, all the scheming I did about overclocking ended up being very unnecessary and I found it borderline impossible to actually stress the upper limits of the machine's muscle with day to day workloads and so all the research I put into overclocking was not really practically necessary, that it was freeing to not have to even think about the machine and instead focus on the work I wanted to do with the software I was using and building. Surely a lesson that continues to pay dividends, albeit from simpler times...

yowlingcat · 2025-12-29T19:00:40 1767034840

Are you experienced with DAWs as a composer or producer?

Many if not most professional producers use MIDI controllers with knobs/sliders/buttons MIDI mapped to DAW controls. As such the skeuomorphism actually plays a valuable role in ensuring that the physical instrument experience maps to their workflows. Secondarily, during production/mastering, producers are generally using automation lanes and envelopes to program parameters into the timeline, and the piano roll to polish the actual notes.

When I've historically done working sessions, the composition phase of what I'm doing tends to involve very little interaction with the keyboard, and is almost entirely driven by my interaction with the MIDI controller.

Conversely, when I'm at the production phase, I am generally not futzing with around with either knobs or the controller, and I am entirely interacting with the DAW through automation lanes or drawing in notes through the piano roll. So I don't really ever use the knob through a mouse and I've never really encountered any professional or even hobbyist musicians who do except for throwaway experimentation purposes.

yowlingcat · 2025-12-29T16:43:07 1767026587

I agree with you and I will also anecdotally note that I've been personally observing more and more of the younger generations (Z, esp gen Alpha) adopt these mechanisms en masse, viewing social media as the funhouse simulation of socialization that it always was and finding true social connection through other manners.

yowlingcat · 2025-12-29T01:41:42 1766972502

Technically it's a lot closer to monopsony (Sam Altman/OAI cornering 40% of the market on DRAM in a clever way for his interests that harms the rest of the world that would want to use it). I keep hoping that somehow necessity will spur China to become the mother of invention here and supply product to serve the now lopsided constrained supply given increasing demand but I just don't know how practical it will be.

yowlingcat · 2025-12-22T16:01:32 1766419292

Very cool! Found this note interesting:

```

Limitations

I haven't, yet, found a good way to implement filters (low-pass, high-pass, band-pass, etc.). It does not have Fourier transform, and we cannot operate on the frequency domain. However, the moving average can suffice as a simple filter.

```

I wonder if there's a way to implement the FFT using subqueries and OVER/partitioning? That would create a lot of room for some interesting stuff to happen, specifically making it easy/possible to implement filters, compression, reverberation, and other kinds of effects.

Two other primitives that would be valuable to figure out: 1. How to implement FM/phase distortion. You can basically implement a whole universe of physical modeling if you get basic 6 op sine wave primitives right with FM + envelopes. 2. Sampling/resampling - given clickhouse should do quite well with storing raw sample data, being able to sample/resample opens up a whole world of wavetable synthesis algorithms, as well as vocal editing, etc.

Honestly, although the repo's approach is basic, I think the overall approach is wonderful and have wanted to be able to use SQL to write music for a while. I've spent a lot of time writing music in trackers, and being able to use SQL feels like it would be one of the few things that could spiritually be a successor to it. I've looked at other live coding languages, many of which are well built and have been used by talented people to make good music (such as Tidal, Strudel, etc). But all of it seems like a step down in language from SQL. I'd rather have their capabilities accessible from SQL than have to use another language and its limitations just to get those capabilities.

Food for thought -- thanks for the interesting and thoughtful work!

yowlingcat · 2025-11-30T20:20:51 1764534051

I see a lot of value in spinning up microservices where the database is global across all services (and not inside the service) but I struggle more to see the value of separate core transactional databases for separate services unless/until the point where two separate parts of the organizations are almost two separate companies that cannot operate as a single org/single company. You lose data integrity, joining ability, one coherent state of the world, etc.

The main time I can see this making sense is when the data access patterns are so different in scale and frequency that they're optimizing for different things that cause resource contention, but even then, my question would become do you really need a separate instance of the same kind of DB inside the service, or do you need another global replica/a new instance of a new but different kind of DB (for example Clickhouse if you've been running Postgres and now need efficient OLAP on large columnar data).

Once you get to this scale, I can see the idea of cell-based architecture [1] making sense -- but even at this point, you're really looking at a multi-dimensionally sharded global persistence store where each cell is functionally isolated for a single slice of routing space. This makes me question the value of microservices with state bound to the service writ large and I can't really think of a good use case for it.

[1] https://docs.aws.amazon.com/wellarchitected/latest/reducing-...

mjr00 · 2025-11-30T20:56:38 1764536198

> I see a lot of value in spinning up microservices where the database is global across all services (and not inside the service)

This issue with this is schema evolution. As a very simple example, let's say you have a User table, and many microservices accessing this table. Now you want to add an "IsDeleted" column to implement soft deletion; how do you do that? First you need to add the actual column to the database, then you need to go update every single service which queries that table and ensure that it's filtering out IsDeleted=True, deploy all those services, and only then can you actually start using the column. If you must update services in lockstep like this, you've built a distributed monolith, which is all of the complexity of microservices with none of the benefits.

A proper service-oriented way to deal with this is have a single service with control of the User table and expose a `GetUsers` API. This way, only one database and its associated service needs to be updated to support IsDeleted. Because of API stability guarantees--another important guarantee of good SoA--other services will continue to only get non-deleted users when using this API, without any updates on their end.

> You lose data integrity, joining ability, one coherent state of the world, etc.

You do lose this! And it's one of the tradeoffs, and why understanding your domain is so important for doing SoA well. For subsets of the domain where data integrity is important, it should all be in one database, and controlled by one service. For most domains, though, a lot of features don't have strict integrity requirements. As a concrete though slightly simplified example, I work with IoT time-series data, and one feature of our platform is using some ML algorithms to predict future values based on historical trends. The prediction calculation and storage of its results is done in a separate service, with the results being linked back via a "foreign key" to the device ID in the primary database. Now, if that device is deleted from the primary database, what happens? You have a bunch of orphaned rows in the prediction service's database. But how big of a deal is this actually? We never "walk back" from any individual prediction record to the device via the ID in the row; queries are always some variant of "give me the predictions for device ID 123". So the only real consequence is a bit of database bloat, which can be resolved via regularly scheduled orphan checking processes if it's a concern.

It's definitely a mindshift if you're used to a "everything in one RDBMS linked by foreign keys" strategy, but I've seen this successfully deployed at many companies (AWS, among others).

yowlingcat · 2025-12-03T02:55:09 1764730509

I get your point around the soft deletion example but that sounds more like poor module separation + relational query abstraction/re-use rather than a shared database issue. Whether it's through a service, a module or a library, leaky abstraction boundaries will always cause issues. I see your point about separately versioning separate services which I think can make certain kinds of migrations more tractable -- but it comes at the expense of making it heavier and prolonging the duration of supporting two systems.

The difference I generally see with shared-state microservices is that now you introduce a network call (although you have a singular master for your OLTP state), and with isolated state microservices, now you are running into multiple store synchronization issues and conflict resolution. Those tradeoffs are very painful to make and borderline questionable to me without a really good reason to sacrifice them (reasons I rarely see but can't in good faith say never happen).

Pertaining to your IoT example -- that's definitely a spot where I see a reason to move out of the cozy RDBMS, which is an access pattern that is predominated by reads and writes of temporal data in a homogenous row layout and seemingly little to no updates -- a great use case for a columnar store such as Clickhouse. I've resisted moving onto it at $MYCORP because of the aforementioned paranoia about losing RDBMS niceties (and our data isn't really large enough for vertical scaling to not just work) but I could see that being different if our data got a lot larger a lot more quickly.

Maybe putting it together, there are maybe only several reasons I've seen where microservices are genuinely the right tool for a specific job (and which create value even with shared state/distributed monolith):

1) [Shared state] Polyglot implementation -- this is the most obvious one that's given leverage for me at other orgs; being able to have what's functionally a distributed monolith allows you to use multiple ecosystems at little to no ongoing cost of maintenance. This need doesn't happen for me that often given I often work in the Python ecosystem (so being able to drop down into cython, numba, etc is always an option for speed and the ecosystem is massive on its own), but at previous orgs, spinning up a service to make use of the Java ecosystem was a huge win for the org over being stuck in the original ecosystem. Aside from that, being able to deploy frontend and backend separately is probably the simplest and most useful variant of this that I've used just about everywhere (given I've mostly worked at shops that ship SPAs).

2) [Shared state] SDLC velocity -- as a monolith grows it just gets plain heavy to check out a large repository, set up the environment run tests, and have that occur over and over again in CI. Being able to know that a build recipe and test suite for just a subset of the codebase needs to occur can really create order of magnitude speed ups in wall to wall CI time which in my experience tends to be the speed limit for how quickly teams can ship code.

3) [Multi-store] Specialized access patterns at scale -- there really are certain workloads that don't play that well with RDBMS in a performant and simple way unless you take on significant ongoing maintenance burden -- two I can think of off the top of my head are large OLAP workloads and search/vector database workloads; no real way of getting around needing to use something like ElasticSearch when Postgres FTS won't cut it, and maybe no way around using something like Clickhouse for big temporal queries when it would be 10x more expensive and brittle to use postgres for it; even so, these still feel more like "multiple singleton stores" rather than "one store per service"

4) [Multi-store] Independent services aligned with separate lines of revenue -- this is probably the best case I can think of for microservices from a first principles level. Does the service stand on its own as a separate product and line of revenue from the rest of the codebase, and is it actually sold by and operated by the business that independently? If so, it really is and should be its own "company" inside a company and it makes sense for it to have the autonomy and independence to consume its upstream dependencies (and expose dependencies to its downstream) however it sees fit. When I was at AWS, this was a blaringly obvious justification, and one that made a lot of intuitive sense to me given that so much of the good stuff that Amazon builds to use internally is also built to be sold externally.

5 [Multi-store] Mechanism to enforce hygiene and accountability around organizational divisions of labor - to me, this feels like the most questionable and yet most common variant that I often see. Microservices are still sexy and have the allure of creating high visibility career advancing project track records for ambitious engineers, even if at the detriment to the good of the company they work for. Microservices can be used as a bureaucratic mechanism to enforce accountability and ownership of one part of the codebase to a specific team to prevent the illegibility of tragedy of the commons -- but ultimately, I've often found that the forces and challenges that lead to those original tragedy of the commons are not actually solved any better in the move to microservices and if anything the cost of solving it is actually increased.

yowlingcat · 2025-11-14T07:44:06 1763106246

I observe this compulsion a lot and in my opinion, it's almost always coming from resentment driving ego in an attempt to compensate for insecurity and self-loathing, which ultimately ends up misdirected towards others. It's almost always accidentally entertaining, but sadly ends up diminishing rather than elevating discourse.

To refute GP's point more broadly -- there is a lot in /applied/ computer science (which is what I think the harder aspects software engineering really is) that was and is done by individuals in software just building in a vacuum, open source holding tons of examples.

And to answer GP's somewhat rhetorical question more directly - none of those professions are paid to do open-ended knowledge work, so the analogy is extremely strained. You don't necessarily see them post on blogs (as opposed to LinkedIn/X, for example), but: investors, management consultants, lawyers, traders, and corporate executives all write a ton of this kind of long-form content that is blog post flavored all the time. And I think it comes from the same place -- they're paid to do open-ended knowledge work of some kind, and that leads people to write to reflect on what they think seems to work and what doesn't.

Some of it is interesting, some of it is pretty banal (for what it's worth, I don't really disagree that this blog post is uninteresting), but I find it odd to throw out the entire category even if a lot of it is noise.