I've never met anyone who say's "I like YAML, it is great"... most people that worked with it say something like "YAML is annoying, I don't like it"...
While introducing Kubernetes at our company in the last two years, we are currently in a process going more and more away from YAML with internal Helm charts to a much simpler process by just using HCL and Terraform, and defining Kuberentes resources as Terraform resources.
As a software developer HCL just makes so much more sense than this YAML + Helm + Go templates hell, which feels like C preprocessor hell all over again. Other solutions like kustomize are neat, but I don't see how all of these YAML workarounds should be better than something like HCL with Terraform. HCL feels like a real declarative programming language (with real conditions, variables, a module system and useful built-in functions). YAML feels like another more complex JSON and other tools like Helm or Kustomize try to work around the weaknesses of YAML with some kind of templating system.
YAML looks nice to read in simple demos and in small files, but is just not adequate in the real world (in my personal opinion - I know that YAML is used by a lot of people in production as of today).
> I've never met anyone who say's "I like YAML, it is great"
Maybe I'm older than you, but I have definitely heard that line.
Mostly because the alternatives were XML, INI or the myriad of bespoke formats, relayd/apachehttpd .conf or iptables etc;etc;
INI has parsers that operate in different ways and doesn't support heirarchies... so that's not ideal.
JSON and YAML came to the fore around the same time, and JSONs limitations in comments and it's picky semantics meant that people did prefer YAML over JSON for human readable configs.
YAML itself is fine, it has some really awkward warts and the parsers are usually programatically unsafe in their implementation (leading to less compatible "safe_load" or other types of loaders)[0]; the issue we actually have with YAML is that we:
A) Template it (jinja, mustache whatever)
B) Put entirely too much stuff into it. (kubernetes manfiests can grow to the hundreds of lines really easily)
These problems will affect any configuration file format we choose to use, including TOML (which is comparatively new on the block), because reading templated/enormous files is really difficult.
What I've taken to doing is programatically generating objects and then serialising them as whatever my software depends on. It might feel ugly to use an entire turing complete language to generate objects that are mostly static: but honestly... the ability to breakpoint, test and print the subsections of output is astonishingly nice.
The tooling is super mature, it's easy to emit, it's easy to parse, it's easy to validate, it can just a little hard to read and write by hand (and I mostly blame SOAP for that). Still, basic XML isn't that hard to read or write, thanks to editor support.
I like that you can use anchors and merges. It greatly simplifies complex, repetive structures. And most of the complaints about yaml can be worked around by string-quoting.
The whitespace can get in the way if you're templating, but then you can also use [1, 2, 3] as a list notation, for example.
In fact, most of the complaints could be resolved by running it through a linter.
I like YAML. More specifically, the subset of YAML, like the author suggests. Clear, intuitive, and allows expressing of complex data structures like JSON does. Much better than TOML, which easily becomes a mess with more complex data.
Yup exactly my experience as well, again a stupid idea to try and make a "configuration" language out of nested key value pairs that end up needing fancy interpreters allowing more and more semantic into the keys and values to start doing what a simple program could have done in half the time...
I ve worked in 4 companies over a period of 10 years, each had exactly this problem, with yml, json, xml, properties file (you dont want to see business logic conditionals in a properties text file, where the keys shapes command an interpreter to behave dynamically...)
The only times I saw a team do it well was a php backend of all things where the lead said they d program all their variations in php rather than source it from configuration flat descriptors and it was amazing, clear, simple and powerful. They had to release the backend at each config change instead of releasing the config change only, but Im still unsure why exactly that's a problem: the configs are software too if we re honest with ourselves, shoe-horning them in a descriptor language isnt gonna make them flat.
I don't think YAML is great, but I still think it's the best format out there.
The only confusing problem I've run into was the sexagesimal number notation and even that was fairly obvious. Perhaps it's because I tend to overquote strings?
I mean sure, the on/off to boolean mappings are annoying, but they also become very obvious when you're parsing config because the type validation will fail. If `flush_cache` has an enum `on` but no key `True` then the type validator will instantly complain about both the missing key and the extra key in the dictionary.
Same with accidental numbers, any type check will show that the parsing failed.
I find JSON for config files to become unreadable quickly because of the non-obvious nesting and the lack of comments. You can pick a JSON extension but then you need to pick one that your tooling will support.
Exactly this. I hear so many people recommend TOML over YAML.
I see the logic in it. For simple Key-value configurations, TOML is superior and more straightforward to YAML. You can add sub-level values and it isn't too bad (if there aren't too many), but beyond two levels, TOML becomes difficult to use.
If you really work in YAML in any sort of more advanced capacity (kubernetes, Ansible, CI/CD Pipelines) then you really need the complexity that YAML provides. You also get used to the "gotchas" mentioned here. Navigating them is fairly straightfoward.
I think the article was vastly overblown. Is YAML perfect? Certainly not. But you find a better way to display such complex data structures in a more human-readable and human-writable way. The complexity is YAML's strength, but it comes with caveats as all complexity generally does. I really think its the best we have.
I think problem in this particular case is using YAML as DSL. Every other data format would be equally bad here. Replace YAML with TOML and you're still in same templating hell.
YAML is least worst for me, and I don't think I ever hit the problems article is showing because
* I use editor that will highlight stuff like anchors
* I often generate config from CM so it can't have those errors
* Loading into defined struct in statically typed language also makes them impossible.
While introducing Kubernetes at our company in the last two years, we are currently in a process going more and more away from YAML with internal Helm charts to a much simpler process by just using HCL and Terraform, and defining Kuberentes resources as Terraform resources.
As a software developer HCL just makes so much more sense than this YAML + Helm + Go templates hell, which feels like C preprocessor hell all over again. Other solutions like kustomize are neat, but I don't see how all of these YAML workarounds should be better than something like HCL with Terraform. HCL feels like a real declarative programming language (with real conditions, variables, a module system and useful built-in functions). YAML feels like another more complex JSON and other tools like Helm or Kustomize try to work around the weaknesses of YAML with some kind of templating system.
YAML looks nice to read in simple demos and in small files, but is just not adequate in the real world (in my personal opinion - I know that YAML is used by a lot of people in production as of today).