Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I couldn't help but notice almost immediately one feature that is not human oriented, but most likely exists because it's easier for a machine to parse: single (":") vs double colon ("::"). This is not human-friendly. A human wants to write "key" "is" "value", and YAML has for a very long time supported a single ":" for "is" regardless of the actual type of the value.

I shouldn't have to care about what the type of the value is when writing out effectively YAML. This double-colon feature will do nothing but lead to bug reports from people confused as to why their document is invalid.



The comment above gives explanations defending subjective preferences about what "human oriented" means. That's fine as long as you remember that is what is happening: justification of subjective preferences. Other people can reasonably (or unreasonably) have different subjective preferences.

Also notice what the commenter above haven't done (yet, maybe they will?): done a full "forest level" comparison of all the trade-offs between the current HUML specification and ... what is your alternative proposal, exactly?

Based on my experience, I would guess that most people who design a language (and written a parser for it) for the first time will: (1) be surprised at how quickly design decisions snowball and lead to unexpected places; (ii) discover just how entangled design choices really are; (iii) will give up on trying to please everyone.

In my view, a language designer does really well to describe one's motivations, goals, tradeoffs, decisions, and then live with what you make, because... (a) making something real and useful is rad and (b) any language you make will probably have some weird stank you can't seem to get rid of.


The original designers of YAML explicitly declared in its name "Yet Another Markup Language" that it was designed to be a markup language, which it most certainly was not. Eventually somebody finally pointed out the mistake, and they sheepishly retronymed it "YAML Ain't Markup Language".

https://stackoverflow.com/questions/6968366/if-yaml-aint-mar...

On the other hand, and to your points, Relax/NG (both its XML and simplified syntaxes) is a beautiful successful example of wisely and collaboratively designing a new clean powerful system with the deep understanding of what a markup language really is (James Clark was deeply involved with many SGML and XML standards and implementations), and full cognizance of the strengths and weaknesses of other systems you're trying to replace (SGML DTDs, XML Schemas, TREX, RELAX, XDuce, and other experimental XML schema languages).

https://en.wikipedia.org/wiki/RELAX_NG

>In computing, RELAX NG (REgular LAnguage for XML Next Generation) is a schema language for XML—a RELAX NG schema specifies a pattern for the structure and content of an XML document. A RELAX NG schema is itself an XML document but RELAX NG also offers a popular compact, non-XML syntax.[1] Compared to other XML schema languages RELAX NG is considered relatively simple.

RELAX NG Compact Syntax

https://www.oasis-open.org/committees/relax-ng/compact-20021...

https://relaxng.org/jclark/design.html

The Design of RELAX NG

James Clark (jjc@thaiopensource.com)

Abstract: RELAX NG is a new schema language for XML. This paper discusses various aspects of the design of RELAX NG including the treatment of attributes, datatyping, mixed content, unordered content namespaces, cross-references and modularity. [...]

>Composability

>RELAX NG is designed to be highly composable. A schema language (or indeed a programming language) provides a number of atomic objects and a number of methods of composition. The methods of composition can be used to combine atomic objects into compound objects which can in turn be composed into further compound objects. The composability of the language is the degree to which the various methods of composition can be applied uniformly to all the various objects of the language, both atomic and compound. For example, RELAX NG provides a choice element that can be applied uniformly to elements, attributes, datatypes and enumerated values. This is not mere syntactic overloading. The choice element has a single uniform semantic in all these cases and can have a single implementation. Another example is the grammar element, which is the container for definitions. The grammar element is just another pattern and can be composed in just the same way as other patterns. Composability improves ease of learning and ease of use. Composability also tends to improve the ratio between complexity and power: for a given amount of complexity, a more composable language will be more powerful than a less composable one. [...]

>XML syntax

RELAX NG uses XML instance syntax to express schemas. Although this makes for a rather verbose schema language, it has some major advantages. Since a user of an XML schema language must necessarily already learn XML instance syntax, using XML instance syntax for the schema language reduces the learning burden on a schema user. It also allows XML tools and technologies to be applied to the schema. For example, a schema can be used to specify the syntax of the schema language. Another important benefit of XML syntax is extensibility. RELAX NG has an open syntax that allows the RELAX NG defined elements and attributes to be annotated with elements and attributes from other namespaces. RELAX NG DTD Compatibility [12] uses this annotation mechanism to extend RELAX NG with a mechanism for declaring default values for attributes. RelaxNGCC [23] uses this annotation mechanism to allow users to embed Java code in RELAX NG schemas, which gets executed as an XML document is parsed against the schema. An unofficial non-XML syntax for RELAX NG has also been developed [8]. The non-XML syntax can be used for authoring RELAX NG schemas by hand and can then be transformed into the standard RELAX NG XML syntax for interchange. [...]


I’m not seeing how this is a response to my comment. It comes across as evangelism and thread hijacking.


Here, I'll break it down for you:

I'm directly illustrating your point that "a language designer does really well to describe one's motivations, goals, tradeoffs, decisions, and then live with what you make", with one negative example of YAML spectacularly failing, and another positive example of Relax/NG spectacularly succeeding.

Because if all I do is beat a dead horse by ragging on YAML without suggesting any counter examples, I'm just pointlessly whining.

The YAML developers were so ignorant of and incurious about past and existing technologies that they were under the mistaken impression that what they designed was a "markup language", so pathologically that two out of the four letters of its name are incorrect, and they had to change what the other two letters stood for retroactively after they finally figured it out. And the human interface -- the syntax of YAML itself -- suffered deeply because of that ignorance (as do its millions of users and apps and LLMs that still have to deal with those bad uninformed design decisions).

The Relax/NG developers knew what the hell they were doing because they had a long track record of working with real world markup language standards, and they eloquently and precisely described their motivations. They even KNEW what a markup language was, unlike the original YAML designers.

They deeply understood not only the history but the cutting edge competing technologies of the time, and Relax/NG greatly benefited because of it. They expertly performed, as you suggest, ''a full "forest level" comparison of all the trade-offs between the current'' RELAX/NG specification and many other schema languages past and present.

Bristling when somebody enthusiastically tells you about old technologies you should look at before designing new technologies is just self imposed ignorance, not a good look. Cultivating ignorance instead of learning from the past is what got the YAML designers in trouble, too.

So no, I'm not a Relax/NG evangelist, I have no stake in it, and I haven't used it for well over 15 years. And I've also used XML Schemas, which is another negative example of terrible language design, and why I appreciate Relax/NG so much. When I was learning and using Relax/NG, I read a lot of the papers and code on James Clark's web site, as well as the archives of the Relax/NG design group discussion (which were fascinating, if you're interested in user friendly language design and markup languages and schemas, which I am, and which this discussion is about).

http://www.jclark.com/

Of course I've written before, responding to other people's comments, about how Relax/NG compares to XML Schemas, the history of markup and schema and language design in general, and how amusing it is to compare the length of James Clark's Haskel implementation of Relax/NG with the length of his Java implementation (although all arguments about how terrible Java is these days can be instantly won by mentioning the word "lawnmower" before getting into the weeds of thorny technical issues):

https://news.ycombinator.com/item?id=28668752

But I'm not trying to convince you to use Relax/NG, I'm just suggesting you might learn from its design, and check out James Clark's design document as a shining example of how enlightened language designers should express their goals and intentions just like you said.

I'm sorry you can't or won't see it, but what I learned and shared about Relax/NG applies directly to this discussion about YAML and HOML's reaction to it, insofar as Relax/NG has well designed user friendly human readable and writable compact syntax, which is perfectly compatible with its clumsy verbose XML syntax.

James Clark's paper, The Design of Relax/NG, is a perfectly on point illustrative example, directly addressing your comment about how language designers should "describe one's motivations, goals, tradeoffs, decisions, and then live with what you make".

https://relaxng.org/jclark/design.html

Did you bother reading or even skimming that excellent paper which exemplifies what you called for, before accusing me of not responding to your comment and hijacking your thread? Geez, sorry, dude. Read the paper, then you'll understand.

You can lead a horse to water...


Thanks for replying. I did indeed read your message but didn’t see a connection to the thread. Please don’t blame me for being honest about how your comment conveys (to me, and likely others, too).

Rather than ignoring your comment, I was open about how it came across. Some people don’t like that level of directness. It is hard to take feedback gracefully, I get it. A lot of people prefer to counterpunch.

Also, I think it’s probably a stretch to hope a person will read or skim through the quantity of words you’ve quoted at length and linked. Think about the readers. Whether you like it or not, you have convince people that it’s worth their time, one way or another.

To use your metaphor, you didn’t lead the horse to the water. The horse didn’t see water; it saw a pile of undrinkable words.

What an awful phrase. Leading a horse to water is about giving an opportunity to drink. If you lead a horse to water and it doesn’t drink, it likely means it’s not thirsty! You don’t yell at the horse for not drinking it.

All this said, I do sincerely appreciate you taking the time to reply.


I find this very human-friendly: "[double colon] permits vectors to be defined inline without additional syntax such as [ ... ] or { ... }."

(One could question how human friendly it is to call lists and dicts "vectors" though...)

https://huml.io/specifications/v0-1-0/#why


It's especially clear in the "inline dict" example. I really like it!

  props:: mime_type: "text/html", encoding: "gzip" # Inline dict.


But that's not done for human readability, that's done for machine parsing? A human would understand just as well:

  props: mime_type: "text/html", encoding: "gzip" # Inline dict.


My mind has a feeling that this might be possible interpretation:

    props: [mime: [html, encoding:[gzip]]]
(Even if not legal, to be sure, I must backtrack and concentrate on this particular piece)

There are many different humans. I definitely like the idea to separate “:: vs :”.


For me, it’s clearer with a double colon. Not intuitive, but extremely easy to get used to. When I see the first colon introduce a list, I have to go out of my way to not see the other colons as introducing lists.


I think we should have a name for the often undue examination and analysis of colon/semicolon usage in machine languages. I volunteer the name "colonoscopy."


And the obsession with inserting colons into things (and vice-verse) should be called "colonialism".


Your point is funny but if you're not anal-retentive about the syntax you get monstrosity such as the CSV-escape rule instead of the passwd-escape rule..


The double-colon is probably a necessity to disambiguate a scalar from an inline list that only has one scalar.

For example `x: 3` would be equivalent to `"x": 3` in JSON, but `x:: 3` is equivalent to `"x": [3]`


A trailing comma would also solve this, but is equally awkward. A clear list delimiter would work for both machines and humans.


Maybe author of huml will read: I would prefer mandatory braces for inline stuff. “X:: 3” feels like a trick question.


I found that example to be humerous, but specifically compared to the goals:

> Provide as few ways as possible—preferably one—of representing something.

Very Pythonic. Especially since representing a dict already has 2 ways, on the first page!


> Especially since representing a dict already has 2 ways,

And lists.

> Pythonic

Pythonic in the way that Python's 'There should be one...' is expressed through the existence of tuples, named tuples, dataclasses, regular classes and attrs (not part of the standard library but it seems to be as much of a goto as requests is)? ;)


You forgot to split out collections.namedtuple and typing.NamedTuple.

I love Python to death, but will readily admit that in no way is there “one obvious way” to do things.


Also see string concatenation.


In my theory of "human-readability", odd double tokens like :: can exist successfully so long as there is sufficient utility and logic in the single token :

Tokens are inseparable from human instincts of single = less, double = more and the corresponding emotions: single=less=easier=quicker, double=more=complicated=longer=difficult

If you are not emphasizing the single token as them most common, it's going to cause confusion.


Maybe the plural of "is" is "are" :)

But why double the character instead of picking another one... plenty of other non-alphanumeric characters to chose from.


Language design often involves subjective tradeoffs. The author gives their rationale here: https://huml.io/specifications/v0-1-0/#why


Slightly off-topic, but I find the text of that anchor fragment amusing ("why? just why?") and a little disappointing. Automatic CMSs have robbed us of attention to detail - that anchor (for "Why `::`?") should probably be changed to `why-double-colon`.


There would at least be some ambiguity with single-value lists otherwise:

If

  numbers: 1, 2, 3
was a list, what would

  numbers: 1
be?


You could have it like in Python tuples where 1 is a scalar and 1, is a tuple.


True, but I think that wouldn't be any more intuitive or error safe than the :: syntax.


Error safe, the : or :: I kind of doubt it, I think that finding a typing mistake (two : instead of 1) won't be easy..


The clear distinction between scalars and vectors appears to be the main advancement HUML offers.

I think it’s a neat improvement.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: