Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Dr Johnson kicked a large rock and said, as his foot rebounded, “I refute it thus.”.

Well, I don't really like that: it just doesn't look nice to me. I would suggest one of the following:

Dr Johnson kicked a large rock and said, as his foot rebounded, “I refute it thus”.

You've chosen not to quote the full stop. There's no law that says you have to include everything in the quotation, right?

Dr Johnson kicked a large rock and said, as his foot rebounded: “I refute it thus.”

This also works if the quoted sentence ends with a question or exclamation mark.

It's a shame that punctuation of human languages can't be logical, but it seems that we're stuck with inconsistent requirements and messy compromises. Cases like the following really confuse and annoy me:

“On the other hand[,]”[,] she said, “we could wait till dark[.]”[.]

(Should that depend on whether the original spoken sentence would, if written, contain a comma after "hand"?)



But what would you do if Dr Johnson was surprised, and you yourself were shouting, i.e.:

Dr Johnson kicked a large rock and said, as his foot rebounded: “I refute it thus?”!

AFAICS, the only way to render this faithfully is the way I just did. In other words, you really do need the punctuation both of the outer sentence and the inner sentence. By extension, the only logical approach for the original sentence would be:

Dr Johnson kicked a large rock and said, as his foot rebounded: “I refute it thus.”.

On a different note, might I use this moment to complain about American books not closing quotations, if they continue on onto a new paragraph, and then opening them again? I.e.:

John said: "I have two things to say.

"One of these things is this.

"The other thing is this."


The quotation thing is irritating if you treat them like matched parentheses, but if you allow the opening and closing quotes to have different meanings, there is a logical interpretation. The opening quote is required syntax for the beginning of any quoted paragraph, so that the reader is reminded that we're still in an extended quote. The closing quote means "this person is finished speaking, and the next quote may be assumed to be a different person." The advantage is the streamlining of longer exchanges:

John spoke to Paul. John said: "I have two things to say.

"One of the things is this."

"What's the other?"

"The other thing is this."

Even in the purest programming languages, we're happy to design special-case idioms that sacrifice perfect orthogonality for better human factors, provided there's an unambiguous parse. Scheme provides (define <identifier> <expression>) - utterly elementary. Yet defining functions by binding identifiers to anonymous lambdas is so annoying that an unneccesary and inconsistent second syntax is provided, (define (<identifier <args...>) <expression>).


Eh? Standard American and British usage is the same with regard to quotes that span multiple paragraphs. Given that it's understood that speakers can alternate without each quote being attributed, e.g.:

Bob said: "Any opinion on this, John?"

John said: "I have two things to say."

"What are they?"

"One of these things is this.

"The other thing is this."

– how would you punctuate that? If you close each paragraph with a quote, then there's no way to tell who's speaking except to label each paragraph:

Bob said: "Any opinion on this, John?"

John said: "I have two things to say."

Bob asked: "What are they?"

John answered: "One of these things is this."

John continued: "The other thing is this."

And if you don't open each quoted paragraph with a quote, it's very hard to tell which paragraphs are quoted:

John said: "I have two things to say.

One of these things is this.

The other thing is this."

The third thing, he kept to himself.


And what about when the quoted text is two sentences? Do we write a full stop for the first sentence, but not the second?


> There's no law that says you have to include everything in the quotation, right?

Oh, if only languages worked so consistently.

  DESCRIPTION_OF_WHAT_WAS_SAID: "$(DIRECT_QUOTE.)".
Then again, in programming things don't always nest either, since sub shell calls have all sorts of oddities in regards to escape sequences.

Has there ever been a case of anything nesting nicely, be it in languages, programming, or any other medium of writing? Maybe XML? I'm not sure.


I guess that's what XML namespaces were supposed to allow.

Reality seemed to involve eldritch abominations like one system I encountered that had entire Base64 encoded XML documents embedded as attribute values in a higher level document and then this approach applied recursively....

Edit: Of course, this wasn't XMLs fault - but for some reason a lot of XML used in the "enterprise" world seemed to be primarily designed to eat the soul of whoever gazed upon it.


> Has there ever been a case of anything nesting nicely, be it in languages, programming, or any other medium of writing? Maybe XML? I'm not sure.

JSON?


Not quite, in practice you still occasionally stumble upon situations like these: https://stackoverflow.com/questions/51974631/how-do-i-proper...

Or this: https://github.com/spaceghoul/json_deep_parse#example-usage-...


I see what you mean, but the first appears to be a PHP bug (unless I am misreading?).

The second, appears to be a tool for parsing a json blob which has been escaped and encoded as a simple string inside another json blob. That's certainly an interesting problem, and one that is likely to come up in a sufficiently complicated world - however it's not an issue with parsing JSON. It's an issue with parsing /any/ data structure or language that may contain strings and as such seems unavoidable.


> It's an issue with parsing /any/ data structure or language that may contain strings and as such seems unavoidable.

Except for a data format which would allow embedding data in a nested fashion without altering it in any way. For example:

  some_object_field: "some value"
  some_other_field:
    with_sub_objects:
      and_sub_fields: "with values"
  and_also_fields:
    """
      which_allow_objects:
        embedded_as_strings: "without transforming the structure"
        which_both: "JSON and XML"
        have_somewhat: "failed to do"
      #""" control sequences should also be valid in the body, as long as there is proper indentation, a la Python
      # which could then be simply stripped for display, for example, based on the first """ having N indentation
      # then it would follow that the rest of the data entries have N+TAB_WIDTH, which could be simply stripped
      # also, processing the beginning of every line would be less expensive than iterating through the entire line in search of escaped \n or anything of the sort
    """
Multi-line strings in JSON are also an embarrassment: https://stackoverflow.com/a/2392888

As a consequence, the amount of parsing and processing that you need to do is really bad for performance. Of course, there are formats like YAML and TOML that go in the opposite direction - they try to cover all use cases and end up being overcomplicated.

There are occasionally other attempts like JSON5 to improve things: https://json5.org/

However those also oftentimes are not very popular, because there is just too much ecosystem that has been built around the older formats, like XML, JSON and even YAML.


Typographically, the first comma is redundant, as it's implied in the break;

"On the other hand", she said, "we could wait 'til dark".


> Well, I don't really like that: it just doesn't look nice to me.

Well, I don't really like that; it just doesn't look nice to me.


> It’s a shame that the punctuation of human languages can’t be logical.

    s/the punctuation of //


Lojban is a human language, even if it is not a natural language.


Lojban also hasn’t shown that it would retain its principles after years of broad day-to-day usage. And I dare speculate that it would not.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: