Hacker Newsnew | past | comments | ask | show | jobs | submit | peter-fogg's commentslogin

OK, let's talk about parsing.

The article claims that parsing isn't necessary with structured editing. We're working on some abstract data structure representing our language's syntax. This might be an AST, like any programming language in common use today, or maybe in the future we've thought up some better way to represent programming languages within the compiler/interpreter. Since we're working directly with the AST or whatever, we don't need to parse! We just do whatever actions the programmer wants do over on our data structure.

Now, let's talk about how parsing works in your language of choice in 2015. We're going to take a string, and we'll turn it into a representation of our program. If we're lucky, our language implementation isn't awful, and our parser will be a function from some sort of Unicode-encoded text into an AST. OK, maybe our input is ASCII or some other text encoding, but the point is the same. Either way, we're taking some keypresses from the programmer, and turning them into a data structure representing the program.

Let's abstract a bit. Maybe in the future we don't use keyboards. Instead, we have whatever peripheral you like. This peripheral is capable of sensing some sort of action from the user, and turning it into actions within the computer. So now, our parser is a function from some user action to a data structure representing the program.

This sounds an awful lot like "we just do whatever actions the programmer wants do over on our data structure". The question is, how do we figure out what the user wants? The answer is, we parse it! It doesn't matter if we're parsing text or not. We will always need some way of determining the programmer's intention from the signals we get through their peripheral device. Viewed through this lens, the camera requires a parser just as much as the keyboard does, just as much as the mousepad does, just as much as the microphone does. We will always need a way to convert the unstructured thoughts of a human programmer into the formal language of a compiler's internals. Regardless of what representation we choose for any part of this process, it's going to be subject to the article's quote:

> This situation is a recipe for disaster. The parser often has bugs: it fails to handle some inputs according to the documented interface. The quoter often has bugs: it produces outputs that do not have the right meaning. Only on rare joyous occasions does it happen that the parser and the quoter both misinterpret the interface in the same way.


No. There is a large difference between parsing commands which affect the structure of a program and directly parsing the structure itself. For one, commands that affect the structure in an invalid way can be rejected. Second, commands can be stored so that a history of source code manipulations can be replayed (which while possible via text ala git is very lossy and subject to merge issues - as stated in the article). In addition, commands are limited by context. So when parsing the user's input we now have a limited context to work within greatly improving accuracy and also simplifying the input space of the user.

These are just a few of the reasons...


> 2. benign effects: in Haskell "proper", you do not have effects; rather you have "codes" for effects, which get interpreted into effects by the RTS; this rules out the possibility of, e.g., using effects to implement a semantically pure interface. On the other hand, OCaml has actual effects, which can be used in an open-ended way to implement all sorts of functional interfaces.

So, to be all pedantic and stuff... You've always got unsafePerformIO, first off. This is used in Debug.Trace (https://hackage.haskell.org/package/base-4.7.0.2/docs/src/De...) to allow printf debugging in pure code. This is also used in some Haskell libraries to provide restricted effects in monads other than IO. But if you don't want to use unsafe* functions, you can always use ST to get direct access to mutable memory in a safe way. If you can runST within that interface, then you can present a pure interface to users.


1. that's why I qualified it "proper". Of course there is unsafePerformIO, but in the presence of laziness, this is really unsafe. as opposed to ML, where doing IO may harm your ability to reason about stuff, but it won't be unsafe.

Debug.Trace is also unpredictable; actually, it's perfectly predictable if you understand laziness, but it's still not what's really wanted in most cases.

2. ST is a single instance of an effect that can be interpreted into "pure" code. There are plenty of other effects that don't work like this in Haskell, not to mention the problem of composing them together.

So thanks for chiming in! But what you have said is not really that different from what I have said.


I would love to talk to you a little about these kinds of things. I've been digging more seriously into OCaml the last few weeks and would love some signposts for making the Haskell->OCaml transition.


Hi! Please feel free to email me at any time. I can't promise that I will know everything you want to know (since I only know very little), but I would be happy to help with what I do know. jon [at] jonmsterling [dot] com.

BTW, I looked at your OCaml stuff yesterday, and it seemed pretty cool.


Could you say something about why `unsafePerformIO` is unsafe in the presence of laziness?

I know `unsafePerformIO` can be used to violate the type system in combination with `IORef`s, for example, but I don't see what laziness has to do with it.


In the presence of laziness, it can be very difficult to reason about what code executes when. When you don't know when your unsafe IO happens, things can get out of order or have other unforeseen side effects.


With regards to 2., could you briefly mention some other instances? I have no idea what I should be thinking of.


In this case, the dragon is slain without much trouble. In LVish (https://github.com/iu-parfunc/lvars) we've got an effect-tracking system that allows us to ensure determinism (read more here: http://www.cs.indiana.edu/~lkuper/papers/dissertation-draft-...). This would be pretty easily extended to only allow computations that are safe to log. And this is all in GHC Haskell! No external static analysis is necessary; the compiler can do it for us.


The dragon cannot be slain, only pushed back. It is the halting problem.

Yes, further automated analysis can find some of these problem. That's what the author of the essay proposes, and it's good to know that there are some solutions.

Let's go down the road further and expand the system so there are multiple loggers, and where subsystem logger configuration via an external configuration file occurs at run-time. Some configurations are acyclic, others are not.

This could be prohibited by edict - if a program cannot be analyzed then it's not valid. As the essay points out, there are also formal verification techniques - and that very few people use them. As I pointed out, it's surprisingly easy to make a Turing machine by accident. (See http://beza1e1.tuxen.de/articles/accidentally_turing_complet... and its HN comments https://news.ycombinator.com/item?id=6577671 .)

In practice then, it's very unlikely that most complex software will be able to use these tools, because it's hard to restrict the solution space to what those tools can analyze.

Here there be dragons.


Ruby and JS are both Turing-complete languages (as with just about any other programming language you'll ever use), which means that one can simulate the other. The simulation might be messy, complicated, or slow, but it will still work. In this case, Ruby and JS are close enough that it can be done without too much trouble (see the compiled JS).


But does "anything written in language A can be simulated using language B" imply "A can compile down to B"?


Yes, it does. It may not be easy, efficient, or elegant, but theoretically any problem solvable by a turing complete language can also be solved by any other turing complete language.



yes. If you can write a body of code X in language B which performs the same operations as any given body of code Y in language A, then all an A->B compiler needs to do is find X given Y.


Ok, but is finding X given Y tractable?

To me, it seems like compiling is translation of source (usually higher level source to lower level source), whereas simulation is more like translation of behavior.

Edit: and the ability to translate behavior does not seem to imply the ability to translate source


Couldn't you translate source given the ability to translate behaviour? If some behaviour in language X can be simulated by a state machine in Y, then you could just emit said state machine as the translation of the behaviour, give its output to the next state machine, etc. And gradually build up the behaviour of a program in X, but using a bunch of generated code in Y instead of just an interpreter. In other terms, if an interpreter for / simulation of X may be expressed in Y, then for each bytecode / AST node in X you could just inline the bytecode handling you would use in Y into the compiled output.


I think thats a very clever algorithm.

But I think your first articulation amounts to converting a turing machine to a huge generated state machine, which would be impossible. The behavior of the X program given one input could be modeled as a sequence of states if it terminates. But the process of generation that you described would have to be repeated for all possible inputs into the simulated X program, which would be impossible.

Maybe a single "behavior" could be modeled by a state machine, but it would also seem impossible to me to decompose a program into individual units of behavior.

But as for bytecode -- one could always compile the X code into bytecode, and then decompile that bytecode into Y code. If you can always find a common bytecode language for two languages X and Y, then I think that would be a generic algorithm for X->Y.

Edit: I guess that common bytecode could just be in a language that describes a turing machine.


It's worth remembering that a large portion of blue-collar criminals are arrested for possession of, say, a small amount of marijuana. There's a big difference between mugging someone and lighting a joint.


Not to be contrarian or anything, but why isn't Pamela Fox mentioned in the title? Seems odd that she's the only omission from the photos on the front page.


Good question. When I tried to post this initially, her name was in it. But the length of the title was too long. I figured that I would just keep the creators of programming languages/libraries/frameworks since this is a site about that.


I understand your point (and just to be pedantic, only one of them actually created a language), but this is not a site about "that", it's a site about tech and while Pamela Fox has not created a popular lang/framework (not that I know at least), she has contributed a lot in many other aspects, specially when trying to educate people around tech, one example of that is her involvement in the girldevelopit chapter in SF.


Totally agree. Wish I could edit the title.


Use short names in cases like this:

Guido, John Resig, DHH and Pamela Fox


GvR


Administration: If possible, can we please have the title manually edited to include everyone?

Thanks.


Shortening David Heinemeier Hansson to DHH should do the trick for getting another name in there.


I believe only one of the listed people are the creators of languages (Rails is a framework, jQuery is a library).


No need to be nitpicky about it, he probably meant that...


Just edited that.


Only one of the four without a wiki page?


I'm wondering the same thing.


Doesn't seem to be working in Firefox 25 on OS X.

    [00:06:34.535] SyntaxError: Using //@ to indicate source map URL pragmas is deprecated. Use //# instead @ http://ajax.googleapis.com/ajax/libs/jquery/1.10.2/jquery.min.js:1

    [00:06:35.468] Use of Mutation Events is deprecated. Use MutationObserver instead. @ chrome://divxhtml5/content/script.js:109
Chrome is good, though.


Browser support is listed in the README. https://github.com/hubspot/signet#support

I'd love to add additional support for Firefox. But unfortunately, Firebug doesn't support the full set of CSS properties in the console that are needed to pull off this effect.

That being said, we're open to PRs if you have a solution for FF!

Edit: We're working on the SyntaxError and MutationEvent warnings you mentioned. Follow here: https://github.com/HubSpot/signet/issues/3


Huh. After trying this in Chrome it seems a lot less pointless. I wondered why you needed a JS library just to console.log a list of strings :)

(Apparently they modified it to degrade to that after you posted.)


yeah, not working in lynx, either.


cURL here, not working either.


Broken in wget.


I'm pretty sure neither of those errors are coming from signet.


Also can confirm that chrome works


But in a situation like that, you could just annotate the type. In Haskell:

    add :: Int -> Int -> Int
    add a b = a + b


That 90% can have boyfriends, too, you know.


Most certainly, but keep in mind less than 5% of (American) men are gay.

(http://www.theatlantic.com/politics/archive/2012/05/american...)


It's underspecified anyway -- we need to know about the motion of rabbits, the troll, the density of black holes, etc.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: