Pipelining looks nice until you have to debug it. And exception handling is also very difficult, because that means to add forks into your pipelines. Pipelines are only good for programming the happy path.
At the risk of over generalized pronouncements, ease of debugging is usually down to how well-designed your tooling happens to be. Most of the time the framework/language does that for you, but it's not the only option.
And for exceptions, why not solve it in the data model, and reify failures? Push it further downstream, let your pipeline's nodes handle "monadic" result values.
Point being, it's always a tradeoff, but you can usually lessen the pain more than you think.
And that's without mentioning that a lot of "pipelining" is pure sugar over the same code we're already writing.
Pipelining simplifies debugging. Each step is obvious and it is trivial to insert logging between pipeline elements. It is easier to debug than the patterns compared in the article.
Exception handing is only a problem in languages that use exceptions. Fortunately there are many modern alternatives in wide use that don't use exceptions.
This is my experience too - when the errors are encoded into the type system, this becomes easier to reason about (which is much of the work when you’re debugging).
I've encountered and used this pattern in Python, Ruby, Haskell, Rust, C#, and maybe some other languages. It often feels nice to write, but reading can easily become difficult -- especially in Haskell where obscure operators can contain a lot of magic.
Debugging them interactively can be equally problematic, depending on the tooling. I'd argue, it's commonly harder to debug a pipeline than the equivalent imperative code and, that in the best case it's equally hard.
I don't know what you're writing, but this sounds like language smell. If you can represent errors as data instead of exceptions (Either, Result, etc) then it is easy to see what went wrong, and offer fallback states in response to errors.
Programming should be focused on the happy path. Much of the syntax in primitive languages concerning exceptions and other early returns is pure noise.
Established debugging tools and logging rubric are not suitable for debugging heavily pipelined code. Stack traces, debuggers rely heavily on line based references which are less useful in this style and can make diagnostic practices feel a little clumsy.
The old adage of not writing code so smart you can’t debug it applies here.
Pipelining runs contrary enough to standard imperative patterns. You don’t just need a new mindset to write code this way. You need to think differently about how you structure your code overall and you need different tools.
That’s not to say that doing things a different way isn’t great, but it does come with baggage that you need to be in a position to carry.
Pipelining is just syntactic sugar for nested function calls.
If you need to handle an unhappy path in a way that isn’t optimal for nested function calls then you shouldn’t be nesting your function calls. Pipelining doesn’t magically make things easier nor harder in that regard.
But if a particular sequence of function calls do suit nesting, then pipelining makes the code much more readable because you’re not mixing right-to-left syntax (function nests) with left-to-right syntax (ie you’re typical language syntax).
Nested loops isn’t pipelining. Some of the examples make heavy use of lambda so they do have nested loops happening as well but in those examples the pipelining logic is still the nesting of the lambda functions.
Crudely put, in C-like languages, pipelining is just as way of turning
fn(fn(fn()))
Where the first function call is in the inner, right-most, parentheses,
into this:
fn | fn | fn
…which can be easily read sequentially from left-to-right.
Pipelining doesn’t do anything with iterating. It’s entirely about linking nested functions.
What you’re looking at is loops defined inside lambda functions. Pipelining makes it much easier to use anonymous functions and lambdas. But it doesn’t magically solve the problem of complex loops.
It kind of is something related to loops if the language supports object iterator interfaces (they bridge OOP to classical constructs like for/foreach). Or maybe even generators.
It does not solve it magically, but it does give the programmer options to coalesce different paradigms into one single working implementation.
> It's only magic if you don't understand how it works.
That’s why I used scare quotes around the term ;)
> To me it is awkward to describe but simple to understand.
It’s not awkward to describe though. It’s literally just syntactic sugar for chaining functions.
It’s probably one of the easiest programming concepts to describe.
From our conversation, I’m not sure you do understand it because you keep bringing other tangential topics into the fold. Granted I don’t think the article does a great job at explaining what pipelining is, but then I think it’s point was more to demonstrate cool syntactic tricks you can pull off when writing functions as a pipeline.
edit: just realised you aren't the same person who wrote the original comment claiming pipelining was about iteration. Apologies for getting you mixed together.
If you're going to be snarky then at least get your facts right.
`Iter` is a method of `data`. And do you know what a method is? It's a function attached to an object. A FUNCTION. Pipelining is just syntactic sugar around chaining functions.
You even proved my point when you quoted the article:
Literally the only thing changing is the syntax of the code. You've got all of the same functions being called, with the same parameters and in the same order.
The article itself makes no mention of this affecting how the code is executed either. Instead, it talks about code readability.
In fact the article further proves my point when it says:
> You can, of course, just assign the result of every filter and map call to a helper variable, and I will (begrudgingly) acknowledge that that works, and is significantly better than trying to do absurd levels of nesting.
What it means by this is something like the following:
list = iter(data)
list = map(channel, |w| w.toWingding())
list = filter(list, |w| w.alive)
list = map(list, |w| w.id)
result = collect(list)
While I do have some experience in this field (having written a pipeline-orientated programming language from scratch), I'll cite some other sources too, so it's not just my word against yours:
The reason the article focuses on map/reduce type functions is because it's a common idiom for nesting commands. In fact you'll be familiar with this in Bash:
cat largefile.txt | sort | uniq --count
(before you argue about "useless use of `cat`" and other optimisations that could be made, this is just an example to demonstrate my point).
In here, each command is a process but analogous to a function in general-purpose programming languages like Rust, LISP, Javascript, etc. Those UNIX processes might internally loop through the contents of STDIN as a LF-delimited list but that happens transparently to the pipeline. Bash, when piping each command to the next, doesn't know how each process will internally operate. And likewise, in general-purpose programming language world, pipelines in LISP, Rust, JavaScript (et al) don't know nor care how each function behaves internally with it's passed parameters just so long as the output data type is compatible with the data type of the next function -- and if it isn't, then that's an error in the code (ie compile-time error in Rust or runtime error in Javascript).
So to summerise, pipelining has nothing to do with iteration. It's just syntactic sugar to make nested functions easier to read. And if the examples seem to focus on map/reduce, it's just because that's a common set of functions you'd want to chain and which are particularly ugly to read in nested form. ie they're an example of functions called in a pipeline, not the reason pipelines exist nor proof that pipelines themselves have any internal logic around iteration.
Yeah, you are right, it's about syntactic sugar, I didn't read the article except first two examples.
Pipelines are about iteration, of course. And they do have internal logic around iteration.
cat largefile.txt | sort | uniq --count
is an excellent example. While cat and count iterate on each character sequentially, sort and uniq require buffering and allocating additional structures.
The pipeline isn’t doing any of that though. The commands are. The pipeline is just connecting the output of one command to the input of the other.
Iteration is about looping and if the pipeline in the above example was some kind of secret source for iteration then the commands above would fork multiple times, but they don’t.
Depends on the context - in a scripting language where you have some kind of console you just don't copy all lines, and see what each pipe does one after another. This is pretty straight forward.
(Not talking about compiled code though)