Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Pass out-parameters by pointer: Widget*.

OP says this like this is an industry rule of sorts but I've pretty much always seen non-const references for this. Using pointers is terrible because now you have to add a check for null inside your function, as its API contract says "I can accept null values", while references make it clear that it does not make sense to call if you don't have a correct object to pass it.

> That’s right: the point of making a class with private members is to preserve invariants among those members.

How would OP make a class, which loads content on construction which must absolutely not being changed after construction then, even within its implementation ? "Being careful" is really not sufficient in my experience, and introducing child classes just to enforce the invariant that "const" provides is actively harmful to readability.



I absolutely hate non-const reference parameters. Personally, I never use them except for operator overloading or stl functions.

The reason is, as lelanthran said, that when I see func(param), I don't expect param to be modified.

The time I lost because of non-const reference parameters adds up to days, maybe weeks. In fact, bugs caused by stealthy references are often way harder to find than null pointer crashes. Null pointer crashes are among the bugs that concern me the least, sure crashes are bad, but the core dump often tells you exactly what's wrong, and they rarely lead to non-DoS exploits. See: offensive programming.

Now that I think of it, maybe IDEs should highlight non-const references in function calls.

I think it is unfortunate because I like the extra guarantees that references offer over pointers but the lack of explicit (de)referencing is enough for me to avoid.


> The time I lost because of non-const reference parameters adds up to days, maybe weeks.

Well, we have wildly different experiences, I can't remember a single instance where this was an issue in the projects I've been involved with.


And I believe you, here are the possible explanations:

- Personal experience: it is not a construct I use, and most of the code I work with don't use it either, so when it happens, it is a surprise. People who come from a pointerless language like Java may expect it.

- I work with a lot of messy code: large code bases I barely know, written by many people from different companies, without a style guide. So one function may use references and the next one may use pointers, adding to the confusion.

- Of course, documentation is misleading (to the point that I make a conscious effort not to read it), and so are function names. Things like getChild(n) that increments n when pretty much every getChild(n) function everywhere passes n by value. Terrible code, that's why there is a bug in the first place, bugs are usually not in the best written parts.

- I actually like jumping in other people terrible code and fix bugs, it is an interesting challenge and I became rather good at it where I work. As a result, I am often given these kinds of job. A few days is not that much compared to my entire experience of debugging terrible code, but enough to be significant.

- Of course I spend a lot more time fixing memory corruption (and the problem with stealthy reference is that it can look like memory corruption). But in my experience, references over pointers don't help that much with it. It helps against null pointer dereferencing, but as I said before, these are usually easy bugs to fix.


> Now that I think of it, maybe IDEs should highlight non-const references in function calls.

CLion highlights this with an off-color "&:" in front of the parameter.


C# requires explicit "out" on call site, I wonder if you could make a macro and lint rules for C++ to require this :) Haven't touched C++ in almost 10 years so not up-to-date on how good/extensible linters are.


You can create a trivial wrapper:

   https://godbolt.org/z/c5YeP6feG
The usual issue with these sort of things is that they are not idiomatic (but then again, there is no single idiomatic C++).

edit: std::reference_wrapper is of course the standard blessed way to do this, but still not idiomatic.


> Now that I think of it, maybe IDEs should highlight non-const references in function calls.

This is a great idea. When I type `ls` in a shell, I get important information from the colors of the output: one color for normal files, another for executables, another for directories, etc.

This would work well for parameter types, even if it were as simple as "white means it cannot be modified (pass by value, const reference, etc.) and yellow means it can (non-const reference, pointer, etc.)"

I wonder if this exists?


I believe it's called "semantic highlighting".


It would be really nice for C++ to add a mandatory “out” keyword for these parameters, just like C#. In that way you are forced to specify that this is an output parameter both at the caller and the callee, which makes things much more explicit and readable.


This is a great talk by Herb Sutter that goes into this in detail https://www.youtube.com/watch?v=6lurOCdaj0Y


Or, don't use out-parameters unless you really have to.

If a function returns a value, have it return that value, instead of using an out-parameter. Passing in-parameters by const reference rather than by value is a nice win because the semantics of the function remain clear. Input parameters still look like input parameters, you're just preventing unnecessary copies where it's easy to do so. Returning a value as an out-parameter rather than just returning it is a performance hack that prevents a copy, but at the expense of muddying the semantics of the function.

As the prophet says: we should forget about small efficiencies, say about 97% of the time. Write code that clearly expresses what it does. Return your return values. If you run a profiler over your code and you notice that some returns are causing a performance problem - sure, at that point, add an overload that takes an out-parameter and use it at the frequenty-executed hot spot.


The most common usage of out-parameters is when you want to return more than one thing from a function, in which case they can't be trivially converted to return values.


Surely in C++ you can return a tuple. It's only in C you can't.


Sure, but that doesn't always help readability/clarity. Compare this:

    SomeType result;
    if (!do_something(&result))
      return;
    process(result);
with this:

     std::tuple<bool, SomeType> ret = do_something();
     if (!std::get<0>(ret))
       return;
     process(std::get<1>(ret));
You could improve it a bit by using `auto` or returning a struct with named members, but quite often I still find it less readable. Especially in situations that are more complex than this example, where e.g. you fallback to another function to fetch `result` if `do_something` failed.


If there are only two status codes (success/failure) then it seems like optional would be more appropriate than a tuple of bool and SomeType.

    const std::optional<SomeType> result = do_something();
    if (!result.has_value()) {
        return;
    }
    process(result.value());
This is nicer than the output parameter in my opinion, because it may be unclear what the meaning of a default constructed SomeType is. Default values are sometimes dangerous if the default is a real, valid input - errors and invalid states can end up being passed through.

C++ doesn't have great support for sum types, so this is all always going to be pretty non-ergonomic, and the compiler doesn't always warn if you forget to do things. I might even say that the main advantage of using sum types is gone in C++ - you actually can forget to check has_value and call operator* on an empty optional.

It would be so easy for someone to make a mistake and write:

    process(*do_something());
And invoke undefined behaviour. Whereas processing a default-constructed SomeType is at least not undefined behaviour (depending on what SomeType is, and what process does...).

This whole thing is a mess, I hate C++, thinking about UB all the time...


for performance-oriented code, the original example, where the argument is stack-allocated in the caller, passed by reference, and used based on the return value, has benefits that can't be replicated in the std::optional<> version.

I like the std::optional<> version and use that pattern some of the time. But if the object being passed/returned to/from do_something() is non-trivial, i don't want the copy overhead.


> for performance-oriented code, the original example, where the argument is stack-allocated in the caller, passed by reference, and used based on the return value, has benefits that can't be replicated in the std::optional<> version.

Can you go into detail? What are the benefits that can't be replicated?

If anything, the std::optional version seems better, since we can avoid an invocation of the SomeType constructor. Otherwise, everything is the same - everything's allocated on the stack, there is no dynamic allocation.

> If an optional<T> contains a value, the value is guaranteed to be allocated as part of the optional object footprint, i.e. no dynamic memory allocation ever takes place. Thus, an optional object models an object, not a pointer, even though operator*() and operator->() are defined.

That's from here: https://en.cppreference.com/w/cpp/utility/optional

Since C++17, RVO is required by the standard too:

> Return value optimization is mandatory and no longer considered as copy elision; see above.

From here: https://en.cppreference.com/w/cpp/language/copy_elision

I admit I'm not really a C++ expert, so maybe I'm missing something.


Yes, in C++17, RVO would fix the issue. Alas, some of us are stuck with project conventions that have not reached "use C++17". Without RVO, there's an extra copy.


RVO is not new in C++17 though, only the standard requiring it in some cases.

Unless your compiler is truly ancient odds are good it RVOs just fine no?


There is no copy involved with the std::optional version.


The thing is, are we sure?

Probably many C++ aficionados will run to godbolt.org to prove this in a simple testcase, and they will probably be right. But then maybe it’s because you’ve only tested this for primitive or POD types. Maybe this might not be guaranteed for non-POD types with custom constructors? Or maybe this might be okay because of RVO or something? Hmm, let us read the specs again…

The problem is, C++ is a language that is very unintuitive about how your code is going to get mapped to actual hardware (assembly code). For many systems developers, it isn’t enough that the compiler might optimize this smartly; we instead want predictable compiler behavior. So if you want to make sure, and you’re not confident about the gnarly details of the C++ specification, it will be better for you just use out parameters instead of std::optional, if you are in the situation where you really need to squeeze out performance (which happens a lot for low-level systems programming).


Yes we are sure and it has nothing to do with optimizations, RVO, Godbolt or any of the completely irrelevant details you are talking about.

The implementation of std::optional is not permitted by the standard to allocate memory.


Note that I’m not talking about if std::optional itself allocates more than it needs to at construction (which I know it doesn’t, as cppreference.com says so), but if the compiler might generate an additional copy constructor call if it is value-returned from a function (which I’m worried about since NRVO isn’t in the spec and you can’t rely on it).

It’s more specific to the example code the original commenter posted (and about general out-parameter usage) than the flaws of std::optional itself. Maybe this didn’t get communicated properly from my comment.


For C++17, true. Not so if you're using an earlier compiler and/or language standard. Yes, yes, I know we should all get with the program, but that's alas not how the world operates.


There is no std::optional in earlier language standards. std::optional is a C++17 library feature.


Even for C++ 17, this is only guaranteed if you don't name the variable in your function. NRVO is not guaranteed at all. There is even an example downthread where gcc can't do it even at -O3.


boost::optional has existed for many, many years and does not require compiler/language support. AFAIK, the behavior is substantially identical to std::optional.


There are numerous ways of handling this that are significantly better, for example using an std::optional yields this code:

    do_something().and_then(process);
Or structured bindings:

    auto [error_code, result] = do_something();
    if(error_code == 0) {
      process(result);
    }


In the presence of types which are truthy (ie, contextually convertible to bool), its too easy to accidentally swap the unpacking:

    // Oops, backwards!
    auto [result, error_code] = do_something();
    if(error_code) {
      process(result);
    }


To me that's not easy...

It's a very well established pattern to have (error, result) I'd immediately see something off with (result, error)

And clang-tidy can catch this: https://clang.llvm.org/extra/clang-tidy/checks/readability-i...


Fair enough, in my codebase we always order data in terms of dependencies so that independent data comes before dependent data. In this case the result depends on the error_code, and the error_code is independent, so the error_code must be listed first.


    auto [error, value] = do_something()
    if(!error)
         process(value)
Is there something wrong with this?


Hmm, is it error, value or value, error? In this situation, I'd prefer something like std::optional or std::variant, which you can't get wrong as easily as swapping two variables in a structured bind.

Those have their problems too, though, and I'd never claim std::variant is "nice" to use.


tuples are unfortunately terrible. Its almost always better to make another type with the fields you need than use a tuple.


You can return a struct in C, but that's as unconventional as it gets.


I still don't really grok move semantics in C++ but is there a difference between assigning output to an out parameter and returning a value using std::move? Conceptually they seem the same to me but from what little I understand std::move doesn't quite do what I think it does.


You must not use std::move when returning a value, just let RVO do its job. The compiler will make it so that the variable you are returning from your function will be constructed in-place at the variable you are assigning the function call to.

However, if you are compiling at -O0 without any optimizations this may not happen.


Since C++17, compilers are now required to elide copies and use RVO: https://en.cppreference.com/w/cpp/language/copy_elision

There are probably exceptions and loopholes just like with literally everything in C++, but there's this line:

> Return value optimization is mandatory and no longer considered as copy elision; see above.


That's only in the unnamed case. Here's a simple case than GCC does not optimize even in -std=c++20 (and even at -O3 LOL): https://gcc.godbolt.org/z/jKPr98Mhx


Isn't it kind of confusing that RVO is mandatory, but it turns out RVO only refers to unnamed RVO and not NRVO? I would've thought that since NRVO is a type of RVO...

Thanks for the clarification. Clang does do NRVO in that case but I guess it's obviously not guaranteed by the standard.


Yes, I believe that explicitly qualifying unnamed and named rvo should be the norm and would clarify a lot of confusion on these topics.

I guess it makes sense that an optimization that GCC is not able to do at all should not be part of the standard ; I'm pretty sure it should be possible to write cases that clang wouldn't be able to optimize either.


Oof, that’s a bit painful. I thought you could rely on NRVO for recent versions of gcc and clang, but feel I should be a lot more careful after seeing that example.


From what I could see through hundreds of godbolt snippets, if you declare the variable you'll return at the very top of the function before any control flow, NRVO will pretty much always happen.


That only applies when you're returning a prvalue. NRVO isn't guaranteed: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58055


Since C++17 I believe this is guaranteed.


Just like you want to avoid a copy of the input arguments, you may also want to avoid a copy of the output/return value.


I really like the way C# did this:

    void foo<T>(T by_value, ref T by_ref, in T by_immutable_ref, out T out_only);
Append `?` to any type to make it optional. (Though that is more of a lint than truly part of the type system).

`out` parameters pass no value in and must pass a value out. Isn't that just a return value, semantically? Yes, but we didn't used to have value tuples and destructuring so we had to make do. It can still be nicer for some things:

    bool TryParse(string input, out int value)

    if (int.TryParse("123", out int value))
        // Use value


Then async comes along and you can't use `out` in async methods :)

Thus (anonymous) tuples were introduce so we can do it Go/Rust-style:

    async Task<(bool, T)> Foo<T>(T value) {
        /* do something async like a web request or file read etc. */
        var aThing = await GetAThing<T>(value);
        return (aThing != null, aThing);
    }


> OP says this like this is an industry rule of sorts but I've pretty much always seen non-const references for this. Using pointers is terrible because now you have to add a check for null inside your function, as its API contract says "I can accept null values", while references make it clear that it does not make sense to call if you don't have a correct object to pass it.

The other side of the coin is that it hurts readability if you are passing non-const references. When a function takes a non-const reference, the caller looks like this:

    foo (bar); // Have to read the header to know if bar is modified.
When a function takes a pointer, the caller looks like this:

    foo (&bar); // Immediately clear that we cannot gloss over this line in code-review.
Since I read code much more than I write it, it's easier for me to determine that local non-const variables that are used as arguments to a function are not getting modified.

A better (but not best) solution might be to have the language recognise and enforce nullability on arguments. Still not perfect (pointer may be non-null but invalid), but better.

One other benefit pointer parameters have over reference parameters is that the API contract is more flexible. For example, I have a function that accepts a network connection,

    mynet_accept(int listener_fd, char **remote_ip, uint16_t *remote_port);
Having the caller specify NULL for `remote_ip` means that the callee just ignores the parameter. Same for `remote_port`.

If that was changed to use references, the caller will have to declare variables just to satisfy the parameter list, even if they never intend to use them.


> The other side of the coin is that it hurts readability if you are passing non-const references. When a function takes a non-const reference, the caller looks like this:

This has honestly never ever been an issue for me. I don't remember the last time where I passed an argument to a C++ function and actually had to wonder if the argument was modified or not. Like, what if they are being modified if that's the called function has to do anyways ? And even if it's const, the implementation can just remove the const which while not a very nice thing to do, is legal in many cases.

    $ cd
    $ rg "const_cast" -g '*.cpp' | wc -l 
    2004
(and that does not even cover the c-style casts)

> mynet_accept(int listener_fd, char *remote_ip, uint16_t *remote_port);

For this exact case my go-to solution nowadays is

    struct connection_info { 
      int fd;
      std::string ip;
      uint16_t port;
    };  

    mynet_accept({ .fd = the_socket, .port = 4556 });
which is I think much more readable than any of the alternatives ( actual example for pretty much this exact case: https://github.com/ossia/libossia/blob/master/examples/Netwo... )


> I don't remember the last time where I passed an argument to a C++ function and actually had to wonder if the argument was modified or not. Like, what if they are being modified if that's the called function has to do anyways

That's not what I was complaining about. I am saying that, as a reader of the code, it is easier to reason about what happens in this situation:

    foo (&bar);
    ...
    if (bar) { do something }
than in this situation:

    foo (bar);
    ...
    if (bar) { do something }


> mynet_accept({ .fd = the_socket, .port = 4556 });

That's not functionally the same as the example I posted. In any case, I was using the example to point out that using pointers provides a more flexible API contract.

When using pointers, the contract can state that any parameters that are NULL are ignored, whether they are out parameters or not.


> Like, what if they are being modified if that's the called function has to do anyways ?

The idea is that you have some code like this:

  x = some_func();
  do_something(x);
  if x == 0 {
    do_something_else();
  }
You're trying to track down why do_something_else() wasn't called, let's say. Do you have to look at the definition of do_something()? If your coding standards say "no non-const references", then no. If do_something() was modifying x, it would have looked differently, so you can ignore it.


To stay on topic, if x was declared const, you wouldn't have to guess. The compiler would tell you.


> To stay on topic, if x was declared const, you wouldn't have to guess. The compiler would tell you.

Frequently it can't be declared constant, because it is being changed in that scope as it runs.


This was a discussion about passing out (or inout) parameters as `T&` vs `T*`. By definition, these parameters can't be const, so const is irrelevant to this thread.


Good point.


> You're trying to track down why do_something_else() wasn't called, let's say

Then I put a breakpoint at that line and hit f5 and don't try to guess what may or may not happen because in the end that's the only way to know what actually happens


> Then I put a breakpoint at that line and hit f5 and don't try to guess what may or may not happen because in the end that's the only way to know what actually happens

There's no need to guess if the parameter is passed by value only. When pointers are used visual inspection alone is enough to ensure that the reader knows what he value of the argument is after the callee returns.

I said that it's easier to read when pointers are used, and you propose stepping through the code as an alternative. Surely you see my point now?


You can't always reproduce a bug. Also, do you Step Over do_something(x) or Step Into? You definitely step into if it's do_something(&x).


> $ rg "const_cast" -g '*.cpp' | wc -l

not really a fair count though, since those casts may also be adding const


I am missing your readability point here: If I am reading you correctly, you seem to be saying that if you come across foo(&bar) then you need to look at what foo is doing in order to know if bar might be changed, while in the case of foo(bar), looking at foo's declaration might be sufficient. Is the latter not sometimes more readable than the former, and never less?


> I am missing your readability point here: If I am reading you correctly, you seem to be saying that if you come across foo(&bar) then you need to look at what foo is doing in order to know if bar might be changed, while in the case of foo(bar), looking at foo's declaration might be sufficient. Is the latter not sometimes more readable than the former, and never less?

Almost never, in my experience. During code review, when I see this:

    foo(&bar);
I know quite well the author of that code knows that bar might be changed. There are no questions to ask him here, and no feedback to be given.

When I see this:

    foo(bar);
I've no idea if the author realises that `bar` might be have a different value after `foo()` returns. I have to ask the question "Do you know that `var` might be changed by `foo`? Did you test with the different program states that result in the value that `foo` may assign to `bar`?"

In a place where the convention is to use pointers instead of mutable references I never have to ask the question - the code review of the implementation of `foo` would have changed the mutable reference parameter to a pointer parameter.

I read code much more often than I write it. References hurt readability.


That is an interesting point of view. I am usually reading code in order to understand what it does (or where it goes wrong) without the author being present, where the proper use of const in declarations can save a lot of time.


In our corner of said industry it's

   void foo(const T & in, const T * optional_in, T & out, T * optional_out);


> How would OP make a class, which loads content on construction which must absolutely not being changed after construction then, even within its implementation ? "Being careful" is really not sufficient in my experience, and introducing child classes just to enforce the invariant that "const" provides is actively harmful to readability.

Agreed. Also what did you mean by introducing child classes? If you move const fields from your class to a field which is an inner class, either it has an assignment operator (so you can overwrite the entire class by mistake) or not (so the outer class can't be assigned over). If you move all fields from your class to a field, it becomes substantially more tedious to access them. And I suppose you could move immutable fields to be private in a base class supplying a getter, but this feels like a hacky use of inheritance (you lose aggregate initialization, you expose an implementation detail in the header, etc).

Personally I want C++'s const fields to allow moving and swapping to replace them, but not regular methods. Is it possible for the compiler to distinguish moves/swaps and regular methods, or provide an escape hatch to allow `operator=() = default` or manually defining moves/swaps? And would the standards committee accept that type of change?


I can't see how that would make it better than simply not marking them const. Now you still have to be careful and keep edge cases in mind and deal with "const" fields actually changing.

There's one exception, though. I'd support const fields being non-const in the constructor and destructor. But then that should be coupled with the constructor being able to run code before constructing members (in a non-hacky way). The constructor's task is to establish class invariants, so while the constructor is running, it's reasonable to assume all invariants haven't been established, yet, including the immutability of some fields.

This isn't nearly as clear-cut for move construction/assignment, which would need to modify a different object's const members, which at the point of invocation is not at any special stage of its lifetime.


> Using pointers is terrible because now you have to add a check for null inside your function, as its API contract says "I can accept null values", while references make it clear that it does not make sense to call if you don't have a correct object to pass it.

You need no such NULL check. Specify as part of the function's contract that a pointer be non-NULL: if a caller breaks the contract, behavior is undefined. An unexpected NULL is just like any other violation of the function's contract: a programming error.

Spell out the contract in the function's documentation. A contract is more than the signature. There is absolutely no need for a runtime check (certain special cases aside) to enforce function contracts.

In some cases, assert()ing that a parameter that is expected to be non-NULL is in fact non-NULL can be useful --- but you don't need such checks for correctness of the program, and use of assertions in general ought to be a matter of taste and discretion.

Knee-jerk, automatic checking by random internal functions of a module of pointer parameters for NULL in every single place suggests that the author doesn't fully comprehend the concept of interface contracts generally.


It's wildly useful to have interface contracts that can be verified by the compiler. In my experience this benefit of using pointers (vs references) only when the argument is optional way outweighs the drawback of not seeing an '&' at the call site.


> In my experience this benefit of using pointers (vs references) only when the argument is optional way outweighs the drawback of not seeing an '&' at the call site.

Maybe so. Reasonable people differ on this subject.

My point is different: it's just not true that if a parameter is a pointer, then as a general requirement you have to check at runtime whether that pointer is non-NULL. It drives me nuts when I see these checks sprinkled throughout a codebase.


> My point is different: it's just not true that if a parameter is a pointer, then as a general requirement you have to check at runtime whether that pointer is non-NULL. It drives me nuts when I see these checks sprinkled throughout a codebase.

My point is that if you only ever use pointer arguments to indicate that the argument is optional, then the checks are required.


But it is a reasonable rule, hence it being a general requirement in a lot of places.

More nuanced, it's more important to have these null checks on a library's public interfaces than a library's private interfaces. Private interfaces can do with asserts for null check.


> it's more important to have these null checks on a library's public interfaces

No it isn't.


Library authors and users would disagree with you.


> Spell out the contract in the function's documentation

The sibling subthread is all about people not wanting to read the function's documentation.


I would love to have compiler enforced contracts. However I'm not sure how to make that happen or even if it is technically possible. The current contracts proposal is a step in the right direction, but just barely a step and I'm not sure if it can go far enough.

Until then I avoid adding undefined behavior to my code for good reasons.


> now you have to add a check for null inside your function

While I generally agree that non-const references make more sense for out-parameters (although I actually prefer return values), don't forget that it's possible for a reference to become "null" too. It's quite unlikely to happen, but if it happens it's much worse to debug than a null pointer.

Oversimplified example:

    const int& bla = *((int*)nullptr);


Dereferencing a null pointer is Undefined Behavior.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: