The recent bug in the Linux kernel Rust code, based on my understanding, was in unsafe code, and related to interop with C. So I wouldn't really classify it as a Rust bug. In fact, under normal circumstances (no interop), people rarely use unsafe in Rust, and the use is very isolated.
I think the idea of developers developing a "bugs antenna" is good in theory, though in practice the kernel, Redis, and many other projects suffer from these classes of bugs consistently. Additionally, that's why people use linters and code formatters even though developers can develop a sensitivity to coding conventions (in fact, these tools used to be unpopular in C-land). Trusting humans develop sensibility is just not enough.
Specifically, about the concurrency: Redis is (mostly) single-threaded, and I guess that's at least in part because of the difficulty of building safe, fast and highly-concurrent C applications (please correct me if I'm wrong).
Can people write safer C (e.g. by using sds.c and the likes)? For sure! Though we've been writing C for 50+ years at this point, at some point "people can just do X" is no longer a valid argument. As while we could, in fact we don't.
I hear "people rarely use unsafe rust" quite a lot, but every time I see a project or library with C-like performance, there's a _lot_ of unsafe code in there. Treating bugs in unsafe code as not being bugs in rust code is kind of silly, also.
Exactly. You don't need much unsafe if you use Rust to replace a Python project, for instance. If there is lower level code, high performances needs, things change.
For replacing a Python project with Rust, unsafe blocks will comprise 0% of your code. For replacing a C project with Rust, unsafe blocks will comprise about 5% of your code. The fact that the percentage is higher in the latter case doesn't change the fact that 95% of your codebase is just as safe as the Python project would be.
A big amount of C code does not do anything unsafe as well, it calls other stuff, do loops, logic business, and so forth. It is also wrong to believe 100% of the C code is basically unsafe.
If so, then it should be trivial for someone to introduce something like Rust's `unsafe` keyword in C such that the unsafe operations can be explicitly annotated and encapsulated.
Of course, it's not actually this trivial because what you're saying is incorrect. C is not equipped to enforce memory safety; even mundane C code is thoroughly suffused with operations that threaten to spiral off the rails into undefined behavior.
It is not so hard to introduce a "safe" keyword in C. I have a patched GCC that does it. The subset of the language which can be used safety is a bit too small to be full replacement on its own, but also not that small.
C lacks safe primitives or non-error-prone ways to build abstractions to refer to business objects. There are no safe string references, let along ways to safely manipulate strings. Want to iterate over or index into a result set? You can try to remember to put bounds checks into every API function.
But even with explicit bounds checks, C has an ace up its sleeve.
int cost_of_nth_item(int n) {
if (n < 0 || n >= num_items)
return -1; // error handling
…
}
Safe, right? Not so fast, because if the caller has a code path that forgets to initialize the argument, it’s UB.
You're swapping definitions of unsafe. Earlier you were referring to the `unsafe` keyword. Now you're using `unsafe` to refer to a property of code. This makes it easy to say things like "It is also wrong to believe 100% of the C code is basically unsafe" but you're just swapping definitions partway through the conversation.
What I see is that antirez claims that absence of "safe" (as syntax) in C lang doesn't automatically mean that all of C code is unsafe (as property). There's no swapping of definitions as I see it.
I think there's a very clear switch of usage happening. Maybe it's hard to see so I'll try to point out exactly where it happens and how you can spot it.
First from antirez:
> You don't need much unsafe if you use Rust to replace a Python project, for instance. If there is lower level code, high performances needs, things change.
Use of the term `unsafe` here referring to the keyword / "blocks" of code. Note that this statement would be nonsensical if talking about `unsafe` as a property of code, certainly it would be inconsistent with the later unsafe since later it's claimed that C code is not inherently "unsafe" (therefor Rust would not be inherently "unsafe").
Kibwen staying on that definition here:
> For replacing a Python project with Rust, unsafe blocks will comprise 0% of your code. For replacing a C project with Rust, unsafe blocks will comprise about 5% of your code.
Here is the switch:
> A big amount of C code does not do anything unsafe as well
Complete shift to "unsafe" as being a property of code, no longer talking about the keyword or about blocks of code. You can spot it by just rewriting the sentences to use Rust instead of C.
You can say:
"A big amount of 'unsafe' Rust code does not do anything unsafe as well"
"It is also wrong to believe 100% of the unsafe Rust code is basically unsafe."
I think that makes this conflation of terms clear, because we're now talking about the properties of the code within an "unsafe" block or globally in C. Note how clear it is in these sentences that the term `unsafe` is being swapped, we can see this by referring to "rust in unsafe blocks" explicitly.
This is just a change of definitions partway through the conversation.
p.s. @Dang can you remove my rate limit? It's been years, I'm a good boy now :)
High performance is not an on/off target. Safe rust really lets you express a lot of software patterns in a "zero-cost" way. Sure, there are a few patterns where you may need to touch unsafe, but safe rust itself is not slow by any means.
For your last sentence, I believe topics are conflated here.
Of course if one writes unsafe Rust and it leads to a CVE then that's on them. Who's denying that?
On the other hand, having to interact with the part of the landscape that's written in C mandates the use of the `unsafe` keyword and not everyone is ideally equipped to be careful.
I view the existence of `unsafe` as pragmatism; Rust never would have taken off without it. And if 5% of all Rust code is potentially unsafe, well, that's still much better than C where you can trivially introduce undefined behavior with many built-in constructs.
Obviously we can't fix everything in one fell swoop.
>>Of course if one writes unsafe Rust and it leads to a CVE then that's on them. >>Who's denying that?
>>The recent bug in the Linux kernel Rust code, based on my understanding, was >>in unsafe code, and related to interop with C. So I wouldn't really classify >>it as a Rust bug.
Why is glue code not normal code in Rust? I don't think anyone else would say that for any other language out there. Does it physically pain you to admit it's a bug in Rust code? I write bugs in all kind of languages and never feel the need for adjectives like "technical", "normal", "everyday" or words like "outlier" to make me feel not let down by the language of choice.
I have worked with Rust for ~3.5 years. I had to use the `unsafe` keyword, twice. In that context it's definitely not everyday code. Hence it's difficult to use that to gauge the language and the ecosystem.
Of course it's a bug in Rust code. It's just not a bug that you would have to protect against often in most workplaces. I probably would have allowed that bug easily because it's not something I stumble upon more than once a year, if even that.
To that effect, I don't believe it's fair to gauge the ecosystem by such statistical outliers. I make no excuses for the people who allowed the bug. This thread is a very good demonstration as to why: everything Rust-related is super closely scrutinized and immediately blown out of proportion.
As for the rest of your emotionally-loaded language -- get civil, please.
I don't care if there can be a bug in Rust code. It doesn't diminish the language for me. I don't appreciate mental gymnastics when evidence is readily available and your comments come out as compulsive defense of something nobody was really is attacking. I'm sorry for the jest in the comments.
I did latch onto semantics for a little time, that much is true, but you are making it look much worse than it is. And yes I get a PTSD and an eye-roll-syndrome from the constant close scrutiny of Rust even though I don't actively work with it for a while now. It gets tiring to read and many interpretations are dramatically negative for no reason than some imagined "Rust zealots always defending it" which I have not seen in a long time here on HN.
But you and me seem to be much closer in opinion and a stance than I thought. Thanks for clarifying that.
The bug in question is in rust glue code that interfaces with a C library. It's not in the rust-C interface or on the C side. If you write python glue code that interfaces with numpy and there's a bug in your glue, it's a python bug not a numpy bug.
I already agreed that technically it is indeed a bug in the Rust code. I would just contest that such a bug is representative is all. People in this thread seem way too eager to extrapolate which is not intellectually curious or fair.
In Rust you can avoid "unsafe" when you use Rust like it was Go or Python.
If you write low level code, that is where C is in theory replaceable only by Rust (and not by Go), then you find yourself in need of writing many unsafe sections. And to lower the amount of unsafe sections, you have to build unnatural abstractions, often, in order to group such unsafe sections into common patterns. Is is a tradeoff, not a silver bullet.
Not necessarily at all. Go peruse the `regex` crate source code, including its dependencies.
The biggest `unsafe` sections are probably for SIMD accelerated search. There's no "unnatural abstractions" there. Just a memmem-like interface.
There's some `unsafe` for eliding bounds checks in the main DFA search loops. No unnatural abstractions there either.
There's also some `unsafe` for some synchronization primitives for managing mutable scratch space to use during a search. A C library (e.g., PCRE2) makes the caller handle this. The `regex` crate does it for you. But not for unnatural reasons. To make using regexes simpler. There are lower level APIs that provide the control of C if you need it.
That's pretty much it. All told, this is a teeny tiny fraction of the code in the `regex` crate (and all of its dependencies).
I think this framing is a bit backwards. Many C programs (and many parts of C programs) would benefit from being more like Go or Python as evident by your very own sds.c.
Now, if what you're saying is that with super highly optimized sections of a codebase, or extremely specific circumstances (some kernel drivers) you'd need a bit of unsafe rust: then sure. Though all of a sudden you flipped the script, and the unsafe becomes the exception, not the rule; and you can keep those pieces of code contained. Similarly to how C programmers use inline assembly in some scenarios.
Funny enough, this is similar to something that Rust did the opposite of C, and is much better for it: immutable by default (let mut vs. const in C) and non-nullable by default (and even being able to define something as non-null).
Flipping the script so that GOOD is default and BAD is rare was a huge win.
I definitely don't think Rust is a silver bullet, though I'd definitely say it's at least a silver alloy bullet. At least when it comes to the above topics.
In my experience (several years of writing high performance rust code), there’s only really 2 instances where you need unsafe blocks:
- C interop
- Low level machine code (eg inline assembly)
Most programs don’t need to do either of those things. I think you could directly port redis to entirely safe rust, and it would be just as fast. (Though there will need be unsafe code somewhere to wrap epoll).
And even when you need a bit of unsafe, it’s usually a tiny minority of any given program.
I used to think you needed unsafe for custom container types, but now I write custom container types in purely safe rust on top of Vec. The code is simpler, and easier to debug. And I’m shocked to find performance has mostly improved as a result.
> was in unsafe code, and related to interop with C
1) "interop with C" is part of the fundamental requirements specification for any code running in the Linux kernel. If Rust can't handle that safely (not Rust "safe", but safely), it isn't appropriate for the job.
2) I believe the problem was related to the fact that Rust can't implement a doubly-linked list in safe code. This is a fundamental limitation, and again is an issue when the fundamental requirement for the task is to interface to data structures implemented as doubly-linked lists.
No matter how good a language is, if it doesn't have support for floating point types, it's not a good language for implementing math libraries. For most applications, the inability to safely express doubly-linked lists and difficulty in interfacing with C aren't fundamental problems - just don't use doubly-linked lists or interface with C code. (well, you still have to call system libraries, but these are slow-moving APIs that can be wrapped by Rust experts) For this particular example, however, C interop and doubly-linked lists are fundamental parts of the problem to be solved by the code.
If Rust is no less safe than C in such a regard, then what benefit is Rust providing that C could not? I am genuinely curious because OS development is not my forte. I assume the justification to implement Rust must be contingent on more than Rust just being 'newer = better', right?
The issue is unrelated to expressing linked lists, it's related to race conditions in the kernel, which is one of the hardest areas to get right.
This could have happened with no linked lists whatsoever. Kernel locks are notoriously difficult, even for Linus and other extremely experienced kernel devs.
I love rust, but C does make it a lot easier to make certain kinds of container types. Eg, intrusive lists are trivial in C but very awkward in rust. Even if you use unsafe, rust’s noalias requirement can make a lot of code much harder to implement correctly. I’ve concluded for myself (after a writing a lot of code and a lot of soul searching) that the best way to implement certain data structures is quite different in rust from how you would do the same thing in C. I don’t think this is a bad thing - they’re different languages. Of course the best way to solve a problem in languages X and Y are different.
And safe abstractions mean this stuff usually only matters if you’re implementing new, complex collection types. Like an ECS, b-tree, or Fenwick tree. Most code can just use the standard collection types. (Vec, HashMap, etc). And then you don’t have to think about any of this.
>> I guess that's at least in part because of the difficulty of building safe, fast and highly-concurrent C applications (please correct me if I'm wrong).
You wrote that question in a browser mostly written in C++ language, running on an OS most likely written in C language.
OS can be actually pretty simple to make. Sometimes it's a part of a CS curriculum to make one. If it were so much easier to do it in other languages (e.g. in Rust), don't you think we would already be using them?
Writing a real one? Who's gonna write all the drivers and the myriad other things?
And the claim was not that it's "so much easier", but that it is so much easier to write it in a secure way. Which claim is true. But it's still a complex and hard program.
(And don't even get started on browsers, it's no accident that even Microsoft dropped maintaining their own browser).
The toy one can still be as highly concurrent as the the real one. The amount of drivers written for it doesn't matter.
The point is if it were much easier, then they would overtake existing ones easily, just by adding features and iterating so much faster and that is clearly not the case.
>>difficulty of building safe, fast and highly-concurrent C
This was the original claim. The answer is, there is a tonne of C code out there that is safe, fast and concurrent. Isn't it logical? We have been using C for the last 50 years to build stuff with it and there is a lot of it. There doesn't seem to be a big jump in productivity with the newer generation of low level languages, even though they have many improvements over C.
This is anecdotal, I used to do a lot of low level C and C++ development. And C++ is a much bigger language then C. And honestly I don't think I was ever more productive with it. Maybe the code looked more organized and extendable, but it took the same or larger amount of time to write it. On the other hand when I develop with Javascript or C#, I'm easily 10 times more productive then I would be with either C or C++. This is a bit of apples and oranges comparison, but what I'm trying to say is that new low level languages don't bring huge gains in productivity.
Slightly different as I generate a PGP key on the computer and then load it to the Yubikey, which means I can have backup keys with the same secret keys.
I never really got "touch to use" working though, if anyone knows how to do it with GPG keys I'd really appreciate it!
Congrats OpenRouter on the launch! I'm a big fan of this pattern. We do the same in Svix for our customers, but also we now make it easy for our customers to do the same.
For other companies looking to build something similar, you can use Svix Stream[1] that offers a lot of these integrations out of the box, with more coming.
This is indeed such an awesome feature. I really hope we see lots of other products & services offer to send you your own traces!!
Also, such integration really ought respect & add to any trace propogation contexts passed in. Fingers crossed that works well here! Not explicitly mentioned so I'm nervous, but the product really needs trace propogation.
I'm sure that if they don't already support it, they will add it. TBH, we at first didn't have it either, and then we added both that as well as custom attributes as we realized the way people setup their observability stack is extremely varied!
No reason all browsers would not be able to be similar in performance eventually. Pleased this was noticed and being worked on by both v8 and Firefox team
Mozilla's "privacy" image prevents them from knowing what their browser actually does in the wild, while Google collects CPU time profiles from user devices, comprehensively, and hammers down the hotspots they find, and that refinement has been going on for many years.
Even if true (and I agree with sibling that I don't think that it is), base64 encoding/decoding feels like one of those things you'd have a micro benchmark for regardless. It's also shocking that the gap is so wide, as I feel like people working on such things would start with a fairly optimized v1.
I wonder if this is why Firefox feels so sluggish with some more complex SPAs.
That’s nonsense. Firefox has telemetry built in, it’s just that you can opt out of it. Your answer doesn’t explain why at all but instead just takes a wild guess at what might have happened. You don’t know if this was discovered in Chrome or in some other use of V8. Or maybe it was always fast in Chrome! What a weird non-answer.
There are a lot of reasons. Just three from the top of my head:
1. The way Unix works, a directory is a file, so if you can write in a directory you'll also be able to move directories around (and thus break the structure you mentioned completely).
2. Doesn't make sense for multi-user. Yes, I understand most people have their own computers, but (1) why design it in a way that breaks multi-user unnecessarily? (2) there are a lot of utility users, and having them get access to user files because of the way this is structured is silly.
3. `grep -r` is going to be a pain in the ass when searching your own files, because it'll also search all the other system subdirectories too.
I think the idea of developers developing a "bugs antenna" is good in theory, though in practice the kernel, Redis, and many other projects suffer from these classes of bugs consistently. Additionally, that's why people use linters and code formatters even though developers can develop a sensitivity to coding conventions (in fact, these tools used to be unpopular in C-land). Trusting humans develop sensibility is just not enough.
Specifically, about the concurrency: Redis is (mostly) single-threaded, and I guess that's at least in part because of the difficulty of building safe, fast and highly-concurrent C applications (please correct me if I'm wrong).
Can people write safer C (e.g. by using sds.c and the likes)? For sure! Though we've been writing C for 50+ years at this point, at some point "people can just do X" is no longer a valid argument. As while we could, in fact we don't.
reply