Who's working on a "smaller Rust"?

BatmanAoD@programming.dev · edit-2 5 months ago

Oh, Nim is possibly even a better example because it is "transpiled" rather than compiled, meaning the compiler actually generates C or C++ code. You can then compile that with whatever compiler you want. However, I don't know of any major projects in Nim to compare against ones in C, C++, Rust, etc.

Edit: and Zig should be extremely efficient as well.

BatmanAoD@programming.dev · 5 months ago

Who benefits from C being suppressed and attempts being made to replace him? I think there is only one answer - companies. Not developers.

You've missed the group that is most affected by software quality: end-users. Almost everyone in the world relies on computers in some way now, and bugs, especially security vulnerabilities, affect people who have no say in what languages people use to develop software.

But you as a programmer are (and must be) responsible for the code you write, not a language. And the one way not to do bugs - not doing them.

Sounds good. How do I, the end-user of software, hold developers accountable for bugs? I guess I can switch from one buggy operating system to another, or from one buggy browser to another.

But also, do you honestly think that the choice of language does not impact software quality at all? Surely if you were forced to write software in a raw assembly, you'd find it more difficult to write a large and complex system correctly; right? But assembly is just another language; what makes C easier to use correctly? And if C is easier to write correctly than assembly, why would it be surprising that there are languages that are even easier to write correctly, especially after five decades of development in the extremely young field of computer science? Tools are important; your programming language (and compiler) is your first, most important, and most impactful tool as a developer.

[C] remains the fastest among high-level languages.

How are you determining that? C, C++, Rust, Fortran, Ada, and D all compile down to machine code, with no garbage collector (D's is optional). So there's not really any theoretical reason why they shouldn't all perform roughly the same. And, in fact, this is largely supported by evidence:

There's a fair amount of competition among grep type tools. grep itself is written in C and heavily optimized. I think it's fairly well known by now that ripgrep, written in Rust, is probably the fastest of these tools.
The TechEmpower web framework benchmarks maintains a ranking of the fastest web frameworks, updated each year. It doesn't look like the current version of the site shows what language each framework is written in, but the top three (may-minihttp, xitca-web, and ntex) are all Rust projects. The fourth (h2o) is in C.
The Benchmarks Game lets people submit programs in a wide variety of languages to solve a variety of problems, and these submissions are benchmarked and compared. Rust and C are effectively neck-and-neck (note that Rust currently does actually beat C in several of the challenges). See the second graph here for an overall visual comparison among languages.

[Side-note: no one is "suppressing" C. I'm also not convinced anyone thinks C is "useless".]

BatmanAoD@programming.dev · 5 months ago

You're misunderstanding the posts you're explaining. Sanitizers, including ASan, HWASan, and bound sanitizer, are not "static analysis tools". They are runtime tools, which is why they have a performance impact. They're not intended to be deployed as part of a final executable.

I don't know how you can read this sentence and interpret it to mean that they "haven't onboarded AddressSanitizer":

In previous years our memory bug detection efforts were focused on Address Sanitizer (ASan).

BatmanAoD@programming.dev · 5 months ago

One problem with this is that C is in no way the "roots" of programming; it's older than most of the languages we use today, but Fortran, Lisp, and Cobol are all older and are also still in use. (And of course there are other languages that predate C but have mostly fallen out of use, such as Pascal.) It feels "low-level" because it closely reflects the hardware for which it was originally designed, the PDP-7 and later the PDP-11. But in fact it hasn't truly been "low-level" for a long time: I highly recommend the ACM article "C Is Not a Low-level Language; Your computer is not a fast PDP-11."

BatmanAoD@programming.dev · edit-2 5 months ago

You are making an extreme assumption, and it also sounds like you've misread what I wrote. The "attempts" I'm talking about are studies (formal and informal) to measure the root causes of bugs, not the C or C++ projects themselves.

I cited one specific measurement, Daniel Stenberg's analysis of the Curl codebase. Here's a separate post about the testing and static analysis used for Curl.

Here's a post with a list of other studies. The projects analyzed are:

Android (both the full codebase and the Bluetooth & media components)
iOS & MacOS
Chrome
Microsoft (this is probably the most commonly cited one)
Firefox
Ubuntu Linux

Do you really think that Google, Apple, Microsoft, Mozilla, and the Ubuntu project "don't even consider onboarding basic static analysis tools" in their C and C++ projects?

If you're curious about the specifics of how errors slip through anyway, here's a talk from CppCon 2017 about how Facebook, despite copious investment into static analysis, still had "curiously recurring" C++ errors. It's long, but I think it's worthwhile; the most interesting part to me, though, starts around 29:40, where he asks an audience of C++ users whether some specific code compiles, and only about 10% of them get the right answer, one of whom is an editor of the C++ standard.

BatmanAoD@programming.dev · 5 months ago

The animation that goes with this is pretty slick: https://x.com/Phantom_TheGame/status/1748457358521426375?s=20

BatmanAoD@programming.dev · 5 months ago

Oh hey, it's modern ed!

BatmanAoD@programming.dev · 5 months ago

I very much understand thinking that Rust has too much hype, but the differences between C and Rust are so fundamental that "switching between" them just to "keep your interest fresh" seems ill-advised to me. To be honest, your statements about both C and Rust so far seem pretty superficial; have you actually used Rust for anything nontrivial?

C syntax is simple, yes, but C semantics are not; there have been numerous attempts to quantify what percentage of C and C++ software bugs and/or security vulnerabilities are due to the lack of memory safety in these languages, and although the results have varied widely, the most conservative estimate (this blog post about curl; see the section "C is unsafe and always will be") ended up with an estimate of 40%, or 50% if you only count critical bugs. If I recall correctly, Microsoft did a similar study on one of their projects and declared a rate closer to 70%.

This means that the choice of language is not just about personal preference. Bugs aren't just extra work for software developers; they affect all users of software, which means they affect pretty much everyone. And, crucially, they're not just annoyances; cyberattacks of various kinds are extremely prevalent and can have a huge impact on people. So if 50% or more of critical software vulnerabilities are due to the choice of language, then that is a very good reason to pick a safer language.

Rust is not the only choice for memory-safe languages. If you like the simplicity of C, you should definitely learn Go (it's explicitly designed to be as simple as possible to learn). But I would also strongly recommend looking into Zig, which hews much closer to C than Rust does; in fact, it has probably the best interoperability with C of any modern language.

BatmanAoD@programming.dev · 5 months ago

Rust's 1.0 release (i.e. the date on which the language received any sort of stability guarantee) was in 2015, and this article was written in 2019. Measuring the pace of feature development of a four-year-old language by its release notes, and comparing against a 50-year-old language by counting bullet points in Wikipedia articles, is absolutely ridiculous.

Yes, younger languages adopt features more quickly, and Rust was stabilized in a "minimal viable product" state, with many planned features not yet ready for stabilization. So of course the pace of new features in Rust is high compared to older languages. But Wikipedia articles are in no way comparable to release notes as a measure of feature adoption.

I think C is faster, more powerful, and more elegant.

"More elegant" is a matter of opinion. But "faster" and "more powerful" should be measurable in some way. I'm not aware of any evidence that C is "faster" than Rust, and in fact this would be extremely surprising since they can both be optimized with LLVM, and several of the features Rust has that C doesn't, such as generics and ubiquitous strict aliasing, tend to improve performance.

"Powerful" can mean many things, but the most useful meaning I've encountered is essentially "flexibility of application" : that is, a more powerful language can be used in more niches, such as obscure embedded hardware platforms. It's really hard to compete with C in this regard, but that's largely a matter of momentum and historical lock-in: hardware vendors support C because it's currently the lowest common denominator for all hardware and software. There's nothing about Rust the language that makes it inappropriate for hardware vendors to support at a low level. Additionally, GCC is probably the toolchain with the broadest hardware support (even hardware vendors that use a bespoke compiler often do so by forking GCC), and Rust currently has two projects (mrustc and gccrs) working to provide a way to use GCC with Rust. So even the advantage C has in terms of hardware support is narrowing.

But note that there are also niches for which C is widely considered less appropriate than Rust! The most obvious example is probably use in a front-end web application. Yes, C should in theory be usable on the front-end using emscripten, but Rust has had decent support for compiling to WebAssembly almost as long as it's been stabilized.

BatmanAoD@programming.dev · 6 months ago

The poster finds it "kinda odd" that people don't immediately correctly interpret statements like this toot as being exclusively about the upstream kernel: https://fosstodon.org/@kernellogger/111742009818641713

I find it "kinda odd" that he expects that to be a natural reading of the original toot.

BatmanAoD@programming.dev · edit-2 6 months ago

Go is a "small" language in the sense that it has an exceptionally small number of concepts (i.e. features and syntactic constructs); someone else in this thread made a comment to the effect that it takes almost no time to learn because there's nothing to learn if you've already learned a different language. This is of course an exaggeration, but only slightly: Go was very intentionally and explicitly designed to be as simple as possible to learn and to use. As an example, it famously had no support for generics until almost 10 years after its 1.0 release. I think that when talking about the size of a language, some people do include the standard library while others don't; Go has quite a large standard library, but you don't actually have to learn the whole library or even be aware of what's available in order to be productive.

I personally don't think it makes sense to include the standard library in the "size" of a language for the purpose of this thread, or Boats' original blog posts. The fundamental point is about the learning curve of the language and the amount of work it takes to produce working code, and a large standard library tends to be more convenient, not less. Common functionality that isn't in Rust's standard library tends to come from libraries that become almost ubiquitous, such as serde, regex, crossbeam, and itertools. From the user's perspective, this isn't much more or less complicated than having the same functionality available via the standard library. (Of course, a large standard library certainly makes languages more difficult to implement and/or specify; if I recall correctly, about half the size of the C++ standard is dedicated to the standard library.)

I don't really know how to fairly compare the "size" of Rust and C++, partly because Rust is so much younger, and several C++ "features" overlap with each other or are attempts to replace others (e.g. brace-initialization as a replacement for parentheses). But I don't think I've ever heard someone claim that C++ is "small" or "minimal" in this sense, so it's in no way a good point of comparison for determining whether Rust is "small".

Edit to add: for what it's worth, if I weren't quoting Boats' blog post (which is sort of the "canonical" post on this concept), I probably would have opted for "simpler (to learn & use)" rather than "smaller."

BatmanAoD@programming.dev · 6 months ago

Not unless you consider Go a "bigger" language than Rust. The blog post means "smaller" in terms of what the user has to learn and think about, rather than smaller in implementation size or resulting binary.

BatmanAoD@programming.dev · 6 months ago

My understanding was that there's some ecosystem bifurcation, somewhat like Rust's. But I'll look into it again!

BatmanAoD@programming.dev · 6 months ago

I do want to learn Haskell some day, but it seems like it has a whole different set of reasons why it's tricky to learn; and I hear enough about the more complex features (e.g. arrow notation) having compiler bugs that I think it really doesn't sound like a "smaller" or "simpler" language than Rust.

That said, yeah, it definitely meets the criteria of having strong typing, a functional style, a garbage collector, and pretty good performance.

BatmanAoD@programming.dev · 6 months ago

OCaml seems really close, but I'm told that there are problems with its concurrency story. I do think it sounds like a really good language.

BatmanAoD@programming.dev · 6 months ago

Did you read the original "Notes" post? I thought it did a pretty good job of explaining why Rust-like ownership semantics are not necessarily at odds with having a garbage collector.

BatmanAoD@programming.dev · 6 months ago

Not necessarily interpreted, but possibly. I think a more likely path is something like Go that's compiled but still has a garbage collector.

BatmanAoD@programming.dev · edit-2 6 months ago

Who's working on a "smaller Rust"?

BatmanAoD@programming.dev · 6 months ago

Plus, if you eventually do need or want to learn C++ for some reason, I honestly think that knowing Rust first would be quite helpful.

BatmanAoD@programming.dev · 6 months ago

I remember thinking that Qt seemed like a really good approach that should be more widely adopted...until I actually had to use it.

Not that it's terrible; it's really not. But C++ is just not at all a good language for <s>anything</s> UI stuff.

BatmanAoD@programming.dev · 7 months ago

The parts that seem likely to cause this confusion (which I shared when I first started using C++11) are:

Moves in C++ are always a potentially-destructive operation on a reference, not just a memcopy.
Consequently, "moving" a temporary still requires having two separate instances of the type, despite that generally not being what you want, hence RVO.
...but move-semantics are generally presented and understood as an "optimization", and conceptually "take the guts of this value and re-use them as a new value" is both what RVO is doing and what move-semantics are doing.
std::move isn't a compiler intrinsic and doesn't force a move operation; it's just a function that returns an r-value reference. So it makes it harder, not easier, for the compiler to "see through" and optimize away, even in the case where "as if" rule should make that legal.