A new way of thinking

31 Jul 2022

Motivation

Rust was my language of the year. You know, that thing where programmers set out to learn a new programming language every year. Usually not to be productive at it but to familiarize themselves with current trends in language design, implementation, and paradigms. I had heard a lot of good stuff about Rust and decided late last year to make it my 2018 language. I even attended the all-hands of the Rust Core Team, and participated in a few discussions. To be fair, it was in the hopes of meeting my internet friend Yehuda Katz in person. He wasn’t there.

A few days ago (it’s November already) I began serious study using both Philipp Oppermann’s Writing an OS in Rust and The Rust Programming Language Book. You’ve probably heard it said that Rust is a systems programming language. Well, an operating system is the ultimate system, so it’s as perfect a match as it gets. Plus both materials are available for free.

I’m only a few days in but I’ve been smacked by some of what I consider the best ideas in programming I’ve encountered yet. I’ve also come to grips with gaps in my knowledge, things I took for granted but shouldn’t have, and a better understanding of my current tools of the trade. In this post I talk about two things: the move (haven’t met in any language aside Rust) and scope, a concept familiar to most programming language.

The Move

In most programming languages, code similar to what’s below will compile or run. What we want to do has been done many times and over. Assign a value to a variable, bind one more variable to the same value but referring to it by the previous variable, and continue to use both variables in the same scope.

1
2
3
4
5
6
7
8
9
// This code accessible on the Rust playground
// at this link:
// https://play.rust-lang.org/?version=stable&mode=debug&edition=2015&gist=1640e708979885d7219b84ec45917d76

fn main() {
  let s = String::from("hello");
  let t = s;
  println!("{} is the same as {}", t, s);
}

In every language I’ve written useful software in this is perfectly valid. But not in Rust, because, basically it fucks up memory reclamation if you’re allowed to do that.

You see, Rust is not garbage collected. That means you allocate and free memory yourself, C-style. If this second statement was self-evident to you then you just got smacked by the first falsehoods programmers believe about garbage collection. Rust isn’t GC-ed but you don’t manually malloc and free either. Instead heap memory is allocated and freed as you enter and leave scopes. And this is the difference between Rust and GC-ed languages: Rust frees unused memory immediately the scope has been exited while GC-ed languages like Go, through a sophistication that I now no longer find necessary, reserve dead objects to be pruned later.

That was a digression. Back to what The Move is or what happened in the code above. In simple terms, s moved to t, after all who needs two references to the same thing? Give me a good reason why you’d have two variables point to the same place in memory. I’ll wait. This is the commonsense explanation of The Move. The computer science-y explanation is: it avoid trying to free the same memory address twice when we exit the scope. Asking that the same memory location be cleared twice, accidentally, can have dire consequences. If in-between the two calls the operating system offered the spot to another process, we effectively made life difficult for this process. If Newton lived long enough to make laws of allocating and freeing memory, he’d have quipped that in a given scope, number of mallocs and frees are equal but opposite. In the code above there’s only one allocation and it’s on line 6. Line 7 doesn’t allocate. So there can be only one freedom at the end of the scope. Thus both s and t can’t survive to the end of the scope.

The book has more details on this behavior, and which types it applies to. It’s here, with more details about the Copy trait, clone-ing, and a lesson on how to explain potentially incendiary concepts.

Scope

Scope can be an elusive concept, but only if you let it be. You can get away with thinking of it as the stuff in between the curly braces (in programming languages with curly braces). In Python it’s defined by the indentation, in Ruby a mix of things. Let’s stick with the curly braces for now. Everything in the opening ({) and closing (}) braces are in the same scope, including, wait for it, other scopes. It could be scopes all the way down—it depends on how you like your sugar-free coke, really. For example, here’s an inception of scopes:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
fn main() {
  // parent scope
  let s = String::from("hello");
  // len, a function, defines another scope.
  // Thus the scope of `len` is a child of
  //`main`'s scope.
  let l = len(s);
  println!("{} has length of {}", s, l);
}

fn len(s: String) -> usize {
  // child scope
  s.len()
}

The asymmetry of life applies: child has access to everything in/on parent but not the other way round. By all means everything we already know and expect of scope inceptions.

That aside, the code above has been written many times over in several languages, text editors, and color schemes. And all variations compile or work. But not with Rust. The innocent looking code isn’t really innocent: it tries to double spend a value. How? you ask. Well it sold ownership of its variable s to another scope (the len function) but tries to use it immediately after. What? sold what? you ask again. Well, that’s how scopes work. They own, and can bequeath ownership of their resources. Thus passing s to len is a statement from the main scope to bequeath ownership of s to len. Henceforth len owns s. Which means that main has lost every right to use it (and so shouldn’t try to use it). It also means that immediately the scope of len is over, the memory occupied by s is freed. I wasn’t expecting this. Born and bred on the jarring principles of pass by value and pass by reference, Rust’s ownership is a breath of fresh air. Once a variable is passed to another scope, the parent scope has lost ownership and can’t use it again. I don’t even want to think in terms of variables any more. I want to think of everything as a resource. This is probably me taking it too far but that’s what happens when you’re hit by a revelation. It changes a lot of things for you at the same time.

Any workarounds? I’d love to still use s after finding its length. No, no workaround needed. We could, instead of bequeathing ownership, lease it to len. How? len could be defined as a function that doesn’t take ownership of its arguments but rather requests a lease valid for the duration of its scope. Here’s how the new len looks like (together with main):

1
2
3
4
5
6
7
8
9
10
fn main() {
  let s = String::from("hello");
  let l = len(&s);
  println!("{} has length of {}", s, l);
}

// The New & Improved `len`. Instead of taking
// ownership of its arguments, it requests a
// lease (aka a reference) and works with that.
fn len(s: &str) -> usize { s.len() }

With this change, the heap memory used by s won’t be reclaimed when len’s scope is over. In fact, in the current implementation there’s no allocation so there will be zero freeing. The allocation was in main and so the deallocation or freeing or reclamation will happen there, mainly because it didn’t bequeath ownership of s to anyone.

Conclusion

These concepts were pretty interesting to think about and use. And once you get a hang of it the entire field of garbage collection, which the best minds of our generation continue to tinker to some desired perfection, begin to look like wasted effort. But I’d like to think otherwise and hope that there’s a significant benefit to garbage collection (over immediately freeing memory) that I just don’t know yet.

That said I’ve learnt a few things and as I continue to write and maintain programs in GC-ed languages I’ll see what lessons from Rust I could apply there.

The biggest takeaway is that I’ll still do my interviews in Python.

Enjoy!

Got comments or corrections for factual errors? There’s a Hacker News thread for that.