Insignificant Truths

Yes, Rust is beginner-friendly

BLUF—Rust, the programming language and technology is beginner-friendly; the official and recommended learning materials, on the other hand, aren’t. Unfortunately, a confused conflation here makes one attribute the badness of the learning materials to the object they try to teach.


Being a three-part retro-capture of a statement I made during the DevCongress Rust Study Group's sync session on Wednesday, Mar 22, 2023 to address a concern about the beginner-friendliness of Rust, the object of our study. I've untied a few knots—or so I hope—and provided more background and detail where necessary.

Among other things, Rust is notorious for being unfriendly to beginners1. Many people have struggled with its steep learning curve and have given up before seeing any benefits (patience is a resource in short supply, after all). Even experienced Rust programmers agree that it can be tough to get started with the language. In fact, Rust’s directors seem to agree with this evaluation (of rough first encounter) and so they have committed quite a lot of resources towards preparing high-quality learning materials for the language. The official (& free) book, The Rust Programming Language, for example, is arguably the best introductory material to Rust. This is the material that the DevCongress Rust Study Group uses.

One careful look at the beginner-unfriendliness notoriety, and it becomes obvious that one thing, and one thing alone, is to blame. You see, Rust’s key feature is memory safety. It achieves that through a largely novel memory management technique publicly known as ownership2. If you’re learning Rust through the official book, you encounter it in chapter 4. The importance of grokking this subject matter is mildly but sternly emphasized at the start of the chapter (bold emphasis mine):

Ownership is Rust’s most unique feature and has deep implications for the rest of the language. It enables Rust to make memory safety guarantees without needing a garbage collector, so it’s important to understand how ownership works. In this chapter, we’ll talk about ownership as well as several related features: borrowing, slices, and how Rust lays data out in memory.

The implications are deep, indeed: at all times the Rust programmer must keep their mind fixed on ownership since the compiler rejects programs that violate the laws of ownership. Politely, but firmly. Although it tries to provide ample help to relieve any pent-up anger, unless you made an obviously silly mistake, the rejection, and the surprise it usually evokes, can mean just one thing: you don’t know what you’re doing! The feeling could easily be one of defeat. A feeling to be avoided by all means necessary. And so when we (the Study Group) arrived at the first chapter on ownership, we determined to spare no resources to make it make sense. We even invented and played several rounds of ownership-themed games. Well, three weeks of individual studies and 9 hours of group meetings later, I’m here to tell you that we really aren’t sure we have this beast under control. Worse, we’re not even sure we can ever put it under control.

Is ownership, by nature, difficult to understand?

Our own experience, and that of many other people on the internet, seem to discourage spending more than regulation time on trying to understand ownership. They say that the extra time is better spent playing whack-a-mole with the compiler. You’re going to play it anyways, and, for the rest of your life so you better get the practice in now. It would seem that experienced Rust programmers are just extremely quick at hanging some apostrophes here, a few ampersands there, and shuffling them around until voila! a permutation that satisfies the laws of ownership. But I’ve watched a few videos of Jon Gjengset at work and this doesn’t seem to be his modus operandi. If anything, Jon appears very at home. He makes mistakes, of course, but these are largely restricted to syntax and mistaken implementations. Mistakes he makes in his English language as well. He barely confronts an unyielding borrow checker. So what’s amiss here? How did these masters slay the hydra?

Well, I don’t have a data-driven answer. All I have is an anecdotally derived hypothesis that the Rust book could be accidentally making life difficult for newcomers to the Rust programming language.

In a follow-on post, I’ll lay out my constructive feedback (read: criticism) of the Rust book. While it strives to remain pedagogical, I’m convinced that the effort has been in a wrong direction so far. I’ll try to persuade you to the same conclusion in the next post in the series.

Draft reviewed by Rasheed Muhammad. All errors mine.


  1. it’s important to distinguish between the two classes of beginners: the first, is a beginner to programming overall. the second type of beginner has pre-existing programming experience. they may even be professional programmers. which means they’re familiar with popular programming concepts. when rust is accused of being unfriendly to beginners, both classes are implied. 

  2. this is not to suggest that there’s a private & different name. i’m not a computer scientist and so i’m not well-versed in the language of the field. but i’m an attentive listener, and i’ve heard rumors that ownership indeed has a different (formal?) name/vocabulary in academese, the language of academics. 

What is dependency injection?

The preliminary terror, which chokes off most fifth-form boys from even attempting to learn how to calculate, can be abolished once for all by simply stating what is the meaning–in common-sense terms–of the two principal symbols that are used in calculating.

— Silvanus P. Thompson, Calculus Made Easy

In my opinion, the biggest obstacle to understanding the dependency injection (DI) design pattern lies in knowing what a dependency is. In software, dependency can be used to refer to quite different things, making it harder to cleanly pin down a reliable definition. For example, entire libraries you use within your application (those things you install with yarn add ..., gem install ... and friends) are called dependencies too. So, are those the dependencies we’re trying to inject? Not necessarily.

If we accept a definition of dependency as a certain requirement, necessary or contingent, for a given piece of code to work, then we begin to see dependencies at all layers of a functioning system: from the lower layers of functions (as function parameters), through structures like classes (other/lower-level classes), packages/modules (other packages/modules), applications (packages, other applications), to services (applications, other services, etc) and beyond. In my experience, at every layer, a different style of dependency injection is necessary; the goal remains the same though: the dependency should be ready to used when and where necessary.

DI is usually meaningful up to the application layer. Mainly because (1) design patterns make the most sense at this layer and below, (2) when folks talk about dependency injection they have this layer in mind, and (3) in a layered/hierarchical assembly, higher units usually have little to no control over internal organization of the lower layers they depend on 1.

Injecting dependencies into functions

Dependency injection at the function level/layer has been variously simplified as function parameterization. Or perhaps I made up this term? To parameterize a function is to define unknown quantities as parameters instead of deriving them within the function’s body. Let me try to illustrate with two functions: parameterized_sum and non_parameterized_sum. Both add two integers A and B. Pay attention to how they receive A and B.

parameterized_sum(A, B) -> A+B

non_parameterized_sum() ->
  A = read(stdin, "A: ") as int
  B = read(stdin, "B: ") as int
  A+B

The difference between the two functions is quite clear: parameterized_sum outsources deriving A and B to the caller. non_parameterized_sum instead derives A and B itself by reading and parsing input from standard input. If this doesn’t look like a bad idea, consider that while non_parameterized_sum is severely constrained (to only adding numbers that were read from stdin), parameterized_sum has no such limitations: it really doesn’t care where A and B came from. A and B could have been derived from a random number generator, or from a file, or from a database table row. It really doesn’t matter to parameterized_sum. By outsourcing the initialization of A and B, parameterized_sum obtains robustness in the face of changing requirements, and robustness in the face of changing requirements is a critical quality of well-crafted code.

But A and B are simple, scalar dependencies. Let’s examine a different situation where the dependencies require a slightly more complex initialization. Here’s a file lookup function. It’s called find. It retrieves a file from a given source. At the moment the sources we support are (1) database (where file content could be stored as a blob of data), (2) local file system, and (3) AWS S3. Here’s the code:

find(src, id) ->
  if src is db: database.lookup(id)
  if src is fs: filesystem.lookup(id)
  if src is s3: aws_s3.lookup(id)

Obviously we have dependencies on the database, the file system, and S3. At the moment it’s not clear where they came from—find jumps straight into using them. All three (and especially database and aws_s3) require slightly complex initialization. Take the database for example. database.lookup works if and only if a connection to the server has been established, and establishing that connection usually involves building a client, dialing the server, and running some checks/pings to confirm that the database is in a desirable state. This is all necessary work: it has to be done. The question is where and when to do it.

One option is to initialize them inside find’s body, like so:

find(src, id) ->
  database := init_db_client()
  filesystem := init_fs_client()
  aws_s3 := init_aws_s3_client()

  if src is db: database.lookup(id)
  if src is fs: filesystem.lookup(id)
  if src is s3: aws_s3.lookup(id)

Some people strongly believe that the code above is as ugly as it gets. Their go-to argument is that we’re only a few inches away from the entire codebase becoming a sprawling mess of init_db_client(), init_fs_client(), init_aws_s3_client() invocations, making it extremely difficult to figure out exactly where the first database or file system or S3 client initialization happens. The other argument is that dependencies usually form a direct acyclic graph. This means that it’s critical that they’re initialized in a topologically sorted order. If there’s more than a handful of dependencies, it could be a painful and error-prone task for a human. But not for a computer.

Just as we did we parameterized_sum, we can defer the initialization concern, for now, and define database, filesystem, and aws_s3 as parameters, like so:

find(src, id, database, filesystem, aws_s3) ->
  if src is db: database.lookup(id)
  if src is fs: filesystem.lookup(id)
  if src is s3: aws_s3.lookup(id)

Technically we have achieved dependency injection, but at some cost. First, we have blown up find’s arity. It went from find/2 to find/5. I have written about keeping function arity to maximum of two. Secondly, we haven’t solved the part where the computer, not the programmer, initializes the dependencies in a topologically sorted order. We just don’t have to do it within find’s body, but it has to be done somewhere and, so far it looks like a human has to do it. But that’s if we don’t use a dependency injection framework.

A dependency injection frameworks (e.g. dagger, wire, dig) let you delegate dependency initialization to the computer. Below I’ll demonstrate a hypothetical dependency injection framework, di_framework.

di_framework allows us to restore find to its original state where it accepted only two parameters: src and id. di_framework provides facilities that allow us to inject database, filesystem, and aws_s3 into find at minimal cost. We do that by declaring a manifest of dependency using the @di_framework::inject annotation. At runtime, di_framework will ensure that database, filesystem, and aws_s3 are properly initialized and ready to go!

@di_framework::inject(
  database,
  filesystem,
  aws_s3,
)
find(src, id) ->
  if src is db: database.lookup(id)
  if src is fs: filesystem.lookup(id)
  if src is s3: aws_s3.lookup(id)

The manifest doesn’t necessarily have to be outside the body of the function. For programming languages without annotations, the code below is an equally valid dependencies declaration:

find(src, id) ->
  di_framework::inject(
    database,
    filesystem,
    aws_s3,
  )

  if src is db: database.lookup(id)
  if src is fs: filesystem.lookup(id)
  if src is s3: aws_s3.lookup(id)

Hell, the annotation name needs not be inject2 either! Could be needs, or requires. The industry, dominated by English speakers, seems to have settled on inject.

And what about the initialization part? It looks like this:

@di_framework::provides(database)
init_db_client() ->
   params := get_params()
   client := build_client(params)
   healthcheck!
   client

@di_framework::provides(filesystem)
init_fs_client() ->
  params := get_params()
  fs := mount_fs(params)
  build_client(fs)

@di_framework::provides(aws_s3)
init_aws_s3_client() ->
   params := get_params
   client := build_client(params)
   healthcheck!
   client

We annotate client initialization functions with @di_framework::provides. That’s how di_framework knows which function to call for each dependency. If, for example, we hadn’t annotated any function to provide aws_s3, we’d be in trouble since di_framework can’t honor any @di_framework::inject(aws_s3) command.

Typically, dependency injection frameworks can provide more than inject and provides annotations, especially if they have runtime behavior. For example, they can ensure that init_db_client(), init_fs_client(), and init_aws_s3_client() are invoked just once. The first time they’re invoked, the resulting object is stored in a container (some sort of dependency registry, if you will) so that satisfying a @di_framework::inject annotation is a cheap registry lookup. Some dependency injection frameworks do all their work at compile time. The net effect is that the initialization code is (1) topologically ordered, then (2) copy-pasted into all the right places. There is no global dependency registry hanging around during the lifetime of the application. Something to pay attention to when choosing a dependency injection framework: depending on your expectations, this may be undesired.

Enough said about dependency injection. As with all principles, the practice can, and I believe should, be allowed to deviate from the theory. Thus, you should expect to see wild varieties in implementations both of dependency injection itself and its frameworks. Regardless, they should share a single animus, and I think it’s that soul that I’ve tried to capture here.

Vale!


  1. … a matrix on the n-level is represented on the n+1 level by its code. … loss of direct control over automatized processes on lower levels of the body hierarchy is part of the price paid for differentiation and specialization. — Arthur Koestler, The Act of Creation, chapter on The Ubiquitous Hierarchy.

  2. inject derives from the latin inicere with the imperative form inice. As a latinist, my preferred translation, and one that fits the purpose here, is throw into. See Wiktionary for more forms. inject as used in dependency injection is the imperative form of the verb. it appears that english derived the verb from the french injecter which has an interesting history. as it happens quite frequently in latin, verbs are backformed from supines of existing verbs. iniectare was backformed from inicere’s supine, iniectum, and took on a life of its own. french derived injecter from iniectare. later, english derived to inject from french’s injecter

A new way of thinking

Motivation

Rust was my language of the year. You know, that thing where programmers set out to learn a new programming language every year. Usually not to be productive at it but to familiarize themselves with current trends in language design, implementation, and paradigms. I had heard a lot of good stuff about Rust and decided late last year to make it my 2018 language. I even attended the all-hands of the Rust Core Team, and participated in a few discussions. To be fair, it was in the hopes of meeting my internet friend Yehuda Katz in person. He wasn’t there.

A few days ago (it’s November already) I began serious study using both Philipp Oppermann’s Writing an OS in Rust and The Rust Programming Language Book. You’ve probably heard it said that Rust is a systems programming language. Well, an operating system is the ultimate system, so it’s as perfect a match as it gets. Plus both materials are available for free.

I’m only a few days in but I’ve been smacked by some of what I consider the best ideas in programming I’ve encountered yet. I’ve also come to grips with gaps in my knowledge, things I took for granted but shouldn’t have, and a better understanding of my current tools of the trade. In this post I talk about two things: the move (haven’t met in any language aside Rust) and scope, a concept familiar to most programming language.

The Move

In most programming languages, code similar to what’s below will compile or run. What we want to do has been done many times and over. Assign a value to a variable, bind one more variable to the same value but referring to it by the previous variable, and continue to use both variables in the same scope.

1
2
3
4
5
6
7
8
9
// This code accessible on the Rust playground
// at this link:
// https://play.rust-lang.org/?version=stable&mode=debug&edition=2015&gist=1640e708979885d7219b84ec45917d76

fn main() {
  let s = String::from("hello");
  let t = s;
  println!("{} is the same as {}", t, s);
}

In every language I’ve written useful software in this is perfectly valid. But not in Rust, because, basically it fucks up memory reclamation if you’re allowed to do that.

You see, Rust is not garbage collected. That means you allocate and free memory yourself, C-style. If this second statement was self-evident to you then you just got smacked by the first falsehoods programmers believe about garbage collection. Rust isn’t GC-ed but you don’t manually malloc and free either. Instead heap memory is allocated and freed as you enter and leave scopes. And this is the difference between Rust and GC-ed languages: Rust frees unused memory immediately the scope has been exited while GC-ed languages like Go, through a sophistication that I now no longer find necessary, reserve dead objects to be pruned later.

That was a digression. Back to what The Move is or what happened in the code above. In simple terms, s moved to t, after all who needs two references to the same thing? Give me a good reason why you’d have two variables point to the same place in memory. I’ll wait. This is the commonsense explanation of The Move. The computer science-y explanation is: it avoid trying to free the same memory address twice when we exit the scope. Asking that the same memory location be cleared twice, accidentally, can have dire consequences. If in-between the two calls the operating system offered the spot to another process, we effectively made life difficult for this process. If Newton lived long enough to make laws of allocating and freeing memory, he’d have quipped that in a given scope, number of mallocs and frees are equal but opposite. In the code above there’s only one allocation and it’s on line 6. Line 7 doesn’t allocate. So there can be only one freedom at the end of the scope. Thus both s and t can’t survive to the end of the scope.

The book has more details on this behavior, and which types it applies to. It’s here, with more details about the Copy trait, clone-ing, and a lesson on how to explain potentially incendiary concepts.

Scope

Scope can be an elusive concept, but only if you let it be. You can get away with thinking of it as the stuff in between the curly braces (in programming languages with curly braces). In Python it’s defined by the indentation, in Ruby a mix of things. Let’s stick with the curly braces for now. Everything in the opening ({) and closing (}) braces are in the same scope, including, wait for it, other scopes. It could be scopes all the way down—it depends on how you like your sugar-free coke, really. For example, here’s an inception of scopes:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
fn main() {
  // parent scope
  let s = String::from("hello");
  // len, a function, defines another scope.
  // Thus the scope of `len` is a child of
  //`main`'s scope.
  let l = len(s);
  println!("{} has length of {}", s, l);
}

fn len(s: String) -> usize {
  // child scope
  s.len()
}

The asymmetry of life applies: child has access to everything in/on parent but not the other way round. By all means everything we already know and expect of scope inceptions.

That aside, the code above has been written many times over in several languages, text editors, and color schemes. And all variations compile or work. But not with Rust. The innocent looking code isn’t really innocent: it tries to double spend a value. How? you ask. Well it sold ownership of its variable s to another scope (the len function) but tries to use it immediately after. What? sold what? you ask again. Well, that’s how scopes work. They own, and can bequeath ownership of their resources. Thus passing s to len is a statement from the main scope to bequeath ownership of s to len. Henceforth len owns s. Which means that main has lost every right to use it (and so shouldn’t try to use it). It also means that immediately the scope of len is over, the memory occupied by s is freed. I wasn’t expecting this. Born and bred on the jarring principles of pass by value and pass by reference, Rust’s ownership is a breath of fresh air. Once a variable is passed to another scope, the parent scope has lost ownership and can’t use it again. I don’t even want to think in terms of variables any more. I want to think of everything as a resource. This is probably me taking it too far but that’s what happens when you’re hit by a revelation. It changes a lot of things for you at the same time.

Any workarounds? I’d love to still use s after finding its length. No, no workaround needed. We could, instead of bequeathing ownership, lease it to len. How? len could be defined as a function that doesn’t take ownership of its arguments but rather requests a lease valid for the duration of its scope. Here’s how the new len looks like (together with main):

1
2
3
4
5
6
7
8
9
10
fn main() {
  let s = String::from("hello");
  let l = len(&s);
  println!("{} has length of {}", s, l);
}

// The New & Improved `len`. Instead of taking
// ownership of its arguments, it requests a
// lease (aka a reference) and works with that.
fn len(s: &str) -> usize { s.len() }

With this change, the heap memory used by s won’t be reclaimed when len’s scope is over. In fact, in the current implementation there’s no allocation so there will be zero freeing. The allocation was in main and so the deallocation or freeing or reclamation will happen there, mainly because it didn’t bequeath ownership of s to anyone.

Conclusion

These concepts were pretty interesting to think about and use. And once you get a hang of it the entire field of garbage collection, which the best minds of our generation continue to tinker to some desired perfection, begin to look like wasted effort. But I’d like to think otherwise and hope that there’s a significant benefit to garbage collection (over immediately freeing memory) that I just don’t know yet.

That said I’ve learnt a few things and as I continue to write and maintain programs in GC-ed languages I’ll see what lessons from Rust I could apply there.

The biggest takeaway is that I’ll still do my interviews in Python.

Enjoy!

Got comments or corrections for factual errors? There’s a Hacker News thread for that.