Insignificant Truths

Flattening nested JSON

Motivation

Once you start functional programming, you become alert to problems that lend themselves to a recursive solution. In my case the programming language is Elixir and the problem in question was posed to my friend at his job interview. He was asked to flatten a heavily nested JSON object. In Python, his language of choice, heavily nested dictionary.

The Problem

Flatten a nested JSON object. But first, a paragraph on JSON.

JSON is a data interchange format, popularized by the proliferation of APIs (thank you REST). Part of its popularity is also due to the fact that it’s text-based, human-readable, and with an incredibly simple specification which could be read and understood at the water cooler.

Here’s an example valid JSON:

1
'{"versions":[1],"name":"json","text":true,"creator":{"name":"Doug", "address":{"country":"USA"}}}'

Pretty printed:

1
2
3
4
5
6
7
8
9
10
11
{
  "creator": {
    "name": "Doug",
    "address": {
      "country": "USA"
    }
  }
  "name": "json",
  "versions": [1],
  "text": true,
}

Our goal is to get rid of the nesting. For the above example, if we do a good job, we should have this result:

1
2
3
4
5
6
7
{
  "creator.name": "Doug"
  "creator.address.country": "USA",
  "name": "json",
  "versions": [1],
  "text": true,
}

To be fair, this is not an exciting problem. It’s not one of those designed to make you look stupid at the white board. It feels like something you’d actually do on the job.

The Recursive Approach

This is the function we’re going to fill up as we tease apart the problem using well-crafted input.

1
2
def flatten(json):
  pass

Input Type One (Empty or One-Level Deep)

1
{"a": "A"}

The simplest input we could be given is a JSON object that’s either empty or only one-level deep. In that case we just return the input as the output. Or we could build a new object and return that instead. Not the most optimal solution, as we’re unnecessarily duplicating the object. But it helps keep the implementation consistent as we improve it so let’s do that.

1
2
3
4
5
6
def flatten(json):
  acc = {}
  for k, v in json.items():
    acc[k] = v

  return acc

We initialize an empty object as the accumulator, which is populated as we iterate over the key-value pairs of the JSON object, generated by the items method. At the end we return our newly-built object. It’s a duplicate of the original input, but it’s different. It’s not affected by changes to the original.

Input Type Two (Our First Encounter With Nesting)

1
{"a": {"b": "c"}, "b": "d"}

The end result of flattening the above input should be as below. I chose . as the separator but it can be anything that makes sense, really.

1
{"a.b": "c", "b": "d"}

Code solution:

1
2
3
4
5
6
7
8
9
def flatten(json, acc, prefix):
  for k, v in json.items():
    prefixed_k = (prefix + '.' + k) if prefix else k
    if type(v) is dict:
      flatten(v, acc, prefixed_k)
    else:
      acc[prefixed_k] = v

  return acc

It’s quite a jump from handling the simple input to the slightly more sophisticated one. In the process our function has gained 2 more arguments (one of which we’ve met before) and 3 more lines. A couple of things, though, should remain familiar. For example, we still iterate over the items of the object to build our new object, the accumulator.

The changes which introduce the new behavior have been highlighted.

First the prefix. Since we’re flattening an object, it’s no longer safe to use their original key in the accumulator as it could overwrite an existing pair or be overwritten by a later member. Prefixing the current key with all their ancestor keys prevents this data loss.

On line 4 we do a type test of the value under consideration, if it’s a dict, we flatten it. The arguments to this call are the value under consideration, the original accumulator, and a prefix that contains its key (and all ancestor objects). It is this call to flatten that makes the solution recursive. As it tries to flatten the nested JSON, if it encounters further objects on its way, it pauses to resolve them. And then continues.

The code above is all that was needed at my friend’s interview. Well, not quite. The interviewer wanted a function that flattens a JSON object but now we demand of them to pass 2 other arguments to the function. It wasn’t a disgraceful challenge so we should be nice to them. Also, as the author of a post complaining bitterly about N-parameter functions I owe it to you to do the right thing. So we’ll bring it back to a single-parameter function by taking advantage of one of the many niceties of Python.

In Python, a function can be defined in another function. Just like a variable. Even better, it can be called (of course what’s the point of defining a function if it can’t be used), just like an initialized variable can be used. Let’s use this technique to bring our function’s arity down to one:

def flatten(json)
  def _flatten(json, acc, prefix):
    for k, v in json.items():
      prefixed_k = (prefix + '.' + k) if prefix
      if type(v) is dict:
        _flatten(v, acc, prefixed_k)
      else:
        acc[prefixed_k] = v
    return acc

  return _flatten(json, {}, '')

This should please the interviewer. This new implementation also allows us to accept and return real JSON strings (the single-quoted stuff). To do that we do some pre-processing before calling _flatten and post-process its return value. Here’s the updated and final code:

from json import loads, dumps

def flatten(json):
  def _flatten(json, acc, prefix):
    for k, v in json.items():
      prefixed_k = (prefix + '.' + k) if prefix else k
      if type(v) is dict:
        _flatten(v, acc, prefixed_k)
      else:
        acc[prefixed_k] = v
    return acc

  json_obj = loads(json)
  return dumps(_flatten(json_obj, {}, ''))

Caveat

Imperative languages don’t do recursion well. And Python is no exception. In fact it has a hard limit on how much you can recurse. By default it’s a 1000. (See $ python -c "import sys;print(sys.getrecursionlimit())"). It can be moved, of course, but the limit is there for a reason: to not crash Python. So, if you’re going to move it around, please do so gently.

You know what has no limits on it? The iterative solution. Its implementation is left as exercise to the reader.

Got comments or corrections for factual errors? There’s a Hacker News thread for that.

Monitoring Go applications with Monit

Out-of-Memory

Out-of-Memory (OOM) occurs when the operating system can’t find free memory to allocate. When this happens an already running process has to be sacrificed to free up enough memory. Which process is killed, by the OOM killer, depends on a time-tested heuristic that selects for bad OOM score. OOM score (also known as badness score) is a cumulative statistic of how well or poorly a process uses system resources. Usually when a process uses a lot of memory, it has a high badness score, which consequently makes it a prime target of the OOM killer.

Go and OOMs

It’s well-known that OOMs are tough on Go applications. Go 1.11 memory management improvements promise to keep the OOM killer at a safe distance. But that respite isn’t always enough. As I write this, my long running Go application (compiled with 1.11, of course) has a badness score of 181, which puts it right in the crosshairs of the killer. The application under consideration is a concurrent audio streams recorder and processor. At the peak of its activity it hoards a significant amount of memory. This memory requirement won’t change. Likewise the OOM killer won’t relent. I mean, with such a reputation the OOM killer is bound to develop an appetite for terminating my crucial but memory-intensive application. Fair.

Hence, I needed a solution which will resurrect the application after it has been killed.

The Solution

If you’re familiar with Erlang or Elixir, you probably jumped ahead and said, yup, what you need is a supervisor2. And you’re probably right.

Unfortunately, Go, like most other programming languages, lacks a supervision tree in the manner of Erlang/Elixir’s. You could build one, specific to your application, but I found Monit to be exactly the process monitor I needed. Besides, whatever I was going to build wasn’t going to give me the assurance of the Erlang/Elixir supervisor nor the low footprints of Monit.

Monit

Monit is a small utility for managing and monitoring Unix systems. It can monitor a process and respond to specific events with pre-defined actions. Monit runs in cycles (defaults to 2 minutes apart), and during each run it figures out the state of the process. If the process is dead it will be restarted with the start program action.

Installation

If there isn’t a package for your Linux/Unix distribution, you can follow the instructions here to download and install Monit on your system. On Ubuntu, my Linux distro, it’s as simple as

 $ sudo apt install monit 

and after a successful installation, started with

$ monit

Setup

In my case I wanted to monitor the main process of my Go application and restart whenever it fell prey to the OOM killer. I added recorder.monit to my application root with the following contents:

check process recorder with pidfile /var/run/recorder.pid
  start program = "/etc/init.d/recorder start"
  stop program = "/etc/init.d/recorder stop"

Next I symlinked to a directory from which Monit loads extra configurations. Which means that next time Monit (re)starts, the recorder process, whose pid is written to /var/run/recorder.pid will be monitored. And that is where the pid of my memory intensive Go application is written to. All set, but I don’t reload Monit yet.

$ ln -s /path/to/recorder.monit /etc/monit/conf.d/recorder

As you can see in the Monit configuration above, I opted for an init script for starting and stopping the application. Below is the contents of recorder.sh, my init script, which I added to the root of the application files:

#! /bin/sh
set -e

### BEGIN INIT INFO
# Provides:          recorder
# Required-Start:    $local_fs $network
# Required-Stop:     $local_fs $network
# Default-Start:     2 3 4 5
# Default-Stop:      0 1 6
# Short-Description: AF Radio audio stream recorder
# Description:       AF Radio audio stream recorder
### END INIT INFO

export PATH=$PATH:/usr/local/bin

BASE=recorder_with_fingerprint-0.1.0
DB_PATH=/var/local/afaudio.db

LINUX_BIN=/usr/local/bin/$BASE-linux-amd64
LOGFILE=/var/log/recorder.log
PIDFILE=/var/run/recorder.pid # managed by app
RECORDER_PIDFILE=/var/run/$BASE.pid # managed by start-stop-daemon
RECORDER_DESC="AF Radio Audio Recorder"

# log_begin_msg, log_end_msg, log_warning_msg
source /lib/lsb/init-functions

# Handle start, stop, status, restart
case "$1" in
  start)
    log_begin_msg "Starting $RECORDER_DESC"
    start-stop-daemon --start --background \
      --no-close \
      --oknodo \
      --exec "$LINUX_BIN" \
      --pidfile "$RECORDER_PIDFILE" \
      --make-pidfile \
      -- \
        -pidfile "$PIDFILE" \
        -logfile "$LOGFILE" \
        -db "$DB_PATH"

    log_end_msg $?
    ;;

  stop)
    if [ -f "$RECORDER_PIDFILE" ]; then
      log_begin_msg "Stopping $RECORDER_DESC"
      start-stop-daemon --stop --pidfile "$RECORDER_PIDFILE" --retry 1

      log_end_msg $?
    else
      log_warning_msg "Recorder already stopped."
    fi
    ;;

  restart)
  recorder_pid=`cat "$RECORDER_PIDFILE" 2>/dev/null`
  [ -n "$recorder_pid" ] \
    && ps -p $recorder_pid >/dev/null 2>&1 \
    && $0 stop
  $0 start
    ;;

  status)
    status_of_proc -p "$RECORDER_PIDFILE" "$BASE" "$RECORDER_DESC"
    ;;

  *)
    echo "Usage: service recorder {start|stop|restart|status}"
    exit 1
    ;;
esac

exit 0

Next, recorder.sh is made executable, symlinked to /etc/init.d/ directory, and added to the list of programs that should be started on boot. Then

$ chmod +x /path/to/recorder.sh
$ ln -s /path/to/recorder.sh /etc/init.d/recorder
$ update-rc.d recorder defaults
$ systemctl start recorder

Now I’m ready to reload Monit:

$ monit reload

It’s a good feeling of liberation when you know your application will be restarted no matter how many times it goes down, and without manual intervention. Well, as long as Monit itself stays alive. As such I still check in once in a while (but less frequently), to see how both Monit and my application are doing, and usually I see statistics like this

The Monit daemon 5.16 uptime: 2h 20m

Process 'recorder'
  status                            Running
  monitoring status                 Monitored
  pid                               12339
  parent pid                        1
  uid                               0
  effective uid                     0
  gid                               0
  uptime                            2h 20m
  threads                           6
  children                          0
  memory                            18.8 MB
  memory total                      18.8 MB
  memory percent                    0.5%
  memory percent total              0.5%
  cpu percent                       0.0%
  cpu percent total                 0.0%

Got comments or corrections for factual errors? There’s a Hacker News thread for that.

  1. I should mention that this isn’t an indictment on Go’s memory management; it’s the nature of the application under consideration. It records audio tracks (to hard disk), reads them into memory for processing, before uploading to S3 storage. 

  2. Erlang and Elixir have a thing called the supervisor tree. Roughly put, one (or more) of the processes started by your application acts as a supervisor of all child processes, and is able to bring them back up when they go down. 

Strings in Go's runtime

Motivation

Consider this an explainer to Rob Pike’s blog post on Go’s strings. Written with Go programmers in mind.

In the blog post, he says without much evidence that Go’s strings are implemented as slices. I guess it wasn’t oversight on his part but proving that point wasn’t the goal of the blog post. Secondly anyone who wanted to find out for themselves had Go’s code at their disposal to explore. That’s what I did. And so in this post I try my best to explain how close Go’s implementation of strings is to its implementation of slices.

I don’t discuss string encoding. That’s not in the runtime, the thing I explore. But I discuss common string operations that mimic slices such as length of a string (len("go")), concatenation ("go" + "lang"), indexing ("golang"[0]), and slicing ("golang"[0:2]). To be fair, indexing and slices are operations in their own rights, which means their availability on strings has nothing (or very little) to do with the nature of strings. This is not entirely true, but please accept it, as the truth will take us into the Go compiler through the underlying types of so-called basic types and back. Secondly, I don’t write this post under oath.

The Nature of Strings

I’m yet to come across a programming language where strings have different underlying memory structure: the contiguous slots of memory. What this means is that the bytes of a string sit next to each other in memory, with nothing in between them. That is, if you used the famous 12-byte string, hello, world in your program, and got the chance to inspect them in memory, you’d find them sitting in the same row, each byte (or character) followed immediately by the other, with no empty spaces or foreign bytes in between them. As far as I know, Go doesn’t deviate from this wisdom.

But this is all talk of memory, the physical stuff. When we’ve reached this point all things are the same and the differences between programming languages are effectively erased. So let’s back up one level into runtimes, and here we see how the different languages go about their businesses, and this is where we find the details of how Go implements its string data type. Luckily a significant part of the Go runtime is written in Go, with, God bless the authors, extensive comments explaining both the obvious and not so obvious implementation details. The runtime’s implementation of string can be found here on GitHub as at the time of writing. Let’s walk about it.

Go’s String

In the Go runtime, a string is a stringStruct type:

type stringStruct struct {
  str unsafe.Pointer
  len int
}

It consists of str, a pointer to the block of memory where the actual bytes are located and len, which is the length of the string. Since strings are immutable, these never change.

Creating a New String

The function tasked with making new strings in the runtime is called rawstring. Here’s how it’s implemented as at the time of writing (comments in code are mine):

func rawstring(size int) (s string, b []byte) {
  // 1. Allocate memory block the size of the
  //    string and return a pointer to it:
  p := mallocgc(uintptr(size), nil, false)

  // 2. Make a stringStruct with the newly
  //    created pointer and the size of the string.
  stringStructOf(&s).str = p
  stringStructOf(&s).len = size

  // 3. Prepare a byte slice where the string's
  //    actual data will be stored.
  *(*slice)(unsafe.Pointer(&b)) = slice{p, size, size}
}

rawstring returns a string and a byte slice where the actual bytes of the string should be stored, and it is this byte slice that will be used in all operations on the string. We can safely call them data ([]byte) and metadata (stringStruct).

But this is not the end of it. And perhaps this is the only time you have access to the real non-zeroed byte slice behind the string. In fact, the comment on rawstring instructs the caller to only use the byte slice once (to write contents of the string) and then drop it. The rest of the time, the string struct should be good enough.

Knowing this, let’s look at how some common string operations are implemented. It will also make sense to us why good old concatenation isn’t a recommended way to build large strings.

Common String Operations

Length (len("go"))

Since strings are immutable, the length of a string stays constant. Even better we know it by the time we’re storing the string, and that’s what we store in stringStruct’s len field. Thus, requests for the length of a string are take the same amount of time regardless of the size of the string. In Big-O terms, it’s a constant time operation.

Concatenation ("go" + "lang")

It’s a simple process. Go first determines the length of the resultant string by summing lengths of all the strings you want to concatenate. Then it requests that size of contiguous memory block. There’s an optimization check and more importantly a safety check. The safety check ensures that the resultant string won’t have a length that exceeds Go’s maximum integer value.

Then the next step of the process begins. The bytes of the individual strings are copied one after another into the new string. That is, bytes in different locations in memory are copied into a new location. It’s unpleasant work, and should be avoided where possible. Hence the recommendation to use strings.Builder instead since it minimizes memory copying. It’s the closest we’ll come to mutable strings.

Indexing ("golang"[0])

Go’s indexing operator is written as [index], where index is an integer. As at the time of writing it was available on arrays, slices, strings, and some pointers.

What arrays, slices, and strings have a common underlying type. In physical memory terms it’s a contiguous block of memory. In Go parlance, an array. For strings this is the byte slice that was returned by rawstring, this is where the string’s contents are stored. And that’s what we index on. Goes without saying that the “some pointers” I mentioned above as compatible with the indexing operator are those with array underlying type.

Note that it’s the same syntax on maps but with different behavior. With maps the type of the key determines the type of the value in between the brackets.

Slicing ("golang"[0:2])

The slice operator has the same compatibility as the index operator: operand must have an array underlying type. Thus it works on the same set of types: arrays, slices, strings, and some pointers.

On strings there’s a caveat. The full slice operator is [low:high:capacity]. In one go it allows you to create a slice and set the capacity of its underlying array. But remember strings are immutable and so there will never be a need to have an underlying array bigger that what’s really needed for the string’s contents. Hence the 3-slice operator doesn’t exist for strings.

The strings and strconv packages

Go provides the strings and strconv packages for dealing with strings. I already mentioned the more efficient Builder for building large strings. It’s provided by the strings package. There’s other niceties in there. Together they provide tuned functions for string transformations, conversions, comparisons, search and replace, etc. Check them out before you build your own.

Source of Confusion

cap(slice) vs cap(string)

The built-in function cap returns the capacity of a slice’s underlying array. Throughout the life of a slice the underlying array’s capacity is allowed to keep changing. Usually it grows to accommodate new elements. If a string is a slice, why doesn’t it respond to a cap inquiry? the question goes. The simple answer: Go strings are immutable. That is, it will never grow or shrink in size, which in turn means that if cap were implemented it would be the same as len.

Got comments or corrections for factual errors? There’s a Hacker News thread for that.

ActiveRecord should ask for more than record identifier

This post is targeted at Rails developers who use ActiveRecord with, obviously, a relational database that allows setting functions as a columns default. PostgreSQL for example.

What We Wanted

I took part in discussions of an issue that led to ActiveRecord migrations accepting and delegating column default values to a native database function. PostgreSQL already had this. For as long as I’ve used it, it’s been possible to set the default value of a column to a function, either one that comes with PostgreSQL, an extension, or a custom procedure. ActiveRecord just didn’t have the facility to do that. Well, you could still execute your way to this desired state.

This was our past (with the mandatory down migration specification since we called execute):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
class CreatePosts < ActiveRecord::Migration
  def up
    create_table :posts do |t|
      t.string :title

      t.timestamps
    end
    execute <<~SQL
      alter table posts
        alter title
        set default some_function()
    SQL
  end

  def down
    drop_table :posts
  end
end

And this is our present and future (which I love so much):

1
2
3
4
5
6
7
8
9
class CreatePosts < ActiveRecord::Migration[5.2]
  def change
    create_table :posts do |t|
      t.string :title, default: -> { "some_function()" }

      t.timestamps
    end
  end
end

Now it’s easier to set a database function as the default value of a column using an automatically reversible migration. Lots to love, and a proper thank you is in order to whoever worked on it.

Solving this problem led to another. If you read the comments on the issue, you’ll see what I’m talking about. A new issue has been created to address it. That is what I address in this post.

The Database As A Dumb Store

An ORM, like every other abstraction, tries to live by the true meaning of the word abstraction. It doesn’t demand of you a blind trust but rather it demands that you think of your database, sorry persistence layer (and this is the first pitfall), as, you guessed it, a persistence layer. The place where you store stuff that survive a crash, shutdown, or anything else that makes the computer lose its memory.

Unfortunately the database isn’t a dumb store. As the server tasked with ensuring storage and integrity of data, it gets to have the last word. And rightly so. It shouldn’t (and doesn’t) delegate this responsibility to any application (or layer) above it. It would be irresponsible and lazy of the database if didn’t perform this task.

What this means is that a couple of validations in your ActiveRecord model isn’t enough to ensure data integrity. Technically, bad data could still end up in your database.

It also means that when you send a valid query to the database it might not even get the chance to run. It could be summarily aborted or heavily modified before it’s applied. There’s not much to worry about when the query is aborted. As far as I can tell, ActiveRecord will behave appropriately. It is when the query is modified that things could take a bad turn because of ActiveRecord’s confidence that it knows what the database will do with such a query. This is not hubris—it’s all good intentions. After all it’s an abstraction, one that requires you to not pay a lot of attention if any at all to the underlying database, vague-ified as the persistence layer.

Pitfalls

ActiveRecord’s confidence is a pitfall. In most cases it does exactly what you want it to. You start fighting (and eventually hating) it when you have columns with default values generated by the database during insert and/or columns whose eventual value is affected by the result of a trigger.

What You Insert/Update Is Not What Is Stored

Remember when we said that the database, as part of its job to ensure data integrity, determines the fate of all queries (and their values) it receives? If a query isn’t aborted but run, then the eventual values stored could be very different than what was originally submitted. Let’s look at two of such cases.

Function Defaults

When the default of a column is a function (such as now(), gen_random_uuid(), etc) the stored value is generated at insert-time (to use the compile-time, runtime lingo). Function defaults are awesome. gen_random_uuid() might be the most popular but stored procedures are easy to write, can be as complex as needed, and are tucked away in the database. But ActiveRecord punishes you for using a strength of the database.

Triggers

Triggers are like a Pub/Sub system. They allow you to listen to events and perform actions before or after they’re applied. The staple of trigger examples is audit tables (tables that keep history of changes to other tables). A rather simple example is normalizing some column values before they’re stored. For example, lowercasing email addresses before they’re stored. Where they usually shine is complex validations which depend on values in different tables. I’ve used them liberally this way and enjoyed the benefit of keeping logic close to the data they work on. If you haven’t used triggers before take a look at them. Again, ActiveRecord punishes you for using this incredible database feature.

What’s A Programmer To Do

The database has a mechanism of telling you what was the eventual values stored. It’s the returning clause that you can tuck to the end of your SQL query to specify which values should be returned from the result of the query. returning *, just like select *, returns the entire row, and these are the truest values, which is what you’d expect when you’ve successfully inserted a new row or updated an existing one. But not with ActiveRecord.

How ActiveRecord Punishes

Take for example the User model below. token has a database function as default, while the email column’s value is trimmed and lowered before it is inserted or updated.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
class CreateUsers < ActiveRecord::Migration[5.2]
  def change
    create_table :users do |t|
      t.string :email, null: false
      t.string :token, null: false, default: -> { "gen_random_uuid()" }

      t.timestamps
    end

    reversible do |dir|
      dir.up do
        execute <<-SQL
          create or replace function lower_trim_email()
          returns trigger as $$
          begin
            new.email = lower(trim(new.email));
            return new;
          end;
          $$ language plpgsql;

          create trigger trg_lower_trim_email
          before insert or update of email on users
          for each row
          execute procedure lower_trim_email();
        SQL
      end

      dir.down do
        execute "drop trigger trg_lower_trim_email on users"
        execute "drop function lower_trim_email()"
      end
    end
  end
end
user = User.create(email: "  HeLLo@exaMPLe.oRg   ")
puts user.email # "  HeLLo@exaMPLe.oRg   "
puts user.token # nil

What happened here? During an insert, ActiveRecord only asks for the autogenerated id back from the database. Thus, while a trimmed and lowercased email address was stored in the database, the application continues to use the original. Even worse, the user object doesn’t have its token set. The workaround is to immediately reload the model after_{create,save}, and that’s two queries already where one would suffice:

1
2
3
4
5
6
7
class Post < ApplicationRecord
  after_save :reload

  # If you only have default values and no triggers
  # that affect updates then use after_create instead.
  # after_create :reload
end

During an update, you can forget about what your triggers did to the values. There is no RETURNING clause, even for the specific values that were changed. This can produce a conflict where you work with one value in your application but have another in the database. In the above example it does.

The ActiveRecord Way

If immediately reloading models after save bothers you then don’t use database functions as default values. And don’t use triggers either. Embrace ActiveRecord fully and do as it expects: set the defaults in the model, transform the attribute values (like trimming and lowercasing) before persisting.

I look forward to what the resolution of this new issue will bring. It’s more complex than just writing the code to attach RETURNING * to the generated insert or update query. What I wish for is a directive on a model (in the manner of abstract_class) that marks a model as one relying on database features such as described above.

Got comments or corrections for factual errors? There’s a Hacker News thread for that.

Using environment variables

Especially for web applications, the software is usually considered to run in an environment. Usually, the difference between any two environments is defined by environment variables. They could define everything from the database(s) to connect to through what API keys to use (e.g. livemode vs testmode) to log levels. Usually, during startup, the application realizes what environment it is running in, then configures itself appropriately. After the application is up and running, there’s barely any good reason for it to try to figure out its running environment.

In the code base I work on, this is how we’ve used environment variables. In my code base, we have used some environment variables to influence logic in different places. This is a story of how it made life difficult for us during tests and two solutions we have come up with to address it.

Case

Before the application starts, we load all environment variables (they are OS-level environment variables, to be specific), cast to the appropriate types, and set necessary defaults. Any part of the code would import the environment variables module if it needed to. Nice and clean. A simplified example,

1
2
3
4
5
6
7
8
9
10
import * as env from 'config/env'
function processTransaction (id: string) {
    const transaction = getTransaction(id)
    const result = process(transaction)
    if (env.rewardCustomer) {
        const customer = getCustomer(transaction.customer)
        rewardCustomer(customer)
    }
    return result
}

All well and good, until we have to test. In our attempt to cover both cases, we have to test the processTransaction function once with rewardCustomer as false, and another with it as true. This essentially means that we can’t set it to one value in the test environments. Taking the whole test suite into consideration also means that it is likely unwise to set it to any value at all. That’s when we came up with our first approach: make it possible to overwrite some environment variables when running certain test cases. We add a new runWithEnvAs function to the environment variables modules. Its implementation is similar to this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
runWithEnvAs (envOverwrites: object, func: () => any) {
  const original = {}
  // resets environment variable to original.
  const reset = () => {
    Object.entries(original).forEach(([k, v]) => {
       store[k] = v
    })
  }
  // overwrite.
  Object.entries(envOverwrites).forEach(([k, v]) => {
    if (store.hasOwnProperty(k)) {
      original[k] = store[k]
      store[k] = v
    }
  })
  // now, run the function.
  try {
    func()
  } finally {
    reset()
  }
}

During tests, one could safely overwrite any environment variable as such:

1
2
3
4
5
6
7
8
9
import * as env from 'config/env'
describe('processTransaction', () => {
  env.runWithEnvAs({rewardCustomer: false}, () => {
    it('does not reward customer', () => {...})
  })
  env.runWithEnvAs({rewardCustomer: true}, () => {
    it('rewards customer', () => {...})
  })
})

Complication

On the surface, the problem was gone. We could now set and unset variables as and when we wanted, and we could test the behavior of any function given any value of an environment variable. But there was still some discomfort. First was that we had to introduce the runWithEnvAs function in the environment variables module, only to be used during tests. Not that anyone will, but it could definitely be used in other environments as well. Secondly, tests that require the overwrites are ugly to write and look very out of place. There were no explanations for the specific overwrites: why is rewardCustomer overwritten? A few days into using the new runWithEnvAs, we realized we had conjured what looked like a clever (or stupid, depending on who you ask) but dangerous trick to hide a major dependency of the processTransaction function. Rewriting a few more test cases drove the point home. Sure, there was code that used environment variables but didn’t have to be tested for different values of the variable, but they too could benefit from whatever we eventually arrived at.

Solution

The fix was quite simple: update all functions that use environment variables implicitly to accept an explicit argument. How this wasn’t the first thing that came to mind is a testament to our ability to overthink problems, or rather, not be able to step back and ask the right questions. Here, the question we asked ourselves was, how do I run a given test case with a preferred value for a given environment variable? I honestly don’t know what would have been a more right question to ask. Here’s processTransaction and its tests after the change:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
type ProcessOpts = {rewardCustomer: boolean}
function processTransaction (id: string, {rewardCustomer}: ProcessOpts) {
  const transaction = getTransaction(id)
  const result = process(transaction)
  if (rewardCustomer) {
    const customer = getCustomer(transaction.customer)
    rewardCustomer(customer)
  }
  return result
}
// Tests.
describe('processTransaction', () => {
  it('rewards customer', () => {
    processTransaction(id, {rewardCustomer: true})
    ...
  })
  it('does not reward customer', () => {
    processTransaction(id, {rewardCustomer: false})
    ...
  })
})

In essence, it was a classic application of the Dependency injection principle, which we had applied at the service level. I had never thought of it at such low level, but basically getting rid of any implicit dependencies and asking callers to pass an argument helped us clean up the code significantly. After the refactor, we were able to get rid of all environment variables in the test environment except those necessary for starting the application, such as database URLs, API keys, and log levels. It is not very often that we stumble on an application of a well-known principle in the remotest of places. I left this encounter asking myself what other principles do I expect to see in the large but have even more sane implication at the very low level? We’re always learning.

Function parameters

Motivation

This post is brought to you by one part frustration, one part discovery, and another part a plea to reconsider how we create functions and APIs going forward.

Modern Functions

A modern function, the type you see in modern high-level programming languages such as JavaScript, Python, Ruby, Go, PHP have their origins in C: they accept input (also known as arguments) and return an output (or maybe not). Below is a simple Go function which accepts 2 integers as input and returns their sum (shrouded in some mandatory Go ceremony):

1
2
3
4
5
6
7
8
9
10
11
12
package main

import "fmt"

func main () {
  total := sum(1, 2)
  fmt.Printf("1 + 2 = %d", total)
}

func sum(a, b int) int {
  return a + b
}

Before we get any further, some definitions.

A function’s arity is its number of parameters. Our sum function above has an arity of 2; written as sum/2 in the name-arity notation. Even though parameters and arguments are used interchangeably, a function is defined with parameters but called with arguments. Referring, again, to our sum function above, a and b are its parameters (see line 10), while the call on line 6 has 1 and 2 as arguments. Parameters are constant, arguments change. Function and method are also used interchangeably. In this post, I differentiate between them as such: a method is a function defined on a receiver. In imperative programming the receiver is usually an instance of a class.

With definitions out of the way, let’s address the biggest challenge of function parameters.

Function Parameters Are Positional

What does this mean?

During a function’s definition its parameters are given in a certain order. That order is a decree to future callers. If and when they use the function they should (1) pass the necessary arguments, but more importantly (2) in the right order. This so-called right order is fixed and non-negotiable. It’s in this stubborn rigidity that the problem lies. I’ll illustrate.

Take for example, exp, a nifty Python function for exponentiation. It calculates the e-th exponent of a base. exp is defined as follows:

def exp(e, b):
  """
  Returns the e-th exponent of b.
  To find the 5th exponent of 2, exp
  is called as such: exp(5, 2).
  """
  return b**e

Does the order of the parameters make sense to you? If your definition of exponentiation is a base b raised a power e you’d intuit that a function that does exactly that will be defined as such:

def exp(b, e):
  """
  Returns the e-th exponent of b. That is,
  b raised to the power e. To calculate 2
  raised to the power 5, exp is called as such:
  exp(2, 5)
  """
  return b**e

We agree on the implementation of the function. We agree that the base and exponent should be parameters so that we can use it on a variety of inputs, but we disagree on the order of the parameters, and if your version isn’t chosen then you’d have to live with the dissonance, constantly referring to the source code or documentation to learn the order of arguments. We’ve made a parameter’s position important and subjective to the programmer. In my opinion, this is not good.

We’re lucky here though. exp has an arity of two so there’s a 50% chance we’ll guess the order right. An arity of 3 reduces it to 17%. One more parameter and you can’t rely on your intuition anymore. See for yourself:

You can put your intuition to test here. Below are input fields which collect arguments for a function that prepares a sweet bio for your CV based on your first name, last name, Twitter, and GitHub handles. Can you guess which input field corresponds to what argument?

Make Me a Bio

Did you give it a try? Could you figure out the order of the parameters? God is with you if you did. Otherwise this is what probably happened: You thought the first input was first name, closely followed by last name. You were probably uncertain of the order of the Twitter and GitHub handles, but you found a (false) clue in the description of the function and thought it was Twitter first, followed by GitHub. But reality didn’t match expectation and so after a couple of tries you probably gave up. You subconsciously acknowledged that some form of documentation was necessary. “How is anyone expected to use the function without it?” you ask.

Did you give up? Did you peek under the hood for help? Did you find it there? Did you feel miserable? If you did, you’re not alone. Anybody who has tried to use a date/time function feels the same. And they usually have an arity of 6 and above. Can you imagine?

To date, figuring out the order of a function’s parameters is the number one reason I look at its documentation. In my opinion, referring to documentation or source code for no other reason than to learn the order of a function’s arguments is unacceptable. It’s a fixable problem, and we should actively work to fix it.

The Fix

We have seen that after three parameters our ability to intuit deteriorates beyond repair. Anything lower than three and our intuition is good enough. For example, we can guess with 100% accuracy the order of arguments to a zero- and single-parameter functions and we don’t have to consult any documentations. At two, our accuracy drops to 50%. Trial and error makes sense here: if one ordering doesn’t work the other will. Again, we avoid another trip to the documentation or source code in search of the right order.

How do we put this discovery to use? Can we still make powerful and useful functions if we set maximum arity to two?

I’m glad you asked. The answer is yes. I argue in favor of zero, one, and two arity functions below.

Zero-Parameter Functions (fn/0)

In imperative (or mutative) programming, zero-parameter function can be achieved with instance methods. Below is a class that represents a document. It has two methods, encrypt and publish which take zero arguments but are able to achieve their goals because they have access to the internal state of the instance:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
class Document
  # NOTE: We will fix this definition of initialize in the
  # next section when we make rules for one-argument functions.
  # All hope is not lost.
  def initialize(title, content, author, publisher, published, free)
    @title     = title
    @content   = content
    @author    = author
    @publisher = publisher
    @published = published
    @free      = free
  end

  # Returns an encrypted version of the document.
  # Please note: This is not how to encrypt.
  def encrypt
    @content.reverse
  end

  # Makes the document available on the interwebz.
  def publish
    @published = true
  end
end

Zero-parameter functions are meaningless in functional programming.

Single-Parameter Functions (fn/1)

Consider Elm, the delightful language for reliable web apps. As at the time of writing, all functions in Elm are defined with one and only one parameter. This is a non-negotiable and intentional constraint. Haskell too. These languages are able to achieve this because they rely heavily on (lightweight) types. They’re typeful languages. Composite types! Types, with their named fields, erase the necessity of positions.

They’re a pleasure to document, use in conversations. Even more to use them. Take for example a hypothetical send_email/1 function which sends, you guessed it, emails.

For a hypothetical send_email/1 function, all we need for documentation as far as parameters are concerned is: send_email/1 takes an email (with email hyperlinked to its documentation). Otherwise things get kinda messy: send_email/8 takes the following parameters in this order: from, to, reply_to, cc, bcc, subject, text_body, html_body. Depending on who should be cc-ed or bcc’ed and whether or not you want both text and HTML body, you’re left with sawtoothed calls such as below.

send_email(
  "from@mail.com",
  "to@mail.com",
  "",
  [],
  [],
  "Subject",
  "Text Body",
  ""
)

As you can see, exploding the email type and passing it’s fields as individual arguments to send_email/8 introduces unnecessary overhead. It doesn’t make for great conversation either.

Now, I said lightweight types because I want to exclude classes like we get in Ruby, Python, Java, etc. They are not lightweight, either to create or to chug along. An empty Ruby class is already bogged down by about 60 methods and attributes. That heavy, it doesn’t make sense to create them to be used as function arguments. Python’s namedtuple comes close to a lightweight type. I’ve used Elixir and Go’s struct with a lot of delight. They are lightweight composite types that are fit for the purpose of single-parameter functions. We need something more lightweight in Ruby.

In the absence of lightweight, named data structures to pass to our functions, we should turn to classes and methods. They can do a good job. For example, send_email/8 easily becomes send/0 on an Email class. With chain-able implementations of from, to, subject, etc., this beauty is a possibility:

email = Email.new
email.
  from("a@b.c").
  to("d@e.f").
  bcc(["g@h.i"]).
  subject("hey").
  text_body("text").
  html_body("html").
  send()

Empty initialization, attributes set with chain-able methods. Come to think of it, this gives us better API stability: when we don’t initialize new instances with arguments but set and unset them via methods we can easily add new attributes to the class without breaking existing code. A maxim is in order: Initialize empty, build later.

But maps, I hear you say. Yes, hashmaps or objects or dictionaries or associative arrays cut it. With them we don’t have to worry about order anymore. I’ll take them over sawtoothed calls. I wish maps could be typed though.

I consider variadic functions as single-parameter too. For example, Go rewrites variadic arguments to a single slice argument. More importantly, order doesn’t matter, which is what we’re ultimately striving for.

Types and classes get as far. Little, tiny, specific, special-purpose classes can make it possible to not exceed a single-parameter. Use them liberally to achieve this goal. Pass a data structure to the function. It’s most likely all you need. Add another method to the class.

Two-Parameter Functions (fn/2)

Very rare cases. Very special functions. Functions for comparisons, swaps, diffs, every function that needs two and exactly two things, of the same type usually. More than two and we can replace with a list instead and a single-parameter function.

Another special group of functions in this category is what I call applicators. These are functions that apply other functions to their arguments. Think of map, reduce, filter of an iterable data type. Their imperative cousins can still remain single-parameter.

Another group is registrar functions. They usually register callbacks. The first argument is a known, predictable event. The second argument is usually a function to call or notify. Very popular in Pub/Sub systems such as event handling (see DOM’s JS API).

These special-purpose functions enable extensibility. I think applicators are a brilliant idea. If your function takes 2 arguments could it be for the reason of extensibility? Shouldn’t the second argument be a callable?

Of course there’s always that one function that lies outside all patterns. You’re unlucky if you have to write them. All I can do is wish you well. I hope it doesn’t become your norm.

OK I’ll end here. This rant is already longer than it should have been. I’ve been unhappy at my work lately and found it a good way to vent about my frustrations. But I like how it turned out. I hope that in a follow up post I’m able to articulate a few rules I personally follow for making delightful functions. And I hope you’ll find them useful.

Closing Thoughts

I’ll leave you with this closing statement. It’s often said that code is written to be read by humans. The sequel to that is, and functions are created to be used by humans.

Next time you build an API keep this advice in mind. Be considerate of, first, your future self, and then your users. Intuition is a good thing; reinforce them! Above all, create and use types, and keep function parameters to a maximum of two. As I tried to show, it’s very possible. Colleagues, strangers on the internet who find solace in your work, your future self, and more importantly, your ongoing self will be thankful.

Got comments or corrections for factual errors? There’s a Hacker News thread for that.

Go-inspired HTTP clients in JavaScript

APIs, particulary HTTP APIs, are the liveblood of most software applications. Building APIs is a pretty decent enterprise, as well as using them. In this post I’ll talk about a pattern I’ve established for building integrations into external APIs. Before I proceed, I should say that sometimes the external APIs developers are generous to develop SDKs to go with it. There’s a lot of languages, and maintaining SDKs in all of them might be tedious. I use JavaScript (via TypeScript), primarily, and there’s almost always an SDK for the APIs I want to use. So why reinvent the wheel? For a couple of reasons:

  1. It keeps dependencies to a minimum. SDKs are built for the entirety of the API. In my experience, we often use less than 20% of an API’s offering, and in a very specific way.
  2. If coding style (enforced either by team or programming language) means anything to you (and your team), cranking out your own integration might be your best bet at keeping a consistent style.
  3. Typing: TypeScript adoption is still in its infancy. The APIs SDK might not ship with type declarations.

I take a lot of inspiration from the way Go’s HTTP tools are built. Specifically, the idea that the client is configured and ready to take a request, perform it, and return a result. I deviate slightly: the client is configured for a specific API, takes a request, performs it, and set the response on the original request object. Let me explain the different parts and the choice I made.

The client

Specimen.

At the core of it, the client configures a generic HTTP client to be very specific to the API at hand: a base URL, authentication/authorization scheme, and content type. Client has a method called do, which performs the request. Before firing off the request though, it ensures that necessary authentication/authorization is acquired. In the example client above, authentication is a different call, with the result permanently added to the default headers. Some times it’s a query parameter. At other times it’s a certain property that should be set on all payloads that are sent to the API. In short, the client should know how to take care of authentication/authorization so that individual requests are not burdened.

Go also calls its HTTP clients executor method Do. As you can tell, the inspiration isn’t restricted to design philosophy.

Requests

Specimen.

Requests are classes that implement an interface defined by the client, for the benefit of its do method. A request should have a body property, which can be used as the request’s body or query parameters. A method property for the HTTP method to use, and to property, which specifies the endpoint path for the request. If a request has all three properties then the client can successfully attempt an API call.

The chance to keep a consistent coding style, or rather, variable/attribute naming convention happens here. In most JavaScript projects I’ve worked on, variables names are spelled using camel-case. Unfortunately, this convention is thrown out of the door when dealing with external input coming into our system. While the JSON data exchange format has its roots in JavaScript, the snake-case (and sometimes dash-case, or whatever it’s called) has become the de facto way to spell attribute names. It leads to code like this:

1
2
3
4
5
6
7
8
9
10
11
createCharge({
  user_name: 'User Name',
  auth_token: 'authToken',
  amount_usd: 100
})
.then(data => {
  if (data.charge_status === 'created') {
    console.log(`${data.charge_id} for ${data.amount_usd} has been created.`)
  }
})
.catch(console.error)

This problem isn’t solved by SDKs either. For example, Stripe’s Node SDK sticks to the snake-case spelling style. In the case of Stripe though, I’m willing to budge. It’s a complicated API.

The design of the request class is such that the params can be named and spelled differently than what is eventually assembled into the body. This gives us an opportunity to get rid of all snake-case spellings. In the example above, I use lodash’s snakecase module to convert attribute names. When the JSON response is received, before setting it on the request, the keys are transformed (at all levels) into camel-case using, once again, lodash’s camelcase module, and a weak or strong type enforced. It depends on what you need the type for. The example above becomes:

1
2
3
4
5
6
7
8
9
10
11
createCharge({
  userName: 'User Name',
  authToken: 'authToken',
  amountUsd: 100
})
.then(data => {
  if (data.chargeStatus === 'created') {
    console.log(`${data.chargeId} for ${data.amountUsd} has been created.`)
  }
})
.catch(console.error)

Very consistent with our camel case convention.

Metrics

As time goes on, knowing how things fail, how long they take to complete or fail, will become a business requirement. We had a similar request for the external APIs we use. In our case it was easy to add instrumentation. All the code went in the client’s do method. Since actual requests were instances of classes, instance.constructor.name gave us a good name we could use in our metrics. There’s quite a few good opportunities that open up with this design philosophy.

The Bad

That said, there are quite a few pitalls to be aware of. The first and most important is that, it might be a laborious task. And depending on commitment (both short and long term to both the API and the code), might not be worth it. Sometimes the work required cannot be automated away. Some APIs have crazy naming conventions. In my experience, a few of them have mixed snake-case, camel-case, and whateverthisstyleiscalled. Dealing with them might not be an exciting enterprise.

P.S. Both specimens here are first and public examples of when we started to develop SDKs using the pattern described here. There has been a few improvements since they were first made. For example, where it’s up to me, I no longer use axios and instead opt for request.

Thank you.