Immutability

— Christopher R. Genovese

Plan for Today #

  • Immutability Case Study: Zippers
  • Design Principles and Practices (next time)

Note: Next assignment zippers, upcoming assignments focus on different notions of derivative.

The Algebra of Types #

Just a quick snapshot but hopefully evocative

Basic Types #

Void
The type with no realizable values; maps to 0.
Unit
The type with only one value; maps to 1.
Boolean
The type with two values (True and False): maps to 2.
Product types
Tuples and records, Cartesian product; maps to product
type Pair a b = Pair a b
type Triple a b c = Triple a b c
...

Write tuple types in literal form (a, b), (a, b, c), …

Sum types
Alternatives, includes enums, disjoint union; maps to sum
type Color = Red | Green | Blue
type Either a b = Left a | Right b
type Maybe a = None | Some a
Exponentials
Functions; maps to exponentials

How many functions of type Boolean -> Pair Boolean Boolean?

Numbers Types
0 Void
1 Unit
a + b Either a b
a * b Pair a b
2=1+1 Boolean
1 + a Maybe a
b^a a -> b

Puzzle: Show that a + a and 2 * a are isomorphic. What does this mean?

Puzzle: Show that a * b and b * a are isomorphic. What does this mean?

Recursive Types #

Types that are defined in terms of themselves.

Common examples:

type List a = Nil | Cons a (List a)
type BinaryTree a = Leaf | Branch (Tree a) a (Tree a)
type Tree a = Node a (List (Tree a))

Look at list; we can translate these, respectively, as

x = 1 + a * x
  = 1 + a * (1 + a*x)
  = ...
  = 1 + a + a*a + a*a*a + ...
  =? 1/(1 - a)
x = 1 + a x^2
  = 1 + a (1 + a x^2)^2
  = 1 + a + 2 a^2 x^2 + a^3 x^4 + ...
  = 1 + a + 2a^2 + 5a^3 + 14a^4 + 42a^5 + ...
x = a * (1 + x + x^2 + x^3 + ...)
  = ...

Note: the coefficients in the binary tree example are the Catalan numbers (A000108) which (among other things) count the number of ways to build a binary tree out of a given number of nodes.

Immutability #

We’ve talked several times about the benefits of immutability. Maintaining mutable state couples different parts of the code, make it difficult to reason about the code, and making it much easier to introduce bugs. (“Spooky action at a distance.”)

Two powerful strategies:

Persistent data structures
allow reasonably efficient updates to data structures without changing the objects of existing references.

Bonus: allow fast transient mutable changes where a single reference is guaranteed.

Ownership management
ensure (at the language level) that there are not two references to any object/value when mutation can take place. (Shared read-only references are ok.)

As a design principle, immutability is a good one to strive for in whatever style. There are always trade-offs, and one needs to find a balance

Zippers: Derivatives of Types #

Zippers are cursors into immutable data structures that allow navigation and modification* near the current focus. We will also see that we can think of zippers as a derivative of an algebraic data type.

A familiar example: current working directory in a terminal. You move through the file system maintaining a current focus (current directory) and can look at the files near your current focus, make changes, move up and down as desired. This is – at least morally – the structure of a zipper for trees.

Motivating Example #

Consider a simple binary tree and suppose we want to navigate through the tree examining and maybe modifying data as we go. How can we do this?

  • Pointer shenanigans (mutable, extra structure needed)

  • Path copying: if we update something in the tree, replace everything that points to it (but as little as possible).

    This gives a new, updated tree without changing the original.

Pictures (standard pointer tree, parent and bidirectional trees, path copying)

Let’s extend this idea to a data structure that maintains

  • a current focus
  • a context around that focus allowing us to move, update etc

Picture: (focus, context) pair in a tree; pointers along path to root from context are reversed; context is in place of the parent node of the focus, points to sibling and up, acts like a recursive context.

With this in mind, a zipper can be viewed as a version of the original data structure with a /hole. The focus and context are associated with the hole, and as we move or update we change where/what the hole is.

List Zippers #

What would a zipper for a Pair a a look like? Well, we can have a hole in either slot, which means we keep either the first or the second element. (We still have a focus and context.) So, Zipper (Pair a a) looks like a | a. In algebraic terms, the zipper for a^2 is 2 a. Hmmm… looks like a derivative.

Now consider lists.

An evocative notion: The algebraic expression for List a is 1 + a + a^2 ... = 1/(1 - a).

The derivative of this expression is 1/(1 - a)^2, suggesting that the derivative of List a is (List a, List a)! Let’s see what that means about zippers.

What does it mean to see this as a derivative? First, it gives us a clue on how to construct these things. Second, it expresses the Zipper as a difference between the current structure and a possible future structure, giving a sense of its meaning.

Let’s see how this manifests for lists.

A zipper provides a focus on one element of the list and the context about the surrounding contents that allow us to move, update, et cetera.


x_1 -> x_2 -> x_3 -> x_4 -> x_5 ...

             focus
               v
x_1 <- x_2 <- x_3 -> x_4 -> x_5 ...
            ^     ^
         left     right
            context
type ListZipper a = record ListZipper where
                        focus : a
                        lefts : List a
                        rights : List a

Puzzle: Sketch or implement

zipperFrom : List a -> ListZipper a
toList : ListZipper a -> List a
focus : ListZipper a -> a
left : ListZipper a -> Maybe (ListZipper a)
right :  ListZipper a -> Maybe (ListZipper a)
replace : a -> ListZipper a -> ListZipper a
edit : (a -> a) -> ListZipper a -> ListZipper a

There are other things we might do. Feel free to change the argument order (or add) as you see fit. For instance, we might want edit to take an optional list of extra arguments in addition to the node. Might also want a pipeline for a sequence of commands.

Puzzle: What would a list zipper look like with two holes? Why might that be useful?

Tree Zippers #

We can do the same thing for trees, seeing the context as just another zipper. It is helpful to track if we have modified a subtree because if we have not, we can move up very easily.

Here is a designation of the type for a tree zipper where the trees can have an arbitrary list of children. One could do something similar for binary trees.

type Tree a = Node a (List (Tree a))
type TreeZipper a = record TreeZipper where
                        node :: a
                        lefts :: List a
                        rights :: List a
                        parent :: Maybe (Zipper a)
                        changed :: Boolean

Appendix: R types for Maybe a and List a #

A simple linked list type for R.

Example use:

alphabet <- cons("a", cons("b", cons("c", nil)))
first(alphabet)  #=> "a"
rest(alphabet)   #=> "a" -> "b" -> "c" -> nil
cons("!", alphabet) #=> "!" -> "a" -> "b" -> "c" -> nil

The code:

nil <- new.env()
nil$first <- NULL
nil$rest <- nil
class(nil) <- c("linkedList", "environment")


cons <- function(car, cdr = nil ) {
    xs <- new.env()
    xs$first <- car
    xs$rest <- cdr
    class(xs) <- c("linkedList", "environment")
    return(xs)
}

snoc <- function(xs) {
    return(list(xs$first, xs$rest))
}

first <- function(xs) xs$first
rest <- function(xs) xs$rest
is.nil <- function(xs) identical(xs, nil)

print.linkedList <- function(xs) {
    focus <- xs
    values <- c()

    while( !is.nil(focus) ) {
        values <- append(values, as.character(focus$first))
        focus <- focus$rest
    }
    values <- append(values, "nil")
    print(paste(values, collapse = " -> "))
    invisible(xs)
}

A Maybe a type for R

None <- new.env()
class(None) <- c("maybe", "environment")

Some <- function(a) {
    x <- new.env()
    x$is <- a
    class(x) <- c("maybe", "environment")
    return(x)
}

is.none <- function(ma) {
    return(identical(ma, None))
}

# maybe : b -> (a -> b) -> Maybe a -> b
maybe <- function(default, f, ma) {
    if( is.none(ma) ) {
        return(default)
    }
    return(f(ma$is))
}

# fromSome : Maybe a -> a
# This is unsafe, assumes not None for
# when that is known
fromSome <- function(ma) {
    return(ma$is)
}

# chain.maybe : (a -> Maybe b) -> Maybe a -> Maybe b
chain.maybe <- function(f,  ma) {
    if( is.none(ma) ) {
        return(None)
    }
    return(f(ma$is))
}

print.maybe <- function(ma) {
    if( is.none(ma) ) {
        print("None")
    } else {
        print(paste("Some", ma$is))
    }
    return(invisible(ma))
}