Building Trees

— Christopher R. Genovese

Announcement #

  • Target Thursday for Monoidal Folds first submission
  • Questions?
  • Office Hours: Thu 3-4, plus by Appt, Fri 1-2 (provisional)

Plan #

Our goal today is to look at building and traversing trees: hierarchical structures that arise in many contexts (including HW1).

What is a tree? #

There are various ways we can define trees, but we will use the following working definitions:

A tree is a connected graph with n nodes and n - 1 edges.

Alternative definitions:

  • A tree is a connected and acyclic graph
  • A tree is an acyclic graph in which a simple cycle would be formed if an edge were added to the graph
  • An (undirected) graph with exactly one path between any two nodes.

The type of a tree #

A general tre

type Tree a = Node a (List (Tree a))
type Forest a = List (Tree a)

Here, the type parameter a represents the type of data stored at each node. The second component is a list of child subtrees.

A few useful functions to illustrate how we use Trees.

isLeaf : Tree a -> Boolean
isLeaf (Node data Nil) = True
isLeaf (Node data (Cons _ _)) = False

childrenOf (Node _ Nil) = Nil
childrenOf (Node _ children) = children

dataAt (Node data _) = data

Traversals #

A traversal of a tree is the process of visiting every node in the tree exactly once, typically performing some action or computation at each node.

What are some operations that we might want to do in such a traversal? Discuss in context of example

Example #

[
    ('op', '*'),
    [
        ('op', '+'),
        ('float', 1.2),
        ('int', 6),
        [
            ('op', '-'),
            ('int', 2),
            ('int', 4)
        ]
    ],
    [
        ('op', '+'),
        ('str', "foo"),
        ('str', "bar")
    ],
    ('symbol', 'a')
]

Similar structure in R.

What might we want to do?

  • Evaluate
  • Pretty Print
  • Substitute

Operations #

Here’s a few operations that are different forms of traversal plus a building operation (unfold):

mapTree : (a -> b) -> Tree a -> Tree b
mapTree f (Node data children) = Node (f data) (map (mapTree f) children)

fold : (a -> List b -> b) -> Tree a -> b
fold f (Node data children) = f data (map (fold f) children)

traverse : (a -> Action b) -> Tree a -> Action (Tree b)
traverse f (Node data children) = lift2 Node (f data) (map (traverse f) children)

-- General Alternative
traverse f = go
  where go (Node data children) = lift2 Node (f data) (traverse go children)

unfold: (s -> Pair a (List s)) -> s -> Tree a
unfold f seed = let (a, bs) = f b
                    in Node a (unfoldForest f bs)

unfoldForest :: (b -> (a, [b])) -> [b] -> [Tree a]
unfoldForest f = map (unfoldTree f)

where the type Action stands for arbitrary effects, satisfying constraints we will discuss later. For example, Action can represent an I/O effect (which we might write as IO b), like printing something; or a computational effect like gathering something into a List; or a no-op like an Identity type in which we simply produce a Tree; or Action b = c -> (IO b) where c is context data passed down the tree (e.g., level)

(The function traverse is more general than trees and applies to any Functor type like List or Maybe as well.) Here, lift2 effectively converts a Tree (Action b) to an Action (Tree b), but don’t worry about that for now.

Example #

[
    '*',
    [
        '+',
        1.2, 6, ['-', 2, 4]
    ],
    [
        '+',
        14, 6
    ]
]

This is a Tree a for what type a.

(In R, we can do the same thing with a list (which can contain lists or values.)

Task 1 #

  • Write several examples of expressions structured like trees They can be numeric and/or symbolic as you choose and can include other operators like ‘%’ or ‘//’ if you like, or even functions like floor, ceil, max, min, et cetera.

  • Write functions traverse such a tree and

    1. evaluate the expression into a value
    2. pretty print the expression
    3. simplify the expression by eliminating ‘+ 0’ and ‘* 1’ and ‘/ 1’ from the expression.
  • What can be generalized here?

Discussion of Task 1 #

  • Breaking down the task
  • Design with types using the two special cases as examples What are the types?
  • How does the function work at each stage?

Task 2 #

  • Python: Traverse AST for some python code, computing a query (e.g., find all function names defined) or transformation (e.g., change all variable names say)

  • R: Use a list that represents a nested (but simplified R expression) and do the same thing. Restrict to simple things here

Task 3 #

  • In monoidal folds,