Plan for Today #
- (Finish) Database Schema Design
- Core Concepts of Functional Programming
- Higher-Order Functions
- Activity
Paradigm Shift #
A programming paradigm is a framework of organizing concepts and abstractions that guide how we design and build software and how we think about solving problems.
Two familiar and commonly used paradigms are:
- object-oriented programming
- the class is the fundamental unit of abstraction - data and code are wrapped together, encapsulated, on objects that represent the nouns of our system.
- imperative programming
- a program is a sequence of commands and actions for the computer to perform that successively update program state. An imperative program describes how to perform a task.
These two paradigms often overlap in practice and are so familiar that alternatives can be hard to see/
Today, we will start a dive into related paradigms – functional programming, data-oriented programming, and declarative programming – that offer powerful benefits to our thinking and practices.
First Class Entities #
An entity is first class when it can be:
- created at run-time,
- passed as a parameter to a function,
- returned from a function, or
- assigned into a variable.
A first class entity typically has built-in support within a language and runtime environment.
For instance, (fixed-width) integers are first-class entities in all common languages.
- Created:
101
- Passed:
max(11, -4)
- Returned:
return 7
- Assigned:
i = 17
Similarly with floating-point numbers, booleans, strings, and arrays. Many languages have more specialized first-class entities:
- Arbitrary precision numbers (Clojure, Mathematica, Java)
- Hash Tables/Associative Arrays/Tables (Python, Clojure, Perl, Ruby, Lua)
- Raw Memory Locations (C, C++)
- Classes/Objects (Python, Ruby, C++, Java, \(\ldots\))
- Types (Idris)
- Unevaluated Code (Common Lisp, Racket, Scheme, Clojure, Rust, R)
- Functions (Python, R, Haskell, Clojure, OCaml, Rust, JavaScript, \(\ldots\))
The first-class entities in a language shape a languages central idioms and abstractions.
Core Concepts of Functional Programming #
Composability is a fundamental aspiration in software design. We want the parts of our system to interoperate and combine as freely as possible. This offers power, expressiveness, reusability, and many other benefits.
In functional programming (FP), functions are the fundamental unit of abstraction providing simple, easily composed building blocks for solving problems.
Central features of FP:
- Functions are first-class entities
- Functions are pure (whenever possible)
- Data is immutable (whenever possible)
- Programs have declarative structure (as much as possible)
- Exploit laziness when appropriate
- Referential transparency
Why use functional programming?
- Functions are simple, easily composed abstractions
- Easy to express many ideas with very little code
- Easier to reason about – and test – a functional program
- Avoids the significant complexity of mutating state
- Effective Concurrency/Parallelism by avoiding mutable state
- Focuses on data flow
- Efficiently exploits recursive thinking
- Fast development cycle
- Encourages code reuse
- Isolate side effects
Languages designed for FP: Clojure, Haskell, OCaml, Scala, Elm
Languages with great FP support: Rust, R, Common Lisp, Julia, Ruby, JavaScript
Decent FP Support: Python
Languages adding many FP features: C++, Java
First-Class Functions #
Having first-class functions means that functions are data too.
Some use cases:
- Parameterized strategies
- Generic operations - abstracting operations across a range of types
- Dynamically-defined operations based on parameters and data
- Combine multiple pieces into useful aggregate functions
apply(M, ACROSS_COLS, function(x) { max(x[!is.na(x) && x != 999]) })
pairwise.distances(X, metric = function(x, y) { max(abs(x - y)) })
integrate(lambda x: x*x, 0, 1)
// Abstracting Array Iteration
function forEach(array, itemAction) {
for ( var i = 0; i < array.length; i++ ) {
itemAction(array[i])
}
}
forEach(["R", "SAS", "SPSS"], console.log);
forEach(["R", "SAS", "SPSS"], store);
forEach(["R", "SAS", "SPSS"], function(x) {myObject.add(x)});
def fwd_solver(f, z_init):
"Fixed-point solver by forward iteration"
z_prev, z = z_init, f(z_init)
while np.linalg.norm(z_prev - z) > 1e-5:
z_prev, z = z, f(z)
return z
def newton_solver(f, z_init):
"Newton iteration fixed-point solver"
f_root = lambda z: f(z) - z
g = lambda z: z - np.linalg.solve(jax.jacobian(f_root)(z), f_root(z))
return fwd_solver(g, z_init)
;; *Markov Chain Monte Carlo (MCMC)* is a simulation method
;; where we create a Markov chain whose /limiting distribution/
;; is a distribution from which we want to sample.
(defn mcmc-step
"Make one step in MH chain, choosing random move."
[state moves]
(let [move (random-choice moves)]
(metropolis-hastings-step (move state))))
(def chain
(mcmc-sample initial-state
:select (mcmc-select :burn-in 1000 :skip 5)
:extract :theta1
:moves [(univariate-move :theta1 random-walk 0.1)
(univariate-move :theta2 random-walk 0.4)
(univariate-move :theta3 random-walk 0.2)]))
(take 100 chain)
Pure Functions #
What does this code do?
x = [5]
process( x )
x[0] = x[0] + 1
A function is pure if it:
- always returns the same value when you pass it the same arguments
- has no observable side effects
What are “side effects”?
- Changing global variables
- Printing out results
- Modifying a database
- Editing a file
- Generating a random number
- …anything else that changes the rest of the world outside the function.
Examples:
pure <- function(x) {
return( sin(x) )
}
global.state <- 10
not.pure <- function(x) {
return( x + global.state )
}
also.not.pure <- function(x) {
return( x + rnorm(1) )
}
another.not.pure <- function(x) {
save_to_file(x, "storex.txt")
return( x + rnorm(1) )
}
u <- 10
and.again <- function(x) {
...
print(...) # Input/Output
z <- rnorm(n) # Changing internal state
my.list$foo <- mean(x) # Mutating objects' state
u <<- u + 1 # Changing out-of-scope values
...
}
def foo(a):
a[0] = -1
return sum(a)
x = [1, 2, 3]
sum_of_x = foo(x)
x #=> [-1, 2, 3]
Benefits of pure functions
- deterministic and mathematically well-defined
- easy to reason about
- easy to test
- easy to change
- can be composed and reused
- can be run simultaneously in concurrent processes
- can be evaluated lazily
Calculations, Actions, Data #
- Actions
- results depend on when they are called, how many times they are called, or context in which they are called
- Calculations
- operations results that do not affect the world when they run, processing inputs to output directly, independent of context or timing
- Data
- Facts about events, a record of something that happened, encodes meaning in structure
Useful to practice distinguishing among these three categories with your code.
Prefer calculations (pure functions) where possible and keep a clear delineation.
Immutable Data #
This is a common pattern in imperative programming:
a = initial value
for index in IndexSet:
a[index] = update based on index, a[index], ....
return a
Each step in the loop updates – or mutates – the state of the object.
The familiarity of this operation obscures the tremendous increase in complexity it produces. Mutability couples different parts of your code and across time. Greatly complicates parallism and concurrency.
Immutable (persistent) data structures do not change once created.
- We can pass the data to anywhere (even simultaneously), knowing that it will maintain its meaning.
- We can maintain the history of objects as they transform.
- With pure functions and immutable data, many calculations can be left to when they are needed (laziness).
These data structures achieve this with good performance by
clever structure sharing. (Eg: pyrsistent
library in Python,
immer
and immutable.js
in JavaScript, built-in in Clojure,
rstackdeque
in R, \(\ldots\))
Favoring immutable data and pure functions makes values and transformations, rather than actions, the key ingredient of programs. So FP prefers expressions over statements
(foo (if (pred x) x (transform x)) y)
if pred(x):
y = x
else:
y = transform(x)
foo(y)
Declarative Structure #
A declarative program tells us what the program produces not how to produce it.
qsort [] = []
qsort l@(x:xs) = qsort small ++ mid ++ qsort large
where
small = [y | y <- xs, y < x]
mid = [y | y <- l, y == x]
large = [y | y <- xs, y > x]
Programming languages or packages often provide ways to thread data through multiple functions to clearly express the flow of operations:
tokenize <- function(line) {
line %>%
str_extract_all("([A-Za-z][-A-Za-z]+)") %>%
unlist %>%
sapply(tolower) %>%
as.character
}
## much clearer than:
## as.character(sapply(tolower, unlist(str_extract_all("([A-Z]...)", line))))
(defn tokenize [line]
(->> line
(re-seq "([A-Za-z][-A-Za-z]+)")
(map lower-case)))
Laziness (sometimes) #
Laziness refers to only computing a result when it is needed.
This can be a powerful optimization, though it is not always
desired. The opposite of lazy here is eager
(or strict).
min = hd(qsort list) -- Laziness makes this O(n) not O(n ln n)
fibs1 = 0 : 1 : zipWith (+) fibs1 (tail fibs1)
fibs2 = 0 : 1 : [ a+b | a <- fibs2 | b <- tail fibs2 ]
take 10 fibs1 -- prodcues [0,1,1,2,3,5,8,13,21,34]
(defn mcmc-sample
"Generates an MCMC sample from an initial state.
Returns a lazy sequence.
Keyword arguments:
:move -- a collection of moves, which are
functions from state to a candidate
:select -- a selector function which indicates
a boolean for each index and state
if that state should be kept in the
output
:extract -- a function to extract features
(e.g., parameters) from the state
"
[initial-state & {:keys [moves select extract]
:or {select (constantly true),
extract identity}}]
(letfn [(stepper [index state]
(lazy-seq
(if (select index state)
(cons (extract state)
(stepper (inc index) (mcmc-step state moves)))
(stepper (inc index) (mcmc-step state moves)))))]
(stepper 0 initial-state)))
Closures #
Closures are functions with an environment – variables and data – attached. The environment is persistent, private, and hidden. This is a powerful approach for associating state with functions that you pass into other functions. (In fact, an entire OOP system could be built from closures.)
counter <- function(start=0, inc=1) {
value <- start
return(function() {
current <- value
value <<- value + inc
return(current)
})
}
cc <- counter(10, 2)
dd <- counter(10, 2)
cc() # 10
cc() # 12
cc() # 14...
value # Error: object 'value' not found
dd() # 10
kernel_density <- function(data, bandwidth) {
return(function(x) {
## calculate the density here
for (row in 1:nrow(data)) {
## calculate some stuff
}
return(density)
})
}
d <- kernel_density(data, 4)
d(3) #=> returns density at 3
Only the anonymous function returned by counter
can access that
internal state: it is private and unique to each instance.
Higher-Order Functions #
Higher-order functions are functions that take other functions as arguments.
Three commonly used: map
, filter
, and reduce
Map #
The map
operation takes a function and a collection and calls the function
for each successive element of the collection, producing a new collection out
of the results. For example, in R:
square <- function(x) x^2
Map(square, 1:10) #=> list(1, 4, 9, 16, 25, 36, 49, 64, 81, 100)
or in Python:
list(map(lambda x: x*x, range(1, 11))) #=> [1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
using an anonymous function, declared with the lambda
keyword, that has no
name.
map
can usually do more than this. In most languages, it can take multiple
sequences and a function that takes multiple arguments. It then calls the
function with successive elements from each sequence as its arguments,
producing a sequence of the results. For example, in R
Map(`-`, 1:10, 10:1) #=> list(-9, -7, -5, -3, -1, 1, 3, 5, 7, 9)
and in Python
list(map(lambda x, y: x - y, range(1, 11), range(10, 0, -1)))
#=> [-9, -7, -5, -3, -1, 1, 3, 5, 7, 9]
We can use map
to replace a common pattern we see over and over in code
processing data:
transformed_data <- c()
for (ii in 1:length(data)) {
transformed_data <- c(transformed_data, do_stuff_to(data[ii]))
}
## becomes
transformed_data <- Map(do_stuff_to, data)
The map
operation expresses our meaning more clearly and concisely, and is
more efficient to boot (since it does not repeatedly copy the
transformed_data
).
Filter #
The filter
operation takes a predicate (a function returning a boolean) and a
collection. It calls the predicate for each element, and returns a new
collection containing only those elements for which the predicate returns
true. For example, in R:
is_odd <- function(x) { (x %% 2 != 0) }
Filter(is_odd, 1:10) #=> c(1, 3, 5, 7, 9)
Filter(is_odd, as.list(1:10)) #=> list(1, 3, 5, 7, 9)
Notice that each result is the same type as the collection Filter
was given.
In Python,
filter(lambda x: x % 2 != 0, range(1, 11)) #=> [1, 3, 5, 7, 9]
We can always combine a map
with a filter
, applying a function to only those
elements matching a predicate. The composability of these operations is a
major advantage, and much easier to deal with than building complicated loops
over data manually.
Reduce #
The reduce
operation, also sometimes called fold
, is a much more general way
to process the elements of a sequence. We could use it to build map
, filter
,
or many other interesting operations.
reduce
takes as arguments a function, an accumulator, and a sequence. The
function takes two arguments – the accumulator and one element of the
sequence – and returns an updated accumulator. The function is first passed
with the initial accumulator and the first element of the sequence, returning
a new accumulator which is passed to the function again with the second
element of the sequence, and so on. The final value of the accumulator is
returned by reduce
.
For example, suppose we want to add up the numbers in a list, but keep separate sums of the odd numbers and the even numbers. In R:
parity_sum <- function(accum, element) {
if ( element %% 2 == 0 ) {
list(accum[[1]] + element, accum[[2]])
} else {
list(accum[[1]], accum[[2]] + element)
}
}
Reduce(parity_sum, 1:10, list(0,0)) #=> list(30, 25)
In Python:
def parity_sum(acc, x):
even, odd = acc
if x % 2 == 0:
return [even + x, odd]
else:
return [even, odd + x]
reduce(parity_sum, range(1,11), [0,0]) #=> [30, 25]