Back to top

Best Practices

— Christopher Genovese and Alex Reinhart

Programming is a form of communication to two audiences: the computer and human readers (including future you).

As long as your code is syntactically correct, the computer will run it, for better or worse. But your code will be checked, studied, tested, documented, debugged, used, modified, generalized, and reused by humans. To get the most value from your time spent programming, you need to pay attention to how humans process your code.

Indeed, the features that characterize good code are very similar in spirit to the features that characterize good writing.

There are many details to manage in a complex piece of code, and consequently there are many detailed choices to consider in practice. These include matters of

style
naming
documentation
organization
design
dependence
error handling
tooling

See the rubric or the book Code Complete by Steve McConnell. These are worth reading and studying.

But for our purposes today, we can cover a lot of ground with only a few basic principles.

Write code to be read #

Code communicates ideas and describes abstractions, often complicated ones. Its execution is like an unfolding story, with characters traveling along their own narrative arcs.

Try to maximize the ease and clarity with which the reader can process the code. Help them to

understand the ideas/abstractions behind the code
identify the characters/entities involved, and
follow the story.

And remember to prepare your reader for the information you are about to give.

Here are a few implications of this principle.

Format your code to make it easy to read
Use meaningful, concrete, and descriptive names
Arrange your code to bring out the central idea in each chunk
Make critical relationships salient
Structure your interfaces to present a clean and consistent abstraction
Avoid hidden side effects and obscure features
Use documentation to supplement code not mimic it

Be consistent #

A foolish consistency may be the hobgoblin of little minds, but for programming, a practical consistency is helpful to in many ways.

Here are a few implications of this principle.

Use consistent formatting, spacing, and style
Use consistent naming schemes for variables, functions, classes, and files CamelCase sausage-case snake_case ALL_CAPS
Use consistent documentation formatting, style, and scope
Use consistent interfaces to functions and classes
Use consistent error handling

Many conventions for naming, formatting, spacing, etc. are included in style guides used by projects or programming languages. For example, PEP 8 describes naming and formatting conventions for Python code, and your code will be expected to follow it. (PEP 8 is unusual because nearly every major Python project uses it.) R has a lot of historical cruft that means nobody uses the exact same style, but the tidyverse style guide is a good reference. Read these guides!

Don’t Repeat Yourself #

Seriously, don’t repeat yourself. It’s inefficient to repeat yourself. So don’t do it. Really.

Keep your code DRY! (Not WET – wasting everyone’s time!)

Each piece of knowledge embodied in the code should have one unambiguous and authoritative representation.

Here are a few implications of this principle.

If you find yourself repeating a piece of code, put it in a function.
If you find yourself using a number or other literal, make it a named constant. (Besides a few basic cases such as 0, 1.)
Documentation should not merely repeat what the code does but should add value. For instance: why, who, when?

It’s easier to chew small pieces #

Any stretch of code focuses on a few key ideas. Organizing your code to bring out one idea at a time, rearranging as needed.

Organize your code modularly (paragraphs, functions, files)
Prefer functions that do one thing well
Prefer orthogonality (decoupling)
Prefer classes with a distinct purpose and identity

Keep the contract clear #

Each function or class has an explicit contract behind it. “I give you this, you give me that.”

Make that contract salient in your code, your tests, and your documentation.

An idea we will discuss: consider using assertions and pre/post conditions to check/enforce this contract.

Keep information on a need to know basis #

Each function, class, and module in your code needs some information to do its job.

Give it the information it needs but no more.

Giving too much information couples parts of the code that should be independent, making them harder to test, debug, and reason about.

Objects in particular should “_encapsulate_” the information they contain quite jealously.

Make it run, make it right, make it fast – in that order. #

Only Optimize the bottlenecks!

A Demonstration #

In your local copy of the documents repository, do a git pull.
Open the file Activities/best-practices/shift-the-mean-1.r in an editor or in RStudio.

We will think about this code with respect to the principles and consider some modifications to improve it.

First, look over the code for five minutes and consider a few initial questions:

What does this code do? How might you figure it out?
What are the intended inputs?
What is the intended output?
Can you explain why anything is done the way it is?
What about this code’s formatting and style makes it difficult to answer the questions above?

Second, a few modifications. See the files:

in the Activities/best-practices directory of the documents repository.

An Interactive Exercise #

Copy one of the files Activities/best-practices/nnk.py or Activities/best-practices/nnk.r into another directory (outside documents).

We will think about this code and make a series of modifications, in light of the principles we have discussed today.

A few initial questions to consider as you examine the code:

What does this code do? How might you figure it out?
What are the intended inputs?
What is the intended output?
Can you explain why anything is done the way it is?
What works well here for clarity and readability? What does not?
Where is the code consistent or inconsistent?
Is there repeated code? What should you do about that?
Are the concepts within the code separated into meaningful chunks?
Is information properly encapsulated?

As you find the answers to these questions, restructure the code to make it follow our design principles.

Rough activity time: 30 minutes

You are encouraged to discuss this with your neighbors as you work, but you should enter your own changes.

Resources #

The book Code Complete by Steve McConnell
The Pragmatic Programmer by Andy Hunt and Dave Thomas
Community style guides
- tidyverse style guide for R, by Hadley Wickham
- PEP 8 style guide for Python code
- Google Style Guides for many languages (including R, Python, C, C++, Java, and Lisp)

Best Practices

Write code to be read #

Be consistent #

Don’t Repeat Yourself #

It’s easier to chew small pieces #

Keep the contract clear #

Keep information on a need to know basis #

Make it run, make it right, make it fast – in that order. #

A Demonstration #

An Interactive Exercise #

Resources #

Course Info

Tools

Practices