Design

— Christopher Genovese and Alex Reinhart

Let’s consider design. Suppose you have a major software task. You’ve been asked to build, or need to build as part of your research, a big software system with many pieces that do many things. How do you go about designing your code, so you know how to organize your work and how to fit all the pieces together?

Design Example: Collaborative Assessment #

At your workplace, your team periodically gets together to do project performance reviews. Each project’s performance is reviewed by everyone involved (manager, employees, users from other departments), and the team uses these reviews to put together recommendations for each project.

These meetings tend to be very long, involve lots of details, and a lot of details get lost.

You decide to write a software system in which all the stakeholders can enter their data and perspectives and which compiles the information into a report.

How should you go about designing this system?

Design Reality #

Design is a sloppy process #

Expect dead ends, wrong turns, mistakes, head-slapping regrets… leading to a good outcome.

Good designs are often only subtly different from bad ones.

Design is about tradeoffs and priorities #

Recognizing those tradeoffs and aligning them with your priorities is a key step.

What are some of the tradeoffs in the example?

Design involves constraints #

Constraints lead to creative solutions

Design is heuristic #

No one methodology or process works in all contexts.

Design is iterative #

Requirements change #

In what ways might the requirements change over time in the example?

Considerations #

What considerations or criteria might affect our design choices?

Design Principles #

Minimize Unnecessary Complexity #

Software engineering is about managing complexity.

Form Consistent Abstractions #

Abstraction is the process of representing the essential features of a mechanism without delving into details or explanation of the underlying implementation.

If the various abstractions that comprise a piece of software are conceptually consistent and aligned, the system becomes easier to work with. If they are inconsistent or misaligned, complexity increases.

What are some of the abstractions in the Collaborative Assessment example?

Loose Coupling, Strong Cohesion, and Encapsulation #

Coupling is the interdependence between different parts of a software system.

In tightly coupled systems, changes to one part of the system tend to cascade, forcing changes in many other parts of the system. The system becomes rigid and fragile.

Key example of coupling: two different parts of the system depending on low-level details of one part’s implementation.

Cohesion describes how well the pieces of a system or module fit together in working towards their singular goal.

An example of a weakly cohesive design is one in which all the code for all parts of the system is in a single file.

Strong cohesion and loose coupling are often aided by encapsulating data and implementation details.

Example: getters and setters for objects

Modularity and Single Responsibilities #

As part of the design process, we try to understand the function and responsibities of different parts of the system and to divide the code into modules that each have a single, focused responsibility.

Modularity works with loose coupling, high cohesion, and encapsulation to help enforce a separation of concerns

What are some ideas for modules in the assessment example?

Extensibility #

Requirements change, and users often want to use software in ways that the authors did not anticipate.

When the design is rigid, it is hard to extend the functionality of the program, and user’s needs are not met.

We want to keep extensibility in mind as we design our programs. Modularity and consistent abstractions help us create extensible software.

What are some ways that we might need to extend the assessment software?

Reusability #

Writing software involves solving problems, some big, some small.

The same problems often recur again and again – but why re-solve those problems.

Parts of our software can be reused (or built upon) in solving future problems. Modularity, encapsulation, and separation of concerns make it easier to build reusable components.

Examples: Building data visualizations

Ease of Maintenance #

Any software that is used for a period of time has to be maintained. Libraries – even languages – change, as do data formats, communication methods, interfaces, and platforms.

  • Good documentation
  • Good tests
  • Package and dependency management (e.g., virtual environments)
  • Version control with good commit messages
  • Well-written code

Use Libraries When Possible #

Using well-used and well-tested (and especially standard) libraries support all of the above principles and often improve performance.

A Design Process #

We will combine top-down (starting with the high-level tasks and moving towards the details) and bottom-up (starting with the details and building to the high-level) approaches.

  1. Develop a clear idea of the objectives and requirements of your program.
  2. Identify the main concerns/subsystems. These are good initial candidates for modules.
  3. Determine what kind of data your program will operate on. How will it obtain that data? From one or many possible sources? How will that data be stored and organized for the tasks at hand?
  4. Name and declare the entry-point functions, including their interface. Define tests if possible.
  5. Develop pseudo-code for the main entry points.
  6. Identify auxilliary functions needed in main functions Define tests if possible.
  7. Reconsider the high-level design
  8. Consider low-level functions. How will you process and store your data? What pieces will bind together the different modules in your system? What basic utilities do you need? Define tests if possible.
  9. How do these low-level details affect the high-level design?
  10. Iterate!

Team Design Task: The Challenge #

Split the room into two groups: one group doing classification-tree, one group doing shazam. Within your group (you can split into smaller groups if you’d like), you’ll work to design your code for the Challenge.

We’ll proceed in steps.

Exercise 1: The Data #

What data and entities do you expect to need to implement all the required features? Several steps:

  1. List all the data you must store. Indicate what kind of data it is (string? data frame? dictionary?).
    • Some data will be about relationships. A song is written by an artist; a node is the left child of a parent node. It may not be obvious how to represent this until we get to Exercise 3, so just list out this kind of data and what it is.
  2. Indicate if this data is temporary (stored in memory while your program runs) or persistent (stored in a file or database to be reused later).
  3. Come up with simple examples of this data that you could use in testing. (For example, in shazam some of your data might be audio recordings; are there very simple cases you could use that would make it easy to test that the recordings are being used correctly?)

Exercise 2: The Operations #

Review the Challenge description to determine what your program needs to be able to do. Focus on operations that affect the data you described in Exercise 1.

  1. What operations must you be able to do to the data? (Adding new rows, sorting, searching for specific things…)
  2. What kinds of data structures are good at storing each type of data and supporting each kind of operation?

If it helps, use a piece of paper to make a list of data types and the operations you need to perform on them.

Exercise 3: Make It Concrete #

So far we’ve been thinking abstractly about data and operations. Now let’s think about how we’d turn this into code.

  1. If you were to write this in object-oriented style, what would the classes be? What methods would they have and what attributes would they contain?
    • For example, in classification-tree, would you have one Tree class? A Tree class and a Node class? What about random forests?
  2. On paper, sketch a diagram showing the classes, any inheritance relationships they have, and their methods.
  3. Annotate the diagram with the arguments each method needs to take. What is required?

Resources #

  • Programming on Purpose by P.J. Plauger (See e.g., link.)
  • Code Complete by Steve McConnell (Focuses on Object-Oriented Design)