Writing Solid Code¶
This section discusses general programming concepts.
The key idea:
It's about structure, not algorithms.
Structured Programming¶
Structured Programming is an old concept that is now essentially baked into programming languages.
The original idea was to avoid "spaghetti code" where go to
statements were used to jump around the code. This is no longer an issue. But the broader ideas remain important.
One key insight is that code needs to be readable and maintainable. Good code reads a lot like text.
A long process is broken down into a few steps. Each step becomes a function that carries out exactly one task (ideally).
Example:
function run_model()
m = initialize_model();
sol = solve_model(m);
sim = simulate_model(m, sol);
show_results(sim);
end
Notes:
- The code is self-explanatory. There is little need for comments.
- All objects are passed explicitly into and out of functions.
- No side-effects.
- No global states.
- The main function is short.
- The flow is linear.
Top Down Design¶
One daunting task is to go from the problem description
Solve and simulate the model. Generate figures.
to the actual code. The task seems unmanageably big.
The idea of top down by stepwise refinement design is to simply write out at a high level of abstraction which steps need to be taken.
The top level of the code could literally look like the example above.
Then each step gets refined. For example:
function solve_model(m)
sol = initialize_solution();
solve_household!(sol, m);
solve_firm!(sol, m);
return sol
end
Note:
- The function is again short.
- All steps are at the same level of abstraction.
- Each function performs exactly one task.
At some point, we arrive at a task that is sufficiently small that we write actual code.
- Some people first write pseudo code.
- Others just write the code directly.
- It depends on how easy the task is.
Then, right away, write unit tests. Really.
Reusable code¶
Quite a bit of code is not project specific.
Examples: Utility functions, production functions, helpers for figures and tables.
It pays off over time to make this code generic / reusable and factor it out into a separate package.
Some rules¶
Style matters¶
This point is hard to overstate. It is extremely important to write code that is easy to understand and easy to maintain.
In practice, you often revisit programs months or years after they were written. They need to be well documented and well structured.
The programs needed to solve a stochastic OLG model have thousands of lines of code. The only way to understand something this complex is to break it into logical, self-contained pieces (a function that solves the household problem, another that solves the firm problem, etc.).
One example of how important this is:
Air traffic control centers still operate with hardware from the 1970s. The reason is that nobody understands the software well enough to port it to new hardware. The FAA has already spent billions of dollars on unsuccessful attempts to rewrite this mess.
Another example is the Space Shuttle, which runs (now "ran") on hardware from the 1960s. The reason is again that the software engineers can no longer understand the existing code.
There are many books on good programming style. One that I like is Writing Solid Code by Steve Maguire. Read it!
Test, test, test¶
Write small functions with lots of automated unit tests.
It may seem like a waste of time to test something that is "obviously correct." But remember:
One typo turns the obviously correct function into a very hard to find bug.
Code that is not general purpose (or very performance critical) should contain lots of self-testing code.
Catching bugs early makes them easier to find.
A trick to prevent your code from getting slowed down by self-testing:
- add a debugging switch as an input argument to each function (I call it
dbg
). - if
dbg == false
: go for speed and turn off self-testing - if
dbg == true
, run all self-test code
The process is then:
- Write code. Make sure it runs (correct syntax).
- Make sure it is correct (run all self-test code -- slow)
- When you are confident that your code is good, set
dbg = false
and go for speed - But every now and then, randomly switch
dbg
on so that self tests are run (little cost in terms of run time; a lot of gain in terms of confidence in your code).
Avoid literals¶
Your code should rarely use specific values for any object. When you refer to an object, do so by its name.
When you see x=zeros([5,3])
or for i1 = 1 : 57
something is probably wrong.
This kind of code is not maintainable. What if you want 58 values instead of 57? Do you want to go throught 10k LOC to find all occurrences of 57 that need to be replaced?
It is much better to write:
const nTypes = 57;
for i1 = 1 : nTypes
[...]
end
The Golden Rule is: Every literal must have a name. Its value is defined in one place only.
Related to this: do not hard-code functional forms.
- If you want to compute the marginal product of capital, write a function for it. Otherwise, if you want to switch from Cobb-Douglas to CES, you have to rewrite all your programs.
- Object oriented programming makes it easy to swap out entire parts of a model. We will talk about this later.
Avoid globals¶
Globals make it hard to reason about your code (unless they are constant).
Globals make it hard to test your code. You never know who changed them.
A common mistake in economics is to make the model parameters into globals. The reasoning is that they are used everywhere, so they need to be global. This is a terrible idea.
- It is hard to remember what each parameter exactly means.
- One typo that changes a parameter value leads to very interesting debugging sessions.
- Where the parameter values are set is hard to keep track of.
A much better alternative is to store the parameters inside the model objects.
Optimization¶
Optimization refers to program modifications that speed up execution.
Think before you optimize!
Most code runs so fast that optimization is simply a waste of time.
Also: Beware of your intuition about where the program spends most of its time.
Here is an example:
- Consider the function that solves a stochastic OLG model.
- It turns out that it spends 80% of its time running the Matlab interpolation function
interp1
! - There is little point optimizing the rest of the code.
To find out what makes your program slow, use a Profiler.
Material for Economists¶
Quantitative Economics by Sargent and Stachursky
- a really nice collection of lectures and exercises that covers both programming and the economics of the material (in Julia and Python)
Tony Smith: Tips for quantitative work in economics
Material Not for Economists¶
- Lifehacker: teach yourself how to code
- "Clean Code" by Robert Martin is a classic. It uses Java, but the general ideas apply everywhere.
- "Writing Solid Code" by Steve Maguire.