# Performance¶

## Profiling (1.5)¶

A blog post on profiling and benchmarking.

I find it most efficient to have a global profiling environment that Pkg.adds Profile, BenchmarkTools, StatProfilerHTML. To use it:

using MyPackage
Pkg.activate("path/to/profiling")
using BenchmarkTools, Profile, StatProfilerHTML
Pkg.activate(".")
include("profiling_code.jl") # for this package


The output generated by the built-in profiler is hard to read. Fortunately, there are packages that improve readability or graph the results.

ProfileView does compile now (1.3), taking a surprisingly long time. Personally, I find the presentation of StatProfilerHTML more convenient, though.

### StatProfilerHTML¶

• It provides a flame graph with clickable links that show which lines in a function take up most time.
• Need to locate index.html and open it by hand in the browser after running statprofilehtml(). But can click on path link in terminal as well.

### PProf.jl¶

• requires Graphviz. On MacOS, install using brew install graphviz. But it has TONS of dependencies and did not install on my system. Then PProf cannot be used.

### TimerOutputs.jl¶

• can be used to time selected lines of code
• produces a nicely formatted table that is much easier to digest than profiler output.

## Loops (1.5)¶

LoopVectorization.jl can give massive speed improvements for for loops. An example.

## Manual dispatch (1.5)¶

It is beneficial to manually dispatch at runtime when a variable could potentially take on many types (as far as the compiler knows) but we know that only a few of those are possible. This is done automatically for small unions (known as union splitting). But for parametric types, the compiler has to look up methods in the method table at runtime because they could be extended.

The package ManualDispatch.jl has a @unionsplit macro for this purpose. But AFAIK one may just as well write out an explicit if else. This would look weird:

if x isa A
foo(x);
elseif x isa B
foo(x);
end


but it seems to work. See the discussion on discourse.

## GPU computing (1.5)¶

Tutorials - Nextjournal 2019 - Cuda.jl tutorial